From tismer@appliedbiometrics.com Sun Jun 6 20:54:04 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Sun, 06 Jun 1999 21:54:04 +0200 Subject: [Python-Dev] Stackless Preview (was: Memory leak under Idle?) References: <000901beafcc$424ec400$639e2299@tim> Message-ID: <375AD1DC.19C1C0F6@appliedbiometrics.com> Tim Peters wrote: [see the main list on idle leaks] > if-pystone-works-ship-it-ly y'rs - tim Well, on request of uncle Timmy, I do it. Although it's very early. A preview of stackless Python can be found under ftp://ftp.pns.cc/pub/stackless_990606.zip Current status: The main interpreter is completely stackless. Just for fun, I've set max recursion depth to 30000, so just try it. PyStone does of course run. My measures were about 3-5 percent slower than with standard Python. I think this is quite fair. As a side effect, the exec_statement now behaves better than before, since exec without globals and locals should update the current environment, which worked only for exec "string". Most of the Run_ functions are stackless as well. Almost all cases could be treated tail recursively. I have just begun to work on the builtins, and there is a very bloody, new-born stackless map, which seems to behave quite well. (It is just an hour old, so don't blame me if I didn't get al refcounts right). This is a first special case, since I *had* to build a tiny interpreter from the old map code. Still quite hacky, but not so bad. It creates its own frame and bails out whenever it needs to call the interpreter. If not, it stays in the loop. Since this one is so fresh, the old map is still there, and the new one has the name "map_nr". As a little bonus, map_nr now also shows up in a traceback. I've set the line no to the iteration count. Beware, this is just a proof of concept and will most probably change. Further plans: I will make the other builtins stackless as well (reduce, filter), also the simple tail-recursive ones which I didn't do now due to lack of time. I think I will *not* think of stackless imports. After loking into this for a while, I think this is rather hairy, and also not necessary. On extensions: There will be a coroutine extension in a few days. This is now nearly a no-brainer, since I did the stackless Python with exactly that in mind. This is the real fruit where I'm after, so please let me pick it :) Documentation: Besides the few new comments, there is nothing yet. Diff files: Sorry, there are no diffs but just the modified files. I had no time to do them now. All files stem from the official Python 1.5.2 release. You might wonder about the version: In order to support extension modules which rely on some special new features of frames, I decided to name this Python "1.5.42", since I believe it will be useful at least "four two" people. :-) I consider this an Alpha 1 version. fearing the feedback :-) ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From da@ski.org Mon Jun 7 17:43:09 1999 From: da@ski.org (David Ascher) Date: Mon, 7 Jun 1999 09:43:09 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl Message-ID: In case you haven't heard about it, ActiveState has recently signed a contract with Microsoft to do some work on Perl on win32. One interesting aspect of this for Python is the specific work being performed. From the FAQ on this joint effort, one gets, under "What is the scope of the work that is being done?": fork() This implementation of fork() will clone the running interpreter and create a new interpreter with its own thread, but running in the same process space. The goal is to achieve functional equivalence to fork() on UNIX systems without suffering the performance hit of the process creation overhead on Win32 platforms. Emulating fork() within a single process needs the ability to run multiple interpreters concurrently in separate threads. Perl version 5.005 has experimental support for this in the form of the PERL_OBJECT build option, but it has some shortcomings. PERL_OBJECT needs a C++ compiler, and currently only works on Windows. ActiveState will be working to provide support for revamped support for the PERL_OBJECT functionality that will run on every platform that Perl will build on, and will no longer require C++ to work. This means that other operating systems that lack fork() but have support for threads (such as VMS and MacOS) will benefit from this aspect of the work. Any guesses as to whether we could hijack this work if/when it is released as Open Source? --david From guido@CNRI.Reston.VA.US Mon Jun 7 17:49:27 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 12:49:27 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 09:43:09 PDT." References: Message-ID: <199906071649.MAA12619@eric.cnri.reston.va.us> > In case you haven't heard about it, ActiveState has recently signed a > contract with Microsoft to do some work on Perl on win32. Have I ever heard of it! :-) David Grove pulled me into one of his bouts of paranoia. I think he's calmed down for the moment. > One interesting aspect of this for Python is the specific work being > performed. From the FAQ on this joint effort, one gets, under "What is > the scope of the work that is being done?": > > fork() > > This implementation of fork() will clone the running interpreter > and create a new interpreter with its own thread, but running in the > same process space. The goal is to achieve functional equivalence to > fork() on UNIX systems without suffering the performance hit of the > process creation overhead on Win32 platforms. > > Emulating fork() within a single process needs the ability to run > multiple interpreters concurrently in separate threads. Perl version > 5.005 has experimental support for this in the form of the PERL_OBJECT > build option, but it has some shortcomings. PERL_OBJECT needs a C++ > compiler, and currently only works on Windows. ActiveState will be > working to provide support for revamped support for the PERL_OBJECT > functionality that will run on every platform that Perl will build on, > and will no longer require C++ to work. This means that other operating > systems that lack fork() but have support for threads (such as VMS and > MacOS) will benefit from this aspect of the work. > > Any guesses as to whether we could hijack this work if/when it is released > as Open Source? When I saw this, my own response was simply "those poor Perl suckers are relying too much of fork()." Am I wrong, and is this also a habit of Python programmers? Anyway, I doubt that we coould use their code, as it undoubtedly refers to reimplementing fork() at the Perl level, not at the C level (which would be much harder). --Guido van Rossum (home page: http://www.python.org/~guido/) From da@ski.org Mon Jun 7 17:51:45 1999 From: da@ski.org (David Ascher) Date: Mon, 7 Jun 1999 09:51:45 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071649.MAA12619@eric.cnri.reston.va.us> Message-ID: On Mon, 7 Jun 1999, Guido van Rossum wrote: > When I saw this, my own response was simply "those poor Perl suckers > are relying too much of fork()." Am I wrong, and is this also a habit > of Python programmers? Well, I find the fork() model to be a very simple one to use, much easier to manage than threads or full-fledged IPC. So, while I don't rely on it in any crucial way, it's quite convenient at times. --david From guido@CNRI.Reston.VA.US Mon Jun 7 17:56:22 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 12:56:22 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 09:51:45 PDT." References: Message-ID: <199906071656.MAA12642@eric.cnri.reston.va.us> > Well, I find the fork() model to be a very simple one to use, much easier > to manage than threads or full-fledged IPC. So, while I don't rely on it > in any crucial way, it's quite convenient at times. Can you give a typical example where you use it, or is this just a gut feeling? It's also dangerous -- e.g. unexpected errors may percolate down the wrong stack (many mailman bugs had to do with forking), GUI apps generally won't be cloned, and some extension libraries don't like to be cloned either (e.g. ILU). --Guido van Rossum (home page: http://www.python.org/~guido/) From da@ski.org Mon Jun 7 18:02:31 1999 From: da@ski.org (David Ascher) Date: Mon, 7 Jun 1999 10:02:31 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071656.MAA12642@eric.cnri.reston.va.us> Message-ID: On Mon, 7 Jun 1999, Guido van Rossum wrote: > Can you give a typical example where you use it, or is this just a gut > feeling? Well, the latest example was that I wanted to spawn a Python process to do viewing of NumPy arrays with Tk from within the Python interactive shell (without using a shell wrapper). It's trivial with a fork(), and non-trivial with threads. The solution I had to finalize on was to branch based on OS and do threads where threads are available and fork() otherwise. Likely 2.05 times as many errors as with a single solution =). > It's also dangerous -- e.g. unexpected errors may percolate down the > wrong stack (many mailman bugs had to do with forking), GUI apps > generally won't be cloned, and some extension libraries don't like to > be cloned either (e.g. ILU). More dangerous than threads? Bwaaahaahaa! =). fork() might be "deceivingly simple in appearance", I grant you that. But sometimes that's good enough. It's also possible that fork() without all of its process-handling relatives isn't useful enough to warrant the effort. --david From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Mon Jun 7 18:05:20 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Mon, 7 Jun 1999 13:05:20 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl References: <199906071656.MAA12642@eric.cnri.reston.va.us> Message-ID: <14171.64464.805578.325069@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> It's also dangerous -- e.g. unexpected errors may percolate Guido> down the wrong stack (many mailman bugs had to do with Guido> forking), GUI apps generally won't be cloned, and some Guido> extension libraries don't like to be cloned either Guido> (e.g. ILU). Rambling mode on... Okay, so you can't guarantee that fork will be everywhere you might want to run an application. For example, that's one of the main reasons Mailman hasn't been ported off of Un*x. But you also can't guarantee that threads will be everywhere either. One of the things I'd (eventually) like to do is to re-architect Mailman so that it uses a threaded central server instead of the current one-shot process model. But there's been debate among the developers because 1) threads aren't supported everywhere, and 2) thread support isn't built-in by default anyway. I wonder if it's feasible or useful to promote threading support in Python? Thoughts would include building threads in by default if possible on the platform, integrating Greg's free threading mods, etc. Providing more integrated support for threads might encourage programmers to reach for that particular tool instead of fork, which is crude, but pretty damn handy and easy to use. Rambling mode off... -Barry From jim@digicool.com Mon Jun 7 18:07:59 1999 From: jim@digicool.com (Jim Fulton) Date: Mon, 07 Jun 1999 13:07:59 -0400 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <375BFC6F.BF779796@digicool.com> David Ascher wrote: > > On Mon, 7 Jun 1999, Guido van Rossum wrote: > > > When I saw this, my own response was simply "those poor Perl suckers > > are relying too much of fork()." Am I wrong, and is this also a habit > > of Python programmers? > > Well, I find the fork() model to be a very simple one to use, much easier > to manage than threads or full-fledged IPC. So, while I don't rely on it > in any crucial way, it's quite convenient at times. Interesting. I prefer threads because they eliminate the *need* for an IPC. I find locks and the various interesting things you can build from them to be much easier to deal with and more elegant than IPC. I wonder if the perl folks are also going to emulate doing IPC in the same process. Hee hee. :) Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From da@ski.org Mon Jun 7 18:10:56 1999 From: da@ski.org (David Ascher) Date: Mon, 7 Jun 1999 10:10:56 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14171.64464.805578.325069@anthem.cnri.reston.va.us> Message-ID: On Mon, 7 Jun 1999, Barry A. Warsaw wrote: > I wonder if it's feasible or useful to promote threading support in > Python? Thoughts would include building threads in by default if > possible on the platform, That seems a good idea to me. It's a relatively safe thing to enable by default, no? > Providing more integrated support for threads might encourage > programmers to reach for that particular tool instead of fork, which > is crude, but pretty damn handy and easy to use. While we're at it, it'd be nice if we could provide a better answer when someone asks (as "they" often do) "how do I program with threads in Python" than our usual "the way you'd do it in C". Threading tutorials are very hard to come by, I've found (I got the ORA multi-threaded programming in win32, but it's such a monster I've barely looked at it). I suggest that we allocate about 10% of TimBot's time to that task. If necessary, we can upgrade it to a dual-CPU setup. With Greg's threading patches, we could even get it to run on both CPUs efficiently. It could write about itself. --david From akuchlin@mems-exchange.org Mon Jun 7 18:20:15 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Mon, 7 Jun 1999 13:20:15 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> Message-ID: <14171.65359.306743.276505@amarok.cnri.reston.va.us> David Ascher writes: >While we're at it, it'd be nice if we could provide a better answer when >someone asks (as "they" often do) "how do I program with threads in >Python" than our usual "the way you'd do it in C". Threading tutorials >are very hard to come by, I've found (I got the ORA multi-threaded Agreed; I'd love to see a HOWTO on thread programming. I really liked Andrew Birrell's introduction to threads for Modula-3; see http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html (Postscript and PDF versions available.) Translating its approach to Python would be an excellent starting point. -- A.M. Kuchling http://starship.python.net/crew/amk/ "If you had stayed with us, we could have given you life until death." "Don't I get that anyway?" -- Stheno and Lyta Hall, in SANDMAN #61: "The Kindly Ones:5" From guido@CNRI.Reston.VA.US Mon Jun 7 18:24:45 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 13:24:45 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 13:20:15 EDT." <14171.65359.306743.276505@amarok.cnri.reston.va.us> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14171.65359.306743.276505@amarok.cnri.reston.va.us> Message-ID: <199906071724.NAA12743@eric.cnri.reston.va.us> > David Ascher writes: > >While we're at it, it'd be nice if we could provide a better answer when > >someone asks (as "they" often do) "how do I program with threads in > >Python" than our usual "the way you'd do it in C". Threading tutorials > >are very hard to come by, I've found (I got the ORA multi-threaded Andrew Kuchling chimes in: > Agreed; I'd love to see a HOWTO on thread programming. I really > liked Andrew Birrell's introduction to threads for Modula-3; see > http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html > (Postscript and PDF versions available.) Translating its approach to > Python would be an excellent starting point. Another idea is for someone to finish the thread tutorial that I started early 1998 (and never finished because I realized that it needed the threading module and some thread-safety patches to urllib for the examples I had in mind to work). It's actually on the website (but unlinked-to): http://www.python.org/doc/essays/threads.html --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@cnri.reston.va.us Mon Jun 7 18:28:57 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Mon, 7 Jun 1999 13:28:57 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071724.NAA12743@eric.cnri.reston.va.us> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14171.65359.306743.276505@amarok.cnri.reston.va.us> <199906071724.NAA12743@eric.cnri.reston.va.us> Message-ID: <14172.289.552901.264826@bitdiddle.cnri.reston.va.us> Indeed, it might be better to start with the threading module for the first tutorial. While I'm also a fan of Birrell's paper, it would encourage people to start with the low-level thread module, instead of the higher-level threading module. So the right answer, of course, is to do both! Jeremy From bwarsaw@python.org Mon Jun 7 18:36:05 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Mon, 7 Jun 1999 13:36:05 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> Message-ID: <14172.773.807413.412693@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: >> I wonder if it's feasible or useful to promote threading >> support in Python? Thoughts would include building threads in >> by default if possible on the platform, DA> That seems a good idea to me. It's a relatively safe thing to DA> enable by default, no? Don't know how hard it would be to write the appropriate configure tests, but then again, if it was easy I'd'a figured Guido would have done it already. A simple thing would be to change the default sense of "Do we build in thread support?". Make this true by default, and add a --without-threads configure flag people can use to turn them off. -Barry From skip@mojam.com (Skip Montanaro) Mon Jun 7 23:37:38 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 7 Jun 1999 18:37:38 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.773.807413.412693@anthem.cnri.reston.va.us> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14172.773.807413.412693@anthem.cnri.reston.va.us> Message-ID: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> BAW> A simple thing would be to change the default sense of "Do we build BAW> in thread support?". Make this true by default, and add a BAW> --without-threads configure flag people can use to turn them off. True enough, but as Guido pointed out, enabling threads by default would immediately make the Mac a second-class citizen. Test cases and demos would eventually find their way into the distribution that Mac users could not run, etc., etc. It may not account for a huge fraction of the Python development seats, but it seems a shame to leave it out in the cold. Has there been an assessment of how hard it would be to add thread support to the Mac? On a scale of 1 to 10 (1: we know how, but it's not implemented because nobody's needed it so far, 10: drilling for oil on the sun would be easier), how hard would it be? I assume Jack Jansen is on this list. Jack, any thoughts? Alpha code? Pre-alpha code? Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From da@ski.org Mon Jun 7 23:43:32 1999 From: da@ski.org (David Ascher) Date: Mon, 7 Jun 1999 15:43:32 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: On Mon, 7 Jun 1999, Skip Montanaro wrote: > True enough, but as Guido pointed out, enabling threads by default would > immediately make the Mac a second-class citizen. Test cases and demos would > eventually find their way into the distribution that Mac users could not > run, etc., etc. It may not account for a huge fraction of the Python > development seats, but it seems a shame to leave it out in the cold. I'm not sure I buy that argument. There are already thread demos in the current directory, and no one complains. The windows builds are already threaded by default, and it's not caused any problems that I know of. Think of it like enabling the *new* module. =) > Has there been an assessment of how hard it would be to add thread > support to the Mac? That's an interesting question, especially since ActiveState lists it as a machine w/ threads and w/o fork(). --david From skip@mojam.com (Skip Montanaro) Mon Jun 7 23:49:12 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 7 Jun 1999 18:49:12 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com> David> I'm not sure I buy that argument. Think of it like enabling the David> *new* module. =) That's not quite the same thing. The new module simply exposes some normally closed-from-Python-code data structures to the Python programmer. Enabling threads requires some support from the underlying runtime system. If that was already in place, I suspect the Mac binaries would come with the thread module enabled by default, yes? Skip From da@ski.org Mon Jun 7 23:58:22 1999 From: da@ski.org (David Ascher) Date: Mon, 7 Jun 1999 15:58:22 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com> Message-ID: On Mon, 7 Jun 1999, Skip Montanaro wrote: > That's not quite the same thing. The new module simply exposes some > normally closed-from-Python-code data structures to the Python programmer. > Enabling threads requires some support from the underlying runtime system. > If that was already in place, I suspect the Mac binaries would come with the > thread module enabled by default, yes? I'm not denying that. It's just that there are lots of things which fall into that category, like (to take a pointed example =), os.fork(). We don't have a --with-fork configure flag. We expose to the Python programmer all of the underlying OS that is 'wrapped' as long as it's reasonably portable. I think that most unices + win32 is a reasonable approximation of 'reasonably portable'. And in fact, this change might motivate someone with Mac fervor to explore adding Python support of Mac threads. --david From gmcm@hypernet.com Tue Jun 8 01:01:56 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 7 Jun 1999 19:01:56 -0500 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: <1283322126-63868517@hypernet.com> David Ascher wrote: > On Mon, 7 Jun 1999, Skip Montanaro wrote: > > > True enough, but as Guido pointed out, enabling threads by default would > > immediately make the Mac a second-class citizen. > I'm not sure I buy that argument. There are already thread demos in > the current directory, and no one complains. The windows builds are > already threaded by default, and it's not caused any problems that I > know of. Think of it like enabling the *new* module. =) > > > Has there been an assessment of how hard it would be to add thread > > support to the Mac? > > That's an interesting question, especially since ActiveState lists > it as a machine w/ threads and w/o fork(). Not a Mac programmer, but I recall that when Steve Jobs came back, they published a schedule that said threads would be available a couple releases down the road. Schedules only move one way, so I'd guess ActiveState is premature. Perhaps Christian's stackless Python would enable green threads... (And there are a number of things in the standard distribution which don't work on Windows, either; fork and select()ing on file fds). - Gordon From skip@mojam.com (Skip Montanaro) Tue Jun 8 00:06:34 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 7 Jun 1999 19:06:34 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com> Message-ID: <14172.20567.40217.703269@cm-24-29-94-19.nycap.rr.com> David> I think that most unices + win32 is a reasonable approximation of David> 'reasonably portable'. And in fact, this change might motivate David> someone with Mac fervor to explore adding Python support of Mac David> threads. One can hope... ;-) Skip From MHammond@skippinet.com.au Tue Jun 8 00:06:37 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 09:06:37 +1000 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071649.MAA12619@eric.cnri.reston.va.us> Message-ID: <000501beb13a$9eec2c10$0801a8c0@bobcat> > > In case you haven't heard about it, ActiveState has > recently signed a > > contract with Microsoft to do some work on Perl on win32. > > Have I ever heard of it! :-) David Grove pulled me into one of his > bouts of paranoia. I think he's calmed down for the moment. It sounds like a :-), but Im afraid I dont understand that reference. When I first heard this, two things sprung to mind: a) Why shouldnt Python push for a similar deal? b) Something more interesting in the MS/Python space is happening anyway, so nyah nya nya ;-) Getting some modest funds to (say) put together and maintain single core+win32 installers to place on the NT resource kit could only help Python. Sometimes I wish we had a few less good programmers, and a few more good marketting type people ;-) > Anyway, I doubt that we coould use their code, as it undoubtedly > refers to reimplementing fork() at the Perl level, not at the C level > (which would be much harder). Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate fork using the Win32 extensions? Python has basically all of the native Win32 process API exposed, and writing a "fork" in Python that only forked Python scripts (for example) may be feasable and not too difficult. It would have obvious limitations, including the fact that it is not available standard with Python on Windows (just like a working popen now :-) but if we could follow the old 80-20 rule, and catch 80% of the uses with 20% of the effort it may be worth investigating. My knowledge of fork is limited to muttering "something about cloning the current process", so I may be naive in the extreme - but is this feasible? Mark. From fredrik@pythonware.com Tue Jun 8 00:21:15 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:21:15 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <000501beb13a$9eec2c10$0801a8c0@bobcat> Message-ID: <001601beb13c$70ff5b90$f29b12c2@pythonware.com> Mark wrote: > Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate > fork using the Win32 extensions? Python has basically all of the native > Win32 process API exposed, and writing a "fork" in Python that only forked > Python scripts (for example) may be feasable and not too difficult. > > It would have obvious limitations, including the fact that it is not > available standard with Python on Windows (just like a working popen now > :-) but if we could follow the old 80-20 rule, and catch 80% of the uses > with 20% of the effort it may be worth investigating. > > My knowledge of fork is limited to muttering "something about cloning the > current process", so I may be naive in the extreme - but is this feasible? as an aside, GvR added Windows' "spawn" API in 1.5.2, so you can at least emulate some common variants of fork+exec. this means that if someone writes a spawn for Unix, we would at least catch >0% of the uses with ~0% of the effort ;-) fwiw, I'm more interested in the "unicode all the way down" parts of the activestate windows project. more on that later. From gstein@lyra.org Tue Jun 8 00:10:38 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 16:10:38 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <375C516E.76EC8ED4@lyra.org> David Ascher wrote: >... > I'm not denying that. It's just that there are lots of things which fall > into that category, like (to take a pointed example =), os.fork(). We > don't have a --with-fork configure flag. We expose to the Python > programmer all of the underlying OS that is 'wrapped' as long as it's > reasonably portable. I think that most unices + win32 is a reasonable > approximation of 'reasonably portable'. And in fact, this change might > motivate someone with Mac fervor to explore adding Python support of Mac > threads. Agreed. Python isn't a least-common-demoninator language. It tries to make things easy for people. Why should we kill all platforms because of a lack on one? Having threads by default will make a lot of things much simpler (in terms of knowing the default platform). Can't tell you how many times I curse to find that the default RedHat distribution (as of 5.x) did not use threads, even though they are well-supported on Linux. And about stuff creeping into the distribution: gee... does that mean that SocketServer doesn't work on the Mac? Threads *and* fork are not available on Python/Mac, so all you would get is a single-threaded server. icky. I can't see how adding threads to other platforms will *hurt* the Macintosh platform... it can only help others. About the only reason that I can see to *not* make them the default is the slight speed loss. But that seems a bit bogus, as the interpreter loop doesn't spend *that* much time mucking with the interp_lock to allow thread switches. There have also been some real good suggestions for making it take near-zero time until you actually create that second thread. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik@pythonware.com Tue Jun 8 00:26:08 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:26:08 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> <1283322126-63868517@hypernet.com> Message-ID: <002a01beb13d$1fa23c80$f29b12c2@pythonware.com> > Not a Mac programmer, but I recall that when Steve Jobs came back, > they published a schedule that said threads would be available a > couple releases down the road. Schedules only move one way, so I'd > guess ActiveState is premature. http://www.computerworld.com/home/print.nsf/all/990531AAFA > Perhaps Christian's stackless Python would enable green threads... > > (And there are a number of things in the standard distribution which > don't work on Windows, either; fork and select()ing on file fds). time to implement channels? (Tcl's unified abstraction for all kinds of streams that you could theoretically use something like select on. sockets, pipes, asynchronous disk I/O, etc). does select really work on ordinary files under Unix, btw? From fredrik@pythonware.com Tue Jun 8 00:30:57 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:30:57 +0200 Subject: [Python-Dev] ActiveState & fork & Perl Message-ID: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> I wrote: > > Not a Mac programmer, but I recall that when Steve Jobs came back, > > they published a schedule that said threads would be available a > > couple releases down the road. Schedules only move one way, so I'd > > guess ActiveState is premature. > > http://www.computerworld.com/home/print.nsf/all/990531AAFA which was just my way of saying that "did he perhaps refer to OS X ?". or are they adding real threads to good old MacOS too? From fredrik@pythonware.com Tue Jun 8 00:38:02 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:38:02 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <375C516E.76EC8ED4@lyra.org> Message-ID: <003f01beb13e$c95a2750$f29b12c2@pythonware.com> > Having threads by default will make a lot of things much simpler > (in terms of knowing the default platform). Can't tell you how > many times I curse to find that the default RedHat distribution > (as of 5.x) did not use threads, even though they are well- > supported on Linux. I have a vague memory that once upon a time, the standard X libraries shipped with RedHat weren't thread safe, and Tkinter didn't work if you compiled Python with threads. but I might be wrong and/or that may have changed... From MHammond@skippinet.com.au Tue Jun 8 00:42:38 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 09:42:38 +1000 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> Message-ID: <000801beb13f$6e118310$0801a8c0@bobcat> > > http://www.computerworld.com/home/print.nsf/all/990531AAFA > > which was just my way of saying that "did he perhaps > refer to OS X ?". > > or are they adding real threads to good old MacOS too? Oh, /F, please dont start adding annotations to your collection of incredibly obscure URLs - takes away half the fun ;-) Mark. From gstein@lyra.org Tue Jun 8 01:01:41 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 17:01:41 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: <375C516E.76EC8ED4@lyra.org> <003f01beb13e$c95a2750$f29b12c2@pythonware.com> Message-ID: <375C5D65.6E6CD6F@lyra.org> Fredrik Lundh wrote: > > > Having threads by default will make a lot of things much simpler > > (in terms of knowing the default platform). Can't tell you how > > many times I curse to find that the default RedHat distribution > > (as of 5.x) did not use threads, even though they are well- > > supported on Linux. > > I have a vague memory that once upon a time, the standard > X libraries shipped with RedHat weren't thread safe, and > Tkinter didn't work if you compiled Python with threads. > > but I might be wrong and/or that may have changed... Yes, it has changed. RedHat now ships with a thread-safe X so that they can use GTK and Gnome (which use threads quite a bit). There may be other limitations, however, as I haven't tried to do any threaded GUI programming, especially on a recent RedHat (I'm using a patched/hacked RH 4.1 system). RedHat 6.0 may even ship with a threaded Python, but I dunno... -g -- Greg Stein, http://www.lyra.org/ From da@ski.org Tue Jun 8 01:43:27 1999 From: da@ski.org (David Ascher) Date: Mon, 7 Jun 1999 17:43:27 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000501beb13a$9eec2c10$0801a8c0@bobcat> Message-ID: On Tue, 8 Jun 1999, Mark Hammond wrote: > When I first heard this, two things sprung to mind: > a) Why shouldnt Python push for a similar deal? > b) Something more interesting in the MS/Python space is happening anyway, > so nyah nya nya ;-) > > Getting some modest funds to (say) put together and maintain single > core+win32 installers to place on the NT resource kit could only help > Python. How much money are we talking about (no, I'm not offering =)? I wonder if one problem we have is that the folks with $$'s don't want to advertise that they have $$'s because they don't want to be swamped with vultures (and because "that isn't done"), and the people with skills but no $$'s don't want to advertise that fact for a variety of reasons (modesty, fear of being labeled 'commercial', fear of exposing that they're not 100% busy, so "can't be good", etc.). I've been wondering if a broker service like sourceXchange for Python could work -- whether there are enough people who want something done to Python and are willing to pay for an Open Soure project (and whether there are enough "worker bees", although I suspect there are). I can think of several items on various TODO lists which could probably be tackled this way. (doing things *within* sourceXchange is clearly a possibility in the long term -- in the short term they seem focused on Linux, but time will tell). Guido, you're probably the point-man for such 'angels' -- do you get those kinds of requests periodically? How about you, Mark? One thing that ActiveState has going for it which doesn't exist in the Python world is a corporate entity devoted to software development and distribution. PPSI is a support company, or at least markets itself that way. --david From gstein@lyra.org Tue Jun 8 02:05:15 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 18:05:15 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <375C6C4B.617138AB@lyra.org> David Ascher wrote: > > On Tue, 8 Jun 1999, Mark Hammond wrote: > > > When I first heard this, two things sprung to mind: > > a) Why shouldnt Python push for a similar deal? As David points out, I believe this is simply because ActiveState is unique in their business type, products, and model. We don't have anything like that in the Python world (although Pythonware could theoretically go in a similar direction). >... > I've been wondering if a broker service like sourceXchange for Python > could work -- whether there are enough people who want something done to > Python and are willing to pay for an Open Soure project (and whether there > are enough "worker bees", although I suspect there are). I can think of > several items on various TODO lists which could probably be tackled this > way. (doing things *within* sourceXchange is clearly a possibility in the > long term -- in the short term they seem focused on Linux, but time will > tell). sourceXchange should work fine. I don't see it being Linux-only by any means. Heck, the server is a FreeBSD box, and Brian Behlendorf comes from the Apache world (and is a FreeBSD guy mostly). > Guido, you're probably the point-man for such 'angels' -- do you get those > kinds of requests periodically? How about you, Mark? > > One thing that ActiveState has going for it which doesn't exist in the > Python world is a corporate entity devoted to software development and > distribution. PPSI is a support company, or at least markets itself that > way. Yup. That's all we are. We are specifically avoiding any attempts to be a product company. ActiveState is all about products and support-type products. I met with Dick Hardt (ActiveState founder/president) just a couple weeks ago. Great guy. We spoke about ActiveState, what they're doing, and what they'd like to do. They might be looking for good Python people, too... Cheers, -g -- Greg Stein, http://www.lyra.org/ From akuchlin@mems-exchange.org Tue Jun 8 02:22:59 1999 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Mon, 7 Jun 1999 21:22:59 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14172.773.807413.412693@anthem.cnri.reston.va.us> <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: <14172.28787.399827.929220@newcnri.cnri.reston.va.us> Skip Montanaro writes: >True enough, but as Guido pointed out, enabling threads by default would >immediately make the Mac a second-class citizen. Test cases and demos would One possibility might be NSPR, the Netscape Portable Runtime, which provides platform-independent threads and I/O on Mac, Win32, and Unix. Perhaps a thread implementation could be written that sat on top of NSPR, in addition to the existing pthreads implementation. See http://www.mozilla.org/docs/refList/refNSPR/. (You'd probably only use NSPR on the Mac, though; there seems no point in adding another layer of complexity to Unix and Windows.) -- A.M. Kuchling http://starship.python.net/crew/amk/ When religion abandons poetic utterance, it cuts its own throat. -- Robertson Davies, _Marchbanks' Garland_ From tim_one@email.msn.com Tue Jun 8 02:24:47 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 7 Jun 1999 21:24:47 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Message-ID: <000901beb14d$b2759100$aaa02299@tim> [David Ascher] > In case you haven't heard about it, ActiveState has recently signed a > contract with Microsoft to do some work on Perl on win32. I'm astonished at the reaction this has provoked "out there". Here: D:\Python>perl -v This is perl, version 5.001 Unofficial patchlevel 1m. Copyright 1987-1994, Larry Wall Win32 port Copyright (c) 1995 Microsoft Corporation. All rights reserved. Developed by hip communications inc., http://info.hip.com/info/ Perl for Win32 Build 107 Built Apr 16 1996@14:47:22 Perl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the Perl 5.0 source kit. D:\Python> Notice the MS copyright? From 1995?! Perl for Win32 has *always* been funded by MS, even back when half of ActiveState was named "hip communications" <0.5 wink>. Thank Perl's dominance in CGI scripting -- MS couldn't sell NT Server if it didn't run Perl. MS may be vicious, but they're not stupid . > ... > fork() > ... > Any guesses as to whether we could hijack this work if/when it is released > as Open Source? It's proven impossible so far to reuse anything from the Perl source -- the code is an incestuous nightmare. From time to time the Perl-Porters talk about splitting some of it into reusable libraries, but that never happens; and the less they feel Perl's dominance is assured, the less they even talk about it. So I'm pessimistic (what else is new ?). I'd rather see the work put into threads anyway. The "Mac OS" problem will go away eventually; time to turn the suckers on by default. it's-not-like-millions-of-programmers-will-start-writing-thread-code-then- who-don't-now-ly y'rs - tim From guido@CNRI.Reston.VA.US Tue Jun 8 02:34:59 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 21:34:59 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 19:01:56 CDT." <1283322126-63868517@hypernet.com> References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> <1283322126-63868517@hypernet.com> Message-ID: <199906080134.VAA13480@eric.cnri.reston.va.us> > Perhaps Christian's stackless Python would enable green threads... This has been suggested before... While this seems possible at first, all blocking I/O calls would have to be redone to pass control to the thread scheduler, before this would be useful -- a huge task! I believe SunOS 4.x's LWP (light-weight processes) library used this method. It was a drop-in replacement for the standard libc, containing changed versions of all system calls. I recall that there were one or two missing, which of course upset the posix module because it references almost *all* system calls... --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Tue Jun 8 02:38:38 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 7 Jun 1999 21:38:38 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> Message-ID: <000e01beb14f$a16d9a40$aaa02299@tim> [/F] > http://www.computerworld.com/home/print.nsf/all/990531AAFA > > which was just my way of saying that "did he perhaps > refer to OS X ?". > > or are they adding real threads to good old MacOS too? Dragon is doing a port of its speech recog software to "good old MacOS" and "OS X", and best we can tell the former is as close to an impossible target as we've ever seen. OS X looks like a pleasant romp, in comparison. I don't think they're going to do anything with "good old MacOS" except let it die. it-was-a-reasonable-architecture-15-years-ago-ly y'rs - tim From gstein@lyra.org Tue Jun 8 02:31:08 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 18:31:08 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14172.773.807413.412693@anthem.cnri.reston.va.us> <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> <14172.28787.399827.929220@newcnri.cnri.reston.va.us> Message-ID: <375C725C.5A86D05B@lyra.org> Andrew Kuchling wrote: > > Skip Montanaro writes: > >True enough, but as Guido pointed out, enabling threads by default would > >immediately make the Mac a second-class citizen. Test cases and demos would > > One possibility might be NSPR, the Netscape Portable Runtime, > which provides platform-independent threads and I/O on Mac, Win32, and > Unix. Perhaps a thread implementation could be written that sat on > top of NSPR, in addition to the existing pthreads implementation. > See http://www.mozilla.org/docs/refList/refNSPR/. > > (You'd probably only use NSPR on the Mac, though; there seems no > point in adding another layer of complexity to Unix and Windows.) NSPR is licensed under the MPL, which is quite a bit more restrictive than Python's license. Of course, you could separately point Mac users to it to say "if you get NSPR, then you can have threads". Apache ran into the licensing issue and punted NSPR in favor of a home-grown runtime (which is not as ambitious as NSPR). Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm@hypernet.com Tue Jun 8 03:37:34 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 7 Jun 1999 21:37:34 -0500 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <002a01beb13d$1fa23c80$f29b12c2@pythonware.com> Message-ID: <1283312788-64430290@hypernet.com> Fredrik Lundh writes: > > time to implement channels? (Tcl's unified abstraction > for all kinds of streams that you could theoretically use > something like select on. sockets, pipes, asynchronous > disk I/O, etc). I have mixed feelings about those types of things. I've recently run across a number of them in some C/C++ libs. On the "pro" side, they can give acceptable behavior and adequate performance and thus suffice for the majority of use. On the "con" side, they're usually an order of magnitude slower than the raw interface, don't quite behave correctly in borderline situations, and tend to produce "One True Path" believers. Of course, so do OSes, editors, languages, GUIs, browsers and colas. > does select really work on ordinary files under Unix, > btw? Sorry, should've said "where a socket is a real fd" or some such... just-like-God-intended-ly y'rs - Gordon From guido@CNRI.Reston.VA.US Tue Jun 8 02:46:40 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 21:46:40 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 17:43:27 PDT." References: Message-ID: <199906080146.VAA13572@eric.cnri.reston.va.us> > Guido, you're probably the point-man for such 'angels' -- do you get those > kinds of requests periodically? No, as far as I recall, nobody has ever offered me money for Python code to be donated to the body of open source. People sometimes seek to hire me, but promarily to further their highly competitive proprietary business goals... --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Tue Jun 8 02:41:32 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 18:41:32 -0700 Subject: [Python-Dev] licensing Message-ID: <375C74CC.2947E4AE@lyra.org> Speaking of licensing issues... I seem to have read somewhere that the two Medusa files are under a separate license. Although, reading the files now, it seems they are not. The issue that I'm really raising is that Python should ship with a single license that covers everything. Otherwise, it will become very complicated for somebody to figure out which pieces fall under what restrictions. Is there anything in the distribution that is different than the normal license? For example, can I take the async modules and build a commercial product on them? Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@CNRI.Reston.VA.US Tue Jun 8 02:56:03 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 21:56:03 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Tue, 08 Jun 1999 09:06:37 +1000." <000501beb13a$9eec2c10$0801a8c0@bobcat> References: <000501beb13a$9eec2c10$0801a8c0@bobcat> Message-ID: <199906080156.VAA13612@eric.cnri.reston.va.us> [me] > > Have I ever heard of it! :-) David Grove pulled me into one of his > > bouts of paranoia. I think he's calmed down for the moment. [Mark] > It sounds like a :-), but Im afraid I dont understand that reference. David Grove occasionally posts to Perl lists with accusations that ActiveState is making Perl proprietary. He once announced a program editor to the Python list which upon inspection by me didn't contain any Python support, for which I flamed him. He then explained to me that he was in a hurry because ActiveState was taking over the Perl world. A couple of days ago, I received an email from him (part of a conversation on the perl5porters list apparently) where he warned me that ActiveState was planning a similar takeover of Python. After some comments from tchrist ("he's a loon") I decided to ignore David. > Sometimes I wish we had a few less good programmers, and a few more good > marketting type people ;-) Ditto... It sure ain't me! > Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate :-) > fork using the Win32 extensions? Python has basically all of the native > Win32 process API exposed, and writing a "fork" in Python that only forked > Python scripts (for example) may be feasable and not too difficult. > > It would have obvious limitations, including the fact that it is not > available standard with Python on Windows (just like a working popen now > :-) but if we could follow the old 80-20 rule, and catch 80% of the uses > with 20% of the effort it may be worth investigating. > > My knowledge of fork is limited to muttering "something about cloning the > current process", so I may be naive in the extreme - but is this feasible? I think it's not needed that much, but David has argued otherwise. I haven't heard much support either way from others. But I think it would be a huge task, because it would require taking control of all file descriptors (given the semantics that upon fork, file descriptors are shared, but if one half closes an fd it is still open in the other half). --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Tue Jun 8 03:58:59 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Mon, 7 Jun 1999 21:58:59 -0500 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000e01beb14f$a16d9a40$aaa02299@tim> References: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> Message-ID: <1283311503-64507593@hypernet.com> [Tim] > Dragon is doing a port of its speech recog software to "good old > MacOS" and "OS X", and best we can tell the former is as close to an > impossible target as we've ever seen. OS X looks like a pleasant > romp, in comparison. I don't think they're going to do anything > with "good old MacOS" except let it die. > > it-was-a-reasonable-architecture-15-years-ago-ly y'rs - tim Don't Macs have another CPU in the keyboard already? Maybe you could just require a special microphone . that's-not-a-mini-tower-that's-a-um--subwoofer-ly y'rs - Gordon From guido@CNRI.Reston.VA.US Tue Jun 8 03:09:02 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 22:09:02 -0400 Subject: [Python-Dev] licensing In-Reply-To: Your message of "Mon, 07 Jun 1999 18:41:32 PDT." <375C74CC.2947E4AE@lyra.org> References: <375C74CC.2947E4AE@lyra.org> Message-ID: <199906080209.WAA13806@eric.cnri.reston.va.us> > Speaking of licensing issues... > > I seem to have read somewhere that the two Medusa files are under a > separate license. Although, reading the files now, it seems they are > not. > > The issue that I'm really raising is that Python should ship with a > single license that covers everything. Otherwise, it will become very > complicated for somebody to figure out which pieces fall under what > restrictions. > > Is there anything in the distribution that is different than the normal > license? There are pieces with different licenses but they only differ in the names of the beneficiaries, not in the conditions (although the words aren't always exactly the same). As far as I can tell, this is the situation for asyncore.py and asynchat.py: they have a copyright notice of their own (see the 1.5.2 source for the exact text) with Sam Rushing's copyright. > For example, can I take the async modules and build a commercial product > on them? As far as I know, yes. Sam Rushing promised me this when he gave them to me for inclusion. (I've had a complaint that they aren't the latest -- can someone confirm this?) --Guido van Rossum (home page: http://www.python.org/~guido/) From MHammond@skippinet.com.au Tue Jun 8 04:11:57 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 13:11:57 +1000 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906080156.VAA13612@eric.cnri.reston.va.us> Message-ID: <000b01beb15c$abd84ea0$0801a8c0@bobcat> [Please dont copy this out of this list :-] > world. A couple of days ago, I received an email from him (part of a > conversation on the perl5porters list apparently) where he warned me > that ActiveState was planning a similar takeover of Python. After > some comments from tchrist ("he's a loon") I decided to ignore David. I believe this to be true - at least "take over" in the same way they have "taken over" Perl. I have it on very good authority that Active State's medium term business plan includes expanding out of Perl alone, and Python is very high on their list. I also believe they would like to recruit people to help with this goal. They are of the opinion that Python alone could not support such a business quite yet, so attaching it to existing infrastructure could fly. On one hand I tend to agree, but on the other hand I think that we do a pretty damn good job as it is, so maybe a Python could fly all alone? And Ive got to say that personally, such an offer would be highly attractive. Depending on the terms (and I must admit I have not had a good look at the ActiveState Perl licenses) this could provide a real boost to the Python world. If the business model is open source software with paid-for support, it seems a win-win situation to me. However, it is very unclear to me, and the industry, that this model alone can work generally. A business-plan that involves withholding sources or technologies until a fee has been paid certainly moves quickly away from win-win to, to quote Guido, "highly competitive proprietary business goals". May be some interesting times ahead. For some time now I have meant to pass this on to PPSI as a heads-up, just incase they intend playing in that space in the future. So consider this it ;-) Mark. From gstein@lyra.org Tue Jun 8 04:13:42 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 20:13:42 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> Message-ID: <375C8A66.56B3F26B@lyra.org> Mark Hammond wrote: > > [Please dont copy this out of this list :-] It's in the archives now... :-) >...[well-said comments about open source and businesses]... > > May be some interesting times ahead. For some time now I have meant to > pass this on to PPSI as a heads-up, just incase they intend playing in that > space in the future. So consider this it ;-) I've already met Dick Hardt and spoken with him at length. Both on an individual basis, and as the President of PPSI. Nothing to report... (yet) Cheers, -g p.s. PPSI is a bit different, as we intend to fill the "support gap" rather than move into real products; ActiveState does products, along with support type stuff and other miscellaneous (I don't recall Dick's list offhand). -- Greg Stein, http://www.lyra.org/ From tim_one@email.msn.com Tue Jun 8 06:14:36 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 01:14:36 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000b01beb15c$abd84ea0$0801a8c0@bobcat> Message-ID: <000401beb16d$cd88d180$f29e2299@tim> [MarkH] > ... > And Ive got to say that personally, such an offer would be highly > attractive. Depending on the terms (and I must admit I have not > had a good look at the ActiveState Perl licenses) this could provide > a real boost to the Python world. I find the ActivePerl license to be quite confusing: http://www.activestate.com/ActivePerl/commlic.htm It appears to say flatly that you can't distribute it yourself, although other pages on the site say "sure, go ahead!". Also seems to imply you can't modify their code (they explicitly allow you to install patches obtained from ActiveState -- but that's all they mention). OTOH, they did a wonderful job on the Perl for Win32 port (a difficult port in the face of an often-hostile Perl community), and gave all the code back to the Perl folk. I've got no complaints about them so far. > If the business model is open source software with paid-for support, it > seems a win-win situation to me. "Part of our business model is to sell value added, proprietary components."; e.g., they sell a Perl Development Kit for $100, and so on. Fine by me! If I could sell tabnanny ... well, I wouldn't do that to anyone . would-like-to-earn-$1-from-python-before-he-dies-ly y'rs - tim From skip@mojam.com (Skip Montanaro) Tue Jun 8 06:37:22 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 8 Jun 1999 01:37:22 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <375C516E.76EC8ED4@lyra.org> References: <375C516E.76EC8ED4@lyra.org> Message-ID: <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> Greg> About the only reason that I can see to *not* make them the Greg> default is the slight speed loss. But that seems a bit bogus, as Greg> the interpreter loop doesn't spend *that* much time mucking with Greg> the interp_lock to allow thread switches. There have also been Greg> some real good suggestions for making it take near-zero time until Greg> you actually create that second thread. Okay, everyone has convinced me that holding threads hostage to the Mac is a red herring. I have other fish to fry. (It's 1:30AM and I haven't had dinner yet. Can you tell? ;-) Is there a way with configure to determine whether or not particular Unix variants should have threads enabled or not? If so, I think that's the way to go. I think it would be unfortunate to enable it by default, have it appear to work on some known to be unsupported platforms, but then bite the programmer in an inconvenient place at an inconvenient time. Such a self-deciding configure script should exit with some information about thread enablement: Yes, we support threads on RedHat Linux 6.0. No, you stinking Minix user, you will never have threads. Rhapsody, huh? I never heard of that. Some weird OS from Sunnyvale, you say? I don't know how to do threads there yet, but when you figure it out, send patches along to python-dev@python.org. Of course, users should be able to override anything using --with-thread or without-thread and possibly specify compile-time and link-time flags through arguments or the environment. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From skip@mojam.com (Skip Montanaro) Tue Jun 8 06:49:19 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 8 Jun 1999 01:49:19 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000b01beb15c$abd84ea0$0801a8c0@bobcat> References: <199906080156.VAA13612@eric.cnri.reston.va.us> <000b01beb15c$abd84ea0$0801a8c0@bobcat> Message-ID: <14172.44596.528927.548722@cm-24-29-94-19.nycap.rr.com> Okay, folks. I must have missed the memo. Who are ActiveState and sourceXchange? I can't be the only person on python-dev who never heard of either of them before this evening. I guess I'm the only one who's not shy about exposing their ignorance. but-i-can-tell-you-where-to-find-spare-parts-for-your-Triumph-ly 'yrs, Skip Montanaro 518-372-5583 See my car: http://www.musi-cal.com/~skip/ From da@ski.org Tue Jun 8 07:12:11 1999 From: da@ski.org (David Ascher) Date: Mon, 7 Jun 1999 23:12:11 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.44596.528927.548722@cm-24-29-94-19.nycap.rr.com> Message-ID: > Okay, folks. I must have missed the memo. Who are ActiveState and > sourceXchange? I can't be the only person on python-dev who never heard of > either of them before this evening. I guess I'm the only one who's not shy > about exposing their ignorance. Well, one answer is to look at www.activestate.com and www.sourcexchange.com, of course =) ActiveState "does" the win32 perl port, for money. (it's a little controversial within the Perl community, which has inherited some of RMS's "Microsoft OS? Ha!" attitude). sourceXchange is aiming to match open source programmers with companies who want open source work done for $$'s, in a 'market' format. It was started by Brian Behlendorf, now at O'Reilly, and of Apache fame. Go get dinner. =) --david From rushing@nightmare.com Tue Jun 8 01:10:18 1999 From: rushing@nightmare.com (Sam Rushing) Date: Mon, 7 Jun 1999 17:10:18 -0700 (PDT) Subject: [Python-Dev] licensing In-Reply-To: <9403621@toto.iv> Message-ID: <14172.23937.83700.673653@seattle.nightmare.com> Guido van Rossum writes: > Greg Stein writes: > > For example, can I take the async modules and build a commercial > > product on them? Yes, my intent was that they go under the normal Python 'do what thou wilt' license. If I goofed in any way, please let me know! > As far as I know, yes. Sam Rushing promised me this when he gave > them to me for inclusion. (I've had a complaint that they aren't > the latest -- can someone confirm this?) Guilty as charged. I've been tweaking them a bit lately, for performance, but anyone can grab the very latest versions out of the medusa CVS repository: CVSROOT=:pserver:medusa@seattle.nightmare.com:/usr/local/cvsroot (the password is 'medusa') Or download one of the snapshots. BTW, those particular files have always had the Python copyright/license. -Sam From gstein@lyra.org Tue Jun 8 08:09:00 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 08 Jun 1999 00:09:00 -0700 Subject: [Python-Dev] licensing References: <14172.23937.83700.673653@seattle.nightmare.com> Message-ID: <375CC18C.1DB5E9F2@lyra.org> Sam Rushing wrote: > > Greg Stein writes: > > > For example, can I take the async modules and build a commercial > > > product on them? > > Yes, my intent was that they go under the normal Python 'do what thou > wilt' license. If I goofed in any way, please let me know! Nope... you haven't goofed. I was thrown off when a certain person (nudge, nudge) goofed in their upcoming book, which I recently reviewed. thx! -g -- Greg Stein, http://www.lyra.org/ From fredrik@pythonware.com Tue Jun 8 09:08:08 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 10:08:08 +0200 Subject: [Python-Dev] licensing References: <375C74CC.2947E4AE@lyra.org> Message-ID: <00c501beb186$0c6d3450$f29b12c2@pythonware.com> > I seem to have read somewhere that the two Medusa files are under a > separate license. Although, reading the files now, it seems they are > not. the medusa server has restrictive license, but the asyncore and asynchat modules use the standard Python license, with Sam Rushing as the copyright owner. just use the source... > The issue that I'm really raising is that Python should ship with a > single license that covers everything. Otherwise, it will become very > complicated for somebody to figure out which pieces fall under what > restrictions. > > Is there anything in the distribution that is different than the normal > license? > > For example, can I take the async modules and build a commercial product > on them? surely hope so -- we're using them in everything we do. and my upcoming book is 60% about doing weird things with tkinter, and 40% about doing weird things with asynclib... From MHammond@skippinet.com.au Tue Jun 8 09:46:33 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 18:46:33 +1000 Subject: [Python-Dev] licensing In-Reply-To: <375CC18C.1DB5E9F2@lyra.org> Message-ID: <001101beb18b$6a049bd0$0801a8c0@bobcat> > Nope... you haven't goofed. I was thrown off when a certain person > (nudge, nudge) goofed in their upcoming book, which I > recently reviewed. I now feel for the other Mark and David, Aaron et al, etc. Our book is out of date in a number of ways before the tech reviewers even saw it. Medusa wasnt a good example - I should have known better when I wrote it. But Pythonwin is a _real_ problem. Just as I start writing the book, Neil sends me a really cool editor control and it leads me down a path of IDLE/Pythonwin integration. So almost _everything_ I have already written on "IDEs for Python" is already out of date - and printing is not scheduled for a number of months. [This may help explain to Guido and Tim my recent fervour in this area - I want to get the "new look" Pythonwin ready for the book. I just yesterday got a dockable interactive window happening. Now adding a splitter window to each window to expose a pyclbr based tree control and then it is time to stop (and re-write that chapter :-] Mark. From fredrik@pythonware.com Tue Jun 8 11:25:47 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 12:25:47 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <004d01beb199$4fe171c0$f29b12c2@pythonware.com> > (modesty, fear of being labeled 'commercial', fear of exposing that > they're not 100% busy, so "can't be good", etc.). fwiw, we're seeing an endless stream of mails from moral crusaders even before we have opened the little Python- Ware shoppe (coming soon, coming soon). some of them are quite nasty, to say the least... I usually tell them to raise their concerns on c.l.python instead. they never do. > One thing that ActiveState has going for it which doesn't exist in the > Python world is a corporate entity devoted to software development and > distribution. saying that there is NO such entity is a bit harsh, I think ;-) but different "scripting" companies are using different strategies, by various reasons. Scriptics, ActiveState, PythonWare, UserLand, Harlequin, Rebol, etc. are all doing similar things, but in different ways (due to markets, existing communities, and probably most important: different funding strategies). But we're all corporate entities devoted to software development... ... by the way, if someone thinks there's no money in Python, consider this: --- Google is looking to expand its operations and needs talented engineers to develop the next generation search engine. If you have a need to bring order to a chaotic web, contact us. Requirements: Several years of industry or hobby-based experience B.S. in Computer Science or equivalent (M.S. a plus) Extensive experience programming in C or C++ Extensive experience programming in the UNIX environment Knowledge of TCP/IP and network programming Experience developing/designing large software systems Experience programming in Python a plus --- Google Inc., a year-old Internet search-engine company, said it has attracted $25 million in venture-capital funding and will add two of Silicon Valley's best-known financiers, Michael Moritz and L. John Doerr, to its board. Even by Internet standards, Google has attracted an un- usually large amount of money for a company still in its infancy. --- looks like anyone on this list could get a cool Python job for an unusually over-funded startup within minutes ;-) From skip@mojam.com (Skip Montanaro) Tue Jun 8 12:12:02 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 8 Jun 1999 07:12:02 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <004d01beb199$4fe171c0$f29b12c2@pythonware.com> References: <004d01beb199$4fe171c0$f29b12c2@pythonware.com> Message-ID: <14172.63947.54638.275348@cm-24-29-94-19.nycap.rr.com> Fredrik> Even by Internet standards, Google has attracted an un- Fredrik> usually large amount of money for a company still in its Fredrik> infancy. And it's a damn good search engine to boot, so I think it probably deserves the funding (most of it will, I suspect, be used to muscle its way into a crowded market). It is *always* my first stop when I need a general-purpose search engine these days. I never use InfoSeek/Go, Lycos or HotBot for anything other than to check that Musi-Cal is still in their database. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From guido@CNRI.Reston.VA.US Tue Jun 8 13:46:51 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 08 Jun 1999 08:46:51 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Tue, 08 Jun 1999 01:37:22 EDT." <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> References: <375C516E.76EC8ED4@lyra.org> <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> Message-ID: <199906081246.IAA14302@eric.cnri.reston.va.us> > Is there a way with configure to determine whether or not particular Unix > variants should have threads enabled or not? If so, I think that's the way > to go. I think it would be unfortunate to enable it by default, have it > appear to work on some known to be unsupported platforms, but then bite the > programmer in an inconvenient place at an inconvenient time. That's not so much the problem, if you can get a threaded program to compile and link that probably means sufficient support exists. There currently are checks in the configure script that try to find out which thread library to use -- these could be expanded to disable threads when none of the known ones work. Anybody care enough to try hacking configure.in, or should I add this to my tired TODO list? --Guido van Rossum (home page: http://www.python.org/~guido/) From jack@oratrix.nl Tue Jun 8 13:47:44 1999 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 08 Jun 1999 14:47:44 +0200 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Message by Andrew Kuchling , Mon, 7 Jun 1999 21:22:59 -0400 (EDT) , <14172.28787.399827.929220@newcnri.cnri.reston.va.us> Message-ID: <19990608124745.3136B303120@snelboot.oratrix.nl> > One possibility might be NSPR, the Netscape Portable Runtime, > which provides platform-independent threads and I/O on Mac, Win32, and > Unix. Perhaps a thread implementation could be written that sat on > top of NSPR, in addition to the existing pthreads implementation. > See http://www.mozilla.org/docs/refList/refNSPR/. NSPR looks rather promising! Does anyone has any experiences with it? What I'd also be interested in is experiences in how it interacts with the "real" I/O system, i.e. can you mix and match NSPR calls with normal os calls, or will that break things? The latter is important for Python, because there are lots of external libraries, and while some are user-built (image libraries, gdbm, etc) and could conceivably be converted to use NSPR others are not... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido@CNRI.Reston.VA.US Tue Jun 8 14:28:02 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 08 Jun 1999 09:28:02 -0400 Subject: [Python-Dev] Python-dev archives going public In-Reply-To: Your message of "Mon, 07 Jun 1999 20:13:42 PDT." <375C8A66.56B3F26B@lyra.org> References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> <375C8A66.56B3F26B@lyra.org> Message-ID: <199906081328.JAA14584@eric.cnri.reston.va.us> > > [Please dont copy this out of this list :-] > > It's in the archives now... :-) Which reminds me... A while ago, Greg made some noises about the archives being public, and temporarily I made them private. In the following brief flurry of messages everybody who spoke up said they preferred the archives to be public (even though the list remains invitation-only). But I never made the change back, waiting for Greg to agree, but after returning from his well deserved tequilla-splashed vacation, he never gave a peep about this, and I "conveniently forgot". I still like the archives to be public. I hope Mark's remark there was a joke? --Guido van Rossum (home page: http://www.python.org/~guido/) From MHammond@skippinet.com.au Tue Jun 8 14:38:03 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 23:38:03 +1000 Subject: [Python-Dev] Python-dev archives going public In-Reply-To: <199906081328.JAA14584@eric.cnri.reston.va.us> Message-ID: <003101beb1b4$22786de0$0801a8c0@bobcat> > I still like the archives to be public. I hope Mark's remark there > was a joke? Well, not really a joke, but I am not naive to think this is a "private" forum even in the absence of archives. What I meant was closer to "please don't make public statements based purely on this information". I never agreed to keep it private, but by the same token didnt want to start the rumour mills and get bad press for either Dick or us ;-) Mark. From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Tue Jun 8 16:09:24 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Tue, 8 Jun 1999 11:09:24 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl References: <375C516E.76EC8ED4@lyra.org> <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> <199906081246.IAA14302@eric.cnri.reston.va.us> Message-ID: <14173.12836.616873.953134@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Anybody care enough to try hacking configure.in, or should Guido> I add this to my tired TODO list? I'll give it a look. I've done enough autoconf hacking that it shouldn't be too hard. I also need to get my string meths changes into the tree... -Barry From gstein@lyra.org Tue Jun 8 19:11:56 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 08 Jun 1999 11:11:56 -0700 Subject: [Python-Dev] Python-dev archives going public References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> <375C8A66.56B3F26B@lyra.org> <199906081328.JAA14584@eric.cnri.reston.va.us> Message-ID: <375D5CEC.340E2531@lyra.org> Guido van Rossum wrote: > > > > [Please dont copy this out of this list :-] > > > > It's in the archives now... :-) > > Which reminds me... A while ago, Greg made some noises about the > archives being public, and temporarily I made them private. In the > following brief flurry of messages everybody who spoke up said they > preferred the archives to be public (even though the list remains > invitation-only). But I never made the change back, waiting for Greg > to agree, but after returning from his well deserved tequilla-splashed > vacation, he never gave a peep about this, and I "conveniently > forgot". I appreciate the consideration, but figured it was a done deal based on feedback. My only consideration in keeping them private was the basic, human fact that people could feel left out. For example, if they read the archives, thought it was neat, and attempted to subscribe only to be refused. It is a bit easier to avoid engendering those bad feelings if the archives aren't public. Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim@digicool.com Tue Jun 8 19:41:11 1999 From: jim@digicool.com (Jim Fulton) Date: Tue, 08 Jun 1999 18:41:11 +0000 Subject: [Python-Dev] Python-dev archives going public References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> <375C8A66.56B3F26B@lyra.org> <199906081328.JAA14584@eric.cnri.reston.va.us> <375D5CEC.340E2531@lyra.org> Message-ID: <375D63C7.6BB6697E@digicool.com> Greg Stein wrote: > > My only consideration in keeping them private was the basic, human fact > that people could feel left out. For example, if they read the archives, > thought it was neat, and attempted to subscribe only to be refused. It > is a bit easier to avoid engendering those bad feelings if the archives > aren't public. I agree. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (540) 371-6909 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From tismer@appliedbiometrics.com Tue Jun 8 20:37:21 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Tue, 08 Jun 1999 21:37:21 +0200 Subject: [Python-Dev] Stackless Preview References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> Message-ID: <375D70F1.37007192@appliedbiometrics.com> Christian Tismer wrote: [a lot] > fearing the feedback :-) ciao - chris I expected everything but forgot to fear "no feedback". :-) About 5 or 6 people seem to have taken the .zip file. Now I'm wondering why nobody complains. Was my code so wonderful, so disgustingly bad, or is this just boring :-? If it's none of the three above, I'd be happy to get a hint if I should continue, or if and what I should change. Maybe it would make sense to add some documentation now, and also to come up with an application which makes use of the stackless implementation, since there is now not much to wonder about than that it seems to work :-) yes-call-me-impatient - ly chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy@cnri.reston.va.us Tue Jun 8 21:09:15 1999 From: jeremy@cnri.reston.va.us (Jeremy Hylton) Date: Tue, 8 Jun 1999 16:09:15 -0400 (EDT) Subject: [Python-Dev] Stackless Preview In-Reply-To: <375D70F1.37007192@appliedbiometrics.com> References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> <375D70F1.37007192@appliedbiometrics.com> Message-ID: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> >>>>> "CT" == Christian Tismer writes: CT> Christian Tismer wrote: [a lot] >> fearing the feedback :-) ciao - chris CT> I expected everything but forgot to fear "no feedback". :-) CT> About 5 or 6 people seem to have taken the .zip file. Now I'm CT> wondering why nobody complains. Was my code so wonderful, so CT> disgustingly bad, or is this just boring :-? CT> If it's none of the three above, I'd be happy to get a hint if I CT> should continue, or if and what I should change. I'm one of the silent 5 or 6. My reasons fall under "None of the above." They are three in number: 1. No time (the perennial excuse; next 2 weeks are quite hectic) 2. I tried to use ndiff to compare old and new ceval.c, but ran into some problems with that tool. (Tim, it looks like the line endings are identical -- all '\012'.) 3. Wasn't sure what to look at first My only suggestion would be to have an executive summary. If there was a short README file -- no more than 150 lines -- that described the essentials of the approach and told me what to look at first, I would be able to comment more quickly. Jeremy From tismer@appliedbiometrics.com Tue Jun 8 21:15:04 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Tue, 08 Jun 1999 22:15:04 +0200 Subject: [Python-Dev] Stackless Preview References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> <375D70F1.37007192@appliedbiometrics.com> <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> Message-ID: <375D79C8.90B3E721@appliedbiometrics.com> Jeremy Hylton wrote: [...] > I'm one of the silent 5 or 6. My reasons fall under "None of the > above." They are three in number: > 1. No time (the perennial excuse; next 2 weeks are quite hectic) > 2. I tried to use ndiff to compare old and new ceval.c, but > ran into some problems with that tool. (Tim, it looks > like the line endings are identical -- all '\012'.) Yes, there are a lot of changes. As a hint: windiff from VC++ does a great job here. You can see both sources in one, in a very readable colored form. > 3. Wasn't sure what to look at first > > My only suggestion would be to have an executive summary. If there > was a short README file -- no more than 150 lines -- that described > the essentials of the approach and told me what to look at first, I > would be able to comment more quickly. Thanks a lot. Will do this tomorrow moaning as my first task. feeling much better - ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From Vladimir.Marangozov@inrialpes.fr Tue Jun 8 23:29:27 1999 From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov) Date: Wed, 9 Jun 1999 00:29:27 +0200 (DFT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <19990608124745.3136B303120@snelboot.oratrix.nl> from "Jack Jansen" at "Jun 8, 99 02:47:44 pm" Message-ID: <199906082229.AAA48646@pukapuka.inrialpes.fr> Jack Jansen wrote: > > NSPR looks rather promising! Does anyone has any experiences with it? What I'd > also be interested in is experiences in how it interacts with the "real" I/O > system, i.e. can you mix and match NSPR calls with normal os calls, or will > that break things? I've looked at it in the past. From memory, NSPR is a fairly big chunk of code and it seemed to me that it's self contained for lots of system stuff. Don't know about I/O, but I played with it to replace the BSD malloc it uses with pymalloc and I was pleased to see the resulting speed & mem stats after rebuilding one of the past Mozilla distribs. This is all the experience I have with it. > > The latter is important for Python, because there are lots of external > libraries, and while some are user-built (image libraries, gdbm, etc) and > could conceivably be converted to use NSPR others are not... I guess that this one would be hard... -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From Vladimir.Marangozov@inrialpes.fr Tue Jun 8 23:45:48 1999 From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov) Date: Wed, 9 Jun 1999 00:45:48 +0200 (DFT) Subject: [Python-Dev] Stackless Preview In-Reply-To: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> from "Jeremy Hylton" at "Jun 8, 99 04:09:15 pm" Message-ID: <199906082245.AAA48828@pukapuka.inrialpes.fr> Jeremy Hylton wrote: > > CT> If it's none of the three above, I'd be happy to get a hint if I > CT> should continue, or if and what I should change. > > I'm one of the silent 5 or 6. My reasons fall under "None of the > above." They are three in number: > ... > My only suggestion would be to have an executive summary. If there > was a short README file -- no more than 150 lines -- that described > the essentials of the approach and told me what to look at first, I > would be able to comment more quickly. Same here + a small wish: please save me the stripping of the ^M line endings typical for MSW, so that I can load the files directly in Xemacs on a Unix box. Otherwise, like Jeremy, I was a bit lost trying to read ceval.c which is already too hairy. -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From tim_one@email.msn.com Wed Jun 9 03:27:37 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 22:27:37 -0400 Subject: [Python-Dev] Stackless Preview In-Reply-To: <199906082245.AAA48828@pukapuka.inrialpes.fr> Message-ID: <000d01beb21f$a3daac20$2fa22299@tim> [Vladimir Marangozov] > ... > please save me the stripping of the ^M line endings typical for MSW, > so that I can load the files directly in Xemacs on a Unix box. Vlad, get linefix.py from Python FTP contrib's System area; converts among Unix, Windows and Mac line conventions; to Unix by default. For that matter, do a global replace of ^M in Emacs . buncha-lazy-whiners-ly y'rs - tim From tim_one@email.msn.com Wed Jun 9 03:27:35 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 22:27:35 -0400 Subject: [Python-Dev] Stackless Preview In-Reply-To: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> Message-ID: <000c01beb21f$a2bd5540$2fa22299@tim> [Christian Tismer] > ... > If it's none of the three above, I'd be happy to get a hint if I > should continue, or if and what I should change. Sorry, Chris! Just a case of "no time" here. Of *course* you should continue, and Guido should pop in with an encouraging word too -- or a "forget it". I think this design opens the doors to a world of interesting ideas, but that's based on informed prejudice rather than careful study of your code. Cheer up: if everyone thought you were a lame ass, we all would have studied your code intensely by now . [Jeremy] > 2. I tried to use ndiff to compare old and new ceval.c, but > ran into some problems with that tool. (Tim, it looks > like the line endings are identical -- all '\012'.) Then let's treat this like a real bug : which version of Python did you use? And ship me the files in a tarball (I'll find a way to extract them intact). And does that specific Python+ndiff combo work OK on *other* files? Or does it fail to find any lines in common no matter what you feed it (a 1-line test case would be a real help )? I couldn't provoke a problem with the stock 1.5.2 ndiff under the stock 1.5.2 Windows Python, using the then-current CVS snapshot of ceval.c as file1 and the ceval.c from Christian's stackless_990606.zip file as file2. Both files have \r\n line endings for me, though (one thanks to CVS line translation, and the other thanks to WinZip line translation). or-were-you-running-ndiff-under-the-stackless-python?-ly y'rs - tim From tim_one@email.msn.com Wed Jun 9 03:27:40 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 22:27:40 -0400 Subject: [Python-Dev] licensing In-Reply-To: <001101beb18b$6a049bd0$0801a8c0@bobcat> Message-ID: <000f01beb21f$a5e2ff40$2fa22299@tim> [Mark Hammond] > ... > [This may help explain to Guido and Tim my recent fervour in this area > - I want to get the "new look" Pythonwin ready for the book. I just > yesterday got a dockable interactive window happening. Now adding a > splitter window to each window to expose a pyclbr based tree control and > then it is time to stop (and re-write that chapter :-] All right! Do get the latest CVS versions of these files: pyclbr has been sped up a lot over the past two days, and is much less likely to get baffled now. And AutoIndent.py now defaults usetabs to 1 (which, of course, means it still uses spaces in new files ). From guido@CNRI.Reston.VA.US Wed Jun 9 04:31:11 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 08 Jun 1999 23:31:11 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: Your message of "Tue, 08 Jun 1999 22:27:35 EDT." <000c01beb21f$a2bd5540$2fa22299@tim> References: <000c01beb21f$a2bd5540$2fa22299@tim> Message-ID: <199906090331.XAA23066@eric.cnri.reston.va.us> Tim wrote: > Sorry, Chris! Just a case of "no time" here. Of *course* you > should continue, and Guido should pop in with an encouraging word > too -- or a "forget it". I think this design opens the doors to a > world of interesting ideas, but that's based on informed prejudice > rather than careful study of your code. Cheer up: if everyone > thought you were a lame ass, we all would have studied your code > intensely by now . No time here either... I did try to have a quick peek and my first impression is that it's *very* tricky code! You know what I think of that... Here's what I think we should do first (I've mentioned this before but nobody cheered me on :-). I'd like to see this as the basis for 1.6. We should structurally split the Python Virtual Machine and related code up into different parts -- both at the source code level and at the runtime level. The core PVM becomes a replaceable component, and so do a few other parts like the parser, the bytecode compiler, the import code, and the interactive read-eval-print loop. Most object implementations are shared between all -- or at least the interfaces are interchangeable. Clearly, a few object types are specific to one or another PVM (e.g. frames). The collection of builtins is also a separate component (though some builtins may again be specific to a PVM -- details, details!). The goal of course, is to create a market for 3rd party components here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's importer, and so on. Thoughts? --Guido van Rossum (home page: http://www.python.org/~guido/) From da@ski.org Wed Jun 9 04:37:36 1999 From: da@ski.org (David Ascher) Date: Tue, 8 Jun 1999 20:37:36 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Splitting up the PVM In-Reply-To: <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: On Tue, 8 Jun 1999, Guido van Rossum wrote: > We should structurally split the Python Virtual Machine and related > code up into different parts -- both at the source code level and at > the runtime level. The core PVM becomes a replaceable component, and > so do a few other parts like the parser, the bytecode compiler, the > import code, and the interactive read-eval-print loop. > The goal of course, is to create a market for 3rd party components > here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's > importer, and so on. > > Thoughts? If I understand it correctly, it means that I can fit in a third-party read-eval-print loop, which is my biggest area of frustration with the current internal structure. Sounds like a plan to me, and one which (lucky for me) I'm not qualified for! --david From skip@mojam.com (Skip Montanaro) Wed Jun 9 04:45:33 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 8 Jun 1999 23:45:33 -0400 (EDT) Subject: [Python-Dev] Stackless Preview In-Reply-To: <375D70F1.37007192@appliedbiometrics.com> References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> <375D70F1.37007192@appliedbiometrics.com> Message-ID: <14173.58054.869171.927699@cm-24-29-94-19.nycap.rr.com> Chris> If it's none of the three above, I'd be happy to get a hint if I Chris> should continue, or if and what I should change. Chris, My vote is for you to keep at it. I haven't looked at it because I have absolutely zero free time available. This will probably continue until at least the end of July, perhaps until Labor Day. Big doings at Musi-Cal and in the Montanaro household (look for an area code change in a month or so). Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From tismer@appliedbiometrics.com Wed Jun 9 13:58:40 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 09 Jun 1999 14:58:40 +0200 Subject: [Python-Dev] Splitting up the PVM References: <000c01beb21f$a2bd5540$2fa22299@tim> <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: <375E6500.307EF39E@appliedbiometrics.com> Guido van Rossum wrote: > > Tim wrote: > > > Sorry, Chris! Just a case of "no time" here. Of *course* you > > should continue, and Guido should pop in with an encouraging word > > too -- or a "forget it". I think this design opens the doors to a > > world of interesting ideas, but that's based on informed prejudice > > rather than careful study of your code. Cheer up: if everyone > > thought you were a lame ass, we all would have studied your code > > intensely by now . > > No time here either... > > I did try to have a quick peek and my first impression is that it's > *very* tricky code! You know what I think of that... Thanks for looking into it, thanks for saying it's tricky. Since I failed to supply proper documentation yet, this impression must come up. But it is really not true. The code is not tricky but just straightforward and consequent, after one has understood what it means to work without a stack, under the precondition to avoid too much changes. I didn't want to rewrite the world, and I just added the tiny missing bits. I will write up my documentation now, and you will understand what the difficulties were. These will not vanish, "stackless" is a brainteaser. My problem was not how to change the code, but finally it was how to change my brain. Now everything is just obvious. > Here's what I think we should do first (I've mentioned this before but > nobody cheered me on :-). I'd like to see this as the basis for 1.6. > > We should structurally split the Python Virtual Machine and related > code up into different parts -- both at the source code level and at > the runtime level. The core PVM becomes a replaceable component, and > so do a few other parts like the parser, the bytecode compiler, the > import code, and the interactive read-eval-print loop. Most object > implementations are shared between all -- or at least the interfaces > are interchangeable. Clearly, a few object types are specific to one > or another PVM (e.g. frames). The collection of builtins is also a > separate component (though some builtins may again be specific to a > PVM -- details, details!). Good idea, and a lot of work. Having different frames for different PVM's was too much for me. Instead, I tried to adjust frames in a way where a lot of machines can work with. I tried to show the concept of having different VM's by implementing a stackless map. Stackless map is a very tiny one which uses frames again (and yes, this was really hacked). Well, different frame flavors would make sense, perhaps. But I have a central routine which handles all calls to frames, and this is what I think is needed. I already *have* pluggable interpreters here, since a function can produce a frame which is bound to an interpreter, and push it to the frame stack. > The goal of course, is to create a market for 3rd party components > here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's > importer, and so on. I'm with that component goal, of course. Much work, not for one persone, but great. While I don't think it makes sense to make a flat PVM pluggable. I would start with a flat PVM, since that opens a world of possibilities. You can hardly plug flatness in after you started with a wrong stack layout. Vice versa, plugging the old machine would be possible. later - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer@appliedbiometrics.com Wed Jun 9 14:08:38 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 09 Jun 1999 15:08:38 +0200 Subject: [Python-Dev] Stackless Preview References: <000c01beb21f$a2bd5540$2fa22299@tim> Message-ID: <375E6756.370BA78E@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > ... > > If it's none of the three above, I'd be happy to get a hint if I > > should continue, or if and what I should change. > > Sorry, Chris! Just a case of "no time" here. Of *course* you should > continue, and Guido should pop in with an encouraging word too -- or a > "forget it". Yup, I know this time problem just too good. Well, I think I got something in between. I was warned before, so I didn't try to write final code, but I managed to prove the concept. I *will* continue, regardless what anybody says. > or-were-you-running-ndiff-under-the-stackless-python?-ly y'rs - tim I didn't use ndiff, but regular "diff", and it worked. But since theere is not much change to the code, but some significant change to the control flow, I found the diff output too confusing. Windiff was always open when I wrote that, to be sure that I didn't trample on things which I didn't want to mess up. A good tool! ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Jun 9 15:48:34 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Wed, 9 Jun 1999 10:48:34 -0400 (EDT) Subject: [Python-Dev] Stackless Preview References: <199906082245.AAA48828@pukapuka.inrialpes.fr> <000d01beb21f$a3daac20$2fa22299@tim> Message-ID: <14174.32450.29368.914458@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Vlad, get linefix.py from Python FTP contrib's System area; TP> converts among Unix, Windows and Mac line conventions; to Unix TP> by default. For that matter, do a global replace of ^M in TP> Emacs . I forgot to follow up to Vlad's original message, but in XEmacs (dunno about FSFmacs), you can visit DOS-eol files without seeing the ^M's. You will see a "DOS" in the modeline, and when you go to write the file it'll ask you if you want to write it in "plain text". I use XEmacs all the time to convert between DOS-eol and eol-The-Way-God-Intended :) To enable this, add the following to your .emacs file: (require 'crypt) -Barry From tismer@appliedbiometrics.com Wed Jun 9 18:58:52 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Wed, 09 Jun 1999 19:58:52 +0200 Subject: [Python-Dev] First Draft on Stackless Python References: <199906082245.AAA48828@pukapuka.inrialpes.fr> <000d01beb21f$a3daac20$2fa22299@tim> <14174.32450.29368.914458@anthem.cnri.reston.va.us> Message-ID: <375EAB5C.138D32CF@appliedbiometrics.com> Howdy, I've begun with a first draft on Stackless Python. Didn't have enough time to finish it, but something might already be useful. (Should I better drop the fish idea?) Will write the rest tomorrow. ciao - chris http://www.pns.cc/stackless/stackless.htm -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one@email.msn.com Thu Jun 10 06:25:11 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 01:25:11 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: <375E6500.307EF39E@appliedbiometrics.com> Message-ID: <001401beb301$9cf20b00$af9e2299@tim> [Christian Tismer, replying to Guido's enthusiasm ] > Thanks for looking into it, thanks for saying it's tricky. > Since I failed to supply proper documentation yet, this > impression must come up. > > But it is really not true. The code is not tricky > but just straightforward and consequent, after one has understood > what it means to work without a stack, under the precondition > to avoid too much changes. I didn't want to rewrite > the world, and I just added the tiny missing bits. > > I will write up my documentation now, and you will > understand what the difficulties were. These will not > vanish, "stackless" is a brainteaser. My problem was not how > to change the code, but finally it was how to change > my brain. Now everything is just obvious. FWIW, I believe you! There's something *inherently* tricky about maintaining the effect of a stack without using the stack C supplies implicitly, and from all you've said and what I've learned of your code, it really isn't the code that's tricky here. You're making formerly-hidden connections explicit, which means more stuff is visible, but also means more power and flexibility *because* "more stuff is visible". Agree too that this clearly moves in the direction of making the VM pluggable. > ... > I *will* continue, regardless what anybody says. Ah, if that's how this works, then STOP! Immediately! Don't you dare waste more of our time with this crap . want-some-money?-ly y'rs - tim From tim_one@email.msn.com Thu Jun 10 06:44:50 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 01:44:50 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: <001701beb304$5b8a8b80$af9e2299@tim> [Guido van Rossum] > ... > Here's what I think we should do first (I've mentioned this before but > nobody cheered me on :-). I'd like to see this as the basis for 1.6. > > We should structurally split the Python Virtual Machine and related > code up into different parts -- both at the source code level and at > the runtime level. The core PVM becomes a replaceable component, and > so do a few other parts like the parser, the bytecode compiler, the > import code, and the interactive read-eval-print loop. Most object > implementations are shared between all -- or at least the interfaces > are interchangeable. Clearly, a few object types are specific to one > or another PVM (e.g. frames). The collection of builtins is also a > separate component (though some builtins may again be specific to a > PVM -- details, details!). > > The goal of course, is to create a market for 3rd party components > here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's > importer, and so on. > > Thoughts? The idea of major subsystems getting reworked to conform to well-defined and well-controlled interfaces is certainly appealing. I'm just more comfortable squeezing another 1.7% out of list.sort() <0.9 wink>. trying-to-reduce-my-ambitions-to-match-my-time-ly y'rs - tim From jack@oratrix.nl Thu Jun 10 09:49:31 1999 From: jack@oratrix.nl (Jack Jansen) Date: Thu, 10 Jun 1999 10:49:31 +0200 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: Message by Guido van Rossum , Tue, 08 Jun 1999 23:31:11 -0400 , <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: <19990610084931.55882303120@snelboot.oratrix.nl> > Here's what I think we should do first (I've mentioned this before but > nobody cheered me on :-). Go, Guido, GO!!!! What I'd like in the split you propose is to see which of the items would be implementable in Python, and try to do the split in such a way that such a Python implementation isn't ruled out. Am I correct in guessing that after factoring out the components you mention the only things that aren't in a "replaceable component" are the builtin objects, and a little runtime glue (malloc and such)? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From tismer@appliedbiometrics.com Thu Jun 10 13:16:20 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Thu, 10 Jun 1999 14:16:20 +0200 Subject: [Python-Dev] Splitting up the PVM References: <001401beb301$9cf20b00$af9e2299@tim> Message-ID: <375FAC94.D17D43A7@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer, replying to Guido's enthusiasm ] ... > > I will write up my documentation now, and you will still under some work :) > > understand what the difficulties were. These will not > > vanish, "stackless" is a brainteaser. My problem was not how > > to change the code, but finally it was how to change > > my brain. Now everything is just obvious. > > FWIW, I believe you! There's something *inherently* tricky about > maintaining the effect of a stack without using the stack C supplies > implicitly, and from all you've said and what I've learned of your code, it > really isn't the code that's tricky here. You're making formerly-hidden > connections explicit, which means more stuff is visible, but also means more > power and flexibility *because* "more stuff is visible". I knew you would understand me. Feeling much, much better now :-)) After this is finalized, restartable exceptions might be interesting to explore. No, Chris, do the doco... > > I *will* continue, regardless what anybody says. > > Ah, if that's how this works, then STOP! Immediately! Don't you dare waste > more of our time with this crap . Thanks, you fired me a continuation. Here the way to get me into an endless loop: Give me an unsolvable problem and claim I can't do that. :) (just realized that I'm just another pluggable interpreter) > want-some-money?-ly y'rs - tim No, but meet you at least once in my life. -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From arw@ifu.net Thu Jun 10 14:40:51 1999 From: arw@ifu.net (Aaron Watters) Date: Thu, 10 Jun 1999 09:40:51 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] Message-ID: <375FC062.62850DE5@ifu.net> While we're talking about stacks... I've always considered it a major shame that Python ints and floats and chars and stuff have anything to do with dynamic allocation, and I always suspected it might be a major speed performance boost if there was some way they could be manipulated without the need for dynamic memory management. One conceivable alternative approach would change the basic manipulation of objects so that instead of representing objects via pyobject pointers everywhere represent them using two "slots" in a structure for each object, one of which is a type descriptor pointer and the other being a (void *) which could contain the data directly for small objects such as ints, floats, chars. In this case, for example, integer addition would never require any memory management, as it shouldn't, I think, in a perfect world. IE instead of C-stack or static: Heap: (pyobject *) ------------> (refcount, typedescr, data ...) in general you get (typedescr repr* ----------------------> (refcount, data, ...) ) or for small objects like ints and floats and chars simply (typedescr, value) with no dereferencing or memory management required. My feeling is that common things like arithmetic and indexing lists of integers and stuff could be much faster under this approach since it reduces memory management overhead and fragmentation, dereferencing, etc... One bad thing, of course, is that this might be a drastic assault on the way existing code works... Unless I'm just not being creative enough with my thinking. Is this a good idea? If so, is there any way to add it to the interpreter without breaking extension modules and everything else? If Python 2.0 will break stuff anyway, would this be an good change to the internals? Curious... -- Aaron Watters ps: I suppose another gotcha is "when do you do increfs/decrefs?" because they no longer make sense for ints in this case... maybe add a flag to the type descriptor "increfable" and assume that the typedescriptors are always in the CPU cache (?). This would slow down increfs by a couple cycles... Would it be worth it? Only the benchmark knows... Another fix would be to put the refcount in the static side with no speed penalty (typedescr repr* ----------------------> data refcount ) but would that be wasteful of space? From guido@CNRI.Reston.VA.US Thu Jun 10 14:45:51 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 10 Jun 1999 09:45:51 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: Your message of "Thu, 10 Jun 1999 10:49:31 +0200." <19990610084931.55882303120@snelboot.oratrix.nl> References: <19990610084931.55882303120@snelboot.oratrix.nl> Message-ID: <199906101345.JAA29917@eric.cnri.reston.va.us> [me] > > Here's what I think we should do first (I've mentioned this before but > > nobody cheered me on :-). [Jack] > Go, Guido, GO!!!! > > What I'd like in the split you propose is to see which of the items would be > implementable in Python, and try to do the split in such a way that such a > Python implementation isn't ruled out. Indeed. The importing code and the read-eval-print loop are obvious candidates (in fact IDLE shows how the latter can be done today). I'm not sure if it makes sense to have a parser/compiler or the VM written in Python, because of the expected slowdown (plus, the VM would present a chicken-egg problem :-) although for certain purposes one might want to do this. An optimizing pass would certainly be a good candidate. > Am I correct in guessing that after factoring out the components you mention > the only things that aren't in a "replaceable component" are the builtin > objects, and a little runtime glue (malloc and such)? I guess (although how much exactly will only become clear when it's done). I guess that things like thread-safety and GC policy are also pervasive. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Thu Jun 10 15:11:23 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 10 Jun 1999 10:11:23 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: Your message of "Thu, 10 Jun 1999 09:40:51 EDT." <375FC062.62850DE5@ifu.net> References: <375FC062.62850DE5@ifu.net> Message-ID: <199906101411.KAA29962@eric.cnri.reston.va.us> [Aaron] > I've always considered it a major shame that Python ints and floats > and chars and stuff have anything to do with dynamic allocation, and > I always suspected it might be a major speed performance boost if > there was some way they could be manipulated without the need for > dynamic memory management. What you're describing is very close to what I recall I once read about the runtime organization of Icon. Perl may also use a variant on this (it has fixed-length object headers). On the other hand, I believe Smalltalks typically uses something like the following ABC trick: In ABC, we used a variation: objects were represented by pointers as in Python, except when the low bit was 1, in which case the remaining 31 bits were a "small int". My experience with this approach was that it probably saved some memory, but perhaps not time (since almost all operations on objects were slowed down by the check "is it an int?" before the pointer could be accessed); and that because of this it was a major hassle in keeping the implementation code correct. There was always the temptation to make a check early in a piece of code and then skip the check later on, which sometimes didn't work when objects switched places. Plus in general the checks made the code less readable, and it was just one more thing to remember to do. The Icon approach (i.e. yours) seems to require a complete rethinking of all object implementations and all APIs at the C level -- perhaps we could think about it for Python 2.0. Some ramifications: - Uses more memory for highly shared objects (there are as many copies of the type pointer as there are references). - Thus, lists take double the memory assuming they reference objects that also exist elsewhere. This affects the performance of slices etc. - On the other hand, a list of ints takes half the memory (given that most of those ints are not shared). - *Homogeneous* lists (where all elements have the same type -- i.e. arrays) can be represented more efficiently by having only one copy of the type pointer. This was an idea for ABC (whose type system required all container types to be homogenous) that was never implemented (because in practice the type check wasn't always applied, and the top-level namespace used by the interactive command interpreter violated all the rules). - Reference count manipulations could be done by a macro (or C++ behind-the-scense magic using copy constructors and destructors) that calls a function in the type object -- i.e. each object could decide on its own reference counting implementation :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Thu Jun 10 19:02:30 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Thu, 10 Jun 1999 14:02:30 -0400 (EDT) Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: <375FC062.62850DE5@ifu.net> <199906101411.KAA29962@eric.cnri.reston.va.us> Message-ID: <14175.64950.720465.456133@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> In ABC, we used a variation: objects were represented by Guido> pointers as in Python, except when the low bit was 1, in Guido> which case the remaining 31 bits were a "small int". Very similar to how Emacs Lisp manages its type system, to which XEmacs extended. The following is from the XEmacs Internals documentation[1]. XEmacs' object representation (on a 32 bit machine) uses the top bit as a GC mark bit, followed by three type tag bits, followed by a pointer or an integer: [ 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 ] [ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 ] ^ <---> <------------------------------------------------------> | tag a pointer to a structure, or an integer | `---> mark bit One of the 8 possible types representable by the tag bits, one is a "record" type, which essentially allows an unlimited (well, 2^32) number of data types. As you might guess there are lots of interesting details and limitations to this scheme, with lots of interesting macros in the C code :). Reading and debugging the C implementation gets fun too (we'll ignore for the moment all the GCPRO'ing going on -- if you think INCREF/DECREF is trouble prone, hah!). Whether or not this is at all relevent for Python 2.0, it all seems to work pretty well in (X)Emacs. >>>>> "AW" == Aaron Watters writes: AW> ps: I suppose another gotcha is "when do you do AW> increfs/decrefs?" because they no longer make sense for ints AW> in this case... maybe add a flag to the type descriptor AW> "increfable" and assume that the typedescriptors are always in AW> the CPU cache (?). This would slow down increfs by a couple AW> cycles... Would it be worth it? Only the benchmark knows... AW> Another fix would be to put the refcount in the static side AW> with no speed penalty | (typedescr | repr* ----------------------> data | refcount | ) AW> but would that be wasteful of space? Once again, you can move the refcount out of the objects, a la NextStep. Could save space and improve LOC for read-only objects. -Barry [1] The Internals documentation comes with XEmacs's Info documetation. Hit: C-h i m Internals RET m How RET From tismer@appliedbiometrics.com Thu Jun 10 20:53:10 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Thu, 10 Jun 1999 21:53:10 +0200 Subject: [Python-Dev] Stackless Preview References: <000d01beb21f$a3daac20$2fa22299@tim> Message-ID: <376017A6.DC619723@appliedbiometrics.com> Howdy, I worked a little more on the docs and figured out that I could use a hint. http://www.pns.cc/stackless/stackless.htm Trying to give an example how coroutines could work, some weaknesses showed up. I wanted to write some function coroutine_transfer which swaps two frame chains. This function should return my unwind token, but unfortunately in that case a real result would be needed as well. Well, I know of several ways out, but it's a matter of design, and I'd like to find the most elegant solution for this. Could perhaps someone of those who encouraged me have a look into the problem? Do I have to add yet another field for return values and handle that in the dispatcher? thanks - chris (tired of thinking) -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 00:32:26 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Thu, 10 Jun 1999 19:32:26 -0400 (EDT) Subject: [Python-Dev] String methods... finally Message-ID: <14176.19210.146525.172100@anthem.cnri.reston.va.us> I've finally checked my string methods changes into the source tree, albeit on a CVS branch (see below). These changes are outgrowths of discussions we've had on the string-sig, with I think Greg Stein giving lots of very useful early feedback. I'll call these changes controversial (hence the branch) because Guido hasn't had much opportunity to play with them. Now that he -- and you -- can check them out, I'm sure I'll get lots more feedback! First, to check them out you need to switch to the string_methods CVS branch. On Un*x: cvs update -r string_methods You might want to do this in a separate tree because this will sticky tag your tree to this branch. If so, try cvs checkout -r string_methods python Here's a brief summary of the changes (as best I can restore the state -- its been a while since I actually made all these changes ;) Strings now have as methods most of the functions that were previously only in the string module. If you've played with JPython, you've already had this feature for a while. So you can do: Python 1.5.2+ (#1, Jun 10 1999, 18:22:14) [GCC 2.8.1] on sunos5 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> s = 'Hello There Devheads' >>> s.lower() 'hello there devheads' >>> s.upper() 'HELLO THERE DEVHEADS' >>> s.split() ['Hello', 'There', 'Devheads'] >>> 'hello'.upper() 'HELLO' that sort of thing. Some of the string module functions don't make sense as string methods, like join, and others never had a C implementation so weren't added, like center. Two new methods startswith and endswith act like their Java cousins. The string module has been rewritten to be completely (I hope) backwards compatible. No code should break, though they could be slower. Guido and I decided that was acceptable. What else? Some cleaning up of the internals based on Greg's suggestions. A couple of new C API additions. Builtin int(), long(), and float() have grown a few new features. I believe they are essentially interchangable with string.atoi(), string.atol(), and string.float() now. After you guys get to toast me (in either sense of the word) for a while and these changes settle down, I'll make a wider announcement. Enjoy, -Barry From da@ski.org Fri Jun 11 00:37:54 1999 From: da@ski.org (David Ascher) Date: Thu, 10 Jun 1999 16:37:54 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: On Thu, 10 Jun 1999, Barry A. Warsaw wrote: > I've finally checked my string methods changes into the source tree, Great! > ... others never had a C implementation so weren't added, like center. I assume that's not a design decision but a "haven't gotten around to it yet" statement, right? > Two new methods startswith and endswith act like their Java cousins. aaaah... . --david From MHammond@skippinet.com.au Fri Jun 11 00:59:17 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Fri, 11 Jun 1999 09:59:17 +1000 Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <003101beb39d$41b1c7c0$0801a8c0@bobcat> > I've finally checked my string methods changes into the source tree, > albeit on a CVS branch (see below). These changes are outgrowths of Yay! Would this also be a good opportunity to dust-off the Unicode implementation the string-sig recently came up with (as implemented by Fredrik) and get this in as a type? Although we still have the unresolved issue of how to use PyArg_ParseTuple etc to convert to/from Unicode and 8bit, it would still be nice to have Unicode and String objects capable of being used interchangably at the Python level. Of course, the big problem with attempting to test out these sorts of changes is that you must do so in code that will never see the public for a good 12 months. I suppose a 1.5.25 is out of the question ;-) Mark. From guido@CNRI.Reston.VA.US Fri Jun 11 02:40:07 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 10 Jun 1999 21:40:07 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Fri, 11 Jun 1999 09:59:17 +1000." <003101beb39d$41b1c7c0$0801a8c0@bobcat> References: <003101beb39d$41b1c7c0$0801a8c0@bobcat> Message-ID: <199906110140.VAA02180@eric.cnri.reston.va.us> > Would this also be a good opportunity to dust-off the Unicode > implementation the string-sig recently came up with (as implemented by > Fredrik) and get this in as a type? > > Although we still have the unresolved issue of how to use PyArg_ParseTuple > etc to convert to/from Unicode and 8bit, it would still be nice to have > Unicode and String objects capable of being used interchangably at the > Python level. Yes, yes, yes! Even if it's not supported everywhere, at least having the Unicode type in the source tree would definitely help! > Of course, the big problem with attempting to test out these sorts of > changes is that you must do so in code that will never see the public for a > good 12 months. I suppose a 1.5.25 is out of the question ;-) We'll see about that... (I sometimes wished I wasn't in the business of making releases. I've asked for help with making essential patches to 1.5.2 available but nobody volunteered... :-( ) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Fri Jun 11 04:08:28 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 23:08:28 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <14175.64950.720465.456133@anthem.cnri.reston.va.us> Message-ID: <000a01beb3b7$adda3b20$329e2299@tim> Jumping in to opine that mixing tag/type bits with native pointers is a Really Bad Idea. Put the bits on the low end and word-addressed machines are screwed. Put the bits on the high end and you've made severe assumptions about how the platform parcels out address space. In any case you're stuck with ugly macros everywhere. This technique was pioneered by Lisps, and was beautifully exploited by the Symbolics Lisp Machine and TI Lisp Explorer hardware. Lisp people don't want to admit those failed, so continue simulating the HW design by hand at comparatively sluggish C speed <0.6 wink>. BTW, I've never heard this approach argued as a speed optimization (except in the HW implementations): software mask-test-branch around every inc/dec-ref to exempt ints is a nasty new repeated expense. The original motivation was to save space, and that back in the days when a 128Mb RAM chip wasn't even conceivable, let alone under $100 . once-wrote-a-functional-language-interpreter-in-8085-assembler-that-ran- in-24Kb-cuz-that's-all-there-was-but-don't-feel-i-need-to-repeat-the- experience-today-wink>-ly y'rs - tim From bwarsaw@python.org Fri Jun 11 04:13:29 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Thu, 10 Jun 1999 23:13:29 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <14176.32473.408675.992145@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: >> ... others never had a C implementation so weren't added, like >> center. DA> I assume that's not a design decision but a "haven't gotten DA> around to it yet" statement, right? I think we decided that they weren't used enough to implement in C. >> Two new methods startswith and endswith act like their Java >> cousins. DA> aaaah... . Tell me about it! -Barry From tim_one@email.msn.com Fri Jun 11 04:33:25 1999 From: tim_one@email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 23:33:25 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <000b01beb3bb$29ccdaa0$329e2299@tim> > Two new methods startswith and endswith act like their Java cousins. Barry, suggest that both of these grow optional start and end slice indices. Why? It's Pythonic . Really, I'm forever marching over huge strings a slice-pair at a time, and it's important that searches and matches never give me false hits due to slobbering over the current slice bounds. regexp objects in general, and string.find/.rfind in particular, support this beautifully. Java feels less need since sub-stringing is via cheap descriptor there. The optional indices wouldn't hurt Java, but would help Python. then-again-if-strings-were-so-great-i'd-switch-to-tcl-ly y'rs - tim From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 04:41:55 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Thu, 10 Jun 1999 23:41:55 -0400 (EDT) Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: <14175.64950.720465.456133@anthem.cnri.reston.va.us> <000a01beb3b7$adda3b20$329e2299@tim> Message-ID: <14176.34179.125397.282079@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Jumping in to opine that mixing tag/type bits with native TP> pointers is a Really Bad Idea. Put the bits on the low end TP> and word-addressed machines are screwed. Put the bits on the TP> high end and you've made severe assumptions about how the TP> platform parcels out address space. In any case you're stuck TP> with ugly macros everywhere. Ah, so you /have/ read the Emacs source code! I'll agree that it's just an RBI for Emacs, but for Python, it'd be a RFSI. TP> This technique was pioneered by Lisps, and was beautifully TP> exploited by the Symbolics Lisp Machine and TI Lisp Explorer TP> hardware. Lisp people don't want to admit those failed, so TP> continue simulating the HW design by hand at comparatively TP> sluggish C speed <0.6 wink>. But of course, the ghosts live on at the FSF and xemacs.org (couldn't tell ya much about how modren Lisps do it). -Barry From skip@mojam.com (Skip Montanaro) Fri Jun 11 05:26:49 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 11 Jun 1999 00:26:49 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> Barry> Some of the string module functions don't make sense as string Barry> methods, like join, and others never had a C implementation so Barry> weren't added, like center. I take it string.capwords falls into that category. It's one of those things that's so easy to write in Python and there's no real speed gain in going to C, that it didn't make much sense to add it to the strop module, right? I see the following functions in string.py that could reasonably be methodized: ljust, rjust, center, expandtabs, capwords That's not very many, and it would appear that this stuff won't see widespread use for quite some time. I think for completeness sake we should bite the bullet on them. BTW, I built it and think it is very cool. Tipping my virtual hat to Barry, I am... Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From skip@mojam.com (Skip Montanaro) Fri Jun 11 05:57:15 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 11 Jun 1999 00:57:15 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> Message-ID: <14176.38521.124491.987817@cm-24-29-94-19.nycap.rr.com> Skip> I see the following functions in string.py that could reasonably be Skip> methodized: Skip> ljust, rjust, center, expandtabs, capwords It occurred to me just a few minutes after sending my previous message that it might make sense to make string.join a method for lists and tuples. They'd obviously have to make the same type checks that string.join does. That would leave the string/strip modules implementing just a couple functions. Skip From da@ski.org Fri Jun 11 06:09:46 1999 From: da@ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:09:46 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.38521.124491.987817@cm-24-29-94-19.nycap.rr.com> Message-ID: On Fri, 11 Jun 1999, Skip Montanaro wrote: > It occurred to me just a few minutes after sending my previous message that > it might make sense to make string.join a method for lists and tuples. > They'd obviously have to make the same type checks that string.join does. as in: >>> ['spam!', 'eggs!'].join() 'spam! eggs!' ? I like the notion, but I think it would naturally migrate towards genericity, at which point it might be called "reduce", so that: >>> ['spam!', 'eggs!'].reduce() 'spam!eggs!' >>> ['spam!', 'eggs!'].reduce(' ') 'spam! eggs!' >>> [1,2,3].reduce() 6 # 1 + 2 + 3 >>> [1,2,3].reduce(10) 26 # 1 + 10 + 2 + 10 + 3 note that string.join(foo) == foo.reduce(' ') and string.join(foo, '') == foo.reduce() --david From guido@CNRI.Reston.VA.US Fri Jun 11 06:16:29 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 01:16:29 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Thu, 10 Jun 1999 22:09:46 PDT." References: Message-ID: <199906110516.BAA02520@eric.cnri.reston.va.us> > On Fri, 11 Jun 1999, Skip Montanaro wrote: > > > It occurred to me just a few minutes after sending my previous message that > > it might make sense to make string.join a method for lists and tuples. > > They'd obviously have to make the same type checks that string.join does. > > as in: > > >>> ['spam!', 'eggs!'].join() > 'spam! eggs!' Note that this is not as powerful as string.join(); the latter works on any sequence, not just on lists and tuples. (Though that may not be a big deal.) I also find it slightly objectionable that this is a general list method but only works if the list contains only strings; Dave Ascher's generalization to reduce() is cute but strikes me are more general than useful, and the name will forever present a mystery to most newcomers. Perhaps join() ought to be a built-in function? --Guido van Rossum (home page: http://www.python.org/~guido/) From da@ski.org Fri Jun 11 06:23:06 1999 From: da@ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:23:06 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Guido van Rossum wrote: > Perhaps join() ought to be a built-in function? Would it do the moral equivalent of a reduce(operator.add, ...) or of a string.join? I think it should do the former (otherwise something about 'string' should be in the name), and as a consequence I think it shouldn't have the default whitespace spacer. cute-but-general'ly y'rs, david From da@ski.org Fri Jun 11 06:35:42 1999 From: da@ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:35:42 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Aside: apply syntax Message-ID: I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core to allow * and ** in function calls, so that: class SubFoo(Foo): def __init__(self, *args, **kw): apply(Foo, (self, ) + args, kw) ... could be written class SubFoo(Foo): def __init__(self, *args, **kw): Foo(self, *args, **kw) ... I really like this notion, but before I poke around trying to see if it's doable, I'd like to get feedback on whether y'all think it's a good idea or not. And if someone else wants to do it, feel free -- I am of course swamped, and I won't get to it until after rich comparisons. FWIW, apply() is one of my least favorite builtins, aesthetically speaking. --david From da@ski.org Fri Jun 11 06:36:30 1999 From: da@ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:36:30 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Re: Aside: apply syntax In-Reply-To: Message-ID: On Thu, 10 Jun 1999, David Ascher wrote: > class SubFoo(Foo): > def __init__(self, *args, **kw): > apply(Foo, (self, ) + args, kw) > ... > > could be written > > class SubFoo(Foo): > def __init__(self, *args, **kw): > Foo(self, *args, **kw) Of course I meant Foo.__init__ in both of the above! --david From skip@mojam.com (Skip Montanaro) Fri Jun 11 08:07:09 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 11 Jun 1999 03:07:09 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: References: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: <14176.45761.801671.880774@cm-24-29-94-19.nycap.rr.com> David> I think it should do the former (otherwise something about David> 'string' should be in the name), and as a consequence I think it David> shouldn't have the default whitespace spacer. Perhaps "joinstrings" would be an appropriate name (though it seems gratuitously long) or join should call str() on non-string elements. My thought here is that we have left in the string module a couple functions that ought to be string object methods but aren't yet mostly for convenience or time constraints, and one (join) that is 99.9% of the time used on lists or tuples of strings. That leaves a very small handful of methods that don't naturally fit somewhere else. You can, of course, complete the picture and add a join method to string objects, which would be useful to explode them into individual characters. That would complete the join-as-a-sequence-method picture I think. If you don't somebody else (and not me, cuz I'll know why already!) is bound to ask why capwords, join, ljust, etc got left behind in the string module while all the other functions got promotions to object methods. Oh, one other thing I forgot. Split (join) and splitfields (joinfields) used to be different. They've been the same for a long time now, long enough that I no longer recall how they used to differ. In making the leap from string module to string methods, I suggest dropping the long names altogether. There's no particular compatibility reason to keep them and they're not really any more descriptive than their shorter siblings. It's not like you'll be preserving backward compatibility for anyone's code by having them. However, if you release this code to the larger public, then you'll be stuck with both in perpetuity. Skip From fredrik@pythonware.com Fri Jun 11 08:06:58 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 11 Jun 1999 09:06:58 +0200 Subject: [Python-Dev] String methods... finally References: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: <008701beb3da$5e2db9d0$f29b12c2@pythonware.com> Guido wrote: > Note that this is not as powerful as string.join(); the latter works > on any sequence, not just on lists and tuples. (Though that may not > be a big deal.) > > I also find it slightly objectionable that this is a general list > method but only works if the list contains only strings; Dave Ascher's > generalization to reduce() is cute but strikes me are more general > than useful, and the name will forever present a mystery to most > newcomers. > > Perhaps join() ought to be a built-in function? come to think of it, the last design I came up with (inspired by a mail from you which I cannot find right now), was this: def join(sequence, sep=None): # built-in if not sequence: return "" sequence[0].__join__(sequence, sep) string.join => join and __join__ methods in the unicode and string classes. Guido? From fredrik@pythonware.com Fri Jun 11 08:03:19 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 11 Jun 1999 09:03:19 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <008601beb3da$5e0a7a60$f29b12c2@pythonware.com> Barry wrote: > Some of the string module functions don't make sense as > string methods, like join, and others never had a C > implementation so weren't added, like center. fwiw, the Unicode module available from pythonware.com implements them all, and more importantly, it can be com- piled for either 8-bit or 16-bit characters... join is a special problem; IIRC, Guido came up with what I at that time thought was an excellent solution, but I don't recall what it was right now ;-) anyway, maybe we should start by figuring out what methods we really want in there, and then figure out whether we should have one or two independent string implementations in the core... From mal@lemburg.com Fri Jun 11 09:15:33 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jun 1999 10:15:33 +0200 Subject: [Python-Dev] String methods... finally References: Message-ID: <3760C5A5.43FB1658@lemburg.com> David Ascher wrote: > > On Fri, 11 Jun 1999, Guido van Rossum wrote: > > > Perhaps join() ought to be a built-in function? > > Would it do the moral equivalent of a reduce(operator.add, ...) or of a > string.join? > > I think it should do the former (otherwise something about 'string' should > be in the name), and as a consequence I think it shouldn't have the > default whitespace spacer. AFAIK, Guido himself proposed something like this on c.l.p a few months ago. I think something like the following written in C and optimized for lists of strings might be useful: def join(sequence,sep=None): x = sequence[0] if sep: for y in sequence[1:]: x = x + sep + y else: for y in sequence[1:]: x = x + y return x >>> join(('a','b')) 'ab' >>> join(('a','b'),' ') 'a b' >>> join((1,2,3),3) 12 >>> join(((1,2),(3,))) (1, 2, 3) Also, while we're at string functions/methods. Some of the stuff in mxTextTools (see Python Pages link below) might be of general use as well, e.g. splitat(), splitlines() and charsplit(). -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 203 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@CNRI.Reston.VA.US Fri Jun 11 13:31:51 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 08:31:51 -0400 Subject: [Python-Dev] Aside: apply syntax In-Reply-To: Your message of "Thu, 10 Jun 1999 22:35:42 PDT." References: Message-ID: <199906111231.IAA02774@eric.cnri.reston.va.us> > I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core > to allow * and ** in function calls, so that: > > class SubFoo(Foo): > def __init__(self, *args, **kw): > apply(Foo, (self, ) + args, kw) > ... > > could be written > > class SubFoo(Foo): > def __init__(self, *args, **kw): > Foo(self, *args, **kw) > ... > > I really like this notion, but before I poke around trying to see if it's > doable, I'd like to get feedback on whether y'all think it's a good idea > or not. And if someone else wants to do it, feel free -- I am of course > swamped, and I won't get to it until after rich comparisons. > > FWIW, apply() is one of my least favorite builtins, aesthetically > speaking. I like the idea, but it would mean a major reworking of the grammar and the parser. Can I persuade you to keep this on ice until 2.0? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Fri Jun 11 13:54:30 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 11 Jun 1999 14:54:30 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <004601beb409$8c535750$f29b12c2@pythonware.com> > Two new methods startswith and endswith act like their Java cousins. is it just me, or do those method names suck? begin? starts_with? startsWith? (ouch) has_prefix? From arw@ifu.net Fri Jun 11 14:05:17 1999 From: arw@ifu.net (Aaron Watters) Date: Fri, 11 Jun 1999 09:05:17 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: <199906110342.XAA07977@python.org> Message-ID: <3761098D.A56F58A8@ifu.net> From: "Tim Peters" >Jumping in to opine that mixing tag/type bits with native pointers is a >Really Bad Idea. Put the bits on the low end and word-addressed machines >are screwed. Put the bits on the high end and you've made severe >assumptions about how the platform parcels out address space. In any case >you're stuck with ugly macros everywhere. Agreed. Never ever mess with pointers. This mistake has been made over and over again by each new generation of computer hardware and software and it's still a mistake. I thought it would be good to be able to do the following loop with Numeric arrays for x in array1: array2[x] = array3[x] + array4[x] without any memory management being involved. Right now, I think the for loop has to continually dynamically allocate each new x and intermediate sum (and immediate deallocate them) and that makes the loop piteously slow. The idea replacing pyobject *'s with a struct [typedescr *, data *] was a space/time tradeoff to speed up operations like the above by eliminating any need for mallocs or other memory management.. I really can't say whether it'd be worth it or not without some sort of real testing. Just a thought. -- Aaron Watters From mal@lemburg.com Fri Jun 11 14:11:20 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jun 1999 15:11:20 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> Message-ID: <37610AF8.3EC610FD@lemburg.com> Fredrik Lundh wrote: > > > Two new methods startswith and endswith act like their Java cousins. > > is it just me, or do those method names suck? > > begin? starts_with? startsWith? (ouch) > has_prefix? In mxTextTools I used the names prefix() and suffix() for much the same thing except that those functions accept a list of strings and return the (first) matching string instead of just 1 or 0. Details are available at: http://starship.skyport.net/~lemburg/mxTextTools.html -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 203 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@CNRI.Reston.VA.US Fri Jun 11 14:58:10 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 09:58:10 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Fri, 11 Jun 1999 15:11:20 +0200." <37610AF8.3EC610FD@lemburg.com> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> <37610AF8.3EC610FD@lemburg.com> Message-ID: <199906111358.JAA02836@eric.cnri.reston.va.us> > > > Two new methods startswith and endswith act like their Java cousins. > > > > is it just me, or do those method names suck? It's just you. > > begin? starts_with? startsWith? (ouch) > > has_prefix? Those are all painful to type, except "begin", which isn't expressive. > In mxTextTools I used the names prefix() and suffix() for much The problem with those is that it's arbitrary (==> harder to remember) whether A.prefix(B) means that A is a prefix of B or that A has B for a prefix. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Fri Jun 11 15:55:14 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jun 1999 16:55:14 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> <37610AF8.3EC610FD@lemburg.com> <199906111358.JAA02836@eric.cnri.reston.va.us> Message-ID: <37612352.227FCA4B@lemburg.com> Guido van Rossum wrote: > > > > > Two new methods startswith and endswith act like their Java cousins. > > > > > > is it just me, or do those method names suck? > > It's just you. > > > > begin? starts_with? startsWith? (ouch) > > > has_prefix? > > Those are all painful to type, except "begin", which isn't expressive. > > > In mxTextTools I used the names prefix() and suffix() for much > > The problem with those is that it's arbitrary (==> harder to remember) > whether A.prefix(B) means that A is a prefix of B or that A has B for > a prefix. True. These are functions in mxTextTools and take a sequence as second argument, so the order is clear there... has_prefix() has_suffix() would probably be appropriate as methods (you don't type them that often ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 203 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jack@oratrix.nl Fri Jun 11 16:55:36 1999 From: jack@oratrix.nl (Jack Jansen) Date: Fri, 11 Jun 1999 17:55:36 +0200 Subject: [Python-Dev] Aside: apply syntax In-Reply-To: Message by Guido van Rossum , Fri, 11 Jun 1999 08:31:51 -0400 , <199906111231.IAA02774@eric.cnri.reston.va.us> Message-ID: <19990611155536.944FA303120@snelboot.oratrix.nl> > > > > class SubFoo(Foo): > > def __init__(self, *args, **kw): > > Foo(self, *args, **kw) > > ... Guido: > I like the idea, but it would mean a major reworking of the grammar > and the parser. Can I persuade you to keep this on ice until 2.0? What exactly would the semantics be? While I hate the apply() loops you have to jump through nowadays to get this behaviour I don't funny understand how this would work in general (as opposed to in this case). For instance, would Foo(self, 12, *args, **kw) be allowed? And Foo(self, *args, x=12, **kw) ? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From da@ski.org Fri Jun 11 17:57:37 1999 From: da@ski.org (David Ascher) Date: Fri, 11 Jun 1999 09:57:37 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Aside: apply syntax In-Reply-To: <199906111231.IAA02774@eric.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Guido van Rossum wrote: > > I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core > > to allow * and ** in function calls, so that: > > > I like the idea, but it would mean a major reworking of the grammar > and the parser. Can I persuade you to keep this on ice until 2.0? Sure. That was hard. =) From da@ski.org Fri Jun 11 18:02:49 1999 From: da@ski.org (David Ascher) Date: Fri, 11 Jun 1999 10:02:49 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Aside: apply syntax In-Reply-To: <19990611155536.944FA303120@snelboot.oratrix.nl> Message-ID: On Fri, 11 Jun 1999, Jack Jansen wrote: > What exactly would the semantics be? While I hate the apply() loops you have > to jump through nowadays to get this behaviour I don't funny understand how > this would work in general (as opposed to in this case). For instance, would > Foo(self, 12, *args, **kw) > be allowed? And > Foo(self, *args, x=12, **kw) Following the rule used for argument processing now, if it's unambiguous, it should be allowed, and not otherwise. So, IMHO, the above two should be allowed, and I suspect Foo.__init__(self, *args, *args2) could be too, but Foo.__init__(self, **kw, **kw2) should not, as dictionary addition is not allowed. However, I could live with the more restricted version as well. --david From bwarsaw@python.org Fri Jun 11 18:17:20 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:17:20 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <000b01beb3bb$29ccdaa0$329e2299@tim> Message-ID: <14177.17568.637272.328126@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: >> Two new methods startswith and endswith act like their Java >> cousins. TP> Barry, suggest that both of these grow optional start and end TP> slice indices. 'Course it'll make the Java implementations of these extra args a little more work. Right now they just forward off to the underlying String methods. No biggie though. I've got new implementations to check in -- let me add a few new tests to cover 'em and watch your checkin emails. -Barry From guido@CNRI.Reston.VA.US Fri Jun 11 18:20:57 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 13:20:57 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Fri, 11 Jun 1999 13:17:20 EDT." <14177.17568.637272.328126@anthem.cnri.reston.va.us> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <000b01beb3bb$29ccdaa0$329e2299@tim> <14177.17568.637272.328126@anthem.cnri.reston.va.us> Message-ID: <199906111720.NAA03746@eric.cnri.reston.va.us> > From: "Barry A. Warsaw" > > 'Course it'll make the Java implementations of these extra args a > little more work. Right now they just forward off to the underlying > String methods. No biggie though. Which reminds me -- are you tracking this in JPython too? --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 18:39:41 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:39:41 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <000b01beb3bb$29ccdaa0$329e2299@tim> <14177.17568.637272.328126@anthem.cnri.reston.va.us> <199906111720.NAA03746@eric.cnri.reston.va.us> Message-ID: <14177.18909.980174.55751@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Which reminds me -- are you tracking this in JPython too? That's definitely my plan. From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 18:43:35 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:43:35 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <199906110516.BAA02520@eric.cnri.reston.va.us> <14176.45761.801671.880774@cm-24-29-94-19.nycap.rr.com> Message-ID: <14177.19143.463951.778491@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> Oh, one other thing I forgot. Split (join) and splitfields SM> (joinfields) used to be different. They've been the same for SM> a long time now, long enough that I no longer recall how they SM> used to differ. I think it was only in the number of arguments they'd accept (at least that's what's implied by the module docos). SM> In making the leap from string module to SM> string methods, I suggest dropping the long names altogether. I agree. Thinking about it, I'm also inclined to not include startswith and endswith in the string module. -Barry From da@ski.org Fri Jun 11 18:42:59 1999 From: da@ski.org (David Ascher) Date: Fri, 11 Jun 1999 10:42:59 -0700 (Pacific Daylight Time) Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <3761098D.A56F58A8@ifu.net> Message-ID: On Fri, 11 Jun 1999, Aaron Watters wrote: > I thought it would be good to be able to do the following loop with Numeric > arrays > > for x in array1: > array2[x] = array3[x] + array4[x] > > without any memory management being involved. Right now, I think the FYI, I think it should be done by writing: array2[array1] = array3[array1] + array4[array1] and doing "the right thing" in NumPy. In other words, I don't think the core needs to be involved. --david PS: I'm in the process of making the NumPy array objects ExtensionClasses, which will make the above much easier to do. From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 18:58:36 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:58:36 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> Message-ID: <14177.20044.69731.219173@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: >> Two new methods startswith and endswith act like their Java >> cousins. FL> is it just me, or do those method names suck? FL> begin? starts_with? startsWith? (ouch) FL> has_prefix? The inspiration was Java string objects, while trying to remain as Pythonic as possible (no mixed case). startswith and endswith doen't seem as bad as issubclass to me :) -Barry From bwarsaw@python.org Fri Jun 11 19:06:22 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:06:22 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <008601beb3da$5e0a7a60$f29b12c2@pythonware.com> Message-ID: <14177.20510.818041.110989@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> fwiw, the Unicode module available from pythonware.com FL> implements them all, and more importantly, it can be com- FL> piled for either 8-bit or 16-bit characters... Are these separately available? I don't see them under downloads. Send me a URL, and if I can figure out how to get CVS to add files to the branch :/, maybe I can check this in so people can play with it. -Barry From tismer@appliedbiometrics.com Fri Jun 11 19:17:46 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Fri, 11 Jun 1999 20:17:46 +0200 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: Message-ID: <376152CA.B46A691E@appliedbiometrics.com> David Ascher wrote: > > On Fri, 11 Jun 1999, Aaron Watters wrote: > > > I thought it would be good to be able to do the following loop with Numeric > > arrays > > > > for x in array1: > > array2[x] = array3[x] + array4[x] > > > > without any memory management being involved. Right now, I think the > > FYI, I think it should be done by writing: > > array2[array1] = array3[array1] + array4[array1] > > and doing "the right thing" in NumPy. In other words, I don't think the > core needs to be involved. For NumPy, this is very ok, dealing with arrays in an array world. Without trying to repeat myself, I'd like to say that I still consider it an unsolved problem which is worth to be solved or to be proven unsolvable: How to do simple things in an efficient way with many tiny Python objects, without writing an extension, without rethinking a problem into APL like style, and without changing the language. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 19:22:36 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:22:36 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: <14177.21484.126155.939932@anthem.cnri.reston.va.us> >> Perhaps join() ought to be a built-in function? IMO, builtin join ought to str()ify all the elements in the sequence, concatenating the results. That seems an intuitive interpretation of 'join'ing a sequence. Here's my Python prototype: def join(seq, sep=''): if not seq: return '' x = str(seq[0]) for y in seq[1:]: x = x + sep + str(y) return x Guido? -Barry From da@ski.org Fri Jun 11 19:24:34 1999 From: da@ski.org (David Ascher) Date: Fri, 11 Jun 1999 11:24:34 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.21484.126155.939932@anthem.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Barry A. Warsaw wrote: > IMO, builtin join ought to str()ify all the elements in the sequence, > concatenating the results. That seems an intuitive interpretation of > 'join'ing a sequence. Here's my Python prototype: I don't get it -- why? I'd expect join(((1,2,3), (4,5,6))) to yield (1,2,3,4,5,6), not anything involving strings. --david From bwarsaw@python.org Fri Jun 11 19:26:48 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:26:48 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> Message-ID: <14177.21736.100540.221487@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> I see the following functions in string.py that could SM> reasonably be methodized: SM> ljust, rjust, center, expandtabs, capwords Also zfill. What do you think, are these important enough to add? Maybe we can just drop in /F's implementation for these. -Barry From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 19:34:08 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:34:08 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14177.21484.126155.939932@anthem.cnri.reston.va.us> Message-ID: <14177.22176.328185.872134@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: DA> On Fri, 11 Jun 1999, Barry A. Warsaw wrote: >> IMO, builtin join ought to str()ify all the elements in the >> sequence, concatenating the results. That seems an intuitive >> interpretation of 'join'ing a sequence. Here's my Python >> prototype: DA> I don't get it -- why? DA> I'd expect join(((1,2,3), (4,5,6))) to yield (1,2,3,4,5,6), DA> not anything involving strings. Oh, just because I think it might useful, and would provide something that isn't easily provided with other constructs. Without those semantics join(((1,2,3), (4,5,6))) isn't much different than (1,2,3) + (4,5,6), or reduce(operator.add, ((1,2,3), (4,5,6))) as you point out. Since those latter two are easy enough to come up with, but str()ing the elements would require painful lambdas, I figured make the new built in do something new. -Barry From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 19:36:54 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:36:54 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> <14177.21736.100540.221487@anthem.cnri.reston.va.us> Message-ID: <14177.22342.320993.969742@anthem.cnri.reston.va.us> One other thing to think about. Where should this new methods be documented? I suppose we should reword the appropriate entries in modules-string and move them to typesseq-strings. What do you think Fred? -Barry From da@ski.org Fri Jun 11 19:36:32 1999 From: da@ski.org (David Ascher) Date: Fri, 11 Jun 1999 11:36:32 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.22176.328185.872134@anthem.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Barry A. Warsaw wrote: Barry: > >> IMO, builtin join ought to str()ify all the elements in the > >> sequence, concatenating the results. Me: > I don't get it -- why? Barry: > Oh, just because I think it might useful, and would provide something > that isn't easily provided with other constructs. I do map(str, ...) all the time. My real concern is that there is nothing about the word 'join' which implies string conversion. Either call it joinstrings or don't do the conversion, I say. --david From bwarsaw@python.org Fri Jun 11 19:42:27 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:42:27 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14177.22176.328185.872134@anthem.cnri.reston.va.us> Message-ID: <14177.22675.716917.331314@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: DA> My real concern is that there is nothing about the word 'join' DA> which implies string conversion. Either call it joinstrings DA> or don't do the conversion, I say. Can you say mapconcat() ? :) Or instead of join, just call it concat? -Barry From da@ski.org Fri Jun 11 19:46:19 1999 From: da@ski.org (David Ascher) Date: Fri, 11 Jun 1999 11:46:19 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.22675.716917.331314@anthem.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Barry A. Warsaw wrote: > >>>>> "DA" == David Ascher writes: > > DA> My real concern is that there is nothing about the word 'join' > DA> which implies string conversion. Either call it joinstrings > DA> or don't do the conversion, I say. > > Can you say mapconcat() ? :) > > Or instead of join, just call it concat? Again, no. Concatenating sequences is what I think the + operator does. I think you need the letters S, T, and R in there... But I'm still not convinced of its utility. From guido@CNRI.Reston.VA.US Fri Jun 11 19:51:18 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 14:51:18 -0400 Subject: [Python-Dev] join() Message-ID: <199906111851.OAA04105@eric.cnri.reston.va.us> Given the heat in this discussion, I'm not sure if I endorse *any* of the proposals so far any more... How would Java do this? A static function in the String class, probably. The Python equivalent is... A function in the string module. So maybe string.join() it remains. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Fri Jun 11 20:08:11 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Fri, 11 Jun 1999 15:08:11 -0400 (EDT) Subject: [Python-Dev] join() References: <199906111851.OAA04105@eric.cnri.reston.va.us> Message-ID: <14177.24219.94236.485421@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Given the heat in this discussion, I'm not sure if I Guido> endorse *any* of the proposals so far any more... Oh I dunno. David and I aren't throwing rocks at each other yet :) Guido> How would Java do this? A static function in the String Guido> class, probably. The Python equivalent is... A function Guido> in the string module. So maybe string.join() it remains. The only reason for making it a builtin would be to avoid pulling in all of string just to get join. But I guess we need to get some more experience using the methods before we know whether this is a real problem or not. as-good-as-a-from-string-import-join-and-easier-to-implement-ly y'rs, -Barry From skip@mojam.com (Skip Montanaro) Fri Jun 11 20:38:33 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 11 Jun 1999 15:38:33 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.21484.126155.939932@anthem.cnri.reston.va.us> References: <199906110516.BAA02520@eric.cnri.reston.va.us> <14177.21484.126155.939932@anthem.cnri.reston.va.us> Message-ID: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com> Barry> IMO, builtin join ought to str()ify all the elements in the Barry> sequence, concatenating the results. That seems an intuitive Barry> interpretation of 'join'ing a sequence. Any reason why join should be a builtin and not a method available just to sequences? Would there some valid interpretation of join( {'a': 1} ) join( 1 ) ? If not, I vote for method-hood, not builtin-hood. Seems like you'd avoid some confusion (and some griping by Graham Matthews about how unpure it is ;-). Skip From skip@mojam.com (Skip Montanaro) Fri Jun 11 20:42:11 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Fri, 11 Jun 1999 15:42:11 -0400 (EDT) Subject: [Python-Dev] join() In-Reply-To: <14177.24219.94236.485421@anthem.cnri.reston.va.us> References: <199906111851.OAA04105@eric.cnri.reston.va.us> <14177.24219.94236.485421@anthem.cnri.reston.va.us> Message-ID: <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> BAW> The only reason for making it a builtin would be to avoid pulling BAW> in all of string just to get join. I still don't understand the motivation for making it a builtin instead of a method of the types it operates on. Making it a builtin seems very un-object-oriented to me. Skip From guido@CNRI.Reston.VA.US Fri Jun 11 20:44:28 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 15:44:28 -0400 Subject: [Python-Dev] join() In-Reply-To: Your message of "Fri, 11 Jun 1999 15:42:11 EDT." <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> References: <199906111851.OAA04105@eric.cnri.reston.va.us> <14177.24219.94236.485421@anthem.cnri.reston.va.us> <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> Message-ID: <199906111944.PAA04277@eric.cnri.reston.va.us> > I still don't understand the motivation for making it a builtin instead of a > method of the types it operates on. Making it a builtin seems very > un-object-oriented to me. Because if you make it a method, every sequence type needs to know about joining strings. (This wouldn't be a problem in Smalltalk where sequence types inherit this stuff from an abstract sequence class, but in Python unfortunately that doesn't exist.) --Guido van Rossum (home page: http://www.python.org/~guido/) From da@ski.org Fri Jun 11 21:11:11 1999 From: da@ski.org (David Ascher) Date: Fri, 11 Jun 1999 13:11:11 -0700 (Pacific Daylight Time) Subject: [Python-Dev] join() In-Reply-To: <199906111944.PAA04277@eric.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Guido van Rossum wrote: > > I still don't understand the motivation for making it a builtin instead of a > > method of the types it operates on. Making it a builtin seems very > > un-object-oriented to me. > > Because if you make it a method, every sequence type needs to know > about joining strings. It still seems to me that we could do something like F/'s proposal, where sequences can define a join() method, which could be optimized if the first element is a string to do what string.join, by placing the class method in an instance method of strings, since string joining clearly has to involve at least one string. Pseudocode: class SequenceType: def join(self, separator=None): if hasattr(self[0], '__join__') # covers all types which can be efficiently joined if homogeneous return self[0].__join__(self, separator) # for the rest: if separator is None: return map(operator.add, self) result = self[0] for element in self[1:]: result = result + separator + element return result where the above would have to be done in abstract.c, with error handling, etc. and with strings (regular and unicode) defining efficient __join__'s as in: class StringType: def join(self, separator): raise AttributeError, ... def __join__(self, sequence): return string.join(sequence) # obviously not literally that =) class UnicodeStringType: def __join__(self, sequence): return unicode.join(sequence) (in C, of course). Yes, it's strange to fake class methods with instance methods, but it's been done before =). Yes, this means expanding what it means to "be a sequence" -- is that impossible without breaking lots of code? --david From gmcm@hypernet.com Fri Jun 11 22:30:10 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 11 Jun 1999 16:30:10 -0500 Subject: [Python-Dev] String methods... finally In-Reply-To: References: <14177.22675.716917.331314@anthem.cnri.reston.va.us> Message-ID: <1282985631-84109501@hypernet.com> David Ascher wrote: > Barry Warsaw wrote: > > Or instead of join, just call it concat? > > Again, no. Concatenating sequences is what I think the + operator > does. I think you need the letters S, T, and R in there... But I'm > still not convinced of its utility. But then Q will feel left out, and since Q doesn't go anywhere without U, pretty soon you'll have the whole damn alphabet in there. I-draw-the-line-at-$-well-$-&-@-but-definitely-not-#-ly y'rs - Gordon From MHammond@skippinet.com.au Fri Jun 11 23:49:29 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Sat, 12 Jun 1999 08:49:29 +1000 Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.20510.818041.110989@anthem.cnri.reston.va.us> Message-ID: <006801beb45c$aab5baa0$0801a8c0@bobcat> > Are these separately available? I don't see them under downloads. > Send me a URL, and if I can figure out how to get CVS to add files to > the branch :/, maybe I can check this in so people can play with it. Fredrik and I have spoken about this. He will dust it off and integrate some patches in the next few days. He will then send it to me to make sure the patches I made for Windows CE all made it OK, then one of us will integrate it with the branch and send it on... Mark. From tim_one@email.msn.com Sat Jun 12 01:56:03 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 11 Jun 1999 20:56:03 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.21736.100540.221487@anthem.cnri.reston.va.us> Message-ID: <000401beb46e$58b965a0$5ba22299@tim> [Skip Montanaro] > I see the following functions in string.py that could > reasonably be methodized: > > ljust, rjust, center, expandtabs, capwords > > Also zfill. > [Barry A. Warsaw] > What do you think, are these important enough to add? I think lack-of-surprise (gratuitous orthogonality ) was the motive here. If Guido could drop string functions in 2.0, which would he be happy to forget? Give him a head start. ljust and rjust were used often a long time ago, before the "%" sprintf-like operator was introduced; don't think I've seen new code use them in years. center was a nice convenience in the pre-HTML world, but probably never speed-critical and easy to write yourself. expandtabs is used frequently in IDLE and even pyclbr.py now. Curiously, though, they almost never want the tab-expanded string, but rather its len. capwords could become an absolute nightmare in a Unicode world <0.5 wink>. > Maybe we can just drop in /F's implementation for these. Sounds like A Plan to me. Wouldn't mourn the passing of the first three. and-i-even-cried-at-my-father's-funeral-ly y'rs - tim From tim_one@email.msn.com Sat Jun 12 07:19:33 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 02:19:33 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <199906110140.VAA02180@eric.cnri.reston.va.us> Message-ID: <000001beb49b$8a94f120$b19e2299@tim> [GvR] > (I sometimes wished I wasn't in the business of making releases. I've > asked for help with making essential patches to 1.5.2 available but > nobody volunteered... :-( ) It's kinda baffling "out here" -- checkin comments usually say what a patch does, but rarely make a judgment about a patch's importance. Sorting thru hundreds of patches without a clue is a pretty hopeless task. Perhaps future checkins that the checker-inner feels are essential could be commented as such in a machine-findable way? an-ounce-of-foresight-is-worth-a-sheet-of-foreskin-or-something-like-that-ly y'rs - tim From tim_one@email.msn.com Sat Jun 12 07:19:37 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 02:19:37 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <199906101411.KAA29962@eric.cnri.reston.va.us> Message-ID: <000101beb49b$8c27c620$b19e2299@tim> [Aaron, describes a scheme where objects are represented by a fixed-size (typecode, variant) pair, where if the typecode is e.g. INT or FLOAT the variant is the value directly instead of a pointer to the value] [Guido] > What you're describing is very close to what I recall I once read > about the runtime organization of Icon. At the lowest level it's exactly what Icon does. It does *not* exempt ints from Icon's flavor of dynamic memory management, but Icon doesn't use refcounting -- it uses compacting mark-&-sweep across some 5 distinct regions each with their own finer-grained policies (e.g., strings are central to Icon and so it manages the string region a little differently; and Icon coroutines save away pieces of the platform's C stack so need *very* special treatment). So: 1) There are no incref/decref expenses anywhere in Icon. 2) Because of compaction, all allocations cost the same and are dirt cheap: just increment the appropriate region's "avail" pointer by the number of bytes you need. If there aren't enough bytes, run GC and try again. If there still aren't enough bytes, Icon usually shuts down (it's not good at asking the OS for more memory! it carves up its initial memory in pretty rigid ways, and relies on tricks like comparing storage addresses to speed M&S and compaction -- those "regions" are in a fixed order relative to each other, so new memory can't be tacked on to a region except at the low and high ends). 3) All the expense is in finding and compacting live objects, so in an odd literal sense cleaning up trash comes for free. 4) Icon has no finalizers, so it doesn't need to identify or preserve trash -- compaction simply overwrites "the holes" where the trash used to be. Icon is nicely implemented, but it's a "self-contained universe" view of the world and its memory approach makes life hard for the tiny handful of folks who have *tried* to make it extendable via C. Icon is also purely procedural -- no OO, no destructors, no resurrection. Irony: one reason I picked up Python in '91 is that my int-fiddling code was too slow in Icon! Even Python 0.9.0 ran int algorithms significantly faster than the 10-years-refined Icon implementation of that time. Never looked into why, but now that Aaron brought up the issue I find it very surprising! Those algorithms had a huge rate of int trash creation, but very few persistent objects, so Icon's M&S should have run like the wind. And Icon's allocation is dirt-cheap (at least as fast as Python's fastest special-purpose allocators), and didn't have any refcounting expenses either. There's an important lesson *somewhere* in that . Maybe it was the fault of Icon's "goal-directed" expression evaluation, constantly asking "did this int succeed or fail?", "did that add suceed or fail?", etc. > ... > The Icon approach (i.e. yours) seems to require a complete rethinking > of all object implementations and all APIs at the C level -- perhaps > we could think about it for Python 2.0. Some ramifications: > > - Uses more memory for highly shared objects (there are as many copies > of the type pointer as there are references). Actually more than that in Icon: if the "variant" part is a pointer, the first word of the block it points to is also a copy of the typecode (turns out the redundancy speeds the GC). > - Thus, lists take double the memory assuming they reference objects > that also exist elsewhere. This affects the performance of slices > etc. > > - On the other hand, a list of ints takes half the memory (given that > most of those ints are not shared). Isn't this 2/3 rather than 1/2? I'm picturing a list element today as essentially a pointer to a type object pointer + int (3 units in all), and a type object pointer + int (2 units in all) "tomorrow". Throw in refcounts too and the ratio likely gets closer to 1. > - *Homogeneous* lists (where all elements have the same type -- > i.e. arrays) can be represented more efficiently by having only one > copy of the type pointer. This was an idea for ABC (whose type system > required all container types to be homogenous) that was never > implemented (because in practice the type check wasn't always applied, > and the top-level namespace used by the interactive command > interpreter violated all the rules). Well, Python already has homogeneous int lists (array.array), and while they save space they suffer in speed due to needing to wrap raw ints "in an object" upon reference and unwrap them upon storage. > - Reference count manipulations could be done by a macro (or C++ > behind-the-scense magic using copy constructors and destructors) that > calls a function in the type object -- i.e. each object could decide > on its own reference counting implementation :-) You don't need to switch representations to get that, though, right? That is, I don't see anything stopping today's type objects from growing __incref__ and __decref__ slots -- except for common sense . An apparent ramification I don't see above that may actually be worth something : - In "i = j + k", the eval stack could contain the ints directly, instead of pointers to the ints. So fetching the value of i takes two loads (get the type pointer + the variant) from adjacent stack locations, instead of today's load-the-pointer + follow-the-pointer (to some other part of memory); similarly for fetching the value of j. Then the sum can be stored *directly* into the stack too, without today's need for allocating and wrapping it in "an int object" first. Possibly happy variant: on top of the above, *don't* exempt ints from refcounting. Let 'em incref and decref like everything else. Give them an intial refcount of max_count/2, and in the exceedingly unlikely event a decref on an int ever sees zero, the int "destructor" simply resets the refcount to max_count/2 and is otherwise a nop. semi-thinking-semi-aloud-ly y'rs - tim From ping@lfw.org Sat Jun 12 09:05:06 1999 From: ping@lfw.org (Ka-Ping Yee) Date: Sat, 12 Jun 1999 01:05:06 -0700 (PDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <004601beb409$8c535750$f29b12c2@pythonware.com> Message-ID: On Fri, 11 Jun 1999, Fredrik Lundh wrote: > > Two new methods startswith and endswith act like their Java cousins. > > is it just me, or do those method names suck? > > begin? starts_with? startsWith? (ouch) > has_prefix? I'm quite happy with "startswith" and "endswith". I mean, they're a bit long, i suppose, but i can't think of anything better. You definitely want to avoid has_prefix, as that compounds the has_key vs. hasattr issue. x.startswith("foo") x[:3] == "foo" x.startswith(y) x[:len(y)] == y Hmm. I guess it doesn't save you much typing until y is an expression. But it's still a lot easier to read. !ping From ping@lfw.org Sat Jun 12 09:12:38 1999 From: ping@lfw.org (Ka-Ping Yee) Date: Sat, 12 Jun 1999 01:12:38 -0700 (PDT) Subject: [Python-Dev] join() In-Reply-To: <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> Message-ID: On Fri, 11 Jun 1999, Skip Montanaro wrote: > > BAW> The only reason for making it a builtin would be to avoid pulling > BAW> in all of string just to get join. > > I still don't understand the motivation for making it a builtin instead of a > method of the types it operates on. Making it a builtin seems very > un-object-oriented to me. Builtin-hood makes it possible for one method to apply to many types (or a heterogeneous list of things). I think i'd support the def join(list, sep=None): if sep is None: result = list[0] for item in list[1:]: result = result + item else: result = list[0] for item in list[1:]: result = result + sep + item idea, basically a reduce(operator.add...) with an optional separator -- *except* my main issue would be to make sure that the actual implementation optimizes the case of joining a list of strings. string.join() currently seems like the last refuge for those wanting to avoid O(n^2) time when assembling many small pieces in string buffers, and i don't want it to see it go away. !ping From fredrik@pythonware.com Sat Jun 12 10:13:59 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 12 Jun 1999 11:13:59 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us><008601beb3da$5e0a7a60$f29b12c2@pythonware.com> <14177.20510.818041.110989@anthem.cnri.reston.va.us> Message-ID: <00c301beb4b3$e84e3de0$f29b12c2@pythonware.com> > FL> fwiw, the Unicode module available from pythonware.com > FL> implements them all, and more importantly, it can be com- > FL> piled for either 8-bit or 16-bit characters... > > Are these separately available? I don't see them under downloads. > Send me a URL, and if I can figure out how to get CVS to add files to > the branch :/, maybe I can check this in so people can play with it. it's under: http://www.pythonware.com/madscientist/index.htm but I've teamed up with Mark H. to update the stuff a bit, test it with his CE port, and produce a set of patches. I'm working on this in this very moment. btw, as for the "missing methods in the string type" issue, my suggestion is to merge the source code into a unified string module, which is compiled twice (or three times, the day we find that we need a 32-bit string type). don't waste any time cutting and pasting until we've sorted that one out... From fredrik@pythonware.com Sat Jun 12 10:31:08 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 12 Jun 1999 11:31:08 +0200 Subject: [Python-Dev] String methods... finally References: <000401beb46e$58b965a0$5ba22299@tim> Message-ID: <00fb01beb4b6$4df59420$f29b12c2@pythonware.com> > expandtabs is used frequently in IDLE and even pyclbr.py now. Curiously, > though, they almost never want the tab-expanded string, but rather its len. looked in stropmodule.c lately: static PyObject * strop_expandtabs(self, args) ... /* First pass: determine size of output string */ ... /* Second pass: create output string and fill it */ ... (btw, I originally wrote that code for pythonworks ;-) how about an "expandtabslength" method? or maybe we should add lazy evaluation of strings! From fredrik@pythonware.com Sat Jun 12 10:49:07 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 12 Jun 1999 11:49:07 +0200 Subject: [Python-Dev] join() References: <199906111851.OAA04105@eric.cnri.reston.va.us> <14177.24219.94236.485421@anthem.cnri.reston.va.us> Message-ID: <014001beb4b9$63f1e820$f29b12c2@pythonware.com> > The only reason for making it a builtin would be to avoid pulling in > all of string just to get join. another reason is that you might be able to avoid a unicode module... From tismer@appliedbiometrics.com Sat Jun 12 14:27:45 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Sat, 12 Jun 1999 15:27:45 +0200 Subject: [Python-Dev] More flexible namespaces. References: <008d01be92b2$c56ef5d0$0801a8c0@bobcat> <199904300300.XAA00608@eric.cnri.reston.va.us> <37296096.D0C9C2CC@appliedbiometrics.com> <199904301517.LAA01422@eric.cnri.reston.va.us> Message-ID: <37626051.C1EA8AE0@appliedbiometrics.com> Guido van Rossum wrote: > > > From: Christian Tismer > > > I'd really like to look into that. > > Also I wouldn't worry too much about speed, since this is > > such a cool feature. It might even be a speedup in some cases > > which otherwise would need more complex handling. > > > > May I have a look? > > Sure! > > (I've forwarded Christian the files per separate mail.) > > I'm also interested in your opinion on how well thought-out and robust > the patches are -- I've never found the time to do a good close > reading of them. Coming back from the stackless task with is finished now, I popped this task from my stack. I had a look and it seems well-thought and robust so far. To make a more trustable claim, I would need to build and test it. Is this still of interest, or should I drop it? The follow-ups in this thread indicated that the opinions about flexible namespaces were quite mixed. So, should I waste time in building and testing or better save it? chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw@python.org Sat Jun 12 18:16:28 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Sat, 12 Jun 1999 13:16:28 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <008601beb3da$5e0a7a60$f29b12c2@pythonware.com> <14177.20510.818041.110989@anthem.cnri.reston.va.us> <00c301beb4b3$e84e3de0$f29b12c2@pythonware.com> Message-ID: <14178.38380.734976.164568@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> btw, as for the "missing methods in the string type" FL> issue, my suggestion is to merge the source code into FL> a unified string module, which is compiled twice (or FL> three times, the day we find that we need a 32-bit FL> string type). don't waste any time cutting and FL> pasting until we've sorted that one out... Very good. Give me the nod when the sorting algorithm halts. From tim_one@email.msn.com Sat Jun 12 19:28:13 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 14:28:13 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com> Message-ID: <000101beb501$55fb9b60$ce9e2299@tim> [Skip Montanaro] > Any reason why join should be a builtin and not a method available just > to sequences? Would there some valid interpretation of > > join( {'a': 1} ) > join( 1 ) > > ? If not, I vote for method-hood, not builtin-hood. Same here, except as a method we've got it twice backwards : it should be a string method, but a method of the *separator*: sep.join(seq) same as convert each elt in seq to a string of the same flavor as sep, then paste the converted strings together with sep between adjacent elements So " ".join(list) delivers the same result as today's string.join(map(str, list), " ") and L" ".join(list) does much the same tomorrow but delivers a Unicode string (or is the "L" for Lundh string ?). It looks odd at first, but the more I play with it the more I think it's "the right thing" to do: captures everything that's done today, plus the most common idiom (mapping str first across the sequence) on top of that, adapts seamlessly (from the user's view) to new string types, and doesn't invite uselessly redundant generalization to non-sequence types. One other attraction perhaps unique to me: I can never remember whether string.join's default separator is a blank or a null string! Explicit is better than implicit . the-heart-of-a-join-is-the-glue-ly y'rs - tim From tim_one@email.msn.com Sat Jun 12 19:28:18 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 14:28:18 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <00fb01beb4b6$4df59420$f29b12c2@pythonware.com> Message-ID: <000201beb501$578548a0$ce9e2299@tim> [Tim] > expandtabs is used frequently in IDLE and even pyclbr.py now. > Curiously, though, they almost never want the tab-expanded string, > but rather its len. [/F] > looked in stropmodule.c lately: > > static PyObject * > strop_expandtabs(self, args) > ... > /* First pass: determine size of output string */ > ... > /* Second pass: create output string and fill it */ > ... > > (btw, I originally wrote that code for pythonworks ;-) Yes, it's nice code! The irony was the source of my "curiously" . > how about an "expandtabslength" method? Na, it's very specialized, easy to spell by hand, and even IDLE/pyclbr don't really need more speed in this area. From an end-user's view, it's much odder that Python supplies expandtabs but not the converse string.tabify(string, leadingwhitespaceonly=1, tabwidth=8). > or maybe we should add lazy evaluation of strings! In the compiler world, there's a famous story about a PL/1 compiler that blew everyone else out of the water by noticing that the inner loop of a benchmark extended a string by one character on each trip around, but only *used* the length of the string. So it skipped code for the quadratic-time repeated strcats, instead just adding 1 to an int representing the length. i.e.-lazy-strings-aren't-lazy-enough-ly y'rs - tim From tim_one@email.msn.com Sat Jun 12 22:37:08 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 17:37:08 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <3761098D.A56F58A8@ifu.net> Message-ID: <000501beb51b$b9cb3780$ce9e2299@tim> [Aaron Watters] > ... > I thought it would be good to be able to do the following loop > with Numeric arrays > > for x in array1: > array2[x] = array3[x] + array4[x] > > without any memory management being involved. Right now, I think the > for loop has to continually dynamically allocate each new x Actually not, it just binds x to the sequence of PyObject*'s already in array1, one at a time. It does bump & drop the refcount on that object a lot. Also irksome is that it keeps allocating/deallocating a little integer on each trip, for the under-the-covers loop index! Marc-Andre (I think) had/has a patch to worm around that, but IIRC it didn't make much difference (wouldn't expect it to, though -- not if the loop body does any real work). One thing a smarter Python compiler could do is notice the obvious : the *internal* incref/decref operations on the object denoted by x in the loop above must cancel out, so there's no need to do any of them. "internal" == those due to the routine actions of the PVM itself, while pushing and popping the eval stack. Exploiting that is tedious; e.g., inventing a pile of opcode variants that do the same thing as today's except skip an incref here and a decref there. > and intermediate sum (and immediate deallocate them) The intermediate sum is allocated each time, but not deallocated (the pre-existing object at array2[x] *may* be deallocated, though). > and that makes the loop piteously slow. A lot of things conspire to make it slow. David is certainly right that, in this particular case, array2[array1] = array3[array1] + etc worms around the worst of them. > The idea replacing pyobject *'s with a struct [typedescr *, data *] > was a space/time tradeoff to speed up operations like the above > by eliminating any need for mallocs or other memory management.. Fleshing out details may make it look less attractive. For machines where ints are no wider than pointers, the "data *" can be replaced with the int directly and then there's real potential. If for a float the "data*" really *is* a pointer, though, what does it point *at*? Some dynamically allocated memory to hold the float appears to be the only answer, and you're right back at the problem you were hoping to avoid. Make the "data*" field big enough to hold a Python float directly, and the descriptor likely zooms to 128 bits (assuming float is IEEE double and the machine requires natural alignment). Let's say we do that. Where does the "+" implementation get the 16 bytes it needs to store its result? The space presumably already exists in the slot indexed by array2[x], but the "+" implementation has no way to *know* that. Figuring it out requires non-local analysis, which is quite a few steps beyond what Python's compiler can do today. Easiest: internal functions all grow a new PyDescriptor* argument into which they are to write their result's descriptor. The PVM passes "+" the address of the slot indexed by array2[x] if it's smart enough; or, if it's not, the address of the stack slot descriptor into which today's PVM *would* push the result. In the latter case the PVM would need to copy those 16 bytes into the slot indexed by array2[x] later. Neither of those are simple as they sound, though, at least because if array2[x] holds a descriptor with a real pointer in its variant half, the thing to which it points needs to get decref'ed iff the add succeeds. It can get very messy! > I really can't say whether it'd be worth it or not without some sort of > real testing. Just a thought. It's a good thought! Just hard to make real. but-if-michael-hudson-keeps-hacking-at-bytecodes-and-christian- keeps-trying-to-prove-he's-crazier-than-michael-by-2001- we'll-be-able-to-generate-optimized-vector-assembler-for- it-ly y'rs - tim From tim_one@email.msn.com Sat Jun 12 22:37:14 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 17:37:14 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <375FC062.62850DE5@ifu.net> Message-ID: <000601beb51b$bc723ba0$ce9e2299@tim> [Aaron Watters] > ... > Another fix would be to put the refcount in the static side with > no speed penalty > > (typedescr > repr* ----------------------> data > refcount > ) > > but would that be wasteful of space? The killer is for types where repr* is a real pointer: x = [Whatever()] y = x[:] Now we have two physically distinct descriptors pointing at the same thing, and so also two distinct refcounts for that thing -- impossible to keep them in synch efficiently; "del y" has no way efficient way to find the refcount hiding in x. tbings-and-and-their-refcounts-are-monogamous-ly y'rs - tim From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Sun Jun 13 18:56:33 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Sun, 13 Jun 1999 13:56:33 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com> <000101beb501$55fb9b60$ce9e2299@tim> Message-ID: <14179.61649.286195.248429@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Same here, except as a method we've got it twice backwards TP> : it should be a string method, but a method of the TP> *separator*: TP> sep.join(seq) TP> same as | convert each elt in seq to a string of the same flavor as | sep, then paste the converted strings together with sep | between adjacent elements TP> So TP> " ".join(list) TP> delivers the same result as today's TP> string.join(map(str, list), " ") TP> and TP> L" ".join(list) TP> does much the same tomorrow but delivers a Unicode string (or TP> is the "L" for Lundh string ?). TP> It looks odd at first, but the more I play with it the more I TP> think it's "the right thing" to do At first glance, I like this proposal a lot. I'd be happy to code it up if David'll stop throwing those rocks. Whether or not they hit me, they still hurt :) -Barry From tim_one@email.msn.com Sun Jun 13 20:34:57 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 13 Jun 1999 15:34:57 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14179.61649.286195.248429@anthem.cnri.reston.va.us> Message-ID: <000801beb5d3$d1fd06e0$ae9e2299@tim> > >>>>> "TP" == Tim Peters writes: > > TP> Same here, except as a method we've got it twice backwards > TP> : it should be a string method, but a method of the > TP> *separator*: > > TP> sep.join(seq) > > TP> same as > > | convert each elt in seq to a string of the same flavor as > | sep, then paste the converted strings together with sep > | between adjacent elements > > TP> So > > TP> " ".join(list) > > TP> delivers the same result as today's > > TP> string.join(map(str, list), " ") > > TP> and > > TP> L" ".join(list) > > TP> does much the same tomorrow but delivers a Unicode string (or > TP> is the "L" for Lundh string ?). > > TP> It looks odd at first, but the more I play with it the more I > TP> think it's "the right thing" to do Barry, did it ever occur to you to that this fancy Emacs quoting is pig ugly ? [Barry A. Warsaw] > At first glance, I like this proposal a lot. That's a bit scary -- even I didn't like it at first glance. It kept growing on me, though, especially after a trivial naming trick: space, tab, null = ' ', '\t', '' ... sentence = space.join(list) table = tab.join(list) squashed = null.join(list) That's so beautifully self-descriptive I cried! Well, I actually jerked my leg and stubbed my little toe badly, but it's healing nicely, thank you. Note the naturalness too of creating zippier bound method objects for the kinds of join you're doing most often: spacejoin = ' '.join tabjoin = '\t'.join etc. I still like it more the more I play with it. > I'd be happy to code it up if David'll stop throwing those rocks. David warmed up to it in pvt email (his first response was the expected one-liner "Wacky!"). Other issues: + David may want C.join(T) generalized to other classes C and argument types T. So far my response to all such generalizations has been "wacky!" , but I don't think that bears one way or t'other on whether StringType.join(SequenceType) makes good sense on its own. + string.join(seq) doesn't currently convert seq elements to string type, and in my vision it would. At least three of us admit to mapping str across seq anyway before calling string.join, and I think it would be a nice convenience: I think there's no confusion because there's nothing sensible string.join *could* do with a non-string seq element other than convert it to string. The primary effect of string.join griping about a non-string seq element today is that my if not ok: sys.__stdout__.write("not ok, args are " + string.join(args) + "\n") debugging output blows up instead of being helpful <0.8 wink>. If Guido is opposed to being helpful, though , the auto-convert bit isn't essential. > Whether or not they hit me, they still hurt :) I know they do, Barry. That's why I never throw rocks at you. If you like, I'll have a word with David's ISP. if-this-was-a-flame-war-we're-too-civilized-to-live-long-enough-to- reproduce-ly y'rs - tim From da@ski.org Sun Jun 13 20:48:59 1999 From: da@ski.org (David Ascher) Date: Sun, 13 Jun 1999 12:48:59 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14179.61649.286195.248429@anthem.cnri.reston.va.us> Message-ID: On Sun, 13 Jun 1999, Barry A. Warsaw wrote: > At first glance, I like this proposal a lot. I'd be happy to code it > up if David'll stop throwing those rocks. Whether or not they hit me, > they still hurt :) I like it too, since you ask. =) (When you get a chance, could you bring the rocks back? I only have a limited supply. Thanks). --david From guido@CNRI.Reston.VA.US Mon Jun 14 15:46:34 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 14 Jun 1999 10:46:34 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Sat, 12 Jun 1999 14:28:13 EDT." <000101beb501$55fb9b60$ce9e2299@tim> References: <000101beb501$55fb9b60$ce9e2299@tim> Message-ID: <199906141446.KAA00733@eric.cnri.reston.va.us> > Same here, except as a method we've got it twice backwards : it > should be a string method, but a method of the *separator*: > > sep.join(seq) Funny, but it does seem right! Barry, go for it... --Guido van Rossum (home page: http://www.python.org/~guido/) From klm@digicool.com Mon Jun 14 16:09:58 1999 From: klm@digicool.com (Ken Manheimer) Date: Mon, 14 Jun 1999 11:09:58 -0400 Subject: [Python-Dev] String methods... finally Message-ID: <613145F79272D211914B0020AFF640191D1BAF@gandalf.digicool.com> > [Skip Montanaro] > > I see the following functions in string.py that could > > reasonably be methodized: > > > > ljust, rjust, center, expandtabs, capwords > > > > Also zfill. > > > > [Barry A. Warsaw] > > What do you think, are these important enough to add? I think expandtabs is worthwhile. Though i wouldn't say i use it frequently, when i do use it i'm thankful it's there - it's something i'm really glad to have precooked, since i'm generally not looking for the distraction when i do happen to need it... Ken klm@digicool.com From guido@CNRI.Reston.VA.US Mon Jun 14 16:12:33 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 14 Jun 1999 11:12:33 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: Your message of "Sat, 12 Jun 1999 02:19:37 EDT." <000101beb49b$8c27c620$b19e2299@tim> References: <000101beb49b$8c27c620$b19e2299@tim> Message-ID: <199906141512.LAA00793@eric.cnri.reston.va.us> [me] > > - Thus, lists take double the memory assuming they reference objects > > that also exist elsewhere. This affects the performance of slices > > etc. > > > > - On the other hand, a list of ints takes half the memory (given that > > most of those ints are not shared). [Tim] > Isn't this 2/3 rather than 1/2? I'm picturing a list element today as > essentially a pointer to a type object pointer + int (3 units in all), and a > type object pointer + int (2 units in all) "tomorrow". Throw in refcounts > too and the ratio likely gets closer to 1. An int is currently 3 units: type, refcnt, value. (The sepcial int allocator means that there's no malloc overhead.) A list item is one unit. So a list of N ints is 4N units (+ overhead). In the proposed scheme, there would be 2 units. That makes a factor of 1/2 for me... > Well, Python already has homogeneous int lists (array.array), and while they > save space they suffer in speed due to needing to wrap raw ints "in an > object" upon reference and unwrap them upon storage. Which would become faster with the proposed scheme since it would not require any heap allocation (presuming 2-unit structs can be passed around as function results). > > - Reference count manipulations could be done by a macro (or C++ > > behind-the-scense magic using copy constructors and destructors) that > > calls a function in the type object -- i.e. each object could decide > > on its own reference counting implementation :-) > > You don't need to switch representations to get that, though, right? That > is, I don't see anything stopping today's type objects from growing > __incref__ and __decref__ slots -- except for common sense . Eh, indeed . > An apparent ramification I don't see above that may actually be worth > something : > > - In "i = j + k", the eval stack could contain the ints directly, instead of > pointers to the ints. So fetching the value of i takes two loads (get the > type pointer + the variant) from adjacent stack locations, instead of > today's load-the-pointer + follow-the-pointer (to some other part of > memory); similarly for fetching the value of j. Then the sum can be stored > *directly* into the stack too, without today's need for allocating and > wrapping it in "an int object" first. I though this was assumed all the time? I mentioned "no heap allocation" above before I read this. I think this is the reason why it was proposed at all: things for which the value fits in a unit don't live on the heap at all, *without* playing tricks with pointer representations. > Possibly happy variant: on top of the above, *don't* exempt ints from > refcounting. Let 'em incref and decref like everything else. Give them an > intial refcount of max_count/2, and in the exceedingly unlikely event a > decref on an int ever sees zero, the int "destructor" simply resets the > refcount to max_count/2 and is otherwise a nop. Don't get this -- there's no object on the heap to hold the refcnt. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@python.org Mon Jun 14 19:47:32 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Mon, 14 Jun 1999 14:47:32 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us> <000801beb5d3$d1fd06e0$ae9e2299@tim> Message-ID: <14181.20036.857729.999835@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: Timbot> Barry, did it ever occur to you to that this fancy Emacs Timbot> quoting is pig ugly ? wink> + string.join(seq) doesn't currently convert seq elements to wink> string type, and in my vision it would. At least three of wink> us admit to mapping str across seq anyway before calling wink> string.join, and I think it would be a nice convenience: Check the CVS branch. It does seem pretty cool! From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Mon Jun 14 19:48:10 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Mon, 14 Jun 1999 14:48:10 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us> Message-ID: <14181.20074.728230.764485@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: DA> (When you get a chance, could you bring the rocks back? I DA> only have a limited supply. Thanks). Sorry, I need them to fill up the empty spaces in my skull. -Barry From tim_one@email.msn.com Tue Jun 15 03:50:08 1999 From: tim_one@email.msn.com (Tim Peters) Date: Mon, 14 Jun 1999 22:50:08 -0400 Subject: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <14181.20036.857729.999835@anthem.cnri.reston.va.us> Message-ID: <000001beb6d9$c82e7980$069e2299@tim> >> wink> + string.join(seq) [etc] [Barry] > Check the CVS branch. It does seem pretty cool! It's even more fun to play with than to argue about . Thank you, Barry! A bug: >>> 'ab'.endswith('b',0,1) # right 0 >>> 'ab'.endswith('ab',0,1) # wrong 1 >>> 'ab'.endswith('ab',0,0) # wrong 1 >>> Two legit compiler warnings from a previous checkin: Objects\intobject.c(236) : warning C4013: 'isspace' undefined; assuming extern returning int Objects\intobject.c(243) : warning C4013: 'isalnum' undefined; assuming extern returning int One docstring glitch ("very" -> "every"): >>> print ''.join.__doc__ S.join(sequence) -> string Return a string which is the concatenation of the string representation of very element in the sequence. The separator between elements is S. >>> "-".join("very nice indeed! ly".split()) + " y'rs - tim" From MHammond@skippinet.com.au Tue Jun 15 04:13:03 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Tue, 15 Jun 1999 13:13:03 +1000 Subject: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <000001beb6d9$c82e7980$069e2299@tim> Message-ID: <00e901beb6dc$fc830d60$0801a8c0@bobcat> > "-".join("very nice indeed! ly".split()) + " y'rs - tim" But now the IDLE "CallTips" extenion seems lame. Typing >>> " ".join( doesnt yield the help, where: >>> s=" "; s.join( does :-) Very cute, I must say. The biggest temptation is going to be, as I mentioned, avoiding the use of this stuff for "general" code. Im still unconvinced the "sep.join" concept is natural, but string methods in general sure as hell are. Guido almost hinted that post 1.5.2 interim release(s?) would be acceptable, so long as he didnt have to do it! Im tempted to volunteer to agree to do something for Windows, and if no other platform biggots volunteer, I wont mind in the least :-) I realize it still needs settling down, but this is too good to keep to "ourselves" (being CVS enabled people) for too long ;-) Mark. From tim_one@email.msn.com Tue Jun 15 06:29:03 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 01:29:03 -0400 Subject: [Python-Dev] RE: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <199906141512.LAA00793@eric.cnri.reston.va.us> Message-ID: <000a01beb6ef$fac66ea0$069e2299@tim> [Guido] >>> - On the other hand, a list of ints takes half the memory (given that >>> most of those ints are not shared). [Tim] >> Isn't this 2/3 rather than 1/2? [yadda yadda] [Guido] > An int is currently 3 units: type, refcnt, value. (The sepcial int > allocator means that there's no malloc overhead.) A list item is one > unit. So a list of N ints is 4N units (+ overhead). In the proposed > scheme, there would be 2 units. That makes a factor of 1/2 for me... Well, if you count the refcount, sure . Moving on, implies you're not contemplating making the descriptor big enough to hold a float (else it would still be 4 units assuming natural alignment), in turn implying that *only* ints would get the space advantage in lists/tuples? Plus maybe special-casing the snot out of short strings? >> Well, Python already has homogeneous int lists (array.array), >> and while they save space they suffer in speed ... > Which would become faster with the proposed scheme since it would not > require any heap allocation (presuming 2-unit structs can be passed > around as function results). They can be in any std (even reasonable) C (or C++). If this gets serious, though, strongly suggest timing it on important compiler + platform combos, especially RISC. You can probably *count* on a PyObject* result getting returned in a register, but depressed C++ compiler jockeys have been known to treat struct/class returns via an unoptimized chain of copy constructors. Probably better to allocate "result space" in the caller and pass that via reference to the callee. With care, you can get the result written into its final resting place efficiently then, more efficiently than even a gonzo globally optimizing compiler could figure out (A calls B call C calls D, and A can tell D exactly where to store the result if it's explicit). >> [other ramifications for >> "i = j + k" >> ] > I though this was assumed all the time? Apparently it was! At least by you . Now by me too; no problem. >> [refcount-on-int drivel] > Don't get this -- there's no object on the heap to hold the refcnt. I don't get it either. Desperation? The idea that incref/decref may need to be treated as virtual methods (in order to exempt ints or other possible direct values) really disturbs me -- incref/decref happen *all* the time, explicit integer ops only some of the time. Turning incref/decref into indirected function calls doesn't sound promising at all. Injecting a test-branch guard via macro sounds faster but still icky, and especially if the set of exempt types isn't a singleton. no-positive-suggestions-just-grousing-ly y'rs - tim From tim_one@email.msn.com Tue Jun 15 07:17:02 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 02:17:02 -0400 Subject: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <00e901beb6dc$fc830d60$0801a8c0@bobcat> Message-ID: <001201beb6f6$af0987c0$069e2299@tim> [Mark Hammond] > ... > But now the IDLE "CallTips" extenion seems lame. > > Typing > >>> " ".join( > > doesnt yield the help, where: > >>> s=" "; s.join( > > does :-) No Windows Guy will be stymied by how to hack that! Hint: string literals always end with one of two characters . > Very cute, I must say. The biggest temptation is going to be, as I > mentioned, avoiding the use of this stuff for "general" code. Im still > unconvinced the "sep.join" concept is natural, but string methods in > general sure as hell are. sep.join bothered me until I gave the separator a name (a la the "space.join, tab.join", etc examples earlier). Then it looked *achingly* natural! Using a one-character literal instead still rubs me the wrong way, although for some reason e.g. ", ".join(seq) no longer does. I can't account for any of it, but I know what I like . > Guido almost hinted that post 1.5.2 interim release(s?) would be > acceptable, so long as he didnt have to do it! Im tempted to volunteer to > agree to do something for Windows, and if no other platform biggots > volunteer, I wont mind in the least :-) I realize it still > needs settling down, but this is too good to keep to "ourselves" (being > CVS enabled people) for too long ;-) Yes, I really like the new string methods too! And I want to rewrite all of IDLE to use them ASAP . damn-the-users-let's-go-nuts-ly y'rs - tim From fredrik@pythonware.com Tue Jun 15 08:10:28 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 15 Jun 1999 09:10:28 +0200 Subject: [Python-Dev] Re: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> Message-ID: <006801beb6fe$27490d80$f29b12c2@pythonware.com> > wink> + string.join(seq) doesn't currently convert seq elements to > wink> string type, and in my vision it would. At least three of > wink> us admit to mapping str across seq anyway before calling > wink> string.join, and I think it would be a nice convenience: hmm. consider the following: space = " " foo = L"foo" bar = L"bar" result = space.join((foo, bar)) what should happen if you run this: a) Python raises an exception b) result is an ordinary string object c) result is a unicode string object From ping@lfw.org Tue Jun 15 08:24:33 1999 From: ping@lfw.org (Ka-Ping Yee) Date: Tue, 15 Jun 1999 00:24:33 -0700 (PDT) Subject: [Python-Dev] Re: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <000001beb6d9$c82e7980$069e2299@tim> Message-ID: On Mon, 14 Jun 1999, Tim Peters wrote: > > A bug: > > >>> 'ab'.endswith('b',0,1) # right > 0 > >>> 'ab'.endswith('ab',0,1) # wrong > 1 > >>> 'ab'.endswith('ab',0,0) # wrong > 1 > >>> I assumed you meant that the extra arguments should be slices on the string being searched, i.e. specimen.startswith(text, start, end) is equivalent to specimen[start:end].startswith(text) without the overhead of slicing the specimen? Or did i understand you correctly? > Return a string which is the concatenation of the string representation > of very element in the sequence. The separator between elements is S. > >>> > > "-".join("very nice indeed! ly".split()) + " y'rs - tim" Yes, i have to agree that this (especially once you name the separator string) is a pretty nice way to present the "join" functionality. !ping "Is it so small a thing, To have enjoyed the sun, To have lived light in the Spring, To have loved, to have thought, to have done; To have advanced true friends, and beat down baffling foes-- That we must feign bliss Of a doubtful future date, And while we dream on this, Lose all our present state, And relegate to worlds... yet distant our repose?" -- Matthew Arnold From MHammond@skippinet.com.au Tue Jun 15 09:28:55 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Tue, 15 Jun 1999 18:28:55 +1000 Subject: [Python-Dev] RE: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <006801beb6fe$27490d80$f29b12c2@pythonware.com> Message-ID: <00f801beb709$1c874b90$0801a8c0@bobcat> > hmm. consider the following: > > space = " " > foo = L"foo" > bar = L"bar" > result = space.join((foo, bar)) > > what should happen if you run this: > > a) Python raises an exception > b) result is an ordinary string object > c) result is a unicode string object Well, we could take this to the extreme, and allow _every_ object to grow a join method, where join attempts to cooerce to the same type. Thus: " ".join([L"foo", L"bar"]) -> "foo bar" L" ".join(["foo", "bar"]) -> L"foo bar" " ".join([1,2]) -> "1 2" 0.join(['1',2']) -> 102 [].join([...]) # exercise for the reader ;-) etc. Mark. From ping@lfw.org Tue Jun 15 09:50:34 1999 From: ping@lfw.org (Ka-Ping Yee) Date: Tue, 15 Jun 1999 01:50:34 -0700 (PDT) Subject: [Python-Dev] Re: [Python-Dev] RE: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <00f801beb709$1c874b90$0801a8c0@bobcat> Message-ID: On Tue, 15 Jun 1999, Mark Hammond wrote: > > hmm. consider the following: > > > > space = " " > > foo = L"foo" > > bar = L"bar" > > result = space.join((foo, bar)) > > > > what should happen if you run this: > > > > a) Python raises an exception > > b) result is an ordinary string object > > c) result is a unicode string object > > Well, we could take this to the extreme, and allow _every_ object to grow a > join method, where join attempts to cooerce to the same type. I think i'd agree with Mark's answer for this situation, though i don't know about adding 'join' methods to other types. I see two arguments that can be made here: For b): the result should match the type of the object on which the method was called. This way the type of the result more easily determinable by the programmer or reader. Also, since the type of the result is immediately known to the "join" code, each member of the passed-in sequence need only be fetched once, and a __getitem__-style generator can easily stand in for the sequence. For c): the result should match the "biggest" type among the operands. This behaviour is consistent with what you would get if you added all the operands together. Unfortunately this means you have to see all the operands before you know the type of the result, which means you either scan twice or convert potentially the whole result. b) weighs more strongly in my opinion, so i think the right thing to do is to match the type of the separator. (But if a Unicode string contains characters outside of the Latin-1 range, is it supposed to raise an exception on an attempt to convert to an ordinary string? In that case, the actual behaviour of the above example would be a) and i'm not sure if that would get annoying fast.) -- ?!ng "In the sciences, we are now uniquely privileged to sit side by side with the giants on whose shoulders we stand." -- Gerald Holton From gstein@lyra.org Tue Jun 15 10:05:43 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 15 Jun 1999 02:05:43 -0700 Subject: [Python-Dev] Re: String methods... finally References: Message-ID: <37661767.37D8E370@lyra.org> Ka-Ping Yee wrote: >... > (But if a Unicode string contains characters outside of > the Latin-1 range, is it supposed to raise an exception > on an attempt to convert to an ordinary string? In that > case, the actual behaviour of the above example would be > a) and i'm not sure if that would get annoying fast.) I forget the "last word" on this, but (IMO) str(unicode_object) should return a UTF-8 encoded string. Cheers, -g p.s. what's up with Mailman... it seems to have broken badly on the [Python-Dev] insertion... I just stripped a bunch of 'em -- Greg Stein, http://www.lyra.org/ From fredrik@pythonware.com Tue Jun 15 10:48:40 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 15 Jun 1999 11:48:40 +0200 Subject: [Python-Dev] Re: String methods... finally References: Message-ID: <003e01beb714$55d7fd80$f29b12c2@pythonware.com> > > > a) Python raises an exception > > > b) result is an ordinary string object > > > c) result is a unicode string object > > > > Well, we could take this to the extreme, and allow _every_ object to grow a > > join method, where join attempts to cooerce to the same type. well, I think that unicode strings and ordinary strings should behave like "strings" where possible, just like integers, floats, long integers and complex values be- have like "numbers" in many (but not all) situations. if we make unicode strings easier to mix with ordinary strings, we don't necessarily have to make integers and lists easier to mix with strings too... (people who want that can use Tcl instead ;-) > I think i'd agree with Mark's answer for this situation, though > i don't know about adding 'join' methods to other types. I see two > arguments that can be made here: > > For b): the result should match the type of the object > on which the method was called. This way the type of > the result more easily determinable by the programmer > or reader. Also, since the type of the result is > immediately known to the "join" code, each member of the > passed-in sequence need only be fetched once, and a > __getitem__-style generator can easily stand in for the > sequence. > > For c): the result should match the "biggest" type among > the operands. This behaviour is consistent with what > you would get if you added all the operands together. > Unfortunately this means you have to see all the operands > before you know the type of the result, which means you > either scan twice or convert potentially the whole result. > > b) weighs more strongly in my opinion, so i think the right > thing to do is to match the type of the separator. > > (But if a Unicode string contains characters outside of > the Latin-1 range, is it supposed to raise an exception > on an attempt to convert to an ordinary string? In that > case, the actual behaviour of the above example would be > a) and i'm not sure if that would get annoying fast.) exactly. there are some major issues hidden in here, including: 1) what should "str" do for unicode strings? 2) should join really try to convert its arguments? 3) can "str" really raise an exception for a built-in type? 4) should code written by americans fail when used in other parts of the world? based on string-sig input, the unicode class currently solves (1) by returning a UTF-8 encoded version of the unicode string contents. this was chosen to make sure that the answer to (3) is "no, never", and that the an- swer (4) is "not always, at least" -- we've had enough of that, thank you: http://www.lysator.liu.se/%e5ttabitars/7bit-example.txt if (1) is a reasonable solution (I think it is), I think the answer to (2) should be no, based on the rule of least surprise. Python has always required me to explicitly state when I want to convert things in a way that may radically change their meaning. I see little reason to abandon that in 1.6. From gstein@lyra.org Tue Jun 15 11:01:09 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 15 Jun 1999 03:01:09 -0700 Subject: [Python-Dev] Re: [Python-Dev] Re: String methods... finally References: <003e01beb714$55d7fd80$f29b12c2@pythonware.com> Message-ID: <37662465.682FA81B@lyra.org> Fredrik Lundh wrote: >... > if (1) is a reasonable solution (I think it is), I think the > answer to (2) should be no, based on the rule of least > surprise. Python has always required me to explicitly > state when I want to convert things in a way that may > radically change their meaning. I see little reason to > abandon that in 1.6. Especially because it is such a simple translation: sep.join(sequence) becomes sep.join(map(str, sequence)) Very obvious what is happening. It isn't hard to read, and it doesn't take a lot out of a person to insert that extra phrase. And hey... people can always do: def strjoin(sep, seq): return sep.join(map(str, seq)) And just use strjoin() everywhere if they hate the typing. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm@hypernet.com Tue Jun 15 14:08:08 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Tue, 15 Jun 1999 08:08:08 -0500 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: String methods... finally In-Reply-To: <37662465.682FA81B@lyra.org> Message-ID: <1282670144-103087754@hypernet.com> Greg Stein wrote: ... > And hey... people can always do: > > def strjoin(sep, seq): > return sep.join(map(str, seq)) > > And just use strjoin() everywhere if they hate the typing. Those who hate typing regard it as great injury that they have to define this. Of course, they'll gladly type huge long posts on the subject. But, I agree. string.join(['a', 'b', 3]) currently barfs. L" ".join(seq) should complain if seq isn't all unicode, and same for good old strings. - Gordon From guido@CNRI.Reston.VA.US Tue Jun 15 13:39:09 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 08:39:09 -0400 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: Your message of "Tue, 15 Jun 1999 09:10:28 +0200." <006801beb6fe$27490d80$f29b12c2@pythonware.com> References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> Message-ID: <199906151239.IAA02917@eric.cnri.reston.va.us> > hmm. consider the following: > > space = " " > foo = L"foo" > bar = L"bar" > result = space.join((foo, bar)) > > what should happen if you run this: > > a) Python raises an exception > b) result is an ordinary string object > c) result is a unicode string object The same should happen as for L"foo" + " " + L"bar". --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@mojam.com (Skip Montanaro) Tue Jun 15 13:50:59 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 15 Jun 1999 08:50:59 -0400 (EDT) Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <199906151239.IAA02917@eric.cnri.reston.va.us> References: <14179.61649.286195.248429@anthem.cnri.reston.va.us> <000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <14182.19420.462788.15633@cm-24-29-94-19.nycap.rr.com> Guido> The same should happen as for L"foo" + " " + L"bar". Remind me again, please. What mnemonic is "L" supposed to evoke? Long? Lundh? Are we talking about Unicode strings? If so, why not "U"? Apologies for my increased density. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip@mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From jack@oratrix.nl Tue Jun 15 13:58:05 1999 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 15 Jun 1999 14:58:05 +0200 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: Message by Guido van Rossum , Tue, 15 Jun 1999 08:39:09 -0400 , <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <19990615125805.8CF03303120@snelboot.oratrix.nl> > The same should happen as for L"foo" + " " + L"bar". This is probably the most reasonable solution. Unfortunately it breaks Marks truly novel suggestion that 0.join(1, 2) becomes 102, but I guess we'll have to live with that:-) -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From fredrik@pythonware.com Tue Jun 15 15:28:17 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 15 Jun 1999 16:28:17 +0200 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <00c201beb73b$5fa27b70$f29b12c2@pythonware.com> > > hmm. consider the following: > > > > space = " " > > foo = L"foo" > > bar = L"bar" > > result = space.join((foo, bar)) > > > > what should happen if you run this: > > > > a) Python raises an exception > > b) result is an ordinary string object > > c) result is a unicode string object > > The same should happen as for L"foo" + " " + L"bar". which is? (alright; for the moment, it's (a) for both: >>> import unicode >>> u = unicode.unicode >>> u("foo") + u(" ") + u("bar") Traceback (innermost last): File "", line 1, in ? TypeError: illegal argument type for built-in operation >>> u("foo") + " " + u("bar") Traceback (innermost last): File "", line 1, in ? TypeError: illegal argument type for built-in operation >>> u(" ").join(("foo", "bar")) Traceback (innermost last): File "", line 1, in ? TypeError: first argument must be sequence of unicode strings but that can of course be changed...) From guido@CNRI.Reston.VA.US Tue Jun 15 15:38:32 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 10:38:32 -0400 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: Your message of "Tue, 15 Jun 1999 16:28:17 +0200." <00c201beb73b$5fa27b70$f29b12c2@pythonware.com> References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> <199906151239.IAA02917@eric.cnri.reston.va.us> <00c201beb73b$5fa27b70$f29b12c2@pythonware.com> Message-ID: <199906151438.KAA03355@eric.cnri.reston.va.us> > > The same should happen as for L"foo" + " " + L"bar". > > which is? Whatever it is -- I think we did a lot of reasoning about this, and perhaps we're not quite done -- but I truly believe that whatever is decided, join() should follow. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Tue Jun 15 16:28:11 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Tue, 15 Jun 1999 11:28:11 -0400 (EDT) Subject: [Python-Dev] Re: String methods... finally References: <37661767.37D8E370@lyra.org> Message-ID: <14182.28939.509040.125174@anthem.cnri.reston.va.us> >>>>> "GS" == Greg Stein writes: GS> p.s. what's up with Mailman... it seems to have broken badly GS> on the [Python-Dev] insertion... I just stripped a bunch of GS> 'em Harald Meland just checked in a fix for this, which I'm installing now, so the breakage should be just temporary. -Barry From tim_one@email.msn.com Tue Jun 15 16:33:38 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 11:33:38 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <006801beb6fe$27490d80$f29b12c2@pythonware.com> Message-ID: <000601beb744$70c6f9e0$979e2299@tim> > hmm. consider the following: > > space = " " > foo = L"foo" > bar = L"bar" > result = space.join((foo, bar)) > > what should happen if you run this: > > a) Python raises an exception > b) result is an ordinary string object > c) result is a unicode string object The proposal said #b, or, in general, that the resulting string be of the same flavor as the separator. From tim_one@email.msn.com Tue Jun 15 16:33:40 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 11:33:40 -0400 Subject: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: Message-ID: <000701beb744$71e450c0$979e2299@tim> >> A bug: >> >> >>> 'ab'.endswith('b',0,1) # right >> 0 >> >>> 'ab'.endswith('ab',0,1) # wrong >> 1 >> >>> 'ab'.endswith('ab',0,0) # wrong >> 1 >> >>> [Ka-Ping] > I assumed you meant that the extra arguments should be slices > on the string being searched, i.e. > > specimen.startswith(text, start, end) > > is equivalent to > > specimen[start:end].startswith(text) > > without the overhead of slicing the specimen? Or did i understand > you correctly? Yes, and e.g. 'ab'[0:1] == 'a', which does not end with 'ab'. So these are inconsistent today, and the second is a bug: >>> 'ab'[0:1].endswith('ab') 0 >>> 'ab'.endswith('ab', 0, 1) 1 >>> Or did I misunderstand you ? From gward@cnri.reston.va.us Tue Jun 15 16:41:39 1999 From: gward@cnri.reston.va.us (Greg Ward) Date: Tue, 15 Jun 1999 11:41:39 -0400 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <19990615125805.8CF03303120@snelboot.oratrix.nl>; from Jack Jansen on Tue, Jun 15, 1999 at 02:58:05PM +0200 References: <19990615125805.8CF03303120@snelboot.oratrix.nl> Message-ID: <19990615114139.A3697@cnri.reston.va.us> On 15 June 1999, Jack Jansen said: > > The same should happen as for L"foo" + " " + L"bar". > > This is probably the most reasonable solution. Unfortunately it breaks Marks > truly novel suggestion that 0.join(1, 2) becomes 102, but I guess we'll have > to live with that:-) Careful -- it actually works this way in Perl (well, except that join isn't a method of strings...): $ perl -de 1 [...] DB<2> $sep = 0 DB<3> @list = (1, 2) DB<4> p join ($sep, @list) 102 Cool! Who needs type-checking anyways? Greg -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913 From tim_one@email.msn.com Tue Jun 15 16:58:48 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 11:58:48 -0400 Subject: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <000901beb747$f4531840$979e2299@tim> >> space = " " >> foo = L"foo" >> bar = L"bar" >> result = space.join((foo, bar)) > The same should happen as for L"foo" + " " + L"bar". Then " ".join([" ", 42]) should blow up, and auto-conversion for non-string types needs to be removed from the implementation. The attraction of auto-conversion for me is that I had never once seen string.join blow up where the exception revealed a conceptual error; in every case conversion to string was the intent, and an obvious one at that. Just anal nagging. How about dropping Unicode instead ? Anyway, I'm already on record as saying auto-convert wasn't essential, and join should first and foremost make good sense for string arguments. off-to-work-ly y'rs - tim From MHammond@skippinet.com.au Tue Jun 15 23:29:32 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Wed, 16 Jun 1999 08:29:32 +1000 Subject: [Python-Dev] Re: String methods... finally In-Reply-To: <003e01beb714$55d7fd80$f29b12c2@pythonware.com> Message-ID: <010101beb77e$8af64430$0801a8c0@bobcat> > well, I think that unicode strings and ordinary strings > should behave like "strings" where possible, just like > integers, floats, long integers and complex values be- > have like "numbers" in many (but not all) situations. I obviously missed a few smileys in my post. I was serious that: L" ".join -> Unicode result " ".join -> String result and even " ".join([1,2]) -> "1 2" But integers and lists growing "join" methods was a little tounge in cheek :-) Mark. From da@ski.org Tue Jun 15 23:48:41 1999 From: da@ski.org (David Ascher) Date: Tue, 15 Jun 1999 15:48:41 -0700 (Pacific Daylight Time) Subject: [Python-Dev] mmap Message-ID: Another topic: what are the chances of adding the mmap module to the core distribution? It's restricted to a smallish set of platforms (modern Unices and Win32, I think), but it's quite small, and would be a nice thing to have available in the core, IMHO. (btw, the buffer object needs more documentation) --david From MHammond@skippinet.com.au Tue Jun 15 23:53:00 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Wed, 16 Jun 1999 08:53:00 +1000 Subject: [Python-Dev] String methods... finally In-Reply-To: <000901beb747$f4531840$979e2299@tim> Message-ID: <010201beb781$d1febf30$0801a8c0@bobcat> [Before I start: Skip mentioned "why L, not U". I know C/C++ uses L, presumably to denote a "long" string (presumably keeping the analogy between int and long ints). I guess Java has no such indicator, being native Unicode? Is there any sort of agreement that Python will use L"..." to denote Unicode strings? I would be happy with it. Also, should: print L"foo" -> 'foo' and print `L"foo"` -> L'foo' I would like to know if there is agreement for this, so I can change the Pythonwin implementation of Unicode now to make things more seamless later. ] > >> space = " " > >> foo = L"foo" > >> bar = L"bar" > >> result = space.join((foo, bar)) > > > The same should happen as for L"foo" + " " + L"bar". I must admit Guido's position has real appeal, even if just from a documentation POV. Eg, join can be defined as: sep.join([s1, ..., sn]) Returns s1 + sep + s2 + sep + ... + sepn Nice and simple to define and understand. Thus, if you can't add 2 items, you can't join them. Assuming the Unicode changes allow us to say: assert " " == L" ", "eek" assert L" " + "" == L" " assert " " + L"" == L" " # or even if this == " " Then this still works well in a Unicode environment; Unicode and strings could be mixed in the list, and as long as you understand what L" " + "" returns, you will understand immediately what the result of join() is going to be. > The attraction of auto-conversion for me is that I had never once seen > string.join blow up where the exception revealed a conceptual > error; in > every case conversion to string was the intent, and an > obvious one at that. OTOH, my gut tells me this is better - that an implicit conversion to the seperator type be performed. Also, it appears that this technique will never surprise anyone in a bad way. It seems the rule above, while simple, basically means "sep.join can only take string/Unicode objects", as all other objects will currently fail the add test. So, given that our rule is that the objects must all be strings, how can it hurt to help the user conform? > off-to-work-ly y'rs - tim where-i-should-be-instead-of-writing-rambling-mails-ly, Mark. From guido@CNRI.Reston.VA.US Tue Jun 15 23:54:42 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 18:54:42 -0400 Subject: [Python-Dev] mmap In-Reply-To: Your message of "Tue, 15 Jun 1999 15:48:41 PDT." References: Message-ID: <199906152254.SAA05114@eric.cnri.reston.va.us> > Another topic: what are the chances of adding the mmap module to the core > distribution? It's restricted to a smallish set of platforms (modern > Unices and Win32, I think), but it's quite small, and would be a nice > thing to have available in the core, IMHO. If it works on Linux, Solaris, Irix and Windows, and is reasonably clean, I'll take it. Please send it. > (btw, the buffer object needs more documentation) That's for Jack & Greg... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Wed Jun 16 00:04:17 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 19:04:17 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Wed, 16 Jun 1999 08:53:00 +1000." <010201beb781$d1febf30$0801a8c0@bobcat> References: <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <199906152304.TAA05136@eric.cnri.reston.va.us> > Is there any sort of agreement that Python will use L"..." to denote > Unicode strings? I would be happy with it. I don't know of any agreement, but it makes sense. > Also, should: > print L"foo" -> 'foo' > and > print `L"foo"` -> L'foo' Yes, I think this should be the way. Exactly what happens to non-ASCII characters is up to the implementation. Do we have agreement on escapes like \xDDDD? Should \uDDDD be added? The difference between the two is that according to the ANSI C standard, which I follow rather strictly for string literals, '\xABCDEF' is a single character whose value is the lower bits (however many fit in a char) of 0xABCDEF; this makes it cumbersome to write a string consisting of a hex escape followed by a digit or letter a-f or A-F; you would have to use another hex escape or split the literal in two, like this: "\xABCD" "EF". (This is true for 8-bit chars as well as for long char in ANSI C.) The \u escape takes up to 4 bytes but is not ANSI C. In Java, \u has the additional funny property that it is recognized *everywhere* in the source code, not just in string literals, and I believe that this complicates the interpretation of things like "\\uffff" (is the \uffff interpreted before regular string \ processing happens?). I don't think we ought to copy this behavior, although JPython users or developers might disagree. (I don't know anyone who *uses* Unicode strings much, so it's hard to gauge the importance of these issues.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Wed Jun 16 01:09:15 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Tue, 15 Jun 1999 19:09:15 -0500 Subject: [Python-Dev] String methods... finally In-Reply-To: <199906152304.TAA05136@eric.cnri.reston.va.us> References: Your message of "Wed, 16 Jun 1999 08:53:00 +1000." <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <1282630485-105472998@hypernet.com> Guido asks: > Do we have agreement on escapes like \xDDDD? Should \uDDDD be > added? > ... The \u escape > takes up to 4 bytes but is not ANSI C. How do endian issues fit in with \u? - Gordon From guido@CNRI.Reston.VA.US Wed Jun 16 00:20:07 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 19:20:07 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Tue, 15 Jun 1999 19:09:15 CDT." <1282630485-105472998@hypernet.com> References: Your message of "Wed, 16 Jun 1999 08:53:00 +1000." <010201beb781$d1febf30$0801a8c0@bobcat> <1282630485-105472998@hypernet.com> Message-ID: <199906152320.TAA05211@eric.cnri.reston.va.us> > How do endian issues fit in with \u? I would assume that it uses the same rules as hex and octal numeric literals: these are always *written* in big-endian notation, since that is also what we use for decimal numbers. Thus, on a little-endian machine, the short integer 0x1234 would be stored as the bytes {0x34, 0x12} and so would the string literal "\x1234". --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Jun 16 00:27:44 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Tue, 15 Jun 1999 19:27:44 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <000901beb747$f4531840$979e2299@tim> <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <14182.57712.380574.385164@anthem.cnri.reston.va.us> >>>>> "MH" == Mark Hammond writes: MH> OTOH, my gut tells me this is better - that an implicit MH> conversion to the seperator type be performed. Right now, the implementation of join uses PyObject_Str() to str-ify the elements in the sequence. I can't remember, but in our Unicode worldview doesn't PyObject_Str() return a narrowed string if it can, and raise an exception if not? So maybe narrow-string's join shouldn't be doing it this way because that'll autoconvert to the separator's type, which breaks the symmetry. OTOH, we could promote sep to the type of sequence[0] and forward the call to it's join if it were a widestring. That should retain the symmetry. -Barry From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Jun 16 00:46:24 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Tue, 15 Jun 1999 19:46:24 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <010201beb781$d1febf30$0801a8c0@bobcat> <199906152304.TAA05136@eric.cnri.reston.va.us> Message-ID: <14182.58832.140587.711978@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Should \uDDDD be added? That'd be nice! :) Guido> In Java, \u has the additional funny property that it is Guido> recognized *everywhere* in the source code, not just in Guido> string literals, and I believe that this complicates the Guido> interpretation of things like "\\uffff" (is the \uffff Guido> interpreted before regular string \ processing happens?). No. JLS section 3.3 says[1] In addition to the processing implied by the grammar, for each raw input character that is a backslash \, input processing must consider how many other \ characters contiguously precede it, separating it from a non-\ character or the start of the input stream. If this number is even, then the \ is eligible to begin a Unicode escape; if the number is odd, then the \ is not eligible to begin a Unicode escape. and this is born out by example. -------------------- snip snip --------------------Uni.java public class Uni { static public void main(String[] args) { System.out.println("\\u00a9"); System.out.println("\u00a9"); } } -------------------- snip snip --------------------outputs \u00a9 © -------------------- snip snip -------------------- -Barry [1] http://java.sun.com/docs/books/jls/html/3.doc.html#44591 PS. it is wonderful having the JLS online :) From ping@lfw.org Tue Jun 15 17:05:40 1999 From: ping@lfw.org (Ka-Ping Yee) Date: Tue, 15 Jun 1999 09:05:40 -0700 (PDT) Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <19990615114139.A3697@cnri.reston.va.us> Message-ID: On Tue, 15 Jun 1999, Greg Ward wrote: > Careful -- it actually works this way in Perl (well, except that join > isn't a method of strings...): > > $ perl -de 1 > [...] > DB<2> $sep = 0 > > DB<3> @list = (1, 2) > > DB<4> p join ($sep, @list) > 102 > > Cool! Who needs type-checking anyways? Cool! So then >>> def f(x): return x ** 2 ... >>> def g(x): return x - 5 ... >>> h = join((f, g)) ... >>> h(8) 59 Right? Right? (Just kidding.) -- ?!ng "Any nitwit can understand computers. Many do." -- Ted Nelson From tim_one@email.msn.com Wed Jun 16 05:02:46 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 16 Jun 1999 00:02:46 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <199906152304.TAA05136@eric.cnri.reston.va.us> Message-ID: <000401beb7ad$175193c0$2ca22299@tim> [Guido] > Do we have agreement on escapes like \xDDDD? I think we have to agree to leave that alone -- it affects what e.g. the regular expression parser does too. > Should \uDDDD be added? Yes, but only in string literals. You don't want to be within 10 miles of Barry if you tell him that Emacs pymode has to treat the Unicode escape for a newline as if it were-- as Java treats it outside literals --an actual line break <0.01 wink>. > ... > The \u escape takes up to 4 bytes Not in Java: it requires exactly 4 hex characters after == exactly 2 bytes, and it's an error if it's followed by fewer than 4 hex characters. That's a good rule (simple!), while ANSI C's is too clumsy to live with if people want to take Unicode seriously. So what does it mean for a Unicode escape to appear in a non-L string? aha-the-secret-escape-to-ucs4-ly y'rs - tim From tim_one@email.msn.com Wed Jun 16 05:02:44 1999 From: tim_one@email.msn.com (Tim Peters) Date: Wed, 16 Jun 1999 00:02:44 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <000301beb7ad$1635c380$2ca22299@tim> [MarkH agonizes, over whether to auto-convert or not] Well, the rule *could* be that the result type is the widest string type among the separator and the sequences' string elements (if any), and other types convert to the result type along the way. I'd be more specific, except I'm not sure which flavor of string str() returns (or, indeed, whether that's up to each __str__ implementation). In any case, widening to Unicode should always be possible, and if "widest wins" it doesn't require a multi-pass algorithm regardless (although the partial result so far may need to be widened once -- but that's true even if auto-convert of non-string types isn't implemented). Or, IOW, sep.join([a, b, c]) == f(a) + sep + f(b) + sep + f(c) where I don't know how to spell f, but f(x) *means* x' = if x has a string type then x else x.__str__() return x' coerced to the widest string type seen so far So I think everyone can get what they want -- except that those who want auto-convert are at direct odds with those who prefer to wag Guido's fingers and go "tsk, tsk, we know what you want but you didn't say 'please' so your program dies" . master-of-fair-summaries-ly y'rs - tim From mal@lemburg.com Wed Jun 16 09:29:27 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 16 Jun 1999 10:29:27 +0200 Subject: [Python-Dev] String methods... finally References: <010201beb781$d1febf30$0801a8c0@bobcat> <199906152304.TAA05136@eric.cnri.reston.va.us> Message-ID: <37676067.62E272F4@lemburg.com> Guido van Rossum wrote: > > > Is there any sort of agreement that Python will use L"..." to denote > > Unicode strings? I would be happy with it. > > I don't know of any agreement, but it makes sense. The u"..." looks more intuitive too me. While inheriting C/C++ constructs usually makes sense I think usage in the C community is not that wide-spread yet and for a Python freak, the small u will definitely remind him of Unicode whereas the L will stand for (nearly) unlimited length/precision. Not that this is important, but... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 198 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fredrik@pythonware.com Wed Jun 16 10:53:23 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 11:53:23 +0200 Subject: [Python-Dev] String methods... finally References: <000401beb7ad$175193c0$2ca22299@tim> Message-ID: <00f701beb7de$cdb422f0$f29b12c2@pythonware.com> > > The \u escape takes up to 4 bytes > > Not in Java: it requires exactly 4 hex characters after == exactly 2 bytes, > and it's an error if it's followed by fewer than 4 hex characters. That's a > good rule (simple!), while ANSI C's is too clumsy to live with if people > want to take Unicode seriously. > > So what does it mean for a Unicode escape to appear in a non-L string? my suggestion is to store it as UTF-8; see the patches included in the unicode package for details. this also means that an u-string literal (L-string, whatever) could be stored as an 8-bit string internally. and that the following two are equivalent: string = u"foo" string = unicode("foo") also note that: unicode(str(u"whatever")) == u"whatever" ... on the other hand, this means that we have at least four major "arrays of bytes or characters" thingies mapped on two data types: the old string type is used for: -- plain old 8-bit strings (ascii, iso-latin-1, whatever) -- byte buffers containing arbitrary data -- unicode strings stored as 8-bit characters, using the UTF-8 encoding. and the unicode string type is used for: -- unicode strings stored as 16-bit characters is this reasonable? ... yet another question is how to deal with source code. is a python 1.6 source file written in ASCII, ISO Latin 1, or UTF-8. speaking from a non-us standpoint, it would be really cool if you could write Python sources in UTF-8... From gstein@lyra.org Wed Jun 16 11:13:45 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 16 Jun 1999 03:13:45 -0700 (PDT) Subject: [Python-Dev] mmap In-Reply-To: <199906152254.SAA05114@eric.cnri.reston.va.us> Message-ID: On Tue, 15 Jun 1999, Guido van Rossum wrote: > > Another topic: what are the chances of adding the mmap module to the core > > distribution? It's restricted to a smallish set of platforms (modern > > Unices and Win32, I think), but it's quite small, and would be a nice > > thing to have available in the core, IMHO. > > If it works on Linux, Solaris, Irix and Windows, and is reasonably > clean, I'll take it. Please send it. Actually, my preference is to see a change to open() rather than a whole new module. For example, let's say that you open a file, specifying memory-mapping. Then you create a buffer against that file: f = open('foo','rm') # 'm' means mem-map b = buffer(f) print b[100:200] Disclaimer: I haven't looked at the mmap modules (AMK's and Mark's) to see what capabilities are in there. They may not be expressable soly as open() changes. (adding add'l params for mmap flags might be another way to handle this) I'd like to see mmap native in Python. I won't push, though, until I can run a test to see what kind of savings will occur when you mmap a .pyc file and open PyBuffer objects against the thing for the code bytes. My hypothesis is that you can reduce the working set of Python (i.e. amortize the cost of a .pyc's code over several processes by mmap'ing it); this depends on the proportion of code in the pyc relative to "other" stuff. > > (btw, the buffer object needs more documentation) > > That's for Jack & Greg... Quite true. My bad :-( ... That would go into the API doc, I guess... I'll put this on a todo list, but it could be a little while. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik@pythonware.com Wed Jun 16 11:53:29 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 12:53:29 +0200 Subject: [Python-Dev] mmap References: Message-ID: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Greg wrote: > Actually, my preference is to see a change to open() rather than a whole > new module. For example, let's say that you open a file, specifying > memory-mapping. Then you create a buffer against that file: > > f = open('foo','rm') # 'm' means mem-map > b = buffer(f) > print b[100:200] > > Disclaimer: I haven't looked at the mmap modules (AMK's and Mark's) to see > what capabilities are in there. They may not be expressable soly as open() > changes. (adding add'l params for mmap flags might be another way to > handle this) > > I'd like to see mmap native in Python. I won't push, though, until I can > run a test to see what kind of savings will occur when you mmap a .pyc > file and open PyBuffer objects against the thing for the code bytes. My > hypothesis is that you can reduce the working set of Python (i.e. amortize > the cost of a .pyc's code over several processes by mmap'ing it); this > depends on the proportion of code in the pyc relative to "other" stuff. yes, yes, yes! my good friend the mad scientist (the guy who writes code, not the flaming cult-ridden brainwashed script kiddie) has considered writing a whole new "abstract file" backend, to entirely get rid of stdio in the Python core. some potential advantages: -- performance (some stdio implementations are slow) -- portability (stdio doesn't exist on some platforms!) -- opens up for cool extensions (memory mapping, pluggable file handlers, etc). should I tell him to start hacking? or is this the same thing as PyBuffer/buffer (I've implemented PyBuffer support for the unicode class, but that doesn't mean that I understand how it works...) PS. someone once told me that Perl goes "below" the standard file I/O system. does anyone here know if that's true, and per- haps even explain how they're doing that... From guido@CNRI.Reston.VA.US Wed Jun 16 13:19:10 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 16 Jun 1999 08:19:10 -0400 Subject: [Python-Dev] mmap In-Reply-To: Your message of "Wed, 16 Jun 1999 03:13:45 PDT." References: Message-ID: <199906161219.IAA05802@eric.cnri.reston.va.us> [me] > > If it works on Linux, Solaris, Irix and Windows, and is reasonably > > clean, I'll take it. Please send it. [Greg] > Actually, my preference is to see a change to open() rather than a whole > new module. For example, let's say that you open a file, specifying > memory-mapping. Then you create a buffer against that file: > > f = open('foo','rm') # 'm' means mem-map > b = buffer(f) > print b[100:200] Buh. Changes of this kind to builtins are painful, especially since we expect that this feature may or may not be supported. And imagine the poor reader who comes across this for the first time... What's wrong with import mmap f = mmap.open('foo', 'r') ??? > I'd like to see mmap native in Python. I won't push, though, until I can > run a test to see what kind of savings will occur when you mmap a .pyc > file and open PyBuffer objects against the thing for the code bytes. My > hypothesis is that you can reduce the working set of Python (i.e. amortize > the cost of a .pyc's code over several processes by mmap'ing it); this > depends on the proportion of code in the pyc relative to "other" stuff. We've been through this before. I still doubt it will help much. Anyway, it's a completely independent feature from making the mmap module(any mmap module) available to users. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Wed Jun 16 13:24:26 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 16 Jun 1999 08:24:26 -0400 Subject: [Python-Dev] mmap In-Reply-To: Your message of "Wed, 16 Jun 1999 12:53:29 +0200." <015b01beb7e6$79b61610$f29b12c2@pythonware.com> References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Message-ID: <199906161224.IAA05815@eric.cnri.reston.va.us> > my good friend the mad scientist (the guy who writes code, > not the flaming cult-ridden brainwashed script kiddie) has > considered writing a whole new "abstract file" backend, to > entirely get rid of stdio in the Python core. some potential > advantages: > > -- performance (some stdio implementations are slow) > -- portability (stdio doesn't exist on some platforms!) You have this backwards -- you'd have to port the abstract backend first! Also don't forget that a *good* stdio might be using all sorts of platform-specific tricks that you'd have to copy to match its performance. > -- opens up for cool extensions (memory mapping, > pluggable file handlers, etc). > > should I tell him to start hacking? Tcl/Tk does this. I see some advantages (e.g. you have more control over and knowledge of how much data is buffered) but also some disadvantages (more work to port, harder to use from C), plus tons of changes needed in the rest of Python. I'd say wait until Python 2.0 and let's keep stdio for 1.6. > PS. someone once told me that Perl goes "below" the standard > file I/O system. does anyone here know if that's true, and per- > haps even explain how they're doing that... Probably just means that they use the C equivalent of os.open() and friends. --Guido van Rossum (home page: http://www.python.org/~guido/) From gward@cnri.reston.va.us Wed Jun 16 13:25:34 1999 From: gward@cnri.reston.va.us (Greg Ward) Date: Wed, 16 Jun 1999 08:25:34 -0400 Subject: [Python-Dev] mmap In-Reply-To: <015b01beb7e6$79b61610$f29b12c2@pythonware.com>; from Fredrik Lundh on Wed, Jun 16, 1999 at 12:53:29PM +0200 References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Message-ID: <19990616082533.A4142@cnri.reston.va.us> On 16 June 1999, Fredrik Lundh said: > my good friend the mad scientist (the guy who writes code, > not the flaming cult-ridden brainwashed script kiddie) has > considered writing a whole new "abstract file" backend, to > entirely get rid of stdio in the Python core. some potential > advantages: [...] > PS. someone once told me that Perl goes "below" the standard > file I/O system. does anyone here know if that's true, and per- > haps even explain how they're doing that... My understanding (mainly from folklore -- peeking into the Perl source has been known to turn otherwise staid, solid programmers into raving lunatics) is that yes, Perl does grovel around in the internals of stdio implementations to wring a few extra cycles out. However, what's probably of more interest to you -- I mean your mad scientist alter ego -- is Perl's I/O abstraction layer: a couple of years ago, somebody hacked up Perl's guts to do basically what you're proposing for Python. The main result was a half-baked, unfinished (at least as of last summer, when I actually asked an expert in person at the Perl Conference) way of building Perl with AT&T's sfio library instead of stdio. I think the other things you mentioned, eg. more natural support for memory-mapped files, have also been bandied about as advantages of this scheme. The main problem with Perl's I/O abstraction layer is that extension modules now have to call e.g. PerlIO_open(), PerlIO_printf(), etc. in place of their stdio counterparts. Surprise surprise, many extension modules have not adapted to the new way of doing things, even though it's been in Perl since version 5.003 (I think). Even more surprisingly, the fourth-party C libraries that those extension modules often interface to haven't switched to using Perl's I/O abstraction layer. This doesn't make a whit of difference if Perl is built in either the "standard way" (no abstraction layer, just direct stdio) or with the abstraction layer on top of stdio. But as soon as some poor fool decides Perl on top of sfio would be neat, lots of extension modules break -- their I/O calls go nowhere. I'm sure there is some sneaky way to make it all work using sfio's binary compatibility layer and some clever macros. This might even have been done. However, AFAIK it's not been documented anywhere. This is not merely to bitch about unfinished business in the Perl core; it's to warn you that others have walked down the road you propose to tread, and there may be potholes. Now if the Python source really does get even more modularized for 1.6, you might have a much easier job of it. ("Modular" is not the word that jumps to mind when one looks at the Perl source code.) Greg /* * "Far below them they saw the white waters pour into a foaming bowl, and * then swirl darkly about a deep oval basin in the rocks, until they found * their way out again through a narrow gate, and flowed away, fuming and * chattering, into calmer and more level reaches." */ -- Tolkein, by way of perl/doio.c -- Greg Ward - software developer gward@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913 From beazley@cs.uchicago.edu Wed Jun 16 14:23:32 1999 From: beazley@cs.uchicago.edu (David Beazley) Date: Wed, 16 Jun 1999 08:23:32 -0500 (CDT) Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Message-ID: <199906161323.IAA28642@gargoyle.cs.uchicago.edu> Fredrik Lundh writes: > > my good friend the mad scientist (the guy who writes code, > not the flaming cult-ridden brainwashed script kiddie) has > considered writing a whole new "abstract file" backend, to > entirely get rid of stdio in the Python core. some potential > advantages: > > -- performance (some stdio implementations are slow) > -- portability (stdio doesn't exist on some platforms!) > -- opens up for cool extensions (memory mapping, > pluggable file handlers, etc). > > should I tell him to start hacking? > I am not in favor of obscuring Python's I/O model too much. When working with C extensions, it is critical to have access to normal I/O mechanisms such as 'FILE *' or integer file descriptors. If you hide all of this behind some sort of abstract I/O layer, it's going to make life hell for extension writers unless you also provide a way to get access to the raw underlying data structures. This is a major gripe I have with the Tcl channel model--namely, there seems to be no easy way to unravel a Tcl channel into a raw file-descriptor for use in C (unless I'm being dense and have missed some simple way to do it). Also, what platforms are we talking about here? I've never come across any normal machine that had a C compiler, but did not have stdio. Is this really a serious problem? Cheers, Dave From MHammond@skippinet.com.au Wed Jun 16 14:47:44 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Wed, 16 Jun 1999 23:47:44 +1000 Subject: [Python-Dev] mmap In-Reply-To: <19990616082533.A4142@cnri.reston.va.us> Message-ID: <011c01beb7fe$d213c600$0801a8c0@bobcat> [Greg writes] > The main problem with Perl's I/O abstraction layer is that extension > modules now have to call e.g. PerlIO_open(), PerlIO_printf(), etc. in > place of their stdio counterparts. Surprise surprise, many extension Interestingly, Python _nearly_ suffers this problem now. Although Python does use native FILE pointers, this scheme still assumes that Python and the extensions all use the same stdio. I understand that on most Unix system this can be taken for granted. However, to be truly cross-platform, this assumption may not be valid. A case in point is (surprise surprise :-) Windows. Windows has a number of C RTL options, and Python and its extensions must be careful to select the one that shares FILE * and the heap across separately compiled and linked modules. In-fact, Windows comes with an excellent debug version of the C RTL, but this gets in Python's way - if even one (but not all) Python extension attempts to use these debugging features, we die in a big way. and-dont-even-talk-to-me-about-Windows-CE ly, Mark. From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Jun 16 15:42:01 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Wed, 16 Jun 1999 10:42:01 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <010201beb781$d1febf30$0801a8c0@bobcat> <199906152304.TAA05136@eric.cnri.reston.va.us> <37676067.62E272F4@lemburg.com> Message-ID: <14183.47033.656933.642197@anthem.cnri.reston.va.us> >>>>> "M" == M writes: M> The u"..." looks more intuitive too me. While inheriting C/C++ M> constructs usually makes sense I think usage in the C community M> is not that wide-spread yet and for a Python freak, the small u M> will definitely remind him of Unicode whereas the L will stand M> for (nearly) unlimited length/precision. I don't think I've every seen C code with L"..." strings in them. Here's my list in no particular order. U"..." -- reminds Java/JPython users of Unicode. Alternative mnemonic: Unamerican-strings L"..." -- long-strings, Lundh-strings, ... W"..." -- wide-strings, Warsaw-strings (just trying to take credit where credit's not due :), what-the-heck-are-these?-strings H"..." -- happy-strings, Hammond-strings, hey-you-just-made-my-extension-module-crash-strings F"..." -- funky-stuff-in-these-hyar-strings A"..." -- ain't-strings S"..." -- strange-strings, silly-strings M> Not that this is important, but... Agreed. -Barry From fredrik@pythonware.com Wed Jun 16 20:11:02 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 21:11:02 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <19990616082533.A4142@cnri.reston.va.us> Message-ID: <001901beb82b$fab54200$f29b12c2@pythonware.com> Greg Ward wrote: > This is not merely to bitch about unfinished business in the Perl core; > it's to warn you that others have walked down the road you propose to > tread, and there may be potholes. oh, the mad scientist have rushed down that road a few times before. we'll see if he's prepared to do that again; it sure won't happen before the unicode stuff is in place... From fredrik@pythonware.com Wed Jun 16 20:16:56 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 21:16:56 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161224.IAA05815@eric.cnri.reston.va.us> Message-ID: <004a01beb82e$36ba54a0$f29b12c2@pythonware.com> > > -- performance (some stdio implementations are slow) > > -- portability (stdio doesn't exist on some platforms!) > > You have this backwards -- you'd have to port the abstract backend > first! Also don't forget that a *good* stdio might be using all sorts > of platform-specific tricks that you'd have to copy to match its > performance. well, if the backend layer is good enough, I don't think a stdio-based standard version will be much slower than todays stdio-only implementation. > > PS. someone once told me that Perl goes "below" the standard > > file I/O system. does anyone here know if that's true, and per- > > haps even explain how they're doing that... > > Probably just means that they use the C equivalent of os.open() and > friends. hopefully. my original source described this as "digging around in the innards of the stdio package" (and so did greg). and the same source claimed it wasn't yet ported to Linux. sounds weird, to say the least, but maybe he referred to that sfio package greg mentioned. I'll do some digging, but not today. From fredrik@pythonware.com Wed Jun 16 20:27:02 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 21:27:02 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161323.IAA28642@gargoyle.cs.uchicago.edu> Message-ID: <004b01beb82e$36d44540$f29b12c2@pythonware.com> David Beazley wrote: > I am not in favor of obscuring Python's I/O model too much. When > working with C extensions, it is critical to have access to normal I/O > mechanisms such as 'FILE *' or integer file descriptors. If you hide > all of this behind some sort of abstract I/O layer, it's going to make > life hell for extension writers unless you also provide a way to get > access to the raw underlying data structures. This is a major gripe > I have with the Tcl channel model--namely, there seems to be no easy > way to unravel a Tcl channel into a raw file-descriptor for use in C > (unless I'm being dense and have missed some simple way to do it). > > Also, what platforms are we talking about here? I've never come > across any normal machine that had a C compiler, but did not have stdio. > Is this really a serious problem? in a way, it is a problem today under Windows (in other words, on most of the machines where Python is used today). it's very easy to end up with different DLL's using different stdio implementations, resulting in all kinds of strange errors. a rewrite could use OS-level handles instead, and get rid of that problem. not to mention Windows CE (iirc, Mark had to write his own stdio-ish package for the CE port), maybe PalmOS, BeOS's BFile's, and all the other upcoming platforms which will make Windows look like a fairly decent Unix clone ;-) ... and in Python, any decent extension writer should write code that works with arbitrary file objects, right? "if it cannot deal with StringIO objects, it's broken"... From beazley@cs.uchicago.edu Wed Jun 16 20:53:23 1999 From: beazley@cs.uchicago.edu (David Beazley) Date: Wed, 16 Jun 1999 14:53:23 -0500 (CDT) Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161323.IAA28642@gargoyle.cs.uchicago.edu> <004b01beb82e$36d44540$f29b12c2@pythonware.com> Message-ID: <199906161953.OAA04527@gargoyle.cs.uchicago.edu> Fredrik Lundh writes: > > and in Python, any decent extension writer should write > code that works with arbitrary file objects, right? "if it > cannot deal with StringIO objects, it's broken"... I disagree. Given that a lot of people use Python as a glue language for interfacing with legacy codes, it is unacceptable for extensions to be forced to use some sort of funky non-standard I/O abstraction. Unless you are volunteering to rewrite all of these codes to use the new I/O model, you are always going to need access (in one way or another) to plain old 'FILE *' and integer file descriptors. Of course, one can always just provide a function like FILE *PyFile_AsFile(PyObject *o) That takes an I/O object and returns a 'FILE *' where supported. (Of course, if it's not supported, then it doesn't matter if this function is missing since any extension that needs a 'FILE *' wouldn't work anyways). Cheers, Dave From fredrik@pythonware.com Wed Jun 16 21:04:54 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 22:04:54 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com><199906161323.IAA28642@gargoyle.cs.uchicago.edu><004b01beb82e$36d44540$f29b12c2@pythonware.com> <199906161953.OAA04527@gargoyle.cs.uchicago.edu> Message-ID: <009d01beb833$80d15d40$f29b12c2@pythonware.com> > > and in Python, any decent extension writer should write > > code that works with arbitrary file objects, right? "if it > > cannot deal with StringIO objects, it's broken"... > > I disagree. Given that a lot of people use Python as a glue language > for interfacing with legacy codes, it is unacceptable for extensions > to be forced to use some sort of funky non-standard I/O abstraction. oh, you're right, of course. should have added that extra smiley to that last line. cut and paste from this mail if necessary: ;-) > Unless you are volunteering to rewrite all of these codes to use the > new I/O model, you are always going to need access (in one way or > another) to plain old 'FILE *' and integer file descriptors. Of > course, one can always just provide a function like > > FILE *PyFile_AsFile(PyObject *o) > > That takes an I/O object and returns a 'FILE *' where supported. exactly my idea. when scanning the code, PyFile_AsFile immediately popped up as a potential pothole (if you need the fileno, there's already a method for that in the "standard file object interface"). btw, an "abstract file object" could actually make it much easier to support arbitrary file objects from C/C++ extensions. just map the calls back to Python. or add a tp_file slot, and things get really interesting... > (Of course, if it's not supported, then it doesn't matter if this > function is missing since any extension that needs a 'FILE *' wouldn't > work anyways). yup. I suspect some legacy code may have a hard time running under CE et al. but of course, with a little macro trickery, no- thing stops you from recompiling such code so it uses Python's new "abstract file... okay, okay, I'll stop now ;-) From beazley@cs.uchicago.edu Wed Jun 16 21:13:42 1999 From: beazley@cs.uchicago.edu (David Beazley) Date: Wed, 16 Jun 1999 15:13:42 -0500 (CDT) Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161323.IAA28642@gargoyle.cs.uchicago.edu> <004b01beb82e$36d44540$f29b12c2@pythonware.com> <199906161953.OAA04527@gargoyle.cs.uchicago.edu> <009d01beb833$80d15d40$f29b12c2@pythonware.com> Message-ID: <199906162013.PAA04781@gargoyle.cs.uchicago.edu> Fredrik Lundh writes: > > > and in Python, any decent extension writer should write > > > code that works with arbitrary file objects, right? "if it > > > cannot deal with StringIO objects, it's broken"... > > > > I disagree. Given that a lot of people use Python as a glue language > > for interfacing with legacy codes, it is unacceptable for extensions > > to be forced to use some sort of funky non-standard I/O abstraction. > > oh, you're right, of course. should have added that extra smiley > to that last line. cut and paste from this mail if necessary: ;-) > Good. You had me worried there for a second :-). > > yup. I suspect some legacy code may have a hard time running > under CE et al. but of course, with a little macro trickery, no- > thing stops you from recompiling such code so it uses Python's > new "abstract file... okay, okay, I'll stop now ;-) Macro trickery? Oh yes, we could use that too... (one can never have too much macro trickery if you ask me :-) Cheers, Dave From arw@ifu.net Thu Jun 17 15:12:16 1999 From: arw@ifu.net (Aaron Watters) Date: Thu, 17 Jun 1999 10:12:16 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] Message-ID: <37690240.66F601E1@ifu.net> > no-positive-suggestions-just-grousing-ly y'rs - tim On the contrary. I think this is definitively a bad idea. Retracted. A double negative is a positive. -- Aaron Watters === "Criticism serves the same purpose as pain. It's not pleasant but it suggests that something is wrong." -- Churchill (paraphrased from memory) From da@ski.org Thu Jun 17 18:50:20 1999 From: da@ski.org (David Ascher) Date: Thu, 17 Jun 1999 10:50:20 -0700 (Pacific Daylight Time) Subject: [Python-Dev] org.python.org Message-ID: Not all that revolutionary, but an interesting migration path. FWIW, I think the underlying issue is a real one. We're starting to have more and more conflicts, even among package names. (Of course the symlink solution doesn't work on Win32, but that's a detail =). --david ---------- Forwarded message ---------- Date: Thu, 17 Jun 1999 13:44:33 -0400 (EDT) From: Andy Dustman To: Gordon McMillan Cc: M.-A. Lemburg , Crew List Subject: Re: [Crew] Wizards' Resolution to Zope/PIL/mxDateTime conflict? On Thu, 17 Jun 1999, Gordon McMillan wrote: > M.A.L. wrote: > > > Or maybe we should start the com.domain.mypackage thing ASAP. > > I know many are against this proposal (makes Python look Feudal? > Reminds people of the J language?), but I think it's the only thing > that makes sense. It does mean you have to do some ugly things to get > Pickle working properly. Actually, it can be done very easily. I just tried this, in fact: cd /usr/lib/python1.5 mkdir -p org/python (cd org/python; ln -s ../.. core) touch __init__.py org/__init__.py org/python/__init__.py >>> from org.python.core import rfc822 >>> import profile So this seems to make things nice and backwards compatible. My only concern was having __init__.py in /usr/lib/python1.5, but this doesn't seem to break anything. Of course, if you are using some trendy new atrocity like Windoze, this might not work. -- andy dustman | programmer/analyst | comstar communications corporation telephone: 770.485.6025 / 706.549.7689 | icq: 32922760 | pgp: 0xc72f3f1d _______________________________________________ Crew maillist - Crew@starship.python.net http://starship.python.net/mailman/listinfo/crew From gmcm@hypernet.com Thu Jun 17 20:36:49 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 17 Jun 1999 14:36:49 -0500 Subject: [Python-Dev] org.python.org In-Reply-To: Message-ID: <1282474031-114884629@hypernet.com> David forwards from Starship Crew list: > Not all that revolutionary, but an interesting migration path. > FWIW, I think the underlying issue is a real one. We're starting to > have more and more conflicts, even among package names. (Of course > the symlink solution doesn't work on Win32, but that's a detail =). > > --david > > ---------- Forwarded message ---------- > Date: Thu, 17 Jun 1999 13:44:33 -0400 (EDT) > From: Andy Dustman > To: Gordon McMillan > Cc: M.-A. Lemburg , Crew List > Subject: Re: [Crew] Wizards' Resolution to > Zope/PIL/mxDateTime conflict? > > On Thu, 17 Jun 1999, Gordon McMillan wrote: > > > M.A.L. wrote: > > > > > Or maybe we should start the com.domain.mypackage thing ASAP. > > > > I know many are against this proposal (makes Python look Feudal? > > Reminds people of the J language?), but I think it's the only thing > > that makes sense. It does mean you have to do some ugly things to get > > Pickle working properly. > > Actually, it can be done very easily. I just tried this, in fact: > > cd /usr/lib/python1.5 > mkdir -p org/python > (cd org/python; ln -s ../.. core) > touch __init__.py org/__init__.py org/python/__init__.py > > >>> from org.python.core import rfc822 > >>> import profile > > So this seems to make things nice and backwards compatible. My only > concern was having __init__.py in /usr/lib/python1.5, but this > doesn't seem to break anything. Of course, if you are using some > trendy new atrocity like Windoze, this might not work. In vanilla cases it's backwards compatible. I try packag-izing almost everything I install. Sometimes it works, sometimes it doesn't. In your example, rfc822 uses only builtins at the top level. It's main will import os. Would that work if os lived in org.python.core? Though I really don't think we need to packagize the std distr, (if that happens, I would think it would be for a different reason). The 2 main problems I run across in packagizing things are intra-package imports (where M.A.L's proposal for relative names in dotted imports might ease the pain) and Pickle / cPickle (where the ugliness of the workarounds has often made me drop back to marshal). - Gordon From MHammond@skippinet.com.au Fri Jun 18 09:31:21 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Fri, 18 Jun 1999 18:31:21 +1000 Subject: [Python-Dev] Merge the string_methods tag? Message-ID: <015601beb964$f37a4fa0$0801a8c0@bobcat> Ive been running the string_methods tag (term?) under CVS for quite some time now, and it seems to work perfectly. I admit that I havent stressed the string methods much, but I feel confident that Barry's patches havent broken existing string code. Also, I find using that tag with CVS a bit of a pain. A few updates have been checked into the main branch, and you tend to miss these (its a pity CVS can't be told "only these files are affected by this tag, so the rest should follow the main branch." I know I can do that personally, but that means I personally need to know all files possibly affected by the branch.) Anyway, I digress... I propose that these extensions be merged into the main branch. The main advantage is that we force more people to bash on it, rather than allowing them to make that choice . If the Unicode type is also considered highly experimental, we can make a new tag for that change, but that is really quite independant of the string methods. Mark. From fredrik@pythonware.com Fri Jun 18 09:56:47 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 18 Jun 1999 10:56:47 +0200 Subject: [Python-Dev] cvs problems References: <015601beb964$f37a4fa0$0801a8c0@bobcat> Message-ID: <001d01beb968$7fd47540$f29b12c2@pythonware.com> maybe not the right forum, but I suppose everyone here is using CVS, so... ...could anyone explain why I keep getting this error? $ cvs -z6 up -P -d ... cvs server: Updating dist/src/Tools/ht2html cvs [server aborted]: cannot open directory /projects/cvsroot/python/dist/src/Tools/ht2html: No such file or directory it used to work... From tismer@appliedbiometrics.com Fri Jun 18 10:47:15 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Fri, 18 Jun 1999 11:47:15 +0200 Subject: [Python-Dev] Flat Python in Linux Weekly Message-ID: <376A15A3.3968EADE@appliedbiometrics.com> Howdy, Who would have thought this... Linux Weekly took notice. http://lwn.net/bigpage.phtml derangedly yours - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From mal@lemburg.com Fri Jun 18 11:05:52 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 18 Jun 1999 12:05:52 +0200 Subject: [Python-Dev] Relative package imports Message-ID: <376A1A00.3099DE99@lemburg.com> Although David has already copy-posted a message regarding this issue to the list, I would like to restate the problem to get a discussion going (and then maybe take it to c.l.p for general flaming ;). The problem we have run into on starship is that some well-known packages have introduced naming conflicts leading to the unfortunate situation that they can't be all installed on the same default path: 1. Zope has a module named DateTime which also is the base name of the package mxDateTime. 2. Both Zope and PIL have a top-level module named ImageFile.py (different ones of course). Now the problem is how to resolve these issues. One possibility is turning Zope and PIL into proper packages altogether. To ease this transition, one would need a way to specify relative intra-package imports and a way to tell pickle where to look for modules/packages. The next problem we'd probably run into sooner or later is that there are quite a few useful top-level modules with generic names that will conflict with package names and other modules with the same name. I guess we'd need at least three things to overcome this situation once and for all ;-): 1. Provide a way to do relative imports, e.g. a single dot could be interpreted as "parent package": modA.py modD.py [A] modA.py modB.py [B] modC.py modD.py In modC.py: from modD import * (works as usual: import A.B.modD) from .modA import * (imports A.modA) from ..modA import * (import the top-level modA) 2. Establish a general vendor based naming scheme much like the one used in the Java world: from org.python.core import time,os,string from org.zope.core import * from com.lemburg import DateTime from com.pythonware import PIL 3. Add a way to prevent double imports of the same file. This is the mayor gripe I have with pickle currently, because intra- package imports often lead to package modules being imported twice leading to many strange problems (e.g. splitting class hierarchies, problems with isinstance() and issubclass(), etc.), e.g. from org.python.core import UserDict u = UserDict.UserDict() import UserDict v = UserDict.UserDict() Now u and v will point to two different classes: >>> u.__class__ >>> v.__class__ 4. Add some kind of redirection or lookup hook to pickle et al. so that imports done during unpickling can be redirected to the correct (possibly renamed) package. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 196 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fredrik@pythonware.com Fri Jun 18 11:47:49 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 18 Jun 1999 12:47:49 +0200 Subject: [Python-Dev] Flat Python in Linux Weekly References: <376A15A3.3968EADE@appliedbiometrics.com> Message-ID: <001901beb978$0312a440$f29b12c2@pythonware.com> flat eric, flat beat, flat python? http://www.flateric-online.de (best viewed through babelfish.altavista.com, of course ;-) should-flat-eric-in-the-routeroute-route-along-ly yrs /F From fredrik@pythonware.com Fri Jun 18 11:51:21 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 18 Jun 1999 12:51:21 +0200 Subject: [Python-Dev] Relative package imports References: <376A1A00.3099DE99@lemburg.com> Message-ID: <001f01beb978$8177aab0$f29b12c2@pythonware.com> > 2. Both Zope and PIL have a top-level module named ImageFile.py > (different ones of course). > > Now the problem is how to resolve these issues. One possibility > is turning Zope and PIL into proper packages altogether. To > ease this transition, one would need a way to specify relative > intra-package imports and a way to tell pickle where to look > for modules/packages. fwiw, PIL 1.0b1 can already be used as a package, but you have to explicitly import the file format handlers you need: from PIL import Image import PIL.GifImagePlugin import PIL.PngImagePlugin import PIL.JpegImagePlugin etc. this has been fixed in PIL 1.0 final. From guido@CNRI.Reston.VA.US Fri Jun 18 15:51:16 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 18 Jun 1999 10:51:16 -0400 Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: Your message of "Fri, 18 Jun 1999 18:31:21 +1000." <015601beb964$f37a4fa0$0801a8c0@bobcat> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> Message-ID: <199906181451.KAA11549@eric.cnri.reston.va.us> > Ive been running the string_methods tag (term?) under CVS for quite some > time now, and it seems to work perfectly. I admit that I havent stressed > the string methods much, but I feel confident that Barry's patches havent > broken existing string code. > > Also, I find using that tag with CVS a bit of a pain. A few updates have > been checked into the main branch, and you tend to miss these (its a pity > CVS can't be told "only these files are affected by this tag, so the rest > should follow the main branch." I know I can do that personally, but that > means I personally need to know all files possibly affected by the branch.) > Anyway, I digress... > > I propose that these extensions be merged into the main branch. The main > advantage is that we force more people to bash on it, rather than allowing > them to make that choice . If the Unicode type is also considered > highly experimental, we can make a new tag for that change, but that is > really quite independant of the string methods. Hmm... This would make it hard to make a patch release for 1.5.2 (possible called 1.5.3?). I *really* don't want the string methods to end up in a release yet -- there are too many rough edges (e.g. some missing methods, should join str() or not, etc.). I admit that managing CVS branches is painful. We may find that it works better to create a branch for patch releases and to do all new development on the main release... But right now I don't want to change anything yet. In any case Barry just went on vacation so we'll have to wait 10 days... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Fri Jun 18 15:55:45 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 18 Jun 1999 10:55:45 -0400 Subject: [Python-Dev] cvs problems In-Reply-To: Your message of "Fri, 18 Jun 1999 10:56:47 +0200." <001d01beb968$7fd47540$f29b12c2@pythonware.com> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <001d01beb968$7fd47540$f29b12c2@pythonware.com> Message-ID: <199906181455.KAA11564@eric.cnri.reston.va.us> > maybe not the right forum, but I suppose everyone > here is using CVS, so... > > ...could anyone explain why I keep getting this error? > > $ cvs -z6 up -P -d > ... > cvs server: Updating dist/src/Tools/ht2html > cvs [server aborted]: cannot open directory /projects/cvsroot/python/dist/src/Tools/ht2html: No such > file or directory > > it used to work... EXPLANATION: For some reason that directory existed on the mirror server but not in the master CVS tree repository. It was created once but quickly deleted -- not quickly enough apparently to prevent it to leak to the slave. Then we did a global resync from the master to the mirror and that wiped out the mirror version. Good riddance. FIX: Edit Tools/CVS/Entries and delete the line that mentions ht2html, then do another cvs update. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one@email.msn.com Fri Jun 18 16:41:54 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 18 Jun 1999 11:41:54 -0400 Subject: [Python-Dev] cvs problems In-Reply-To: <001d01beb968$7fd47540$f29b12c2@pythonware.com> Message-ID: <000901beb9a1$179d2380$b79e2299@tim> [/F] > ...could anyone explain why I keep getting this error? > > $ cvs -z6 up -P -d > ... > cvs server: Updating dist/src/Tools/ht2html > cvs [server aborted]: cannot open directory > /projects/cvsroot/python/dist/src/Tools/ht2html: No such > file or directory > > it used to work... It stopped working a week ago Thursday, and Guido & Barry know about it. The directory in question vanished from the server under mysterious circumstances. You can get going again by deleting the ht2html line in your local Tools/CVS/Entries file. From da@ski.org Fri Jun 18 18:09:27 1999 From: da@ski.org (David Ascher) Date: Fri, 18 Jun 1999 10:09:27 -0700 (Pacific Daylight Time) Subject: [Python-Dev] automatic wildcard expansion on Win32 Message-ID: A python-help poster finally convinced me that there was a way to enable automatic wildcard expansion on win32. This is done by linking in "setargv.obj" along with all of the other MS libs. Quick testing shows that it works. Is this a feature we want to add? I can see both sides of that coin. --david PS: I saw a RISKS digest posting last week which had a horror story about wildcard expansion on some flavor of Windows. The person had two files with long filenames: verylongfile1.txt and verylongfile2.txt But Win32 stored them in 8.3 format, so they were stored as verylo~2.txt and verylo~1.txt (Yes, the 1 and 2 were swapped!). So when he did del *1.txt he removed the wrong file. Neat, eh? (This is actually relevant -- it's possible that setargv.obj and glob.glob could give different answers). --david From guido@CNRI.Reston.VA.US Fri Jun 18 19:09:29 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 18 Jun 1999 14:09:29 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Your message of "Fri, 18 Jun 1999 10:09:27 PDT." References: Message-ID: <199906181809.OAA12090@eric.cnri.reston.va.us> > A python-help poster finally convinced me that there was a way to enable > automatic wildcard expansion on win32. This is done by linking in > "setargv.obj" along with all of the other MS libs. Quick testing shows > that it works. > > Is this a feature we want to add? I can see both sides of that coin. I don't see big drawbacks except minor b/w compat problems. Should it be done for both python.exe and pythonw.exe? --Guido van Rossum (home page: http://www.python.org/~guido/) From da@ski.org Fri Jun 18 21:06:09 1999 From: da@ski.org (David Ascher) Date: Fri, 18 Jun 1999 13:06:09 -0700 (Pacific Daylight Time) Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: <199906181809.OAA12090@eric.cnri.reston.va.us> Message-ID: On Fri, 18 Jun 1999, Guido van Rossum wrote: > I don't see big drawbacks except minor b/w compat problems. > > Should it be done for both python.exe and pythonw.exe? Sure. From MHammond@skippinet.com.au Sat Jun 19 01:56:42 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Sat, 19 Jun 1999 10:56:42 +1000 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Message-ID: <016e01beb9ee$99e1a710$0801a8c0@bobcat> > A python-help poster finally convinced me that there was a > way to enable > automatic wildcard expansion on win32. This is done by linking in > "setargv.obj" along with all of the other MS libs. Quick > testing shows > that it works. This has existed since I have been using C on Windows. I personally would vote against it. AFAIK, common wisdom on Windows is to not use this. Indeed, if people felt that this behaviour was an improvement, MS would have enabled it by default at some stage over the last 10 years it has existed, and provided a way of disabling it! This behaviour causes subtle side effects; effects Unix users are well aware of, due to every single tool using it. Do the tricks needed to get the wildcard down to the program exist? Will any windows users know what they are? IMO, Windows "fixed" the Unix behaviour by dropping this, and they made a concession to die-hards by providing a rarely used way of enabling it. Windows C programmers dont expect it, VB programmers dont expect it, even batch file programmers dont expect it. I dont think we should use it. > (This is actually relevant -- it's possible that setargv.obj > and glob.glob > could give different answers). Exactly. As may win32api.FindFiles(). Give the user the wildcard, and let them make sense of it. The trivial case of using glob() is so simple I dont believe it worth hiding. Your horror story of the incorrect file being deleted could then only be blamed on the application, not on Python! Mark. From tim_one@email.msn.com Sat Jun 19 02:00:46 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 18 Jun 1999 21:00:46 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Message-ID: <000501beb9ef$2ac61720$a69e2299@tim> [David Ascher] > A python-help poster finally convinced me that there was a way to enable > automatic wildcard expansion on win32. This is done by linking in > "setargv.obj" along with all of the other MS libs. Quick testing shows > that it works. > > Is this a feature we want to add? I can see both sides of that coin. The only real drawback I see is that we're then under some obligation to document Python's behavior. Which is then inherited from the MS setargv.obj, which is in turn only partially documented in developer-only docs, and incorrectly documented at that. > PS: I saw a RISKS digest posting last week which had a horror story about > wildcard expansion on some flavor of Windows. The person had two files > with long filenames: > > verylongfile1.txt > and > verylongfile2.txt > > But Win32 stored them in 8.3 format, so they were stored as > verylo~2.txt > and > verylo~1.txt > > (Yes, the 1 and 2 were swapped!). So when he did > > del *1.txt > > he removed the wrong file. Neat, eh? > > (This is actually relevant -- it's possible that setargv.obj and > glob.glob could give different answers). Yes, and e.g. it works this way under Win95: D:\Python>dir *~* Volume in drive D is DISK1PART2 Volume Serial Number is 1DFF-0F59 Directory of D:\Python PYCLBR~1 PAT 5,765 06-07-99 11:41p pyclbr.patch KJBUCK~1 PYD 34,304 03-31-98 3:07a kjbuckets.pyd WIN32C~1 05-16-99 12:10a win32comext PYTHON~1 05-16-99 12:10a Pythonwin TEXTTO~1 01-15-99 11:35p TextTools UNWISE~1 EXE 109,056 07-03-97 8:35a UnWisePW32.exe 3 file(s) 149,125 bytes 3 dir(s) 1,502,511,104 bytes free Here's the same thing in an argv-spewing console app whipped up to link setargv.obj: D:\Python>garp\debug\garp *~* 0: D:\PYTHON\GARP\DEBUG\GARP.EXE 1: kjbuckets.pyd 2: pyclbr.patch 3: Pythonwin 4: TextTools 5: UnWisePW32.exe 6: win32comext D:\Python> setargv.obj is apparently consistent with what native wildcard expansion does (although you won't find that promise made anywhere!), and it's definitely surprising in the presence of non-8.3 names. The quoting rules too are impossible to explain, seemingly random: D:\Python>garp\debug\garp "\\a\\" 0: D:\PYTHON\GARP\DEBUG\GARP.EXE 1: \\a\ D:\Python> Before I was on the Help list, I used to believe it would work to just say "well, it does what Windows does" . magnification-of-ignorance-ly y'rs - tim From tim_one@email.msn.com Sat Jun 19 02:26:42 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 18 Jun 1999 21:26:42 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: <016e01beb9ee$99e1a710$0801a8c0@bobcat> Message-ID: <000701beb9f2$c95b9880$a69e2299@tim> [MarkH, with *the* killer argument <0.3 wink>] > Your horror story of the incorrect file being deleted could then > only be blamed on the application, not on Python! Sold! Some years ago in the Perl world, they solved this by making regular old perl.exe not expand wildcards on Windows, but also supplying perlglob.exe which did. Don't know what they're doing today, but they apparently changed their minds at least once, as the couple-years-old version of perl.exe on my machine does do wildcard expansion, and does the wrong (i.e., the Windows ) thing. screw-it-ly y'rs - tim From tim_one@email.msn.com Sat Jun 19 19:45:16 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sat, 19 Jun 1999 14:45:16 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <199906101411.KAA29962@eric.cnri.reston.va.us> Message-ID: <000801beba83$df719e80$c49e2299@tim> Backtracking: [Aaron] > I've always considered it a major shame that Python ints and floats > and chars and stuff have anything to do with dynamic allocation ... [Guido] > What you're describing is very close to what I recall I once read > about the runtime organization of Icon. Perl may also use a variant > on this (it has fixed-length object headers). ... I've rarely been able to make sense of Perl's source code, but gave it another try anyway. An hour later I gave up unenlightened, so cruised the web. Turns out there's a *terrific* writeup of Perl's type representation at: http://home.sol.no/~aas/perl/guts/ Pictures and everything . Header is 3 words: An 8-bit "type" field, 24 baffling flag bits (e.g., flag #14 is "BREAK -- refcnt is artificially low"(!)), 32 refcount bits, and a 32-bit pointer field. Appears that the pointer field is always a real (although possibly NULL) pointer. Plain ints have type code SvIV, and the pointer then points to a bogus address, but where that address + 3 words points to the actual integer value. Why? Because then they can use the same offset to get to the int as when the type is SvPVIV, which is the combined string/integer type, and needs three words (to point to the string start address, current len and allocated len) in addition to the integer value at the end. So why is the integer value at the end? So the same offsets work for the SvPV type, which is solely a string descriptor. So why is it important that SvPVIV, SvPV and SvIV all have the same layout? So that either of the latter types can be dynamically "upgraded" to SvPVIV (when a string is converted to int or vice versa; Perl then holds on to both representations internally) by plugging in a new type code and fiddling some of the baffling flag bits. Brr. I have no idea how they manage to keep Perl running! and-not-entirely-sure-that-they-do-ly y'rs - tim From mal@lemburg.com Mon Jun 21 10:54:50 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 21 Jun 1999 11:54:50 +0200 Subject: [Python-Dev] Relative package imports References: <376A1A00.3099DE99@lemburg.com> Message-ID: <376E0BEA.60F22945@lemburg.com> It seems that there is not much interest in the topic... I'll be offline for the next two weeks -- maybe someone could pick the thread up and toss it around a bit while I'm away. Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 193 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From MHammond@skippinet.com.au Mon Jun 21 12:23:34 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Mon, 21 Jun 1999 21:23:34 +1000 Subject: [Python-Dev] Relative package imports In-Reply-To: <376E0BEA.60F22945@lemburg.com> Message-ID: <000501bebbd8$80f56b10$0801a8c0@bobcat> > It seems that there is not much interest in the topic... > > I'll be offline for the next two weeks -- maybe someone could > pick the thread up and toss it around a bit while I'm away. OK - here are my 2c on it: Unless I am mistaken, this problem could be solved with 2 steps: * Code moves to Python packages. * The standard Python library move to a package. If all non-trivial Python program used packages, and some agreement on a standard namespace could be met, I think it would be addressed. There was a thread on the newsgroup about the potential naming of the standard library. You did state as much in your proposal - indeed, you state "to ease the transition". Personally, I dont think it is worth it, mainly because we end up with a half-baked scheme purely for the transition, but one that can never be removed. To me, the question is one of: * Why arent Zope/PIL capable of being used as packages. * If they are (as I understand to be the case) why do people choose not to use them as such, or why do the authors not recommend this? * Is there a deficiency in the package scheme that makes it hard to use? Eg, should "__" that ni used for the parent package be reinstated? Mark. From fredrik@pythonware.com Mon Jun 21 13:41:27 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 21 Jun 1999 14:41:27 +0200 Subject: [Python-Dev] Relative package imports References: <000501bebbd8$80f56b10$0801a8c0@bobcat> Message-ID: <006501bebbe3$6189e570$f29b12c2@pythonware.com> Mark Hammond wrote: > * Why arent Zope/PIL capable of being used as packages. PIL can be used as a package ("from PIL import Image"), assuming that it's installed under a directory in your path. there's one pro- blem in 1.0b1, though: you have to explicitly import the file format handlers you need: import PIL.JpegImagePlugin import PIL.PngImagePlugin this has been fixed in 1.0 final. > * If they are (as I understand to be the case) why do people choose not to > use them as such, or why do the authors not recommend this? inertia, and compatibility concerns. we've decided that all official material related to PIL 1.0 will use the old syntax (and all 1.X releases will be possible to install using the PIL.pth approach). too many users out there... now, PIL 2.0 is a completely different thing... > * Is there a deficiency in the package scheme that makes it hard to use? not that I'm aware... From mal@lemburg.com Mon Jun 21 15:36:58 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 21 Jun 1999 16:36:58 +0200 Subject: [Python-Dev] Relative package imports References: <000501bebbd8$80f56b10$0801a8c0@bobcat> Message-ID: <376E4E0A.3B714BAB@lemburg.com> Mark Hammond wrote: > > > It seems that there is not much interest in the topic... > > > > I'll be offline for the next two weeks -- maybe someone could > > pick the thread up and toss it around a bit while I'm away. > > OK - here are my 2c on it: > > Unless I am mistaken, this problem could be solved with 2 steps: > * Code moves to Python packages. > * The standard Python library move to a package. > > If all non-trivial Python program used packages, and some agreement on a > standard namespace could be met, I think it would be addressed. There was > a thread on the newsgroup about the potential naming of the standard > library. > > You did state as much in your proposal - indeed, you state "to ease the > transition". Personally, I dont think it is worth it, mainly because we > end up with a half-baked scheme purely for the transition, but one that can > never be removed. With "easing the transition" I ment introducing a way to do relative package imports: you don't need relative imports if you can be sure that the package name will never change (with a fixed naming scheme, a la com.domain.product.package...). The smarter import mechanism is needed to work-around the pickle problems you face (because pickle uses absolute package names). > To me, the question is one of: > > * Why arent Zope/PIL capable of being used as packages. > * If they are (as I understand to be the case) why do people choose not to > use them as such, or why do the authors not recommend this? > * Is there a deficiency in the package scheme that makes it hard to use? > Eg, should "__" that ni used for the parent package be reinstated? I guess this would help a great deal; although I'd personally wouldn't like yet another underscore in the language. Simply leave the name empty as in '.submodule' or '..subpackage.submodule'. Cheers, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 193 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@CNRI.Reston.VA.US Mon Jun 21 23:44:24 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 21 Jun 1999 18:44:24 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Your message of "Fri, 18 Jun 1999 21:26:42 EDT." <000701beb9f2$c95b9880$a69e2299@tim> References: <000701beb9f2$c95b9880$a69e2299@tim> Message-ID: <199906212244.SAA18866@eric.cnri.reston.va.us> > Some years ago in the Perl world, they solved this by making regular old > perl.exe not expand wildcards on Windows, but also supplying perlglob.exe > which did. This seems a reasonable way out. Just like we have pythonw.exe, we could add pythong.exe and pythongw.exe (or pythonwg.exe?). I guess it's time for a README.txt file to be installed explaining all the different executables... By default the g versions would not be used unless invoked explicitly. --Guido van Rossum (home page: http://www.python.org/~guido/) From Vladimir.Marangozov@inrialpes.fr Thu Jun 24 13:23:48 1999 From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov) Date: Thu, 24 Jun 1999 14:23:48 +0200 (DFT) Subject: [Python-Dev] ob_refcnt access Message-ID: <199906241223.OAA46222@pukapuka.inrialpes.fr> How about introducing internal macros for explicit ob_refcnt accesses in the core? Actually, there are a number of places where one can see "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), _Py_SETREF(op, n) thus decoupling completely the low level refcount management defined in object.h: #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) Comments? I've contributed myself to the mess in intobject.c & floatobject.c, so I thought that such macros would make the code cleaner. Here's the current state of affairs: python/dist/src>find . -name "*.[c]" -exec grep ob_refcnt {} \; -print (void *) v, ((PyObject *) v)->ob_refcnt)) ./Modules/_tkinter.c if (self->arg->ob_refcnt > 1) { \ if (ob->ob_refcnt < 2 || self->fast) if (args->ob_refcnt > 1) { ./Modules/cPickle.c if (--inst->ob_refcnt > 0) { ./Objects/classobject.c if (result->ob_refcnt == 1) ./Objects/fileobject.c if (PyFloat_Check(p) && p->ob_refcnt != 0) if (!PyFloat_Check(p) || p->ob_refcnt == 0) { if (PyFloat_Check(p) && p->ob_refcnt != 0) { p, p->ob_refcnt, buf); ./Objects/floatobject.c if (PyInt_Check(p) && p->ob_refcnt != 0) if (!PyInt_Check(p) || p->ob_refcnt == 0) { if (PyInt_Check(p) && p->ob_refcnt != 0) p, p->ob_refcnt, p->ob_ival); ./Objects/intobject.c assert(v->ob_refcnt == 1); /* Since v will be used as accumulator! */ ./Objects/longobject.c if (op->ob_refcnt <= 0) op->ob_refcnt, (long)op); op->ob_refcnt = 1; if (op->ob_refcnt < 0) fprintf(fp, "[%d] ", op->ob_refcnt); ./Objects/object.c if (!PyString_Check(v) || v->ob_refcnt != 1) { if (key->ob_refcnt == 2 && key == value) { ./Objects/stringobject.c if (!PyTuple_Check(op) || op->ob_refcnt != 1) { if (v == NULL || !PyTuple_Check(v) || v->ob_refcnt != 1) { ./Objects/tupleobject.c if (PyList_Check(seq) && seq->ob_refcnt == 1) { if (args->ob_refcnt > 1) { ./Python/bltinmodule.c if (value->ob_refcnt != 1) ./Python/import.c return PyInt_FromLong((long) arg->ob_refcnt); ./Python/sysmodule.c -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From guido@CNRI.Reston.VA.US Thu Jun 24 16:30:45 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 24 Jun 1999 11:30:45 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Your message of "Thu, 24 Jun 1999 14:23:48 +0200." <199906241223.OAA46222@pukapuka.inrialpes.fr> References: <199906241223.OAA46222@pukapuka.inrialpes.fr> Message-ID: <199906241530.LAA27887@eric.cnri.reston.va.us> > How about introducing internal macros for explicit ob_refcnt accesses > in the core? What problem does this solve? > Actually, there are a number of places where one can see > "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), > _Py_SETREF(op, n) thus decoupling completely the low level refcount > management defined in object.h: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) Why the cast? It loses some type-safety, e.g. _Py_GETREF(0) will now cause a core dump instead of a compile-time error. > Comments? I don't see how it's cleaner or saves typing: op->ob_refcnt _Py_GETREF(op) op->ob_refcnt = 1 _Py_SETREF(op, 1) --Guido van Rossum (home page: http://www.python.org/~guido/) From Vladimir.Marangozov@inrialpes.fr Thu Jun 24 17:33:31 1999 From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov) Date: Thu, 24 Jun 1999 18:33:31 +0200 (DFT) Subject: [Python-Dev] Re: ob_refcnt access In-Reply-To: from "marangoz" at "Jun 24, 99 02:23:47 pm" Message-ID: <199906241633.SAA44314@pukapuka.inrialpes.fr> marangoz wrote: > > > How about introducing internal macros for explicit ob_refcnt accesses > in the core? Actually, there are a number of places where one can see > "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), > _Py_SETREF(op, n) thus decoupling completely the low level refcount > management defined in object.h: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) > > Comments? Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot to mention that if this detail doesn't hurt code aesthetics, one (I) could experiment more easily all sort of weird things with refcounting... I formulated the same wish for malloc & friends some time ago, that is, use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be defined for now as malloc, free, but nobody seems to be very excited about a smooth transition to other kinds of malloc. Hence, I reiterate this wish, 'cause switching to macros means preparing the code for the future, even if in the future it remains intact ;-). Defining these basic interfaces is clearly Guido's job :-) as he points out in his summary of the last Open Source summit, but nevertheless, I'm raising the issue to let him see what other people think about this and allow him to make decisions easier :-) -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From ping@lfw.org Thu Jun 24 18:29:19 1999 From: ping@lfw.org (Ka-Ping Yee) Date: Thu, 24 Jun 1999 10:29:19 -0700 (PDT) Subject: [Python-Dev] ob_refcnt access In-Reply-To: <199906241530.LAA27887@eric.cnri.reston.va.us> Message-ID: On Thu, 24 Jun 1999, Guido van Rossum wrote: > > How about introducing internal macros for explicit ob_refcnt accesses > > in the core? > > What problem does this solve? I assume Vladimir was trying to leave the door open for further ob_refcnt manipulation hooks later, like having objects manage their own refcounts. Until there's an actual problem to solve that requires this, though, i'm not sure it's necessary. Are there obvious reasons to want to allow this? * * * While we're talking about refcounts and all, i've had the argument quite successfully made to me that a reasonably written garbage collector can be both (a) simple and (b) more efficient than refcounting. Having spent a good number of work days doing nothing but debugging crashes by tracing refcounting bugs, i was easily converted into a believer once a friend dispelled the notion that garbage collectors were either slow or horribly complicated. I had always been scared of them before, but less so now. Is an incremental GC being considered for a future Python? I've idly been pondering various tricks by which it could be made to work with existing extension modules -- here are some possibilities: 1. Keep the refcounts and let existing code do the usual thing; introduce a new variant of PyObject_NEW that puts an object into the "gc-able" pool rather than the "refcounted" pool. 2. Have Py_DECREF and Py_INCREF just do nothing, and let the garbage collector guess from the contents of the structure where the pointers are. (I'm told it's possible to do this safely, since you can only have false positives, never false negatives.) 3. Have Py_DECREF and Py_INCREF just do nothing, and ask the extension module to just provide (in its type object) a table of where the pointers are in its struct. And so on; mix and match. What are everyone's thoughts on this one? -- ?!ng "All models are wrong; some models are useful." -- George Box From tim_one@email.msn.com Fri Jun 25 07:38:11 1999 From: tim_one@email.msn.com (Tim Peters) Date: Fri, 25 Jun 1999 02:38:11 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Message-ID: <000c01bebed5$4b8d1040$d29e2299@tim> [Ka-Ping Yee, opines about GC] Ping, I think you're not getting any responses because this has been beaten to death on c.l.py over the last month (for the 53rd time, no less ). A hefty percentage of CPython users *like* the reliably timely destruction refcounting yields, and some clearly rely on it. Guido recently (10 June) posted the start of a "add GC on top of RC" scheme, in a thread with the unlikely name "fork()". The combination of cycles, destructors and resurrection is quite difficult to handle in a way both principled and useful (Java's way is principled but by most accounts unhelpful to the point of uselessness). Python experience with the Boehm collector can be found in the FAQ; note that the Boehm collector deals with finalizers in cycles by letting cycles with finalizers leak! > ... > While we're talking about refcounts and all, i've had the > argument quite successfully made to me that a reasonably > written garbage collector can be both (a) simple and (b) more > efficient than refcounting. That's a dubious claim. Sophisticated mark-and-sweep (with or without compaction) is almost universally acknowledged to beat RC, but simple M&S has terrible cache behavior (you fill up the address space before reclaiming anything, then leap all over the address space repeatedly cleaning it up). Don't discount that, in Python unlike as in most other languages, the simple loop for i in xrange(1000000): pass creates a huge amount of trash at a furious pace. Under RC it can happily reuse the same little bit of storage each time around. > Having spent a good number of work days doing nothing but debugging > crashes by tracing refcounting bugs, Yes, we can trade that for tracking down M&S bugs <0.5 wink> -- instead of INCREF/DECREF macros, you end up with M&S macros marking regions where the collector must not be run (because you're in a temporarily "inconsistent" state). That's under sophisticated M&S, though, but is an absolute nightmare when you miss a pair (the bugs only show up "sometimes", and not always the same ways -- depends on when M&S happens to run, and "how inconsistent" you happen to be at the time). > ... > And so on; mix and match. What are everyone's thoughts on this one? I think Python probably needs to clean up cycles, but by some variant of Guido's scheme on top of RC; I very much dislike the property of his scheme that objects with destructors may be get destroyed without their destructors getting invoked, but it seems hard to fix. Alternatives include Java's scheme (which really has nothing going for it other than that Java does it <0.3 wink>); Scheme's "guardian" scheme (which would let the user "get at" cyclic trash with destructors, but refuses to do anything with them on its own); following Boehm by saying that cycles with destructors are immortal; following goofier historical precedent by e.g. destroying such objects in reverse order of creation; or maybe just raising an exception if a trash cycle containing a destructor is found. All of those seem a comparative pain to implement, with Java's being the most painful -- and quite possibly the least satisfying! it's-a-whale-of-a-lot-easier-in-a-self-contained-universe-or-even-an- all-c-one-ly y'rs - tim From Vladimir.Marangozov@inrialpes.fr Fri Jun 25 12:27:43 1999 From: Vladimir.Marangozov@inrialpes.fr (Vladimir Marangozov) Date: Fri, 25 Jun 1999 13:27:43 +0200 (DFT) Subject: [Python-Dev] Re: ob_refcnt access (fwd) Message-ID: <199906251127.NAA27464@pukapuka.inrialpes.fr> FYI, my second message on this issue didn't reach the list because of a stupid error of mine, so Guido and I exchanged two mails in private. His response to the msg below was that he thinks that tweaking the refcount scheme at this level wouldn't contribute much and that he doesn't intend to change anything on this until 2.0 which will be rewritten from scratch. Besides, if I want to satisfy my curiosity in hacking the refcounts I can do it with a small patch because I've already located the places where the ob_refcnt slot is accessed directly. ----- Forwarded message ----- From Vladimir.Marangozov@inrialpes.fr Thu Jun 24 17:33:31 1999 From: Vladimir.Marangozov@inrialpes.fr (Vladimir.Marangozov@inrialpes.fr) Date: Thu, 24 Jun 1999 18:33:31 +0200 (DFT) Subject: ob_refcnt access In-Reply-To: from "marangoz" at "Jun 24, 99 02:23:47 pm" marangoz wrote: > > > How about introducing internal macros for explicit ob_refcnt accesses > in the core? Actually, there are a number of places where one can see > "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), > _Py_SETREF(op, n) thus decoupling completely the low level refcount > management defined in object.h: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) > > Comments? Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot to mention that if this detail doesn't hurt code aesthetics, one (I) could experiment more easily all sort of weird things with refcounting... I formulated the same wish for malloc & friends some time ago, that is, use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be defined for now as malloc, free, but nobody seems to be very excited about a smooth transition to other kinds of malloc. Hence, I reiterate this wish, 'cause switching to macros means preparing the code for the future, even if in the future it remains intact ;-). Defining these basic interfaces is clearly Guido's job :-) as he points out in his summary of the last Open Source summit, but nevertheless, I'm raising the issue to let him see what other people think about this and allow him to make decisions easier :-) -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 ----- End of forwarded message ----- -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From tismer@appliedbiometrics.com Fri Jun 25 19:47:51 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Fri, 25 Jun 1999 20:47:51 +0200 Subject: [Python-Dev] Re: ob_refcnt access (fwd) References: <199906251127.NAA27464@pukapuka.inrialpes.fr> Message-ID: <3773CED7.B87D055C@appliedbiometrics.com> Vladimir Marangozov wrote: > > FYI, my second message on this issue didn't reach the list because > of a stupid error of mine, so Guido and I exchanged two mails > in private. His response to the msg below was that he thinks > that tweaking the refcount scheme at this level wouldn't contribute > much and that he doesn't intend to change anything on this until 2.0 > which will be rewritten from scratch. > > Besides, if I want to satisfy my curiosity in hacking the refcounts > I can do it with a small patch because I've already located the places > where the ob_refcnt slot is accessed directly. Well, one Euro on that issue: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) > > > > Comments? > > Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot > to mention that if this detail doesn't hurt code aesthetics, one (I) could > experiment more easily all sort of weird things with refcounting... I think if at all, this should be no typecast to stay safe. As long as every PyObject has a refcount, this would be correct and checked by the compiler. Why loose it? #define _Py_GETREF(op) ((op)->ob_refcnt) This carries the same semantics, the same compiler check, but adds a level of abstraction for future changes. > I formulated the same wish for malloc & friends some time ago, that is, > use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be > defined for now as malloc, free, but nobody seems to be very excited > about a smooth transition to other kinds of malloc. Hence, I reiterate > this wish, 'cause switching to macros means preparing the code for the > future, even if in the future it remains intact ;-). I wish to incref this wish by mine. In order to be able to try different memory allocation strategies, I would go even further and give every object type its own allocation macro which carries info about the object type about to be allocated. This costs nothing but a little macro expansion for the C compiler, but would allow to try new schemes, without always patching the Python source. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer@appliedbiometrics.com Fri Jun 25 19:56:39 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Fri, 25 Jun 1999 20:56:39 +0200 Subject: [Python-Dev] ob_refcnt access References: <000c01bebed5$4b8d1040$d29e2299@tim> Message-ID: <3773D0E7.458E00F1@appliedbiometrics.com> Tim Peters wrote: > > [Ka-Ping Yee, opines about GC] > > Ping, I think you're not getting any responses because this has been beaten > to death on c.l.py over the last month (for the 53rd time, no less ). > > A hefty percentage of CPython users *like* the reliably timely destruction > refcounting yields, and some clearly rely on it. [CG issue dropped, I know the thread] I know how much of a pain in the .. proper refcounting can be. Sometimes, after long debugging, I wished it would go. But finally, I think it is a *really good thing* to have to do proper refcounting. The reason is that this causes a lot of discipline, which improves the whole program. I guess with GC always there, quite a number of errors stay undetected. I can say this, since I have been through a week of debugging now, and I can now publish full blown first class continuations for Python yes I'm happy - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From skip@mojam.com (Skip Montanaro) Sun Jun 27 23:11:28 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Sun, 27 Jun 1999 18:11:28 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: <199906181451.KAA11549@eric.cnri.reston.va.us> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> Message-ID: <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> Guido> Hmm... This would make it hard to make a patch release for 1.5.2 Guido> (possible called 1.5.3?). I *really* don't want the string Guido> methods to end up in a release yet -- there are too many rough Guido> edges (e.g. some missing methods, should join str() or not, Guido> etc.). Sorry for the delayed response. I've been out of town. When Barry returns would it be possible to merge the string methods in conditionally (#ifdef STRING_METHODS) and add a --with-string-methods configure option? How hard would it be to modify string.py, stringobject.c and stropmodule.c to carry that around? Skip Montanaro | http://www.mojam.com/ skip@mojam.com | http://www.musi-cal.com/~skip/ 518-372-5583 From tim_one@email.msn.com Mon Jun 28 03:27:06 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 27 Jun 1999 22:27:06 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <3773D0E7.458E00F1@appliedbiometrics.com> Message-ID: <000501bec10d$b6f1fb40$e19e2299@tim> [Christian Tismer] > ... > I can say this, since I have been through a week of debugging > now, and I can now publish > > full blown first class continuations for Python > > yes I'm happy - chris You should be! So how come nobody else is ? Let's fire some imagination here: without the stinkin' C stack snaking its way thru everything, then with the exception of external system objects (like open files), the full state of a running Python program is comprised of objects Python understands and controls. So with some amount of additional pain we could pickle them. And unpickle them. Painlessly checkpoint a long computation for possible restarting? Freeze a program while it's running on your mainframe, download it to your laptop and resume it while you're on the road? Ship a bug report with the computation frozen right before the error occurs? Take an app with gobs of expensive initialization, freeze it after it's "finally ready to go", and ship the latter instead? Capture the state of an interactive session for later resumption? Etc. Not saying those are easy, but getting the C stack out of the way means they move from impossible to plausible. Maybe it would help get past the Schemeophobia if, instead of calling them "continuations", you called 'em "platform-independent potentially picklable threads". pippt-sounds-as-good-as-it-reads-ly y'rs - tim From tim_one@email.msn.com Mon Jun 28 04:13:15 1999 From: tim_one@email.msn.com (Tim Peters) Date: Sun, 27 Jun 1999 23:13:15 -0400 Subject: [Python-Dev] ActiveState & fork & Perl Message-ID: <000601bec114$2a2929c0$e19e2299@tim> Moving back in time ... [GordonM] > Perhaps Christian's stackless Python would enable green threads... [Guido] > This has been suggested before... While this seems possible at first, > all blocking I/O calls would have to be redone to pass control to the > thread scheduler, before this would be useful -- a huge task! I didn't understand this. If I/O calls are left alone, and a green thread hit one, the whole program just sits there waiting for the call to complete, right? But if the same thing happens using "real threads" today, the same thing happens today anyway . That is, if a thread doesn't release the global lock before a blocking call today, the whole program just sits there etc. Or do you have some other kind of problem in mind here? unconvincedly y'rs - tim From MHammond@skippinet.com.au Mon Jun 28 05:29:29 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Mon, 28 Jun 1999 14:29:29 +1000 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <000501bec10d$b6f1fb40$e19e2299@tim> Message-ID: <003301bec11e$d0cfc6d0$0801a8c0@bobcat> > > yes I'm happy - chris > > You should be! So how come nobody else is ? Im a little unhappy as this will break the Active Debugging stuff - ie, the ability for Python, Java, Perl, VBScript etc to all exist in the same process, each calling each other, and each being debuggable (makes a _great_ demo :-) Im not _really_ unhappy, Im just throwing this in as an FYI. The Active Debugging interfaces need some way of sorting a call stack. As many languages may be participating in a debugging session, there is no implicit ordering available. Inter-language calls are not made via the debugger, so it has no chance to intercept. So the solution MS came up with was, surprise surprise, the machine stack! :-) The assumption is that all languages will make _some_ use of the stack, so they ask a language to report its "stack base address" and "stack size". Using this information, the debugger sorts into the correct call sequence. Indeed, getting this information (even the half of it I did manage :-) was painful, and hard to get right. Ahh, the joys of bleeding-edge technologies :-) > Let's fire some imagination here: without the stinkin' C > stack snaking its I tried, and look what happened :-) Seriously, some if this stuff would be way cool. Bit I also understand completely the silence on this issue. When the thread started, there was much discussion about exactly what the hell these continuation/coroutine thingies even were. However, there were precious few real-world examples where they could be used. A few acedemic, theoretical places, but the only real contender I have seen brought up was Medusa. There were certainly no clear examples of "as soon as we have this, I could change abc to take advantage, and this would give us the very cool xyz" So, if anyone else if feeling at all like me about this issue, they are feeling all warm and fuzzy knowing that a few smart people are giving us the facility to do something we hope we never, ever have to do. :-) Mark. From rushing@nightmare.com Mon Jun 28 10:53:21 1999 From: rushing@nightmare.com (Sam Rushing) Date: Mon, 28 Jun 1999 02:53:21 -0700 (PDT) Subject: [Python-Dev] ob_refcnt access In-Reply-To: <41219828@toto.iv> Message-ID: <14199.13497.439332.366329@seattle.nightmare.com> Mark Hammond writes: > I tried, and look what happened :-) Seriously, some if this stuff > would be way cool. > > Bit I also understand completely the silence on this issue. When > the thread started, there was much discussion about exactly what > the hell these continuation/coroutine thingies even were. However, > there were precious few real-world examples where they could be > used. A few acedemic, theoretical places, but the only real > contender I have seen brought up was Medusa. There were certainly > no clear examples of "as soon as we have this, I could change abc > to take advantage, and this would give us the very cool xyz" Part of the problem is that we didn't have the feature to play with. Many of the possibilities are showing up now that it's here... The basic advantage to coroutines is they allow you to turn any event-driven/state-machine problem into one that is managed with 'normal' control state; i.e., for loops, while loops, nested procedure calls, etc... Here are a few possible real-world uses: ================================================== Parsing. I remember a discussion from a few years back about the distinction between 'push' and 'pull' model parsers. Coroutines let you have it both ways; you can write a parser in the most natural way (pull), but use it as a 'push'; i.e. for a web browser. ================================================== "http sessions". A single 'thread' of control that is re-entered whenever a hit from a particular user ('session') comes in to the web server: [Apologies to those that have already seen this cheezy example] def ecommerce (session): session.login() # sends a login form, waits for it to return basket = [] while 1: item = session.shop_for_item() if item: basket.append (item) else: break if basket: session.get_shipping_info() session.get_payment_info() session.transact() 'session.shop_for_item()' will resume the main coroutine, which will resume this coroutine only when a new hit comes in from that session/user, and 'return' this hit to the while loop. I have a little web server that uses this idea to play blackjack: http://www.nightmare.com:7777/ http://www.nightmare.com/stuff/blackjack_httpd.py [though I'm a little fuzzy on the rules]. Rather than building a state machine that keeps track of where the user has been, and what they're doing, you can keep all the state in local variables (like 'basket' above) - in other words, it's a much more natural style of programming. ================================================== One of the areas I'm most excited about is GUI coding. All GUI's are event driven. All GUI code is therefore written in a really twisted, state-machine fashion; interactions are very complex. OO helps a bit, but doesn't change the basic difficulty - past a certain point interesting things become too complex to try... Mr. Fuchs' paper ("Escaping the event loop: an alternative control structure for multi-threaded GUIs") does a much better job of describing this than I can: http://cs.nyu.edu/phd_students/fuchs/ http://cs.nyu.edu/phd_students/fuchs/gui.ps ================================================== Tim's example of 'dumping' a computation in the middle and storing it on disk (or sending it over a network), is not a fantasy... I have a 'stackless' Scheme system that does this right now. ================================================== Ok, final example. Isn't there an interface in Python to call a certain function after every so many vm insns? Using coroutines you could hook into this and provide non-preemptive 'threads' for those platforms that don't have them. [And the whole thing would be written in Python, not in C!] ================================================== > So, if anyone else if feeling at all like me about this issue, they > are feeling all warm and fuzzy knowing that a few smart people are > giving us the facility to do something we hope we never, ever have > to do. :-) "When the only tool you have is a hammer, everything looks like a nail". I saw the guys over in the Scheme shop cutting wood with a power saw; now I feel like a schmuck with my hand saw. You are right to be frightened by the strangeness of the underlying machinery; hopefully a simple and easy-to-understand interface can be built for the C level as well as Python. I think Christian's 'frame dispatcher' is fairly clear, and not *that* much of a departure from the current VM; it's amazing to me how little work really had to be done! -Sam From tismer@appliedbiometrics.com Mon Jun 28 13:07:33 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Mon, 28 Jun 1999 14:07:33 +0200 Subject: [Python-Dev] ob_refcnt access References: <003301bec11e$d0cfc6d0$0801a8c0@bobcat> Message-ID: <37776585.17B78DD1@appliedbiometrics.com> Mark Hammond wrote: > > > > yes I'm happy - chris > > > > You should be! So how come nobody else is ? (to Tim) I believe this comes simply since following me would force people to change their way of thinking. I am through this already, but it was hard for me. And after accepting to be stackless, there is no way to go back. Today I'm wondering about my past: "how could I think of stacks when thinking of programs?" This is so wrong. The truth is: Programs are just some data, part of it called code, part of it is local state, and! its future of computation. Out, over, roger. All the rest is artificial showstoppers. > Im a little unhappy as this will break the Active Debugging stuff - ie, the > ability for Python, Java, Perl, VBScript etc to all exist in the same > process, each calling each other, and each being debuggable (makes a > _great_ demo :-) > > Im not _really_ unhappy, Im just throwing this in as an FYI. Well, yet I see no problem. > The Active Debugging interfaces need some way of sorting a call stack. As > many languages may be participating in a debugging session, there is no > implicit ordering available. Inter-language calls are not made via the > debugger, so it has no chance to intercept. > > So the solution MS came up with was, surprise surprise, the machine stack! > :-) The assumption is that all languages will make _some_ use of the > stack, so they ask a language to report its "stack base address" and "stack > size". Using this information, the debugger sorts into the correct call > sequence. Now, I can give it a machine stack. There is just a frame dispatcher sitting on the stack, and it grabs frames from the current thread state. > Indeed, getting this information (even the half of it I did manage :-) was > painful, and hard to get right. I would have to see the AX interface. But for sure there will be some method hooks with which I can tell AX how to walk the frame chain. And why don't I simply publish frames as COM objects? This would give you much more than everything else, I guess. BTW, as it is now, there is no need to use AX debugging for Python, since Python can do it alone now. Of course it makes sense to have it all in the AX environment. You will be able to modify a running programs local variables, its evaluation stack, change its code, change where it returns to, all is doable. ... > Bit I also understand completely the silence on this issue. When the > thread started, there was much discussion about exactly what the hell these > continuation/coroutine thingies even were. However, there were precious > few real-world examples where they could be used. A few acedemic, > theoretical places, but the only real contender I have seen brought up was > Medusa. There were certainly no clear examples of "as soon as we have > this, I could change abc to take advantage, and this would give us the very > cool xyz" The problem was for me, that I had also no understanding what I was doing, actually. Implemented continuations without an idea how they work. But Tim and Sam said they were the most powerful control strucure possible, so I used all my time to find this out. Now I'm beginning to understand. And my continuation based coroutine example turns out to be twenty lines of Python code. Coming soon, after I served my whining customers. > So, if anyone else if feeling at all like me about this issue, they are > feeling all warm and fuzzy knowing that a few smart people are giving us > the facility to do something we hope we never, ever have to do. :-) Think of it as just a flare gun in your hands. By reading the fine print, you will realize that you actually hold an atom bomb, with a little code taming it for you. :-) back-to-the-future - ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From skip@mojam.com (Skip Montanaro) Mon Jun 28 14:13:31 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 28 Jun 1999 09:13:31 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000601bec114$2a2929c0$e19e2299@tim> References: <000601bec114$2a2929c0$e19e2299@tim> Message-ID: <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com> Still trying to make the brain shift from out-of-town to back-to-work... Tim> [GordonM] >> Perhaps Christian's stackless Python would enable green threads... What's a green thread? Skip From fredrik@pythonware.com Mon Jun 28 14:37:30 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 28 Jun 1999 15:37:30 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <000601bec114$2a2929c0$e19e2299@tim> <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com> Message-ID: <00ca01bec16b$5eef11e0$f29b12c2@secret.pythonware.com> > What's a green thread? a user-level thread (essentially what you can implement yourself by swapping stacks, etc). it's enough to write smoothly running threaded programs, but not enough to support true concurrency on multiple processors. also see: http://www.sun.com/solaris/java/wp-java/4.html From tismer@appliedbiometrics.com Mon Jun 28 17:11:43 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Mon, 28 Jun 1999 18:11:43 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <000601bec114$2a2929c0$e19e2299@tim> <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com> Message-ID: <37779EBF.A146D355@appliedbiometrics.com> Skip Montanaro wrote: > > Still trying to make the brain shift from out-of-town to back-to-work... > > Tim> [GordonM] > >> Perhaps Christian's stackless Python would enable green threads... > > What's a green thread? Nano-Threads. Threadless threads, solely Python driven, no system threads needed but possible. Think of the "big" system threads where each can run any number of tiny Python threads. Powered by snake oil - ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From akuchlin@mems-exchange.org Mon Jun 28 18:55:16 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Mon, 28 Jun 1999 13:55:16 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 Message-ID: <14199.46852.932030.576094@amarok.cnri.reston.va.us> --LBZCZBunrI Content-Type: text/plain; charset=us-ascii Content-Description: message body text Content-Transfer-Encoding: 7bit Paul Prescod sent the following note to the XML-SIG mailing list. Thoughts? --amk --LBZCZBunrI Content-Type: message/rfc822 Content-Description: forwarded message Received: from cnri.reston.va.us (ns.cnri.reston.va.us [132.151.1.1]) by newcnri.cnri.reston.va.us (8.9.1a/8.9.1) with SMTP id NAA21297 for ; Mon, 28 Jun 1999 13:31:36 -0400 (EDT) Received: from python.org (parrot [132.151.1.90]) by cnri.reston.va.us (8.9.1a/8.9.1) with ESMTP id NAA09256; Mon, 28 Jun 1999 13:31:36 -0400 (EDT) Received: from python.org (localhost [127.0.0.1]) by python.org (8.9.1a/8.9.1) with ESMTP id NAA19449; Mon, 28 Jun 1999 13:29:36 -0400 (EDT) Received: from relay.pair.com (relay1.pair.com [209.68.1.20]) by python.org (8.9.1a/8.9.1) with ESMTP id NAA19425 for ; Mon, 28 Jun 1999 13:29:00 -0400 (EDT) Received: from prescod.net (sdn-ar-004txdallP126.dialsprint.net [168.191.157.190]) by relay.pair.com (8.8.7/8.8.5) with ESMTP id NAA24949 for ; Mon, 28 Jun 1999 13:31:19 -0400 (EDT) Message-ID: <37779C32.780A9134@prescod.net> X-Mailer: Mozilla 4.51 [en] (WinNT; I) X-Accept-Language: en,tr MIME-Version: 1.0 Errors-To: xml-sig-admin@python.org X-Mailman-Version: 1.0rc2 Precedence: bulk List-Id: XML Processing in Python X-BeenThere: xml-sig@python.org Content-Type: text/plain; charset=us-ascii Content-Length: 1040 From: Paul Prescod Sender: xml-sig-admin@python.org To: "xml-sig@python.org" Subject: [XML-SIG] [Fwd: Re: parsers for Palm?] Date: Mon, 28 Jun 1999 12:00:50 -0400 > Expat 1.1 added a compile-time option to allow a smaller (and slightly > slower) parser. With this option on Win32 it compiles into a single DLL > that compresses to 23k. Is that too large for Palm? > > James Wow. I didn't notice that Expat was so small now. I think that we should certainly move for Python 1.6 to include eXpat and easysax. At compile time, Unix Python users could choose whether they want small or fast. For Windows we could just make both DLLs available (though only the small one would be built-in to the distribution). 23K for something as significant as massively-accelarated XML seems like a small price. Note that this 23k includes full Unicode support and is completely Ansi C, just like Python. Also, I understand that it now supports internal and external, general and parameter entities. In other words, almost everything except validation! Opinions? Paul Prescod _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://www.python.org/mailman/listinfo/xml-sig --LBZCZBunrI-- From guido@CNRI.Reston.VA.US Mon Jun 28 20:35:04 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 15:35:04 -0400 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: Your message of "Mon, 28 Jun 1999 13:55:16 EDT." <14199.46852.932030.576094@amarok.cnri.reston.va.us> References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> Message-ID: <199906281935.PAA01439@eric.cnri.reston.va.us> > Paul Prescod sent the following note to the XML-SIG mailing list. > Thoughts? I don't know any of the acronyms, and I'm busy writing a funding proposal plus two talks for the Monterey conference, so I don't have any thoughts to spare at the moment. Perhaps someone could present the case with some more background info? (It does sounds intriguing, but then again I'm not sure how many people *really* need to parse XML -- it doesn't strike me as something of the same generality as regular expressions yet.) --Guido van Rossum (home page: http://www.python.org/~guido/) From jim@digicool.com Mon Jun 28 20:51:00 1999 From: jim@digicool.com (Jim Fulton) Date: Mon, 28 Jun 1999 15:51:00 -0400 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> Message-ID: <3777D224.6936B890@digicool.com> "Andrew M. Kuchling" wrote: > > Paul Prescod sent the following note to the XML-SIG mailing list. > Thoughts? > When I brought up some ideas for adding a separate validation mechanism for PyExpat, some folks suggested that I should look at some other C libraries, including one from the ILU folks and some other one that I can't remember the name of off hand. Should we (used loosely ;) look into the other libraries before including expat in the Python dist? Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido@CNRI.Reston.VA.US Mon Jun 28 21:07:50 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 16:07:50 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Your message of "Mon, 28 Jun 1999 02:53:21 PDT." <14199.13497.439332.366329@seattle.nightmare.com> References: <14199.13497.439332.366329@seattle.nightmare.com> Message-ID: <199906282007.QAA01570@eric.cnri.reston.va.us> > Part of the problem is that we didn't have the feature to play with. > Many of the possibilities are showing up now that it's here... > > The basic advantage to coroutines is they allow you to turn any > event-driven/state-machine problem into one that is managed with > 'normal' control state; i.e., for loops, while loops, nested procedure > calls, etc... > > Here are a few possible real-world uses: Thanks, Sam! Very useful collection of suggestions. (How come I'm not surprised to see these coming from you ;-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Mon Jun 28 21:08:42 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Mon, 28 Jun 1999 16:08:42 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <199906281935.PAA01439@eric.cnri.reston.va.us> References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> <199906281935.PAA01439@eric.cnri.reston.va.us> Message-ID: <14199.54858.464165.381344@amarok.cnri.reston.va.us> Guido van Rossum writes: >any thoughts to spare at the moment. Perhaps someone could present >the case with some more background info? (It does sounds intriguing, Paul is probably suggesting this so that Python comes with a fast, standardized XML parser out of the box. On the other hand, where do you draw the line? Paul suggests including PyExpat and easySAX (a small SAX implementation), but why not full SAX, and why not DOM? My personal leaning is that we can get more bang for the buck by working on the Distutils effort, so that installing a package like PyExpat becomes much easier, rather than piling more things into the core distribution. -- A.M. Kuchling http://starship.python.net/crew/amk/ The Law, in its majestic equality, forbids the rich, as well as the poor, to sleep under the bridges, to beg in the streets, and to steal bread. -- Anatole France From guido@CNRI.Reston.VA.US Mon Jun 28 21:17:41 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 16:17:41 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Sun, 27 Jun 1999 23:13:15 EDT." <000601bec114$2a2929c0$e19e2299@tim> References: <000601bec114$2a2929c0$e19e2299@tim> Message-ID: <199906282017.QAA01592@eric.cnri.reston.va.us> [Tim] > Moving back in time ... > > [GordonM] > > Perhaps Christian's stackless Python would enable green threads... > > [Guido] > > This has been suggested before... While this seems possible at first, > > all blocking I/O calls would have to be redone to pass control to the > > thread scheduler, before this would be useful -- a huge task! > > I didn't understand this. If I/O calls are left alone, and a green thread > hit one, the whole program just sits there waiting for the call to complete, > right? > > But if the same thing happens using "real threads" today, the same thing > happens today anyway . That is, if a thread doesn't release the > global lock before a blocking call today, the whole program just sits there > etc. > > Or do you have some other kind of problem in mind here? OK, I'll explain. Suppose there's a wrapper for a read() call whose essential code looks like this: Py_BEGIN_ALLOW_THREADS n = read(fd, buffer, size); Py_END_ALLOW_THREADS When the read() call is made, other threads can run. However in green threads (e.g. using Christian's stackless Python, where a thread switcher is easily added) the whole program would block at this point. The way to fix this is to have a way to tell the scheduler "come back to this thread when there's input ready on this fd". The scheduler has to combine such calls from all threads into a single giant select. It gets more complicated when you have blocking I/O wrapped in library functions, e.g. gethostbyname() or fread(). Then, you need to have a way to implement sleep() by talking to the thread schedule (remember, this is the thread scheduler we have to write ourselves). Oh, and of course the thread scheduler must also have a select() lookalike API so I can still implement the select module. Does this help? Or am I misunderstanding your complaint? Or is a missing? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@CNRI.Reston.VA.US Mon Jun 28 21:23:57 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 16:23:57 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Your message of "Sun, 27 Jun 1999 22:27:06 EDT." <000501bec10d$b6f1fb40$e19e2299@tim> References: <000501bec10d$b6f1fb40$e19e2299@tim> Message-ID: <199906282023.QAA01605@eric.cnri.reston.va.us> > > yes I'm happy - chris > > You should be! So how come nobody else is ? Chris and I have been through this in private, but it seems that as long as I don't fess up in public I'm afraid it will come back and I'll get pressure coming at me to endorse Chris' code. I have no problem with the general concept (see my response to Sam's post of exciting examples). But I have a problem with a megapatch like this that affects many places including very sensitive areas like the main loop in ceval.c. The problem is simply that I know this is very intricate code, and I can't accept a patch of this scale to this code before I understand every little detail of the patch. I'm just too worried otherwise that there's a reference count bug in it that will very subtly break stuff and that will take forever to track down; I feel that when I finally have the time to actually understand the whole patch I'll be able to prevent that (famous last words). Please don't expect action or endorsement of Chris' patch from me any time soon, I'm too busy. However I'd love it if others used the patch in a real system and related their experiences regarding performance, stability etc. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@mojam.com (Skip Montanaro) Mon Jun 28 21:24:46 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 28 Jun 1999 16:24:46 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <14199.54858.464165.381344@amarok.cnri.reston.va.us> References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> <199906281935.PAA01439@eric.cnri.reston.va.us> <14199.54858.464165.381344@amarok.cnri.reston.va.us> Message-ID: <14199.55737.544299.718558@cm-24-29-94-19.nycap.rr.com> Andrew> My personal leaning is that we can get more bang for the buck by Andrew> working on the Distutils effort, so that installing a package Andrew> like PyExpat becomes much easier, rather than piling more things Andrew> into the core distribution. Amen to that. See Guido's note and my response regarding soundex in the Doc-SIG. Perhaps you could get away with a very small core distribution that only contained the stuff necessary to pull everything else from the net via http or ftp... Skip From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Mon Jun 28 22:20:05 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Mon, 28 Jun 1999 17:20:05 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> Message-ID: <14199.59141.447168.107784@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> Sorry for the delayed response. I've been out of town. When SM> Barry returns would it be possible to merge the string methods SM> in conditionally (#ifdef STRING_METHODS) and add a SM> --with-string-methods configure option? How hard would it be SM> to modify string.py, stringobject.c and stropmodule.c to carry SM> that around? How clean do you want this separation to be? Just disabling the actual string methods would be easy, and I'm sure I can craft a string.py that would work in either case (remember stropmodule.c wasn't even touched). There are a few other miscellaneous changes mostly having to do with some code cleaning, but those are probably small (and uncontroversial?) enough that they can either stay in, or be easily understood and accepted (optimistic aren't I? :) by Guido during the merge. I'll see what I can put together in the next 1/2 hour or so. -Barry From skip@mojam.com (Skip Montanaro) Mon Jun 28 22:37:03 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 28 Jun 1999 17:37:03 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: <14199.59141.447168.107784@anthem.cnri.reston.va.us> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> Message-ID: <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> >>>>> "BAW" == Barry A Warsaw writes: >>>>> "SM" == Skip Montanaro writes: SM> would it be possible to merge the string methods in conditionally SM> (#ifdef STRING_METHODS) ... BAW> How clean do you want this separation to be? Just disabling the BAW> actual string methods would be easy, and I'm sure I can craft a BAW> string.py that would work in either case (remember stropmodule.c BAW> wasn't even touched). Barry, I would be happy with having to manually #define STRING_METHODS in stringobject.c. Forget about the configure flag at first. I think the main point for experimenters like myself is that it is a hell of a lot easier to twiddle a #define than to try merging different CVS branches to get access to the functionality. Most of us have probably advanced far enough on the Emacs, vi or Notepad learning curves to handle that change, while most of us are probably not CVS wizards. Once it's in the main CVS branch, you can announce the change or not on the main list as you see fit (perhaps on python-dev sooner and on python-list later after some more experience has been gained with the patches). Skip From tismer@appliedbiometrics.com Mon Jun 28 22:41:28 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Mon, 28 Jun 1999 23:41:28 +0200 Subject: [Python-Dev] ob_refcnt access References: <000501bec10d$b6f1fb40$e19e2299@tim> <199906282023.QAA01605@eric.cnri.reston.va.us> Message-ID: <3777EC08.42C15478@appliedbiometrics.com> Guido van Rossum wrote: > > > > yes I'm happy - chris > > > > You should be! So how come nobody else is ? > > Chris and I have been through this in private, but it seems that as > long as I don't fess up in public I'm afraid it will come back and > I'll get pressure coming at me to endorse Chris' code. Please let me add a few comments. > I have no problem with the general concept (see my response to Sam's > post of exciting examples). This is the most worthful statement I can get. And see below. > But I have a problem with a megapatch like this that affects many > places including very sensitive areas like the main loop in ceval.c. Actually it is a rather small patch, but the implicit semantic change is rather hefty. > The problem is simply that I know this is very intricate code, and I > can't accept a patch of this scale to this code before I understand > every little detail of the patch. I'm just too worried otherwise that > there's a reference count bug in it that will very subtly break stuff > and that will take forever to track down; I feel that when I finally > have the time to actually understand the whole patch I'll be able to > prevent that (famous last words). I never expected to see this patch go into Python right now. The current public version is an alpha 0.2. Meanwhile I have 0.3, with again new patches, and a completely reworked policy of frame refcounting. Even worse, there is a night mare of more work which I simply had no time for. All the instance and onbect code must be carefully changed, since they still need to call back in a recursive way. This is hard to change until I have a better mechanism to generate all the callbacks. For instance, I cannot switch tasks in an __init__ at this time. Although I can do so in regular methods. But this is all half-baked. In other words, the danger is by far not over, but still in the growing phase. I believe I should work on and maintain this until I'm convinced that there are not more refcount bugs than before, and until I have evicted every recursion which is a serious impact. This is still months of work. When I release the final version, I will pay $100 to the first person who finds a refcount bug which I introduced. But not before. I don't want to waste Guido's time, and for sure not now with this bloody fresh code. What I needed to know is wether I am on the right track or if I'm wasting my time. But since I have users already, it is no waste at all. What I really could use were some hints about API design. Guido, thank you for Python - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw@python.org Mon Jun 28 23:04:05 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Mon, 28 Jun 1999 18:04:05 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> Message-ID: <14199.61781.695240.71428@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> I would be happy with having to manually #define SM> STRING_METHODS in stringobject.c. Forget about the configure SM> flag at first. Oh, I agree -- I wasn't going to add the configure flag anyway :) What I meant was how much of my changes should be ifdef-out-able? Just the methods on string objects? All my changes? -Barry From skip@mojam.com (Skip Montanaro) Mon Jun 28 23:30:55 1999 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Mon, 28 Jun 1999 18:30:55 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: <14199.61781.695240.71428@anthem.cnri.reston.va.us> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> <14199.61781.695240.71428@anthem.cnri.reston.va.us> Message-ID: <14199.63115.58129.480522@cm-24-29-94-19.nycap.rr.com> BAW> Oh, I agree -- I wasn't going to add the configure flag anyway :) BAW> What I meant was how much of my changes should be ifdef-out-able? BAW> Just the methods on string objects? All my changes? Well, when the CPP macro is undefined, the behavior from Python should be unchanged, yes? Am I missing something? There are string methods and what else involved in the changes? If string.py has to test to see if "".capitalize yields an AttributeError to decide what to do, I think that sort of change will be simple enough to accommodate. Any new code that gets well-exercised now before string methods become widely available is all to the good in my opinion. It's not fixing something that ain't broke, more like laying the groundwork for new directions. Skip From bwarsaw@python.org Tue Jun 29 00:04:55 1999 From: bwarsaw@python.org (Barry A. Warsaw) Date: Mon, 28 Jun 1999 19:04:55 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> <14199.61781.695240.71428@anthem.cnri.reston.va.us> <14199.63115.58129.480522@cm-24-29-94-19.nycap.rr.com> Message-ID: <14199.65431.161001.730247@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> Well, when the CPP macro is undefined, the behavior from SM> Python should be unchanged, yes? Am I missing something? SM> There are string methods and what else involved in the SM> changes? There are a few additions to the C API, but these probably don't need to be ifdef'd, since they don't change the existing semantics or interfaces. abstract.c has some code cleaning and reorganization, but the public API and semantics should be unchanged. Builtin long() and int() have grown an extra optional argument, which specifies the base to use. If this extra argument isn't given then they should work the same as in the main branch. Should we ifdef out the extra argument? SM> If string.py has to test to see if "".capitalize yields an SM> AttributeError to decide what to do, I think that sort of SM> change will be simple enough to accommodate. Basically what I've got is to move the main-branch string.py to stringold.py and if you get an attribute error on ''.upper I do a "from stringold import *". I've also got some hackarounds for test_string.py to make it work with or without string methods. SM> Any new code that gets well-exercised now before string SM> methods become widely available is all to the good in my SM> opinion. It's not fixing something that ain't broke, more SM> like laying the groundwork for new directions. Agreed. I'll check my changes in shortly. The ifdef will only disable the string methods. long() and int() will still accept the option argument. Stay tuned, -Barry From tim_one@email.msn.com Tue Jun 29 05:16:34 1999 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 29 Jun 1999 00:16:34 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199906282017.QAA01592@eric.cnri.reston.va.us> Message-ID: <000201bec1e6$2c496940$229e2299@tim> [Tim, claims not to understand Guido's > While this seems possible at first, all blocking I/O calls would > have to be redone to pass control to the thread scheduler, before > this would be useful -- a huge task! ] [Guido replies, sketching an elaborate scheme for making threads that are fake nevertheless act like real threads in the particular case of potentially blocking I/O calls] > ... > However in green threads (e.g. using Christian's stackless Python, > where a thread switcher is easily added) the whole program would block > at this point. The way to fix this is [very painful ]. > ... > Does this help? Or am I misunderstanding your complaint? Or is a > missing? No missing wink; I think it hinges on a confusion about the meaning of your original word "useful". Threads can be very useful purely as a means for algorithm structuring, due to independent control flows. Indeed, I use threads in Python most often these days without any hope or even *use* for potential parallelism (overlapped I/O or otherwise). It's the only non-brain-busting way to write code now that requires advanced control of the iterator, generator, coroutine, or even independent-agents-in-a-pipeline flavors. Fake threads would allow code like that to run portably, and also likely faster than with the overheads of OS-level threads. For pedagogical and debugging purposes too, fake threads could be very much friendlier than the real thing. Heck, we could even run them on a friendly old Macintosh . If all fake threads block when any hits an I/O call, waiting for the latter to return, we're no worse off than in a single-threaded program. Being "fake threads", it *is* a single-threaded program, so it's not even a surprise . Maybe in your Py_BEGIN_ALLOW_THREADS n = read(fd, buffer, size); Py_END_ALLOW_THREADS you're assuming that some other Python thread needs to run in order for the read implementation to find something to read? Then that's a dead program for sure, as it would be for a single-threaded run today too. I can live with that! I don't expect fake threads to act like real threads in all cases. My assumption was that the BEGIN/END macros would do nothing under fake threads -- since there isn't a real thread backing it up, a fake thread can't yield in the middle of random C code (Python has no way to capture/restore the C state). I didn't picture fake threads working except as a Python-level feature, with context switches limited to bytecode boundaries (which a stackless ceval can handle with ease; the macro context switch above is "in the middle of" some bytecode's interpretation, and while "green threads" may be interested in simulating the that, Tim's "fake threads" aren't). different-threads-for-different-heads-ly y'rs - tim From guido@CNRI.Reston.VA.US Tue Jun 29 13:01:30 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 29 Jun 1999 08:01:30 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Tue, 29 Jun 1999 00:16:34 EDT." <000201bec1e6$2c496940$229e2299@tim> References: <000201bec1e6$2c496940$229e2299@tim> Message-ID: <199906291201.IAA02535@eric.cnri.reston.va.us> > [Tim, claims not to understand Guido's > > > While this seems possible at first, all blocking I/O calls would > > have to be redone to pass control to the thread scheduler, before > > this would be useful -- a huge task! > > ] > > [Guido replies, sketching an elaborate scheme for making threads that > are fake nevertheless act like real threads in the particular case of > potentially blocking I/O calls] [Tim responds, explaining that without this threads are quite useful.] I guess it's all in the perspective. 99.99% of all thread apps I've ever written use threads primarily to overlap I/O -- if there wasn't I/O to overlap I wouldn't use a thread. I think I share this perspective with most of the thread community (after all, threads originate in the OS world where they were invented as a replacement for I/O completion routines). (And no, I don't use threads to get the use of multiple CPUs, since I almost never have had more than one of those. And no, I wasn't expecting the read() to be fed from another thread.) As far as I can tell, all the examples you give are easily done using coroutines. Can we call whatever you're asking for coroutines instead of fake threads? I think that when you mention threads, green or otherwise colored, most people who are at all familiar with the concept will assume they provide I/O overlapping, except perhaps when they grew up in the parallel machine world. Certainly all examples I give in my never-completed thread tutorial (still available at http://www.python.org/doc/essays/threads.html) use I/O as the primary motivator -- this kind of example appeals to simples souls (e.g. downloading more than one file in parallel, which they probably have already seen in action in their web browser), as opposed to generators or pipelines or coroutines (for which you need to have some programming theory background to appreciate the powerful abstraction possibillities they give). Another good use of threads (suggested by Sam) is for GUI programming. An old GUI system, News by David Rosenthal at Sun, used threads programmed in PostScript -- very elegant (and it failed for other reasons -- if only he had used Python instead :-). On the other hand, having written lots of GUI code using Tkinter, the event-driven version doesn't feel so bad to me. Threads would be nice when doing things like rubberbanding, but I generally agree with Ousterhout's premise that event-based GUI programming is more reliable than thread-based. Every time your Netscape freezes you can bet there's a threading bug somewhere in the code. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Wed Jun 30 01:03:37 1999 From: gmcm@hypernet.com (Gordon McMillan) Date: Tue, 29 Jun 1999 19:03:37 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199906291201.IAA02535@eric.cnri.reston.va.us> References: Your message of "Tue, 29 Jun 1999 00:16:34 EDT." <000201bec1e6$2c496940$229e2299@tim> Message-ID: <1281421591-30373695@hypernet.com> I've been out of town, too (not with Skip), but I'll jump back in here... [Guido] > When the read() call is made, other threads can run. However in > green threads (e.g. using Christian's stackless Python, where a > thread switcher is easily added) the whole program would block at > this point. The way to fix this is to have a way to tell the > scheduler "come back to this thread when there's input ready on > this fd". The scheduler has to combine such calls from all > threads into a single giant select. It gets more complicated when > you have blocking I/O I suppose, in the best of all possible worlds, this is true. But I'm fairly sure there are a number of well-used green thread implementations which go only part way - eg, if this is a "selectable" fd, do a select with a timeout of 0 on this one fd and choose to read/write or swap accordingly. That's a fair amount of bang for the buck, I think... [Tim] > Threads can be very useful purely as a means for algorithm > structuring, due to independent control flows. Spoken like a true schizo, Tim me boyos! Actually, you and Guido are saying almost the same thing - threads are useful when more than one thing is "driving" your processing. It's just that in the real world, that's almost always I/O, not some sick, tortured internal dialogue... I think the real question is: how useful would this be on a Mac? On Win31? (I'll answer that - useful, though I've finally got my last Win31 client to promise to upgrade, RSN ). - Gordon From MHammond@skippinet.com.au Wed Jun 30 00:47:26 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Wed, 30 Jun 1999 09:47:26 +1000 Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? Message-ID: <006f01bec289$bf1e3a90$0801a8c0@bobcat> This probably isnt the correct list, but I really dont want to start a philosophical discussion - hopefully people here are both "in the know" and able to resist a huge thread :-) Especially given the recent slashdot flamefest between RMS and ESR, I thought it worth getting correct. I just read a statement early in our book - "Python is an Open Source tool, ...". Is this "near enough"? Should I avoid this term in preference for something more generic (ie, even simply dropping the caps?) - but the OS(tm) idea seems doomed anyway... Just-hoping-to-avoid-flame-mail-from-rabid-devotees-of-either-religion :-) Mark. From da@ski.org Wed Jun 30 07:16:01 1999 From: da@ski.org (David Ascher) Date: Tue, 29 Jun 1999 23:16:01 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? In-Reply-To: <006f01bec289$bf1e3a90$0801a8c0@bobcat> Message-ID: On Wed, 30 Jun 1999, Mark Hammond wrote: > I just read a statement early in our book - "Python is an Open Source tool, > ...". > > Is this "near enough"? Should I avoid this term in preference for > something more generic (ie, even simply dropping the caps?) - but the > OS(tm) idea seems doomed anyway... It's not certified Open Source, but my understanding is that ESR believes the Python license would qualify if GvR applied for certification. BTW, you won't be able to avoid flames about something or other, and given that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and pseudo-RMSs, all Anonymous Cowards. =) --david From fredrik@pythonware.com Wed Jun 30 09:42:15 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 30 Jun 1999 10:42:15 +0200 Subject: [Python-Dev] Is Python Free Software, free software, Open Source,open source, etc? References: Message-ID: <012601bec2d4$74c315b0$f29b12c2@secret.pythonware.com> > BTW, you won't be able to avoid flames about something or other, and given > that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and > pseudo-RMSs, all Anonymous Cowards. =) just check the latest "learning python" review on Amazon... surely proves that perlers are weird people ;-) From guido@CNRI.Reston.VA.US Wed Jun 30 13:06:21 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 30 Jun 1999 08:06:21 -0400 Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? In-Reply-To: Your message of "Tue, 29 Jun 1999 23:16:01 PDT." References: Message-ID: <199906301206.IAA04619@eric.cnri.reston.va.us> > On Wed, 30 Jun 1999, Mark Hammond wrote: > > > I just read a statement early in our book - "Python is an Open Source tool, > > ...". > > > > Is this "near enough"? Should I avoid this term in preference for > > something more generic (ie, even simply dropping the caps?) - but the > > OS(tm) idea seems doomed anyway... > > It's not certified Open Source, but my understanding is that ESR believes > the Python license would qualify if GvR applied for certification. I did, months ago, and haven't heard back yet. My current policy is to drop the initial caps and say "open source" -- most people don't know the difference anyway. > BTW, you won't be able to avoid flames about something or other, and given > that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and > pseudo-RMSs, all Anonymous Cowards. =) I don't have the time to read slashdot -- can anyone summarize what ESR and RMS were flaming about? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Wed Jun 30 13:22:09 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 30 Jun 1999 14:22:09 +0200 Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? References: <199906301206.IAA04619@eric.cnri.reston.va.us> Message-ID: <000701bec2f3$2df78430$f29b12c2@secret.pythonware.com> > I did, months ago, and haven't heard back yet. My current policy is > to drop the initial caps and say "open source" -- most people don't > know the difference anyway. and "Open Source" cannot be trademarked anyway... > I don't have the time to read slashdot -- can anyone summarize what > ESR and RMS were flaming about? the usual; RMS wrote in saying that 1) he's not part of the open source movement, 2) open source folks don't under- stand the real meaning of the word freedom, and 3) he's not a communist. ESR response is here: http://www.tuxedo.org/~esr/writings/shut-up-and-show-them.html ... OSI's tactics work. That's the easy part of the lesson. The hard part is that the FSF's tactics don't work, and never did. ... So the next time RMS, or anybody else, urges you to "talk about freedom", I urge you to reply "Shut up and show them the code." imo, the best thing is of course to ignore them both, and continue to ship great stuff under a truly open license... From bwarsaw@cnri.reston.va.us (Barry A. Warsaw) Wed Jun 30 13:54:06 1999 From: bwarsaw@cnri.reston.va.us (Barry A. Warsaw) (Barry A. Warsaw) Date: Wed, 30 Jun 1999 08:54:06 -0400 (EDT) Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? References: <199906301206.IAA04619@eric.cnri.reston.va.us> <000701bec2f3$2df78430$f29b12c2@secret.pythonware.com> Message-ID: <14202.4974.162380.284749@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> imo, the best thing is of course to ignore them both, and FL> continue to ship great stuff under a truly open license... Agreed, of course. I think given the current state of affairs (i.e. the non-trademarkability of "Open Source", but also the mind share that little-oh, little-ess has gotten), we should say that Python (and JPython) are "open source" projects and let people make up their own minds about what that means. waiting-for-guido's-inevitable-faq-entry-ly y'rs, -Barry From tismer@appliedbiometrics.com Tue Jun 29 19:17:51 1999 From: tismer@appliedbiometrics.com (Christian Tismer) Date: Tue, 29 Jun 1999 20:17:51 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000201bec1e6$2c496940$229e2299@tim> <199906291201.IAA02535@eric.cnri.reston.va.us> Message-ID: <37790DCF.7C0E8FA@appliedbiometrics.com> Guido van Rossum wrote: > [Guido and Tim, different opinions named misunderstanding :] > > I guess it's all in the perspective. 99.99% of all thread apps I've > ever written use threads primarily to overlap I/O -- if there wasn't > I/O to overlap I wouldn't use a thread. I think I share this > perspective with most of the thread community (after all, threads > originate in the OS world where they were invented as a replacement > for I/O completion routines). > > (And no, I don't use threads to get the use of multiple CPUs, since I > almost never have had more than one of those. And no, I wasn't > expecting the read() to be fed from another thread.) > > As far as I can tell, all the examples you give are easily done using > coroutines. Can we call whatever you're asking for coroutines instead > of fake threads? I don't think this would match it. These threads can be implemented by coroutines which always run apart, and have some scheduling running. When there is polled I/O available, they can of course give a threaded feeling. If an application polls the kbhit function instead of reading, the other "threads" can run nicely. Can be quite useful for very small computers like CE. Many years before, I had my own threads under Turbo Pascal (I had no idea that these are called so). Ok, this was DOS, but it was enough of threading to have a "process" which smoothly updated a graphics screen, while another (single! :) "process" wrote data to the disk, a third one handled keyboard input, and a fourth drove a multichannel A/D sampling device. ? Oops, I just realized that these were *true* threads. The disk process would not run smooth, I agree. All the rest would be fine with green threads. ... > On the other hand, having written lots of GUI code using Tkinter, the > event-driven version doesn't feel so bad to me. Threads would be nice > when doing things like rubberbanding, but I generally agree with > Ousterhout's premise that event-based GUI programming is more reliable > than thread-based. Every time your Netscape freezes you can bet > there's a threading bug somewhere in the code. Right. But with a traceback instead of a machine hang, this could be more attractive to do. Green threads/coroutines are incredibly fast (one c call per switch). And since they have local state, you can save most of the attribute lookups which are needed with event based programming. (But this is all theory until we tried it). ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Sun Jun 6 21:54:04 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sun, 06 Jun 1999 21:54:04 +0200 Subject: [Python-Dev] Stackless Preview (was: Memory leak under Idle?) References: <000901beafcc$424ec400$639e2299@tim> Message-ID: <375AD1DC.19C1C0F6@appliedbiometrics.com> Tim Peters wrote: [see the main list on idle leaks] > if-pystone-works-ship-it-ly y'rs - tim Well, on request of uncle Timmy, I do it. Although it's very early. A preview of stackless Python can be found under ftp://ftp.pns.cc/pub/stackless_990606.zip Current status: The main interpreter is completely stackless. Just for fun, I've set max recursion depth to 30000, so just try it. PyStone does of course run. My measures were about 3-5 percent slower than with standard Python. I think this is quite fair. As a side effect, the exec_statement now behaves better than before, since exec without globals and locals should update the current environment, which worked only for exec "string". Most of the Run_ functions are stackless as well. Almost all cases could be treated tail recursively. I have just begun to work on the builtins, and there is a very bloody, new-born stackless map, which seems to behave quite well. (It is just an hour old, so don't blame me if I didn't get al refcounts right). This is a first special case, since I *had* to build a tiny interpreter from the old map code. Still quite hacky, but not so bad. It creates its own frame and bails out whenever it needs to call the interpreter. If not, it stays in the loop. Since this one is so fresh, the old map is still there, and the new one has the name "map_nr". As a little bonus, map_nr now also shows up in a traceback. I've set the line no to the iteration count. Beware, this is just a proof of concept and will most probably change. Further plans: I will make the other builtins stackless as well (reduce, filter), also the simple tail-recursive ones which I didn't do now due to lack of time. I think I will *not* think of stackless imports. After loking into this for a while, I think this is rather hairy, and also not necessary. On extensions: There will be a coroutine extension in a few days. This is now nearly a no-brainer, since I did the stackless Python with exactly that in mind. This is the real fruit where I'm after, so please let me pick it :) Documentation: Besides the few new comments, there is nothing yet. Diff files: Sorry, there are no diffs but just the modified files. I had no time to do them now. All files stem from the official Python 1.5.2 release. You might wonder about the version: In order to support extension modules which rely on some special new features of frames, I decided to name this Python "1.5.42", since I believe it will be useful at least "four two" people. :-) I consider this an Alpha 1 version. fearing the feedback :-) ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From da at ski.org Mon Jun 7 18:43:09 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 09:43:09 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl Message-ID: In case you haven't heard about it, ActiveState has recently signed a contract with Microsoft to do some work on Perl on win32. One interesting aspect of this for Python is the specific work being performed. From the FAQ on this joint effort, one gets, under "What is the scope of the work that is being done?": fork() This implementation of fork() will clone the running interpreter and create a new interpreter with its own thread, but running in the same process space. The goal is to achieve functional equivalence to fork() on UNIX systems without suffering the performance hit of the process creation overhead on Win32 platforms. Emulating fork() within a single process needs the ability to run multiple interpreters concurrently in separate threads. Perl version 5.005 has experimental support for this in the form of the PERL_OBJECT build option, but it has some shortcomings. PERL_OBJECT needs a C++ compiler, and currently only works on Windows. ActiveState will be working to provide support for revamped support for the PERL_OBJECT functionality that will run on every platform that Perl will build on, and will no longer require C++ to work. This means that other operating systems that lack fork() but have support for threads (such as VMS and MacOS) will benefit from this aspect of the work. Any guesses as to whether we could hijack this work if/when it is released as Open Source? --david From guido at CNRI.Reston.VA.US Mon Jun 7 18:49:27 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 12:49:27 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 09:43:09 PDT." References: Message-ID: <199906071649.MAA12619@eric.cnri.reston.va.us> > In case you haven't heard about it, ActiveState has recently signed a > contract with Microsoft to do some work on Perl on win32. Have I ever heard of it! :-) David Grove pulled me into one of his bouts of paranoia. I think he's calmed down for the moment. > One interesting aspect of this for Python is the specific work being > performed. From the FAQ on this joint effort, one gets, under "What is > the scope of the work that is being done?": > > fork() > > This implementation of fork() will clone the running interpreter > and create a new interpreter with its own thread, but running in the > same process space. The goal is to achieve functional equivalence to > fork() on UNIX systems without suffering the performance hit of the > process creation overhead on Win32 platforms. > > Emulating fork() within a single process needs the ability to run > multiple interpreters concurrently in separate threads. Perl version > 5.005 has experimental support for this in the form of the PERL_OBJECT > build option, but it has some shortcomings. PERL_OBJECT needs a C++ > compiler, and currently only works on Windows. ActiveState will be > working to provide support for revamped support for the PERL_OBJECT > functionality that will run on every platform that Perl will build on, > and will no longer require C++ to work. This means that other operating > systems that lack fork() but have support for threads (such as VMS and > MacOS) will benefit from this aspect of the work. > > Any guesses as to whether we could hijack this work if/when it is released > as Open Source? When I saw this, my own response was simply "those poor Perl suckers are relying too much of fork()." Am I wrong, and is this also a habit of Python programmers? Anyway, I doubt that we coould use their code, as it undoubtedly refers to reimplementing fork() at the Perl level, not at the C level (which would be much harder). --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Mon Jun 7 18:51:45 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 09:51:45 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071649.MAA12619@eric.cnri.reston.va.us> Message-ID: On Mon, 7 Jun 1999, Guido van Rossum wrote: > When I saw this, my own response was simply "those poor Perl suckers > are relying too much of fork()." Am I wrong, and is this also a habit > of Python programmers? Well, I find the fork() model to be a very simple one to use, much easier to manage than threads or full-fledged IPC. So, while I don't rely on it in any crucial way, it's quite convenient at times. --david From guido at CNRI.Reston.VA.US Mon Jun 7 18:56:22 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 12:56:22 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 09:51:45 PDT." References: Message-ID: <199906071656.MAA12642@eric.cnri.reston.va.us> > Well, I find the fork() model to be a very simple one to use, much easier > to manage than threads or full-fledged IPC. So, while I don't rely on it > in any crucial way, it's quite convenient at times. Can you give a typical example where you use it, or is this just a gut feeling? It's also dangerous -- e.g. unexpected errors may percolate down the wrong stack (many mailman bugs had to do with forking), GUI apps generally won't be cloned, and some extension libraries don't like to be cloned either (e.g. ILU). --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Mon Jun 7 19:02:31 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 10:02:31 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071656.MAA12642@eric.cnri.reston.va.us> Message-ID: On Mon, 7 Jun 1999, Guido van Rossum wrote: > Can you give a typical example where you use it, or is this just a gut > feeling? Well, the latest example was that I wanted to spawn a Python process to do viewing of NumPy arrays with Tk from within the Python interactive shell (without using a shell wrapper). It's trivial with a fork(), and non-trivial with threads. The solution I had to finalize on was to branch based on OS and do threads where threads are available and fork() otherwise. Likely 2.05 times as many errors as with a single solution =). > It's also dangerous -- e.g. unexpected errors may percolate down the > wrong stack (many mailman bugs had to do with forking), GUI apps > generally won't be cloned, and some extension libraries don't like to > be cloned either (e.g. ILU). More dangerous than threads? Bwaaahaahaa! =). fork() might be "deceivingly simple in appearance", I grant you that. But sometimes that's good enough. It's also possible that fork() without all of its process-handling relatives isn't useful enough to warrant the effort. --david From bwarsaw at cnri.reston.va.us Mon Jun 7 19:05:20 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 7 Jun 1999 13:05:20 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl References: <199906071656.MAA12642@eric.cnri.reston.va.us> Message-ID: <14171.64464.805578.325069@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> It's also dangerous -- e.g. unexpected errors may percolate Guido> down the wrong stack (many mailman bugs had to do with Guido> forking), GUI apps generally won't be cloned, and some Guido> extension libraries don't like to be cloned either Guido> (e.g. ILU). Rambling mode on... Okay, so you can't guarantee that fork will be everywhere you might want to run an application. For example, that's one of the main reasons Mailman hasn't been ported off of Un*x. But you also can't guarantee that threads will be everywhere either. One of the things I'd (eventually) like to do is to re-architect Mailman so that it uses a threaded central server instead of the current one-shot process model. But there's been debate among the developers because 1) threads aren't supported everywhere, and 2) thread support isn't built-in by default anyway. I wonder if it's feasible or useful to promote threading support in Python? Thoughts would include building threads in by default if possible on the platform, integrating Greg's free threading mods, etc. Providing more integrated support for threads might encourage programmers to reach for that particular tool instead of fork, which is crude, but pretty damn handy and easy to use. Rambling mode off... -Barry From jim at digicool.com Mon Jun 7 19:07:59 1999 From: jim at digicool.com (Jim Fulton) Date: Mon, 07 Jun 1999 13:07:59 -0400 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <375BFC6F.BF779796@digicool.com> David Ascher wrote: > > On Mon, 7 Jun 1999, Guido van Rossum wrote: > > > When I saw this, my own response was simply "those poor Perl suckers > > are relying too much of fork()." Am I wrong, and is this also a habit > > of Python programmers? > > Well, I find the fork() model to be a very simple one to use, much easier > to manage than threads or full-fledged IPC. So, while I don't rely on it > in any crucial way, it's quite convenient at times. Interesting. I prefer threads because they eliminate the *need* for an IPC. I find locks and the various interesting things you can build from them to be much easier to deal with and more elegant than IPC. I wonder if the perl folks are also going to emulate doing IPC in the same process. Hee hee. :) Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From da at ski.org Mon Jun 7 19:10:56 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 10:10:56 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14171.64464.805578.325069@anthem.cnri.reston.va.us> Message-ID: On Mon, 7 Jun 1999, Barry A. Warsaw wrote: > I wonder if it's feasible or useful to promote threading support in > Python? Thoughts would include building threads in by default if > possible on the platform, That seems a good idea to me. It's a relatively safe thing to enable by default, no? > Providing more integrated support for threads might encourage > programmers to reach for that particular tool instead of fork, which > is crude, but pretty damn handy and easy to use. While we're at it, it'd be nice if we could provide a better answer when someone asks (as "they" often do) "how do I program with threads in Python" than our usual "the way you'd do it in C". Threading tutorials are very hard to come by, I've found (I got the ORA multi-threaded programming in win32, but it's such a monster I've barely looked at it). I suggest that we allocate about 10% of TimBot's time to that task. If necessary, we can upgrade it to a dual-CPU setup. With Greg's threading patches, we could even get it to run on both CPUs efficiently. It could write about itself. --david From akuchlin at mems-exchange.org Mon Jun 7 19:20:15 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Mon, 7 Jun 1999 13:20:15 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> Message-ID: <14171.65359.306743.276505@amarok.cnri.reston.va.us> David Ascher writes: >While we're at it, it'd be nice if we could provide a better answer when >someone asks (as "they" often do) "how do I program with threads in >Python" than our usual "the way you'd do it in C". Threading tutorials >are very hard to come by, I've found (I got the ORA multi-threaded Agreed; I'd love to see a HOWTO on thread programming. I really liked Andrew Birrell's introduction to threads for Modula-3; see http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html (Postscript and PDF versions available.) Translating its approach to Python would be an excellent starting point. -- A.M. Kuchling http://starship.python.net/crew/amk/ "If you had stayed with us, we could have given you life until death." "Don't I get that anyway?" -- Stheno and Lyta Hall, in SANDMAN #61: "The Kindly Ones:5" From guido at CNRI.Reston.VA.US Mon Jun 7 19:24:45 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 13:24:45 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 13:20:15 EDT." <14171.65359.306743.276505@amarok.cnri.reston.va.us> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14171.65359.306743.276505@amarok.cnri.reston.va.us> Message-ID: <199906071724.NAA12743@eric.cnri.reston.va.us> > David Ascher writes: > >While we're at it, it'd be nice if we could provide a better answer when > >someone asks (as "they" often do) "how do I program with threads in > >Python" than our usual "the way you'd do it in C". Threading tutorials > >are very hard to come by, I've found (I got the ORA multi-threaded Andrew Kuchling chimes in: > Agreed; I'd love to see a HOWTO on thread programming. I really > liked Andrew Birrell's introduction to threads for Modula-3; see > http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html > (Postscript and PDF versions available.) Translating its approach to > Python would be an excellent starting point. Another idea is for someone to finish the thread tutorial that I started early 1998 (and never finished because I realized that it needed the threading module and some thread-safety patches to urllib for the examples I had in mind to work). It's actually on the website (but unlinked-to): http://www.python.org/doc/essays/threads.html --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at cnri.reston.va.us Mon Jun 7 19:28:57 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Mon, 7 Jun 1999 13:28:57 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071724.NAA12743@eric.cnri.reston.va.us> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14171.65359.306743.276505@amarok.cnri.reston.va.us> <199906071724.NAA12743@eric.cnri.reston.va.us> Message-ID: <14172.289.552901.264826@bitdiddle.cnri.reston.va.us> Indeed, it might be better to start with the threading module for the first tutorial. While I'm also a fan of Birrell's paper, it would encourage people to start with the low-level thread module, instead of the higher-level threading module. So the right answer, of course, is to do both! Jeremy From bwarsaw at cnri.reston.va.us Mon Jun 7 19:36:05 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 7 Jun 1999 13:36:05 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> Message-ID: <14172.773.807413.412693@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: >> I wonder if it's feasible or useful to promote threading >> support in Python? Thoughts would include building threads in >> by default if possible on the platform, DA> That seems a good idea to me. It's a relatively safe thing to DA> enable by default, no? Don't know how hard it would be to write the appropriate configure tests, but then again, if it was easy I'd'a figured Guido would have done it already. A simple thing would be to change the default sense of "Do we build in thread support?". Make this true by default, and add a --without-threads configure flag people can use to turn them off. -Barry From skip at mojam.com Tue Jun 8 00:37:38 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 7 Jun 1999 18:37:38 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.773.807413.412693@anthem.cnri.reston.va.us> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14172.773.807413.412693@anthem.cnri.reston.va.us> Message-ID: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> BAW> A simple thing would be to change the default sense of "Do we build BAW> in thread support?". Make this true by default, and add a BAW> --without-threads configure flag people can use to turn them off. True enough, but as Guido pointed out, enabling threads by default would immediately make the Mac a second-class citizen. Test cases and demos would eventually find their way into the distribution that Mac users could not run, etc., etc. It may not account for a huge fraction of the Python development seats, but it seems a shame to leave it out in the cold. Has there been an assessment of how hard it would be to add thread support to the Mac? On a scale of 1 to 10 (1: we know how, but it's not implemented because nobody's needed it so far, 10: drilling for oil on the sun would be easier), how hard would it be? I assume Jack Jansen is on this list. Jack, any thoughts? Alpha code? Pre-alpha code? Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From da at ski.org Tue Jun 8 00:43:32 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 15:43:32 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: On Mon, 7 Jun 1999, Skip Montanaro wrote: > True enough, but as Guido pointed out, enabling threads by default would > immediately make the Mac a second-class citizen. Test cases and demos would > eventually find their way into the distribution that Mac users could not > run, etc., etc. It may not account for a huge fraction of the Python > development seats, but it seems a shame to leave it out in the cold. I'm not sure I buy that argument. There are already thread demos in the current directory, and no one complains. The windows builds are already threaded by default, and it's not caused any problems that I know of. Think of it like enabling the *new* module. =) > Has there been an assessment of how hard it would be to add thread > support to the Mac? That's an interesting question, especially since ActiveState lists it as a machine w/ threads and w/o fork(). --david From skip at mojam.com Tue Jun 8 00:49:12 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 7 Jun 1999 18:49:12 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com> David> I'm not sure I buy that argument. Think of it like enabling the David> *new* module. =) That's not quite the same thing. The new module simply exposes some normally closed-from-Python-code data structures to the Python programmer. Enabling threads requires some support from the underlying runtime system. If that was already in place, I suspect the Mac binaries would come with the thread module enabled by default, yes? Skip From da at ski.org Tue Jun 8 00:58:22 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 15:58:22 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com> Message-ID: On Mon, 7 Jun 1999, Skip Montanaro wrote: > That's not quite the same thing. The new module simply exposes some > normally closed-from-Python-code data structures to the Python programmer. > Enabling threads requires some support from the underlying runtime system. > If that was already in place, I suspect the Mac binaries would come with the > thread module enabled by default, yes? I'm not denying that. It's just that there are lots of things which fall into that category, like (to take a pointed example =), os.fork(). We don't have a --with-fork configure flag. We expose to the Python programmer all of the underlying OS that is 'wrapped' as long as it's reasonably portable. I think that most unices + win32 is a reasonable approximation of 'reasonably portable'. And in fact, this change might motivate someone with Mac fervor to explore adding Python support of Mac threads. --david From gmcm at hypernet.com Tue Jun 8 02:01:56 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Mon, 7 Jun 1999 19:01:56 -0500 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: <1283322126-63868517@hypernet.com> David Ascher wrote: > On Mon, 7 Jun 1999, Skip Montanaro wrote: > > > True enough, but as Guido pointed out, enabling threads by default would > > immediately make the Mac a second-class citizen. > I'm not sure I buy that argument. There are already thread demos in > the current directory, and no one complains. The windows builds are > already threaded by default, and it's not caused any problems that I > know of. Think of it like enabling the *new* module. =) > > > Has there been an assessment of how hard it would be to add thread > > support to the Mac? > > That's an interesting question, especially since ActiveState lists > it as a machine w/ threads and w/o fork(). Not a Mac programmer, but I recall that when Steve Jobs came back, they published a schedule that said threads would be available a couple releases down the road. Schedules only move one way, so I'd guess ActiveState is premature. Perhaps Christian's stackless Python would enable green threads... (And there are a number of things in the standard distribution which don't work on Windows, either; fork and select()ing on file fds). - Gordon From skip at mojam.com Tue Jun 8 01:06:34 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 7 Jun 1999 19:06:34 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com> Message-ID: <14172.20567.40217.703269@cm-24-29-94-19.nycap.rr.com> David> I think that most unices + win32 is a reasonable approximation of David> 'reasonably portable'. And in fact, this change might motivate David> someone with Mac fervor to explore adding Python support of Mac David> threads. One can hope... ;-) Skip From MHammond at skippinet.com.au Tue Jun 8 01:06:37 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 09:06:37 +1000 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071649.MAA12619@eric.cnri.reston.va.us> Message-ID: <000501beb13a$9eec2c10$0801a8c0@bobcat> > > In case you haven't heard about it, ActiveState has > recently signed a > > contract with Microsoft to do some work on Perl on win32. > > Have I ever heard of it! :-) David Grove pulled me into one of his > bouts of paranoia. I think he's calmed down for the moment. It sounds like a :-), but Im afraid I dont understand that reference. When I first heard this, two things sprung to mind: a) Why shouldnt Python push for a similar deal? b) Something more interesting in the MS/Python space is happening anyway, so nyah nya nya ;-) Getting some modest funds to (say) put together and maintain single core+win32 installers to place on the NT resource kit could only help Python. Sometimes I wish we had a few less good programmers, and a few more good marketting type people ;-) > Anyway, I doubt that we coould use their code, as it undoubtedly > refers to reimplementing fork() at the Perl level, not at the C level > (which would be much harder). Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate fork using the Win32 extensions? Python has basically all of the native Win32 process API exposed, and writing a "fork" in Python that only forked Python scripts (for example) may be feasable and not too difficult. It would have obvious limitations, including the fact that it is not available standard with Python on Windows (just like a working popen now :-) but if we could follow the old 80-20 rule, and catch 80% of the uses with 20% of the effort it may be worth investigating. My knowledge of fork is limited to muttering "something about cloning the current process", so I may be naive in the extreme - but is this feasible? Mark. From fredrik at pythonware.com Tue Jun 8 01:21:15 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:21:15 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <000501beb13a$9eec2c10$0801a8c0@bobcat> Message-ID: <001601beb13c$70ff5b90$f29b12c2@pythonware.com> Mark wrote: > Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate > fork using the Win32 extensions? Python has basically all of the native > Win32 process API exposed, and writing a "fork" in Python that only forked > Python scripts (for example) may be feasable and not too difficult. > > It would have obvious limitations, including the fact that it is not > available standard with Python on Windows (just like a working popen now > :-) but if we could follow the old 80-20 rule, and catch 80% of the uses > with 20% of the effort it may be worth investigating. > > My knowledge of fork is limited to muttering "something about cloning the > current process", so I may be naive in the extreme - but is this feasible? as an aside, GvR added Windows' "spawn" API in 1.5.2, so you can at least emulate some common variants of fork+exec. this means that if someone writes a spawn for Unix, we would at least catch >0% of the uses with ~0% of the effort ;-) fwiw, I'm more interested in the "unicode all the way down" parts of the activestate windows project. more on that later. From gstein at lyra.org Tue Jun 8 01:10:38 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 16:10:38 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <375C516E.76EC8ED4@lyra.org> David Ascher wrote: >... > I'm not denying that. It's just that there are lots of things which fall > into that category, like (to take a pointed example =), os.fork(). We > don't have a --with-fork configure flag. We expose to the Python > programmer all of the underlying OS that is 'wrapped' as long as it's > reasonably portable. I think that most unices + win32 is a reasonable > approximation of 'reasonably portable'. And in fact, this change might > motivate someone with Mac fervor to explore adding Python support of Mac > threads. Agreed. Python isn't a least-common-demoninator language. It tries to make things easy for people. Why should we kill all platforms because of a lack on one? Having threads by default will make a lot of things much simpler (in terms of knowing the default platform). Can't tell you how many times I curse to find that the default RedHat distribution (as of 5.x) did not use threads, even though they are well-supported on Linux. And about stuff creeping into the distribution: gee... does that mean that SocketServer doesn't work on the Mac? Threads *and* fork are not available on Python/Mac, so all you would get is a single-threaded server. icky. I can't see how adding threads to other platforms will *hurt* the Macintosh platform... it can only help others. About the only reason that I can see to *not* make them the default is the slight speed loss. But that seems a bit bogus, as the interpreter loop doesn't spend *that* much time mucking with the interp_lock to allow thread switches. There have also been some real good suggestions for making it take near-zero time until you actually create that second thread. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Tue Jun 8 01:26:08 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:26:08 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> <1283322126-63868517@hypernet.com> Message-ID: <002a01beb13d$1fa23c80$f29b12c2@pythonware.com> > Not a Mac programmer, but I recall that when Steve Jobs came back, > they published a schedule that said threads would be available a > couple releases down the road. Schedules only move one way, so I'd > guess ActiveState is premature. http://www.computerworld.com/home/print.nsf/all/990531AAFA > Perhaps Christian's stackless Python would enable green threads... > > (And there are a number of things in the standard distribution which > don't work on Windows, either; fork and select()ing on file fds). time to implement channels? (Tcl's unified abstraction for all kinds of streams that you could theoretically use something like select on. sockets, pipes, asynchronous disk I/O, etc). does select really work on ordinary files under Unix, btw? From fredrik at pythonware.com Tue Jun 8 01:30:57 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:30:57 +0200 Subject: [Python-Dev] ActiveState & fork & Perl Message-ID: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> I wrote: > > Not a Mac programmer, but I recall that when Steve Jobs came back, > > they published a schedule that said threads would be available a > > couple releases down the road. Schedules only move one way, so I'd > > guess ActiveState is premature. > > http://www.computerworld.com/home/print.nsf/all/990531AAFA which was just my way of saying that "did he perhaps refer to OS X ?". or are they adding real threads to good old MacOS too? From fredrik at pythonware.com Tue Jun 8 01:38:02 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:38:02 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <375C516E.76EC8ED4@lyra.org> Message-ID: <003f01beb13e$c95a2750$f29b12c2@pythonware.com> > Having threads by default will make a lot of things much simpler > (in terms of knowing the default platform). Can't tell you how > many times I curse to find that the default RedHat distribution > (as of 5.x) did not use threads, even though they are well- > supported on Linux. I have a vague memory that once upon a time, the standard X libraries shipped with RedHat weren't thread safe, and Tkinter didn't work if you compiled Python with threads. but I might be wrong and/or that may have changed... From MHammond at skippinet.com.au Tue Jun 8 01:42:38 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 09:42:38 +1000 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> Message-ID: <000801beb13f$6e118310$0801a8c0@bobcat> > > http://www.computerworld.com/home/print.nsf/all/990531AAFA > > which was just my way of saying that "did he perhaps > refer to OS X ?". > > or are they adding real threads to good old MacOS too? Oh, /F, please dont start adding annotations to your collection of incredibly obscure URLs - takes away half the fun ;-) Mark. From gstein at lyra.org Tue Jun 8 02:01:41 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 17:01:41 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: <375C516E.76EC8ED4@lyra.org> <003f01beb13e$c95a2750$f29b12c2@pythonware.com> Message-ID: <375C5D65.6E6CD6F@lyra.org> Fredrik Lundh wrote: > > > Having threads by default will make a lot of things much simpler > > (in terms of knowing the default platform). Can't tell you how > > many times I curse to find that the default RedHat distribution > > (as of 5.x) did not use threads, even though they are well- > > supported on Linux. > > I have a vague memory that once upon a time, the standard > X libraries shipped with RedHat weren't thread safe, and > Tkinter didn't work if you compiled Python with threads. > > but I might be wrong and/or that may have changed... Yes, it has changed. RedHat now ships with a thread-safe X so that they can use GTK and Gnome (which use threads quite a bit). There may be other limitations, however, as I haven't tried to do any threaded GUI programming, especially on a recent RedHat (I'm using a patched/hacked RH 4.1 system). RedHat 6.0 may even ship with a threaded Python, but I dunno... -g -- Greg Stein, http://www.lyra.org/ From da at ski.org Tue Jun 8 02:43:27 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 17:43:27 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000501beb13a$9eec2c10$0801a8c0@bobcat> Message-ID: On Tue, 8 Jun 1999, Mark Hammond wrote: > When I first heard this, two things sprung to mind: > a) Why shouldnt Python push for a similar deal? > b) Something more interesting in the MS/Python space is happening anyway, > so nyah nya nya ;-) > > Getting some modest funds to (say) put together and maintain single > core+win32 installers to place on the NT resource kit could only help > Python. How much money are we talking about (no, I'm not offering =)? I wonder if one problem we have is that the folks with $$'s don't want to advertise that they have $$'s because they don't want to be swamped with vultures (and because "that isn't done"), and the people with skills but no $$'s don't want to advertise that fact for a variety of reasons (modesty, fear of being labeled 'commercial', fear of exposing that they're not 100% busy, so "can't be good", etc.). I've been wondering if a broker service like sourceXchange for Python could work -- whether there are enough people who want something done to Python and are willing to pay for an Open Soure project (and whether there are enough "worker bees", although I suspect there are). I can think of several items on various TODO lists which could probably be tackled this way. (doing things *within* sourceXchange is clearly a possibility in the long term -- in the short term they seem focused on Linux, but time will tell). Guido, you're probably the point-man for such 'angels' -- do you get those kinds of requests periodically? How about you, Mark? One thing that ActiveState has going for it which doesn't exist in the Python world is a corporate entity devoted to software development and distribution. PPSI is a support company, or at least markets itself that way. --david From gstein at lyra.org Tue Jun 8 03:05:15 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 18:05:15 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <375C6C4B.617138AB@lyra.org> David Ascher wrote: > > On Tue, 8 Jun 1999, Mark Hammond wrote: > > > When I first heard this, two things sprung to mind: > > a) Why shouldnt Python push for a similar deal? As David points out, I believe this is simply because ActiveState is unique in their business type, products, and model. We don't have anything like that in the Python world (although Pythonware could theoretically go in a similar direction). >... > I've been wondering if a broker service like sourceXchange for Python > could work -- whether there are enough people who want something done to > Python and are willing to pay for an Open Soure project (and whether there > are enough "worker bees", although I suspect there are). I can think of > several items on various TODO lists which could probably be tackled this > way. (doing things *within* sourceXchange is clearly a possibility in the > long term -- in the short term they seem focused on Linux, but time will > tell). sourceXchange should work fine. I don't see it being Linux-only by any means. Heck, the server is a FreeBSD box, and Brian Behlendorf comes from the Apache world (and is a FreeBSD guy mostly). > Guido, you're probably the point-man for such 'angels' -- do you get those > kinds of requests periodically? How about you, Mark? > > One thing that ActiveState has going for it which doesn't exist in the > Python world is a corporate entity devoted to software development and > distribution. PPSI is a support company, or at least markets itself that > way. Yup. That's all we are. We are specifically avoiding any attempts to be a product company. ActiveState is all about products and support-type products. I met with Dick Hardt (ActiveState founder/president) just a couple weeks ago. Great guy. We spoke about ActiveState, what they're doing, and what they'd like to do. They might be looking for good Python people, too... Cheers, -g -- Greg Stein, http://www.lyra.org/ From akuchlin at mems-exchange.org Tue Jun 8 03:22:59 1999 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Mon, 7 Jun 1999 21:22:59 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14172.773.807413.412693@anthem.cnri.reston.va.us> <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: <14172.28787.399827.929220@newcnri.cnri.reston.va.us> Skip Montanaro writes: >True enough, but as Guido pointed out, enabling threads by default would >immediately make the Mac a second-class citizen. Test cases and demos would One possibility might be NSPR, the Netscape Portable Runtime, which provides platform-independent threads and I/O on Mac, Win32, and Unix. Perhaps a thread implementation could be written that sat on top of NSPR, in addition to the existing pthreads implementation. See http://www.mozilla.org/docs/refList/refNSPR/. (You'd probably only use NSPR on the Mac, though; there seems no point in adding another layer of complexity to Unix and Windows.) -- A.M. Kuchling http://starship.python.net/crew/amk/ When religion abandons poetic utterance, it cuts its own throat. -- Robertson Davies, _Marchbanks' Garland_ From tim_one at email.msn.com Tue Jun 8 03:24:47 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 7 Jun 1999 21:24:47 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Message-ID: <000901beb14d$b2759100$aaa02299@tim> [David Ascher] > In case you haven't heard about it, ActiveState has recently signed a > contract with Microsoft to do some work on Perl on win32. I'm astonished at the reaction this has provoked "out there". Here: D:\Python>perl -v This is perl, version 5.001 Unofficial patchlevel 1m. Copyright 1987-1994, Larry Wall Win32 port Copyright (c) 1995 Microsoft Corporation. All rights reserved. Developed by hip communications inc., http://info.hip.com/info/ Perl for Win32 Build 107 Built Apr 16 1996 at 14:47:22 Perl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the Perl 5.0 source kit. D:\Python> Notice the MS copyright? From 1995?! Perl for Win32 has *always* been funded by MS, even back when half of ActiveState was named "hip communications" <0.5 wink>. Thank Perl's dominance in CGI scripting -- MS couldn't sell NT Server if it didn't run Perl. MS may be vicious, but they're not stupid . > ... > fork() > ... > Any guesses as to whether we could hijack this work if/when it is released > as Open Source? It's proven impossible so far to reuse anything from the Perl source -- the code is an incestuous nightmare. From time to time the Perl-Porters talk about splitting some of it into reusable libraries, but that never happens; and the less they feel Perl's dominance is assured, the less they even talk about it. So I'm pessimistic (what else is new ?). I'd rather see the work put into threads anyway. The "Mac OS" problem will go away eventually; time to turn the suckers on by default. it's-not-like-millions-of-programmers-will-start-writing-thread-code-then- who-don't-now-ly y'rs - tim From guido at CNRI.Reston.VA.US Tue Jun 8 03:34:59 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 21:34:59 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 19:01:56 CDT." <1283322126-63868517@hypernet.com> References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> <1283322126-63868517@hypernet.com> Message-ID: <199906080134.VAA13480@eric.cnri.reston.va.us> > Perhaps Christian's stackless Python would enable green threads... This has been suggested before... While this seems possible at first, all blocking I/O calls would have to be redone to pass control to the thread scheduler, before this would be useful -- a huge task! I believe SunOS 4.x's LWP (light-weight processes) library used this method. It was a drop-in replacement for the standard libc, containing changed versions of all system calls. I recall that there were one or two missing, which of course upset the posix module because it references almost *all* system calls... --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Tue Jun 8 03:38:38 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 7 Jun 1999 21:38:38 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> Message-ID: <000e01beb14f$a16d9a40$aaa02299@tim> [/F] > http://www.computerworld.com/home/print.nsf/all/990531AAFA > > which was just my way of saying that "did he perhaps > refer to OS X ?". > > or are they adding real threads to good old MacOS too? Dragon is doing a port of its speech recog software to "good old MacOS" and "OS X", and best we can tell the former is as close to an impossible target as we've ever seen. OS X looks like a pleasant romp, in comparison. I don't think they're going to do anything with "good old MacOS" except let it die. it-was-a-reasonable-architecture-15-years-ago-ly y'rs - tim From gstein at lyra.org Tue Jun 8 03:31:08 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 18:31:08 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14172.773.807413.412693@anthem.cnri.reston.va.us> <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> <14172.28787.399827.929220@newcnri.cnri.reston.va.us> Message-ID: <375C725C.5A86D05B@lyra.org> Andrew Kuchling wrote: > > Skip Montanaro writes: > >True enough, but as Guido pointed out, enabling threads by default would > >immediately make the Mac a second-class citizen. Test cases and demos would > > One possibility might be NSPR, the Netscape Portable Runtime, > which provides platform-independent threads and I/O on Mac, Win32, and > Unix. Perhaps a thread implementation could be written that sat on > top of NSPR, in addition to the existing pthreads implementation. > See http://www.mozilla.org/docs/refList/refNSPR/. > > (You'd probably only use NSPR on the Mac, though; there seems no > point in adding another layer of complexity to Unix and Windows.) NSPR is licensed under the MPL, which is quite a bit more restrictive than Python's license. Of course, you could separately point Mac users to it to say "if you get NSPR, then you can have threads". Apache ran into the licensing issue and punted NSPR in favor of a home-grown runtime (which is not as ambitious as NSPR). Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm at hypernet.com Tue Jun 8 04:37:34 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Mon, 7 Jun 1999 21:37:34 -0500 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <002a01beb13d$1fa23c80$f29b12c2@pythonware.com> Message-ID: <1283312788-64430290@hypernet.com> Fredrik Lundh writes: > > time to implement channels? (Tcl's unified abstraction > for all kinds of streams that you could theoretically use > something like select on. sockets, pipes, asynchronous > disk I/O, etc). I have mixed feelings about those types of things. I've recently run across a number of them in some C/C++ libs. On the "pro" side, they can give acceptable behavior and adequate performance and thus suffice for the majority of use. On the "con" side, they're usually an order of magnitude slower than the raw interface, don't quite behave correctly in borderline situations, and tend to produce "One True Path" believers. Of course, so do OSes, editors, languages, GUIs, browsers and colas. > does select really work on ordinary files under Unix, > btw? Sorry, should've said "where a socket is a real fd" or some such... just-like-God-intended-ly y'rs - Gordon From guido at CNRI.Reston.VA.US Tue Jun 8 03:46:40 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 21:46:40 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 17:43:27 PDT." References: Message-ID: <199906080146.VAA13572@eric.cnri.reston.va.us> > Guido, you're probably the point-man for such 'angels' -- do you get those > kinds of requests periodically? No, as far as I recall, nobody has ever offered me money for Python code to be donated to the body of open source. People sometimes seek to hire me, but promarily to further their highly competitive proprietary business goals... --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Tue Jun 8 03:41:32 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 18:41:32 -0700 Subject: [Python-Dev] licensing Message-ID: <375C74CC.2947E4AE@lyra.org> Speaking of licensing issues... I seem to have read somewhere that the two Medusa files are under a separate license. Although, reading the files now, it seems they are not. The issue that I'm really raising is that Python should ship with a single license that covers everything. Otherwise, it will become very complicated for somebody to figure out which pieces fall under what restrictions. Is there anything in the distribution that is different than the normal license? For example, can I take the async modules and build a commercial product on them? Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at CNRI.Reston.VA.US Tue Jun 8 03:56:03 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 21:56:03 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Tue, 08 Jun 1999 09:06:37 +1000." <000501beb13a$9eec2c10$0801a8c0@bobcat> References: <000501beb13a$9eec2c10$0801a8c0@bobcat> Message-ID: <199906080156.VAA13612@eric.cnri.reston.va.us> [me] > > Have I ever heard of it! :-) David Grove pulled me into one of his > > bouts of paranoia. I think he's calmed down for the moment. [Mark] > It sounds like a :-), but Im afraid I dont understand that reference. David Grove occasionally posts to Perl lists with accusations that ActiveState is making Perl proprietary. He once announced a program editor to the Python list which upon inspection by me didn't contain any Python support, for which I flamed him. He then explained to me that he was in a hurry because ActiveState was taking over the Perl world. A couple of days ago, I received an email from him (part of a conversation on the perl5porters list apparently) where he warned me that ActiveState was planning a similar takeover of Python. After some comments from tchrist ("he's a loon") I decided to ignore David. > Sometimes I wish we had a few less good programmers, and a few more good > marketting type people ;-) Ditto... It sure ain't me! > Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate :-) > fork using the Win32 extensions? Python has basically all of the native > Win32 process API exposed, and writing a "fork" in Python that only forked > Python scripts (for example) may be feasable and not too difficult. > > It would have obvious limitations, including the fact that it is not > available standard with Python on Windows (just like a working popen now > :-) but if we could follow the old 80-20 rule, and catch 80% of the uses > with 20% of the effort it may be worth investigating. > > My knowledge of fork is limited to muttering "something about cloning the > current process", so I may be naive in the extreme - but is this feasible? I think it's not needed that much, but David has argued otherwise. I haven't heard much support either way from others. But I think it would be a huge task, because it would require taking control of all file descriptors (given the semantics that upon fork, file descriptors are shared, but if one half closes an fd it is still open in the other half). --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Tue Jun 8 04:58:59 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Mon, 7 Jun 1999 21:58:59 -0500 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000e01beb14f$a16d9a40$aaa02299@tim> References: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> Message-ID: <1283311503-64507593@hypernet.com> [Tim] > Dragon is doing a port of its speech recog software to "good old > MacOS" and "OS X", and best we can tell the former is as close to an > impossible target as we've ever seen. OS X looks like a pleasant > romp, in comparison. I don't think they're going to do anything > with "good old MacOS" except let it die. > > it-was-a-reasonable-architecture-15-years-ago-ly y'rs - tim Don't Macs have another CPU in the keyboard already? Maybe you could just require a special microphone . that's-not-a-mini-tower-that's-a-um--subwoofer-ly y'rs - Gordon From guido at CNRI.Reston.VA.US Tue Jun 8 04:09:02 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 22:09:02 -0400 Subject: [Python-Dev] licensing In-Reply-To: Your message of "Mon, 07 Jun 1999 18:41:32 PDT." <375C74CC.2947E4AE@lyra.org> References: <375C74CC.2947E4AE@lyra.org> Message-ID: <199906080209.WAA13806@eric.cnri.reston.va.us> > Speaking of licensing issues... > > I seem to have read somewhere that the two Medusa files are under a > separate license. Although, reading the files now, it seems they are > not. > > The issue that I'm really raising is that Python should ship with a > single license that covers everything. Otherwise, it will become very > complicated for somebody to figure out which pieces fall under what > restrictions. > > Is there anything in the distribution that is different than the normal > license? There are pieces with different licenses but they only differ in the names of the beneficiaries, not in the conditions (although the words aren't always exactly the same). As far as I can tell, this is the situation for asyncore.py and asynchat.py: they have a copyright notice of their own (see the 1.5.2 source for the exact text) with Sam Rushing's copyright. > For example, can I take the async modules and build a commercial product > on them? As far as I know, yes. Sam Rushing promised me this when he gave them to me for inclusion. (I've had a complaint that they aren't the latest -- can someone confirm this?) --Guido van Rossum (home page: http://www.python.org/~guido/) From MHammond at skippinet.com.au Tue Jun 8 05:11:57 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 13:11:57 +1000 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906080156.VAA13612@eric.cnri.reston.va.us> Message-ID: <000b01beb15c$abd84ea0$0801a8c0@bobcat> [Please dont copy this out of this list :-] > world. A couple of days ago, I received an email from him (part of a > conversation on the perl5porters list apparently) where he warned me > that ActiveState was planning a similar takeover of Python. After > some comments from tchrist ("he's a loon") I decided to ignore David. I believe this to be true - at least "take over" in the same way they have "taken over" Perl. I have it on very good authority that Active State's medium term business plan includes expanding out of Perl alone, and Python is very high on their list. I also believe they would like to recruit people to help with this goal. They are of the opinion that Python alone could not support such a business quite yet, so attaching it to existing infrastructure could fly. On one hand I tend to agree, but on the other hand I think that we do a pretty damn good job as it is, so maybe a Python could fly all alone? And Ive got to say that personally, such an offer would be highly attractive. Depending on the terms (and I must admit I have not had a good look at the ActiveState Perl licenses) this could provide a real boost to the Python world. If the business model is open source software with paid-for support, it seems a win-win situation to me. However, it is very unclear to me, and the industry, that this model alone can work generally. A business-plan that involves withholding sources or technologies until a fee has been paid certainly moves quickly away from win-win to, to quote Guido, "highly competitive proprietary business goals". May be some interesting times ahead. For some time now I have meant to pass this on to PPSI as a heads-up, just incase they intend playing in that space in the future. So consider this it ;-) Mark. From gstein at lyra.org Tue Jun 8 05:13:42 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 20:13:42 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> Message-ID: <375C8A66.56B3F26B@lyra.org> Mark Hammond wrote: > > [Please dont copy this out of this list :-] It's in the archives now... :-) >...[well-said comments about open source and businesses]... > > May be some interesting times ahead. For some time now I have meant to > pass this on to PPSI as a heads-up, just incase they intend playing in that > space in the future. So consider this it ;-) I've already met Dick Hardt and spoken with him at length. Both on an individual basis, and as the President of PPSI. Nothing to report... (yet) Cheers, -g p.s. PPSI is a bit different, as we intend to fill the "support gap" rather than move into real products; ActiveState does products, along with support type stuff and other miscellaneous (I don't recall Dick's list offhand). -- Greg Stein, http://www.lyra.org/ From tim_one at email.msn.com Tue Jun 8 07:14:36 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 01:14:36 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000b01beb15c$abd84ea0$0801a8c0@bobcat> Message-ID: <000401beb16d$cd88d180$f29e2299@tim> [MarkH] > ... > And Ive got to say that personally, such an offer would be highly > attractive. Depending on the terms (and I must admit I have not > had a good look at the ActiveState Perl licenses) this could provide > a real boost to the Python world. I find the ActivePerl license to be quite confusing: http://www.activestate.com/ActivePerl/commlic.htm It appears to say flatly that you can't distribute it yourself, although other pages on the site say "sure, go ahead!". Also seems to imply you can't modify their code (they explicitly allow you to install patches obtained from ActiveState -- but that's all they mention). OTOH, they did a wonderful job on the Perl for Win32 port (a difficult port in the face of an often-hostile Perl community), and gave all the code back to the Perl folk. I've got no complaints about them so far. > If the business model is open source software with paid-for support, it > seems a win-win situation to me. "Part of our business model is to sell value added, proprietary components."; e.g., they sell a Perl Development Kit for $100, and so on. Fine by me! If I could sell tabnanny ... well, I wouldn't do that to anyone . would-like-to-earn-$1-from-python-before-he-dies-ly y'rs - tim From skip at mojam.com Tue Jun 8 07:37:22 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 8 Jun 1999 01:37:22 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <375C516E.76EC8ED4@lyra.org> References: <375C516E.76EC8ED4@lyra.org> Message-ID: <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> Greg> About the only reason that I can see to *not* make them the Greg> default is the slight speed loss. But that seems a bit bogus, as Greg> the interpreter loop doesn't spend *that* much time mucking with Greg> the interp_lock to allow thread switches. There have also been Greg> some real good suggestions for making it take near-zero time until Greg> you actually create that second thread. Okay, everyone has convinced me that holding threads hostage to the Mac is a red herring. I have other fish to fry. (It's 1:30AM and I haven't had dinner yet. Can you tell? ;-) Is there a way with configure to determine whether or not particular Unix variants should have threads enabled or not? If so, I think that's the way to go. I think it would be unfortunate to enable it by default, have it appear to work on some known to be unsupported platforms, but then bite the programmer in an inconvenient place at an inconvenient time. Such a self-deciding configure script should exit with some information about thread enablement: Yes, we support threads on RedHat Linux 6.0. No, you stinking Minix user, you will never have threads. Rhapsody, huh? I never heard of that. Some weird OS from Sunnyvale, you say? I don't know how to do threads there yet, but when you figure it out, send patches along to python-dev at python.org. Of course, users should be able to override anything using --with-thread or without-thread and possibly specify compile-time and link-time flags through arguments or the environment. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From skip at mojam.com Tue Jun 8 07:49:19 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 8 Jun 1999 01:49:19 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000b01beb15c$abd84ea0$0801a8c0@bobcat> References: <199906080156.VAA13612@eric.cnri.reston.va.us> <000b01beb15c$abd84ea0$0801a8c0@bobcat> Message-ID: <14172.44596.528927.548722@cm-24-29-94-19.nycap.rr.com> Okay, folks. I must have missed the memo. Who are ActiveState and sourceXchange? I can't be the only person on python-dev who never heard of either of them before this evening. I guess I'm the only one who's not shy about exposing their ignorance. but-i-can-tell-you-where-to-find-spare-parts-for-your-Triumph-ly 'yrs, Skip Montanaro 518-372-5583 See my car: http://www.musi-cal.com/~skip/ From da at ski.org Tue Jun 8 08:12:11 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 23:12:11 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.44596.528927.548722@cm-24-29-94-19.nycap.rr.com> Message-ID: > Okay, folks. I must have missed the memo. Who are ActiveState and > sourceXchange? I can't be the only person on python-dev who never heard of > either of them before this evening. I guess I'm the only one who's not shy > about exposing their ignorance. Well, one answer is to look at www.activestate.com and www.sourcexchange.com, of course =) ActiveState "does" the win32 perl port, for money. (it's a little controversial within the Perl community, which has inherited some of RMS's "Microsoft OS? Ha!" attitude). sourceXchange is aiming to match open source programmers with companies who want open source work done for $$'s, in a 'market' format. It was started by Brian Behlendorf, now at O'Reilly, and of Apache fame. Go get dinner. =) --david From rushing at nightmare.com Tue Jun 8 02:10:18 1999 From: rushing at nightmare.com (Sam Rushing) Date: Mon, 7 Jun 1999 17:10:18 -0700 (PDT) Subject: [Python-Dev] licensing In-Reply-To: <9403621@toto.iv> Message-ID: <14172.23937.83700.673653@seattle.nightmare.com> Guido van Rossum writes: > Greg Stein writes: > > For example, can I take the async modules and build a commercial > > product on them? Yes, my intent was that they go under the normal Python 'do what thou wilt' license. If I goofed in any way, please let me know! > As far as I know, yes. Sam Rushing promised me this when he gave > them to me for inclusion. (I've had a complaint that they aren't > the latest -- can someone confirm this?) Guilty as charged. I've been tweaking them a bit lately, for performance, but anyone can grab the very latest versions out of the medusa CVS repository: CVSROOT=:pserver:medusa at seattle.nightmare.com:/usr/local/cvsroot (the password is 'medusa') Or download one of the snapshots. BTW, those particular files have always had the Python copyright/license. -Sam From gstein at lyra.org Tue Jun 8 09:09:00 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 08 Jun 1999 00:09:00 -0700 Subject: [Python-Dev] licensing References: <14172.23937.83700.673653@seattle.nightmare.com> Message-ID: <375CC18C.1DB5E9F2@lyra.org> Sam Rushing wrote: > > Greg Stein writes: > > > For example, can I take the async modules and build a commercial > > > product on them? > > Yes, my intent was that they go under the normal Python 'do what thou > wilt' license. If I goofed in any way, please let me know! Nope... you haven't goofed. I was thrown off when a certain person (nudge, nudge) goofed in their upcoming book, which I recently reviewed. thx! -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Tue Jun 8 10:08:08 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 10:08:08 +0200 Subject: [Python-Dev] licensing References: <375C74CC.2947E4AE@lyra.org> Message-ID: <00c501beb186$0c6d3450$f29b12c2@pythonware.com> > I seem to have read somewhere that the two Medusa files are under a > separate license. Although, reading the files now, it seems they are > not. the medusa server has restrictive license, but the asyncore and asynchat modules use the standard Python license, with Sam Rushing as the copyright owner. just use the source... > The issue that I'm really raising is that Python should ship with a > single license that covers everything. Otherwise, it will become very > complicated for somebody to figure out which pieces fall under what > restrictions. > > Is there anything in the distribution that is different than the normal > license? > > For example, can I take the async modules and build a commercial product > on them? surely hope so -- we're using them in everything we do. and my upcoming book is 60% about doing weird things with tkinter, and 40% about doing weird things with asynclib... From MHammond at skippinet.com.au Tue Jun 8 10:46:33 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 18:46:33 +1000 Subject: [Python-Dev] licensing In-Reply-To: <375CC18C.1DB5E9F2@lyra.org> Message-ID: <001101beb18b$6a049bd0$0801a8c0@bobcat> > Nope... you haven't goofed. I was thrown off when a certain person > (nudge, nudge) goofed in their upcoming book, which I > recently reviewed. I now feel for the other Mark and David, Aaron et al, etc. Our book is out of date in a number of ways before the tech reviewers even saw it. Medusa wasnt a good example - I should have known better when I wrote it. But Pythonwin is a _real_ problem. Just as I start writing the book, Neil sends me a really cool editor control and it leads me down a path of IDLE/Pythonwin integration. So almost _everything_ I have already written on "IDEs for Python" is already out of date - and printing is not scheduled for a number of months. [This may help explain to Guido and Tim my recent fervour in this area - I want to get the "new look" Pythonwin ready for the book. I just yesterday got a dockable interactive window happening. Now adding a splitter window to each window to expose a pyclbr based tree control and then it is time to stop (and re-write that chapter :-] Mark. From fredrik at pythonware.com Tue Jun 8 12:25:47 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 12:25:47 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <004d01beb199$4fe171c0$f29b12c2@pythonware.com> > (modesty, fear of being labeled 'commercial', fear of exposing that > they're not 100% busy, so "can't be good", etc.). fwiw, we're seeing an endless stream of mails from moral crusaders even before we have opened the little Python- Ware shoppe (coming soon, coming soon). some of them are quite nasty, to say the least... I usually tell them to raise their concerns on c.l.python instead. they never do. > One thing that ActiveState has going for it which doesn't exist in the > Python world is a corporate entity devoted to software development and > distribution. saying that there is NO such entity is a bit harsh, I think ;-) but different "scripting" companies are using different strategies, by various reasons. Scriptics, ActiveState, PythonWare, UserLand, Harlequin, Rebol, etc. are all doing similar things, but in different ways (due to markets, existing communities, and probably most important: different funding strategies). But we're all corporate entities devoted to software development... ... by the way, if someone thinks there's no money in Python, consider this: --- Google is looking to expand its operations and needs talented engineers to develop the next generation search engine. If you have a need to bring order to a chaotic web, contact us. Requirements: Several years of industry or hobby-based experience B.S. in Computer Science or equivalent (M.S. a plus) Extensive experience programming in C or C++ Extensive experience programming in the UNIX environment Knowledge of TCP/IP and network programming Experience developing/designing large software systems Experience programming in Python a plus --- Google Inc., a year-old Internet search-engine company, said it has attracted $25 million in venture-capital funding and will add two of Silicon Valley's best-known financiers, Michael Moritz and L. John Doerr, to its board. Even by Internet standards, Google has attracted an un- usually large amount of money for a company still in its infancy. --- looks like anyone on this list could get a cool Python job for an unusually over-funded startup within minutes ;-) From skip at mojam.com Tue Jun 8 13:12:02 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 8 Jun 1999 07:12:02 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <004d01beb199$4fe171c0$f29b12c2@pythonware.com> References: <004d01beb199$4fe171c0$f29b12c2@pythonware.com> Message-ID: <14172.63947.54638.275348@cm-24-29-94-19.nycap.rr.com> Fredrik> Even by Internet standards, Google has attracted an un- Fredrik> usually large amount of money for a company still in its Fredrik> infancy. And it's a damn good search engine to boot, so I think it probably deserves the funding (most of it will, I suspect, be used to muscle its way into a crowded market). It is *always* my first stop when I need a general-purpose search engine these days. I never use InfoSeek/Go, Lycos or HotBot for anything other than to check that Musi-Cal is still in their database. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From guido at CNRI.Reston.VA.US Tue Jun 8 14:46:51 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 08 Jun 1999 08:46:51 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Tue, 08 Jun 1999 01:37:22 EDT." <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> References: <375C516E.76EC8ED4@lyra.org> <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> Message-ID: <199906081246.IAA14302@eric.cnri.reston.va.us> > Is there a way with configure to determine whether or not particular Unix > variants should have threads enabled or not? If so, I think that's the way > to go. I think it would be unfortunate to enable it by default, have it > appear to work on some known to be unsupported platforms, but then bite the > programmer in an inconvenient place at an inconvenient time. That's not so much the problem, if you can get a threaded program to compile and link that probably means sufficient support exists. There currently are checks in the configure script that try to find out which thread library to use -- these could be expanded to disable threads when none of the known ones work. Anybody care enough to try hacking configure.in, or should I add this to my tired TODO list? --Guido van Rossum (home page: http://www.python.org/~guido/) From jack at oratrix.nl Tue Jun 8 14:47:44 1999 From: jack at oratrix.nl (Jack Jansen) Date: Tue, 08 Jun 1999 14:47:44 +0200 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Message by Andrew Kuchling , Mon, 7 Jun 1999 21:22:59 -0400 (EDT) , <14172.28787.399827.929220@newcnri.cnri.reston.va.us> Message-ID: <19990608124745.3136B303120@snelboot.oratrix.nl> > One possibility might be NSPR, the Netscape Portable Runtime, > which provides platform-independent threads and I/O on Mac, Win32, and > Unix. Perhaps a thread implementation could be written that sat on > top of NSPR, in addition to the existing pthreads implementation. > See http://www.mozilla.org/docs/refList/refNSPR/. NSPR looks rather promising! Does anyone has any experiences with it? What I'd also be interested in is experiences in how it interacts with the "real" I/O system, i.e. can you mix and match NSPR calls with normal os calls, or will that break things? The latter is important for Python, because there are lots of external libraries, and while some are user-built (image libraries, gdbm, etc) and could conceivably be converted to use NSPR others are not... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido at CNRI.Reston.VA.US Tue Jun 8 15:28:02 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 08 Jun 1999 09:28:02 -0400 Subject: [Python-Dev] Python-dev archives going public In-Reply-To: Your message of "Mon, 07 Jun 1999 20:13:42 PDT." <375C8A66.56B3F26B@lyra.org> References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> <375C8A66.56B3F26B@lyra.org> Message-ID: <199906081328.JAA14584@eric.cnri.reston.va.us> > > [Please dont copy this out of this list :-] > > It's in the archives now... :-) Which reminds me... A while ago, Greg made some noises about the archives being public, and temporarily I made them private. In the following brief flurry of messages everybody who spoke up said they preferred the archives to be public (even though the list remains invitation-only). But I never made the change back, waiting for Greg to agree, but after returning from his well deserved tequilla-splashed vacation, he never gave a peep about this, and I "conveniently forgot". I still like the archives to be public. I hope Mark's remark there was a joke? --Guido van Rossum (home page: http://www.python.org/~guido/) From MHammond at skippinet.com.au Tue Jun 8 15:38:03 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 23:38:03 +1000 Subject: [Python-Dev] Python-dev archives going public In-Reply-To: <199906081328.JAA14584@eric.cnri.reston.va.us> Message-ID: <003101beb1b4$22786de0$0801a8c0@bobcat> > I still like the archives to be public. I hope Mark's remark there > was a joke? Well, not really a joke, but I am not naive to think this is a "private" forum even in the absence of archives. What I meant was closer to "please don't make public statements based purely on this information". I never agreed to keep it private, but by the same token didnt want to start the rumour mills and get bad press for either Dick or us ;-) Mark. From bwarsaw at cnri.reston.va.us Tue Jun 8 17:09:24 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 8 Jun 1999 11:09:24 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl References: <375C516E.76EC8ED4@lyra.org> <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> <199906081246.IAA14302@eric.cnri.reston.va.us> Message-ID: <14173.12836.616873.953134@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Anybody care enough to try hacking configure.in, or should Guido> I add this to my tired TODO list? I'll give it a look. I've done enough autoconf hacking that it shouldn't be too hard. I also need to get my string meths changes into the tree... -Barry From gstein at lyra.org Tue Jun 8 20:11:56 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 08 Jun 1999 11:11:56 -0700 Subject: [Python-Dev] Python-dev archives going public References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> <375C8A66.56B3F26B@lyra.org> <199906081328.JAA14584@eric.cnri.reston.va.us> Message-ID: <375D5CEC.340E2531@lyra.org> Guido van Rossum wrote: > > > > [Please dont copy this out of this list :-] > > > > It's in the archives now... :-) > > Which reminds me... A while ago, Greg made some noises about the > archives being public, and temporarily I made them private. In the > following brief flurry of messages everybody who spoke up said they > preferred the archives to be public (even though the list remains > invitation-only). But I never made the change back, waiting for Greg > to agree, but after returning from his well deserved tequilla-splashed > vacation, he never gave a peep about this, and I "conveniently > forgot". I appreciate the consideration, but figured it was a done deal based on feedback. My only consideration in keeping them private was the basic, human fact that people could feel left out. For example, if they read the archives, thought it was neat, and attempted to subscribe only to be refused. It is a bit easier to avoid engendering those bad feelings if the archives aren't public. Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim at digicool.com Tue Jun 8 20:41:11 1999 From: jim at digicool.com (Jim Fulton) Date: Tue, 08 Jun 1999 18:41:11 +0000 Subject: [Python-Dev] Python-dev archives going public References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> <375C8A66.56B3F26B@lyra.org> <199906081328.JAA14584@eric.cnri.reston.va.us> <375D5CEC.340E2531@lyra.org> Message-ID: <375D63C7.6BB6697E@digicool.com> Greg Stein wrote: > > My only consideration in keeping them private was the basic, human fact > that people could feel left out. For example, if they read the archives, > thought it was neat, and attempted to subscribe only to be refused. It > is a bit easier to avoid engendering those bad feelings if the archives > aren't public. I agree. Jim -- Jim Fulton mailto:jim at digicool.com Technical Director (540) 371-6909 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From tismer at appliedbiometrics.com Tue Jun 8 21:37:21 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 08 Jun 1999 21:37:21 +0200 Subject: [Python-Dev] Stackless Preview References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> Message-ID: <375D70F1.37007192@appliedbiometrics.com> Christian Tismer wrote: [a lot] > fearing the feedback :-) ciao - chris I expected everything but forgot to fear "no feedback". :-) About 5 or 6 people seem to have taken the .zip file. Now I'm wondering why nobody complains. Was my code so wonderful, so disgustingly bad, or is this just boring :-? If it's none of the three above, I'd be happy to get a hint if I should continue, or if and what I should change. Maybe it would make sense to add some documentation now, and also to come up with an application which makes use of the stackless implementation, since there is now not much to wonder about than that it seems to work :-) yes-call-me-impatient - ly chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy at cnri.reston.va.us Tue Jun 8 22:09:15 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Tue, 8 Jun 1999 16:09:15 -0400 (EDT) Subject: [Python-Dev] Stackless Preview In-Reply-To: <375D70F1.37007192@appliedbiometrics.com> References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> <375D70F1.37007192@appliedbiometrics.com> Message-ID: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> >>>>> "CT" == Christian Tismer writes: CT> Christian Tismer wrote: [a lot] >> fearing the feedback :-) ciao - chris CT> I expected everything but forgot to fear "no feedback". :-) CT> About 5 or 6 people seem to have taken the .zip file. Now I'm CT> wondering why nobody complains. Was my code so wonderful, so CT> disgustingly bad, or is this just boring :-? CT> If it's none of the three above, I'd be happy to get a hint if I CT> should continue, or if and what I should change. I'm one of the silent 5 or 6. My reasons fall under "None of the above." They are three in number: 1. No time (the perennial excuse; next 2 weeks are quite hectic) 2. I tried to use ndiff to compare old and new ceval.c, but ran into some problems with that tool. (Tim, it looks like the line endings are identical -- all '\012'.) 3. Wasn't sure what to look at first My only suggestion would be to have an executive summary. If there was a short README file -- no more than 150 lines -- that described the essentials of the approach and told me what to look at first, I would be able to comment more quickly. Jeremy From tismer at appliedbiometrics.com Tue Jun 8 22:15:04 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 08 Jun 1999 22:15:04 +0200 Subject: [Python-Dev] Stackless Preview References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> <375D70F1.37007192@appliedbiometrics.com> <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> Message-ID: <375D79C8.90B3E721@appliedbiometrics.com> Jeremy Hylton wrote: [...] > I'm one of the silent 5 or 6. My reasons fall under "None of the > above." They are three in number: > 1. No time (the perennial excuse; next 2 weeks are quite hectic) > 2. I tried to use ndiff to compare old and new ceval.c, but > ran into some problems with that tool. (Tim, it looks > like the line endings are identical -- all '\012'.) Yes, there are a lot of changes. As a hint: windiff from VC++ does a great job here. You can see both sources in one, in a very readable colored form. > 3. Wasn't sure what to look at first > > My only suggestion would be to have an executive summary. If there > was a short README file -- no more than 150 lines -- that described > the essentials of the approach and told me what to look at first, I > would be able to comment more quickly. Thanks a lot. Will do this tomorrow moaning as my first task. feeling much better - ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From Vladimir.Marangozov at inrialpes.fr Wed Jun 9 00:29:27 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Wed, 9 Jun 1999 00:29:27 +0200 (DFT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <19990608124745.3136B303120@snelboot.oratrix.nl> from "Jack Jansen" at "Jun 8, 99 02:47:44 pm" Message-ID: <199906082229.AAA48646@pukapuka.inrialpes.fr> Jack Jansen wrote: > > NSPR looks rather promising! Does anyone has any experiences with it? What I'd > also be interested in is experiences in how it interacts with the "real" I/O > system, i.e. can you mix and match NSPR calls with normal os calls, or will > that break things? I've looked at it in the past. From memory, NSPR is a fairly big chunk of code and it seemed to me that it's self contained for lots of system stuff. Don't know about I/O, but I played with it to replace the BSD malloc it uses with pymalloc and I was pleased to see the resulting speed & mem stats after rebuilding one of the past Mozilla distribs. This is all the experience I have with it. > > The latter is important for Python, because there are lots of external > libraries, and while some are user-built (image libraries, gdbm, etc) and > could conceivably be converted to use NSPR others are not... I guess that this one would be hard... -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From Vladimir.Marangozov at inrialpes.fr Wed Jun 9 00:45:48 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Wed, 9 Jun 1999 00:45:48 +0200 (DFT) Subject: [Python-Dev] Stackless Preview In-Reply-To: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> from "Jeremy Hylton" at "Jun 8, 99 04:09:15 pm" Message-ID: <199906082245.AAA48828@pukapuka.inrialpes.fr> Jeremy Hylton wrote: > > CT> If it's none of the three above, I'd be happy to get a hint if I > CT> should continue, or if and what I should change. > > I'm one of the silent 5 or 6. My reasons fall under "None of the > above." They are three in number: > ... > My only suggestion would be to have an executive summary. If there > was a short README file -- no more than 150 lines -- that described > the essentials of the approach and told me what to look at first, I > would be able to comment more quickly. Same here + a small wish: please save me the stripping of the ^M line endings typical for MSW, so that I can load the files directly in Xemacs on a Unix box. Otherwise, like Jeremy, I was a bit lost trying to read ceval.c which is already too hairy. -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From tim_one at email.msn.com Wed Jun 9 04:27:37 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 22:27:37 -0400 Subject: [Python-Dev] Stackless Preview In-Reply-To: <199906082245.AAA48828@pukapuka.inrialpes.fr> Message-ID: <000d01beb21f$a3daac20$2fa22299@tim> [Vladimir Marangozov] > ... > please save me the stripping of the ^M line endings typical for MSW, > so that I can load the files directly in Xemacs on a Unix box. Vlad, get linefix.py from Python FTP contrib's System area; converts among Unix, Windows and Mac line conventions; to Unix by default. For that matter, do a global replace of ^M in Emacs . buncha-lazy-whiners-ly y'rs - tim From tim_one at email.msn.com Wed Jun 9 04:27:35 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 22:27:35 -0400 Subject: [Python-Dev] Stackless Preview In-Reply-To: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> Message-ID: <000c01beb21f$a2bd5540$2fa22299@tim> [Christian Tismer] > ... > If it's none of the three above, I'd be happy to get a hint if I > should continue, or if and what I should change. Sorry, Chris! Just a case of "no time" here. Of *course* you should continue, and Guido should pop in with an encouraging word too -- or a "forget it". I think this design opens the doors to a world of interesting ideas, but that's based on informed prejudice rather than careful study of your code. Cheer up: if everyone thought you were a lame ass, we all would have studied your code intensely by now . [Jeremy] > 2. I tried to use ndiff to compare old and new ceval.c, but > ran into some problems with that tool. (Tim, it looks > like the line endings are identical -- all '\012'.) Then let's treat this like a real bug : which version of Python did you use? And ship me the files in a tarball (I'll find a way to extract them intact). And does that specific Python+ndiff combo work OK on *other* files? Or does it fail to find any lines in common no matter what you feed it (a 1-line test case would be a real help )? I couldn't provoke a problem with the stock 1.5.2 ndiff under the stock 1.5.2 Windows Python, using the then-current CVS snapshot of ceval.c as file1 and the ceval.c from Christian's stackless_990606.zip file as file2. Both files have \r\n line endings for me, though (one thanks to CVS line translation, and the other thanks to WinZip line translation). or-were-you-running-ndiff-under-the-stackless-python?-ly y'rs - tim From tim_one at email.msn.com Wed Jun 9 04:27:40 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 22:27:40 -0400 Subject: [Python-Dev] licensing In-Reply-To: <001101beb18b$6a049bd0$0801a8c0@bobcat> Message-ID: <000f01beb21f$a5e2ff40$2fa22299@tim> [Mark Hammond] > ... > [This may help explain to Guido and Tim my recent fervour in this area > - I want to get the "new look" Pythonwin ready for the book. I just > yesterday got a dockable interactive window happening. Now adding a > splitter window to each window to expose a pyclbr based tree control and > then it is time to stop (and re-write that chapter :-] All right! Do get the latest CVS versions of these files: pyclbr has been sped up a lot over the past two days, and is much less likely to get baffled now. And AutoIndent.py now defaults usetabs to 1 (which, of course, means it still uses spaces in new files ). From guido at CNRI.Reston.VA.US Wed Jun 9 05:31:11 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 08 Jun 1999 23:31:11 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: Your message of "Tue, 08 Jun 1999 22:27:35 EDT." <000c01beb21f$a2bd5540$2fa22299@tim> References: <000c01beb21f$a2bd5540$2fa22299@tim> Message-ID: <199906090331.XAA23066@eric.cnri.reston.va.us> Tim wrote: > Sorry, Chris! Just a case of "no time" here. Of *course* you > should continue, and Guido should pop in with an encouraging word > too -- or a "forget it". I think this design opens the doors to a > world of interesting ideas, but that's based on informed prejudice > rather than careful study of your code. Cheer up: if everyone > thought you were a lame ass, we all would have studied your code > intensely by now . No time here either... I did try to have a quick peek and my first impression is that it's *very* tricky code! You know what I think of that... Here's what I think we should do first (I've mentioned this before but nobody cheered me on :-). I'd like to see this as the basis for 1.6. We should structurally split the Python Virtual Machine and related code up into different parts -- both at the source code level and at the runtime level. The core PVM becomes a replaceable component, and so do a few other parts like the parser, the bytecode compiler, the import code, and the interactive read-eval-print loop. Most object implementations are shared between all -- or at least the interfaces are interchangeable. Clearly, a few object types are specific to one or another PVM (e.g. frames). The collection of builtins is also a separate component (though some builtins may again be specific to a PVM -- details, details!). The goal of course, is to create a market for 3rd party components here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's importer, and so on. Thoughts? --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Wed Jun 9 05:37:36 1999 From: da at ski.org (David Ascher) Date: Tue, 8 Jun 1999 20:37:36 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Splitting up the PVM In-Reply-To: <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: On Tue, 8 Jun 1999, Guido van Rossum wrote: > We should structurally split the Python Virtual Machine and related > code up into different parts -- both at the source code level and at > the runtime level. The core PVM becomes a replaceable component, and > so do a few other parts like the parser, the bytecode compiler, the > import code, and the interactive read-eval-print loop. > The goal of course, is to create a market for 3rd party components > here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's > importer, and so on. > > Thoughts? If I understand it correctly, it means that I can fit in a third-party read-eval-print loop, which is my biggest area of frustration with the current internal structure. Sounds like a plan to me, and one which (lucky for me) I'm not qualified for! --david From skip at mojam.com Wed Jun 9 05:45:33 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 8 Jun 1999 23:45:33 -0400 (EDT) Subject: [Python-Dev] Stackless Preview In-Reply-To: <375D70F1.37007192@appliedbiometrics.com> References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> <375D70F1.37007192@appliedbiometrics.com> Message-ID: <14173.58054.869171.927699@cm-24-29-94-19.nycap.rr.com> Chris> If it's none of the three above, I'd be happy to get a hint if I Chris> should continue, or if and what I should change. Chris, My vote is for you to keep at it. I haven't looked at it because I have absolutely zero free time available. This will probably continue until at least the end of July, perhaps until Labor Day. Big doings at Musi-Cal and in the Montanaro household (look for an area code change in a month or so). Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From tismer at appliedbiometrics.com Wed Jun 9 14:58:40 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 09 Jun 1999 14:58:40 +0200 Subject: [Python-Dev] Splitting up the PVM References: <000c01beb21f$a2bd5540$2fa22299@tim> <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: <375E6500.307EF39E@appliedbiometrics.com> Guido van Rossum wrote: > > Tim wrote: > > > Sorry, Chris! Just a case of "no time" here. Of *course* you > > should continue, and Guido should pop in with an encouraging word > > too -- or a "forget it". I think this design opens the doors to a > > world of interesting ideas, but that's based on informed prejudice > > rather than careful study of your code. Cheer up: if everyone > > thought you were a lame ass, we all would have studied your code > > intensely by now . > > No time here either... > > I did try to have a quick peek and my first impression is that it's > *very* tricky code! You know what I think of that... Thanks for looking into it, thanks for saying it's tricky. Since I failed to supply proper documentation yet, this impression must come up. But it is really not true. The code is not tricky but just straightforward and consequent, after one has understood what it means to work without a stack, under the precondition to avoid too much changes. I didn't want to rewrite the world, and I just added the tiny missing bits. I will write up my documentation now, and you will understand what the difficulties were. These will not vanish, "stackless" is a brainteaser. My problem was not how to change the code, but finally it was how to change my brain. Now everything is just obvious. > Here's what I think we should do first (I've mentioned this before but > nobody cheered me on :-). I'd like to see this as the basis for 1.6. > > We should structurally split the Python Virtual Machine and related > code up into different parts -- both at the source code level and at > the runtime level. The core PVM becomes a replaceable component, and > so do a few other parts like the parser, the bytecode compiler, the > import code, and the interactive read-eval-print loop. Most object > implementations are shared between all -- or at least the interfaces > are interchangeable. Clearly, a few object types are specific to one > or another PVM (e.g. frames). The collection of builtins is also a > separate component (though some builtins may again be specific to a > PVM -- details, details!). Good idea, and a lot of work. Having different frames for different PVM's was too much for me. Instead, I tried to adjust frames in a way where a lot of machines can work with. I tried to show the concept of having different VM's by implementing a stackless map. Stackless map is a very tiny one which uses frames again (and yes, this was really hacked). Well, different frame flavors would make sense, perhaps. But I have a central routine which handles all calls to frames, and this is what I think is needed. I already *have* pluggable interpreters here, since a function can produce a frame which is bound to an interpreter, and push it to the frame stack. > The goal of course, is to create a market for 3rd party components > here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's > importer, and so on. I'm with that component goal, of course. Much work, not for one persone, but great. While I don't think it makes sense to make a flat PVM pluggable. I would start with a flat PVM, since that opens a world of possibilities. You can hardly plug flatness in after you started with a wrong stack layout. Vice versa, plugging the old machine would be possible. later - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Wed Jun 9 15:08:38 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 09 Jun 1999 15:08:38 +0200 Subject: [Python-Dev] Stackless Preview References: <000c01beb21f$a2bd5540$2fa22299@tim> Message-ID: <375E6756.370BA78E@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > ... > > If it's none of the three above, I'd be happy to get a hint if I > > should continue, or if and what I should change. > > Sorry, Chris! Just a case of "no time" here. Of *course* you should > continue, and Guido should pop in with an encouraging word too -- or a > "forget it". Yup, I know this time problem just too good. Well, I think I got something in between. I was warned before, so I didn't try to write final code, but I managed to prove the concept. I *will* continue, regardless what anybody says. > or-were-you-running-ndiff-under-the-stackless-python?-ly y'rs - tim I didn't use ndiff, but regular "diff", and it worked. But since theere is not much change to the code, but some significant change to the control flow, I found the diff output too confusing. Windiff was always open when I wrote that, to be sure that I didn't trample on things which I didn't want to mess up. A good tool! ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Wed Jun 9 16:48:34 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 9 Jun 1999 10:48:34 -0400 (EDT) Subject: [Python-Dev] Stackless Preview References: <199906082245.AAA48828@pukapuka.inrialpes.fr> <000d01beb21f$a3daac20$2fa22299@tim> Message-ID: <14174.32450.29368.914458@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Vlad, get linefix.py from Python FTP contrib's System area; TP> converts among Unix, Windows and Mac line conventions; to Unix TP> by default. For that matter, do a global replace of ^M in TP> Emacs . I forgot to follow up to Vlad's original message, but in XEmacs (dunno about FSFmacs), you can visit DOS-eol files without seeing the ^M's. You will see a "DOS" in the modeline, and when you go to write the file it'll ask you if you want to write it in "plain text". I use XEmacs all the time to convert between DOS-eol and eol-The-Way-God-Intended :) To enable this, add the following to your .emacs file: (require 'crypt) -Barry From tismer at appliedbiometrics.com Wed Jun 9 19:58:52 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 09 Jun 1999 19:58:52 +0200 Subject: [Python-Dev] First Draft on Stackless Python References: <199906082245.AAA48828@pukapuka.inrialpes.fr> <000d01beb21f$a3daac20$2fa22299@tim> <14174.32450.29368.914458@anthem.cnri.reston.va.us> Message-ID: <375EAB5C.138D32CF@appliedbiometrics.com> Howdy, I've begun with a first draft on Stackless Python. Didn't have enough time to finish it, but something might already be useful. (Should I better drop the fish idea?) Will write the rest tomorrow. ciao - chris http://www.pns.cc/stackless/stackless.htm -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one at email.msn.com Thu Jun 10 07:25:11 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 01:25:11 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: <375E6500.307EF39E@appliedbiometrics.com> Message-ID: <001401beb301$9cf20b00$af9e2299@tim> [Christian Tismer, replying to Guido's enthusiasm ] > Thanks for looking into it, thanks for saying it's tricky. > Since I failed to supply proper documentation yet, this > impression must come up. > > But it is really not true. The code is not tricky > but just straightforward and consequent, after one has understood > what it means to work without a stack, under the precondition > to avoid too much changes. I didn't want to rewrite > the world, and I just added the tiny missing bits. > > I will write up my documentation now, and you will > understand what the difficulties were. These will not > vanish, "stackless" is a brainteaser. My problem was not how > to change the code, but finally it was how to change > my brain. Now everything is just obvious. FWIW, I believe you! There's something *inherently* tricky about maintaining the effect of a stack without using the stack C supplies implicitly, and from all you've said and what I've learned of your code, it really isn't the code that's tricky here. You're making formerly-hidden connections explicit, which means more stuff is visible, but also means more power and flexibility *because* "more stuff is visible". Agree too that this clearly moves in the direction of making the VM pluggable. > ... > I *will* continue, regardless what anybody says. Ah, if that's how this works, then STOP! Immediately! Don't you dare waste more of our time with this crap . want-some-money?-ly y'rs - tim From tim_one at email.msn.com Thu Jun 10 07:44:50 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 01:44:50 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: <001701beb304$5b8a8b80$af9e2299@tim> [Guido van Rossum] > ... > Here's what I think we should do first (I've mentioned this before but > nobody cheered me on :-). I'd like to see this as the basis for 1.6. > > We should structurally split the Python Virtual Machine and related > code up into different parts -- both at the source code level and at > the runtime level. The core PVM becomes a replaceable component, and > so do a few other parts like the parser, the bytecode compiler, the > import code, and the interactive read-eval-print loop. Most object > implementations are shared between all -- or at least the interfaces > are interchangeable. Clearly, a few object types are specific to one > or another PVM (e.g. frames). The collection of builtins is also a > separate component (though some builtins may again be specific to a > PVM -- details, details!). > > The goal of course, is to create a market for 3rd party components > here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's > importer, and so on. > > Thoughts? The idea of major subsystems getting reworked to conform to well-defined and well-controlled interfaces is certainly appealing. I'm just more comfortable squeezing another 1.7% out of list.sort() <0.9 wink>. trying-to-reduce-my-ambitions-to-match-my-time-ly y'rs - tim From jack at oratrix.nl Thu Jun 10 10:49:31 1999 From: jack at oratrix.nl (Jack Jansen) Date: Thu, 10 Jun 1999 10:49:31 +0200 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: Message by Guido van Rossum , Tue, 08 Jun 1999 23:31:11 -0400 , <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: <19990610084931.55882303120@snelboot.oratrix.nl> > Here's what I think we should do first (I've mentioned this before but > nobody cheered me on :-). Go, Guido, GO!!!! What I'd like in the split you propose is to see which of the items would be implementable in Python, and try to do the split in such a way that such a Python implementation isn't ruled out. Am I correct in guessing that after factoring out the components you mention the only things that aren't in a "replaceable component" are the builtin objects, and a little runtime glue (malloc and such)? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From tismer at appliedbiometrics.com Thu Jun 10 14:16:20 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 10 Jun 1999 14:16:20 +0200 Subject: [Python-Dev] Splitting up the PVM References: <001401beb301$9cf20b00$af9e2299@tim> Message-ID: <375FAC94.D17D43A7@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer, replying to Guido's enthusiasm ] ... > > I will write up my documentation now, and you will still under some work :) > > understand what the difficulties were. These will not > > vanish, "stackless" is a brainteaser. My problem was not how > > to change the code, but finally it was how to change > > my brain. Now everything is just obvious. > > FWIW, I believe you! There's something *inherently* tricky about > maintaining the effect of a stack without using the stack C supplies > implicitly, and from all you've said and what I've learned of your code, it > really isn't the code that's tricky here. You're making formerly-hidden > connections explicit, which means more stuff is visible, but also means more > power and flexibility *because* "more stuff is visible". I knew you would understand me. Feeling much, much better now :-)) After this is finalized, restartable exceptions might be interesting to explore. No, Chris, do the doco... > > I *will* continue, regardless what anybody says. > > Ah, if that's how this works, then STOP! Immediately! Don't you dare waste > more of our time with this crap . Thanks, you fired me a continuation. Here the way to get me into an endless loop: Give me an unsolvable problem and claim I can't do that. :) (just realized that I'm just another pluggable interpreter) > want-some-money?-ly y'rs - tim No, but meet you at least once in my life. -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From arw at ifu.net Thu Jun 10 15:40:51 1999 From: arw at ifu.net (Aaron Watters) Date: Thu, 10 Jun 1999 09:40:51 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] Message-ID: <375FC062.62850DE5@ifu.net> While we're talking about stacks... I've always considered it a major shame that Python ints and floats and chars and stuff have anything to do with dynamic allocation, and I always suspected it might be a major speed performance boost if there was some way they could be manipulated without the need for dynamic memory management. One conceivable alternative approach would change the basic manipulation of objects so that instead of representing objects via pyobject pointers everywhere represent them using two "slots" in a structure for each object, one of which is a type descriptor pointer and the other being a (void *) which could contain the data directly for small objects such as ints, floats, chars. In this case, for example, integer addition would never require any memory management, as it shouldn't, I think, in a perfect world. IE instead of C-stack or static: Heap: (pyobject *) ------------> (refcount, typedescr, data ...) in general you get (typedescr repr* ----------------------> (refcount, data, ...) ) or for small objects like ints and floats and chars simply (typedescr, value) with no dereferencing or memory management required. My feeling is that common things like arithmetic and indexing lists of integers and stuff could be much faster under this approach since it reduces memory management overhead and fragmentation, dereferencing, etc... One bad thing, of course, is that this might be a drastic assault on the way existing code works... Unless I'm just not being creative enough with my thinking. Is this a good idea? If so, is there any way to add it to the interpreter without breaking extension modules and everything else? If Python 2.0 will break stuff anyway, would this be an good change to the internals? Curious... -- Aaron Watters ps: I suppose another gotcha is "when do you do increfs/decrefs?" because they no longer make sense for ints in this case... maybe add a flag to the type descriptor "increfable" and assume that the typedescriptors are always in the CPU cache (?). This would slow down increfs by a couple cycles... Would it be worth it? Only the benchmark knows... Another fix would be to put the refcount in the static side with no speed penalty (typedescr repr* ----------------------> data refcount ) but would that be wasteful of space? From guido at CNRI.Reston.VA.US Thu Jun 10 15:45:51 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 10 Jun 1999 09:45:51 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: Your message of "Thu, 10 Jun 1999 10:49:31 +0200." <19990610084931.55882303120@snelboot.oratrix.nl> References: <19990610084931.55882303120@snelboot.oratrix.nl> Message-ID: <199906101345.JAA29917@eric.cnri.reston.va.us> [me] > > Here's what I think we should do first (I've mentioned this before but > > nobody cheered me on :-). [Jack] > Go, Guido, GO!!!! > > What I'd like in the split you propose is to see which of the items would be > implementable in Python, and try to do the split in such a way that such a > Python implementation isn't ruled out. Indeed. The importing code and the read-eval-print loop are obvious candidates (in fact IDLE shows how the latter can be done today). I'm not sure if it makes sense to have a parser/compiler or the VM written in Python, because of the expected slowdown (plus, the VM would present a chicken-egg problem :-) although for certain purposes one might want to do this. An optimizing pass would certainly be a good candidate. > Am I correct in guessing that after factoring out the components you mention > the only things that aren't in a "replaceable component" are the builtin > objects, and a little runtime glue (malloc and such)? I guess (although how much exactly will only become clear when it's done). I guess that things like thread-safety and GC policy are also pervasive. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Thu Jun 10 16:11:23 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 10 Jun 1999 10:11:23 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: Your message of "Thu, 10 Jun 1999 09:40:51 EDT." <375FC062.62850DE5@ifu.net> References: <375FC062.62850DE5@ifu.net> Message-ID: <199906101411.KAA29962@eric.cnri.reston.va.us> [Aaron] > I've always considered it a major shame that Python ints and floats > and chars and stuff have anything to do with dynamic allocation, and > I always suspected it might be a major speed performance boost if > there was some way they could be manipulated without the need for > dynamic memory management. What you're describing is very close to what I recall I once read about the runtime organization of Icon. Perl may also use a variant on this (it has fixed-length object headers). On the other hand, I believe Smalltalks typically uses something like the following ABC trick: In ABC, we used a variation: objects were represented by pointers as in Python, except when the low bit was 1, in which case the remaining 31 bits were a "small int". My experience with this approach was that it probably saved some memory, but perhaps not time (since almost all operations on objects were slowed down by the check "is it an int?" before the pointer could be accessed); and that because of this it was a major hassle in keeping the implementation code correct. There was always the temptation to make a check early in a piece of code and then skip the check later on, which sometimes didn't work when objects switched places. Plus in general the checks made the code less readable, and it was just one more thing to remember to do. The Icon approach (i.e. yours) seems to require a complete rethinking of all object implementations and all APIs at the C level -- perhaps we could think about it for Python 2.0. Some ramifications: - Uses more memory for highly shared objects (there are as many copies of the type pointer as there are references). - Thus, lists take double the memory assuming they reference objects that also exist elsewhere. This affects the performance of slices etc. - On the other hand, a list of ints takes half the memory (given that most of those ints are not shared). - *Homogeneous* lists (where all elements have the same type -- i.e. arrays) can be represented more efficiently by having only one copy of the type pointer. This was an idea for ABC (whose type system required all container types to be homogenous) that was never implemented (because in practice the type check wasn't always applied, and the top-level namespace used by the interactive command interpreter violated all the rules). - Reference count manipulations could be done by a macro (or C++ behind-the-scense magic using copy constructors and destructors) that calls a function in the type object -- i.e. each object could decide on its own reference counting implementation :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Thu Jun 10 20:02:30 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 10 Jun 1999 14:02:30 -0400 (EDT) Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: <375FC062.62850DE5@ifu.net> <199906101411.KAA29962@eric.cnri.reston.va.us> Message-ID: <14175.64950.720465.456133@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> In ABC, we used a variation: objects were represented by Guido> pointers as in Python, except when the low bit was 1, in Guido> which case the remaining 31 bits were a "small int". Very similar to how Emacs Lisp manages its type system, to which XEmacs extended. The following is from the XEmacs Internals documentation[1]. XEmacs' object representation (on a 32 bit machine) uses the top bit as a GC mark bit, followed by three type tag bits, followed by a pointer or an integer: [ 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 ] [ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 ] ^ <---> <------------------------------------------------------> | tag a pointer to a structure, or an integer | `---> mark bit One of the 8 possible types representable by the tag bits, one is a "record" type, which essentially allows an unlimited (well, 2^32) number of data types. As you might guess there are lots of interesting details and limitations to this scheme, with lots of interesting macros in the C code :). Reading and debugging the C implementation gets fun too (we'll ignore for the moment all the GCPRO'ing going on -- if you think INCREF/DECREF is trouble prone, hah!). Whether or not this is at all relevent for Python 2.0, it all seems to work pretty well in (X)Emacs. >>>>> "AW" == Aaron Watters writes: AW> ps: I suppose another gotcha is "when do you do AW> increfs/decrefs?" because they no longer make sense for ints AW> in this case... maybe add a flag to the type descriptor AW> "increfable" and assume that the typedescriptors are always in AW> the CPU cache (?). This would slow down increfs by a couple AW> cycles... Would it be worth it? Only the benchmark knows... AW> Another fix would be to put the refcount in the static side AW> with no speed penalty | (typedescr | repr* ----------------------> data | refcount | ) AW> but would that be wasteful of space? Once again, you can move the refcount out of the objects, a la NextStep. Could save space and improve LOC for read-only objects. -Barry [1] The Internals documentation comes with XEmacs's Info documetation. Hit: C-h i m Internals RET m How RET From tismer at appliedbiometrics.com Thu Jun 10 21:53:10 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 10 Jun 1999 21:53:10 +0200 Subject: [Python-Dev] Stackless Preview References: <000d01beb21f$a3daac20$2fa22299@tim> Message-ID: <376017A6.DC619723@appliedbiometrics.com> Howdy, I worked a little more on the docs and figured out that I could use a hint. http://www.pns.cc/stackless/stackless.htm Trying to give an example how coroutines could work, some weaknesses showed up. I wanted to write some function coroutine_transfer which swaps two frame chains. This function should return my unwind token, but unfortunately in that case a real result would be needed as well. Well, I know of several ways out, but it's a matter of design, and I'd like to find the most elegant solution for this. Could perhaps someone of those who encouraged me have a look into the problem? Do I have to add yet another field for return values and handle that in the dispatcher? thanks - chris (tired of thinking) -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Fri Jun 11 01:32:26 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 10 Jun 1999 19:32:26 -0400 (EDT) Subject: [Python-Dev] String methods... finally Message-ID: <14176.19210.146525.172100@anthem.cnri.reston.va.us> I've finally checked my string methods changes into the source tree, albeit on a CVS branch (see below). These changes are outgrowths of discussions we've had on the string-sig, with I think Greg Stein giving lots of very useful early feedback. I'll call these changes controversial (hence the branch) because Guido hasn't had much opportunity to play with them. Now that he -- and you -- can check them out, I'm sure I'll get lots more feedback! First, to check them out you need to switch to the string_methods CVS branch. On Un*x: cvs update -r string_methods You might want to do this in a separate tree because this will sticky tag your tree to this branch. If so, try cvs checkout -r string_methods python Here's a brief summary of the changes (as best I can restore the state -- its been a while since I actually made all these changes ;) Strings now have as methods most of the functions that were previously only in the string module. If you've played with JPython, you've already had this feature for a while. So you can do: Python 1.5.2+ (#1, Jun 10 1999, 18:22:14) [GCC 2.8.1] on sunos5 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> s = 'Hello There Devheads' >>> s.lower() 'hello there devheads' >>> s.upper() 'HELLO THERE DEVHEADS' >>> s.split() ['Hello', 'There', 'Devheads'] >>> 'hello'.upper() 'HELLO' that sort of thing. Some of the string module functions don't make sense as string methods, like join, and others never had a C implementation so weren't added, like center. Two new methods startswith and endswith act like their Java cousins. The string module has been rewritten to be completely (I hope) backwards compatible. No code should break, though they could be slower. Guido and I decided that was acceptable. What else? Some cleaning up of the internals based on Greg's suggestions. A couple of new C API additions. Builtin int(), long(), and float() have grown a few new features. I believe they are essentially interchangable with string.atoi(), string.atol(), and string.float() now. After you guys get to toast me (in either sense of the word) for a while and these changes settle down, I'll make a wider announcement. Enjoy, -Barry From da at ski.org Fri Jun 11 01:37:54 1999 From: da at ski.org (David Ascher) Date: Thu, 10 Jun 1999 16:37:54 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: On Thu, 10 Jun 1999, Barry A. Warsaw wrote: > I've finally checked my string methods changes into the source tree, Great! > ... others never had a C implementation so weren't added, like center. I assume that's not a design decision but a "haven't gotten around to it yet" statement, right? > Two new methods startswith and endswith act like their Java cousins. aaaah... . --david From MHammond at skippinet.com.au Fri Jun 11 01:59:17 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Fri, 11 Jun 1999 09:59:17 +1000 Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <003101beb39d$41b1c7c0$0801a8c0@bobcat> > I've finally checked my string methods changes into the source tree, > albeit on a CVS branch (see below). These changes are outgrowths of Yay! Would this also be a good opportunity to dust-off the Unicode implementation the string-sig recently came up with (as implemented by Fredrik) and get this in as a type? Although we still have the unresolved issue of how to use PyArg_ParseTuple etc to convert to/from Unicode and 8bit, it would still be nice to have Unicode and String objects capable of being used interchangably at the Python level. Of course, the big problem with attempting to test out these sorts of changes is that you must do so in code that will never see the public for a good 12 months. I suppose a 1.5.25 is out of the question ;-) Mark. From guido at CNRI.Reston.VA.US Fri Jun 11 03:40:07 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 10 Jun 1999 21:40:07 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Fri, 11 Jun 1999 09:59:17 +1000." <003101beb39d$41b1c7c0$0801a8c0@bobcat> References: <003101beb39d$41b1c7c0$0801a8c0@bobcat> Message-ID: <199906110140.VAA02180@eric.cnri.reston.va.us> > Would this also be a good opportunity to dust-off the Unicode > implementation the string-sig recently came up with (as implemented by > Fredrik) and get this in as a type? > > Although we still have the unresolved issue of how to use PyArg_ParseTuple > etc to convert to/from Unicode and 8bit, it would still be nice to have > Unicode and String objects capable of being used interchangably at the > Python level. Yes, yes, yes! Even if it's not supported everywhere, at least having the Unicode type in the source tree would definitely help! > Of course, the big problem with attempting to test out these sorts of > changes is that you must do so in code that will never see the public for a > good 12 months. I suppose a 1.5.25 is out of the question ;-) We'll see about that... (I sometimes wished I wasn't in the business of making releases. I've asked for help with making essential patches to 1.5.2 available but nobody volunteered... :-( ) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Fri Jun 11 05:08:28 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 23:08:28 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <14175.64950.720465.456133@anthem.cnri.reston.va.us> Message-ID: <000a01beb3b7$adda3b20$329e2299@tim> Jumping in to opine that mixing tag/type bits with native pointers is a Really Bad Idea. Put the bits on the low end and word-addressed machines are screwed. Put the bits on the high end and you've made severe assumptions about how the platform parcels out address space. In any case you're stuck with ugly macros everywhere. This technique was pioneered by Lisps, and was beautifully exploited by the Symbolics Lisp Machine and TI Lisp Explorer hardware. Lisp people don't want to admit those failed, so continue simulating the HW design by hand at comparatively sluggish C speed <0.6 wink>. BTW, I've never heard this approach argued as a speed optimization (except in the HW implementations): software mask-test-branch around every inc/dec-ref to exempt ints is a nasty new repeated expense. The original motivation was to save space, and that back in the days when a 128Mb RAM chip wasn't even conceivable, let alone under $100 . once-wrote-a-functional-language-interpreter-in-8085-assembler-that-ran- in-24Kb-cuz-that's-all-there-was-but-don't-feel-i-need-to-repeat-the- experience-today-wink>-ly y'rs - tim From bwarsaw at cnri.reston.va.us Fri Jun 11 05:13:29 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 10 Jun 1999 23:13:29 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <14176.32473.408675.992145@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: >> ... others never had a C implementation so weren't added, like >> center. DA> I assume that's not a design decision but a "haven't gotten DA> around to it yet" statement, right? I think we decided that they weren't used enough to implement in C. >> Two new methods startswith and endswith act like their Java >> cousins. DA> aaaah... . Tell me about it! -Barry From tim_one at email.msn.com Fri Jun 11 05:33:25 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 23:33:25 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <000b01beb3bb$29ccdaa0$329e2299@tim> > Two new methods startswith and endswith act like their Java cousins. Barry, suggest that both of these grow optional start and end slice indices. Why? It's Pythonic . Really, I'm forever marching over huge strings a slice-pair at a time, and it's important that searches and matches never give me false hits due to slobbering over the current slice bounds. regexp objects in general, and string.find/.rfind in particular, support this beautifully. Java feels less need since sub-stringing is via cheap descriptor there. The optional indices wouldn't hurt Java, but would help Python. then-again-if-strings-were-so-great-i'd-switch-to-tcl-ly y'rs - tim From bwarsaw at cnri.reston.va.us Fri Jun 11 05:41:55 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 10 Jun 1999 23:41:55 -0400 (EDT) Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: <14175.64950.720465.456133@anthem.cnri.reston.va.us> <000a01beb3b7$adda3b20$329e2299@tim> Message-ID: <14176.34179.125397.282079@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Jumping in to opine that mixing tag/type bits with native TP> pointers is a Really Bad Idea. Put the bits on the low end TP> and word-addressed machines are screwed. Put the bits on the TP> high end and you've made severe assumptions about how the TP> platform parcels out address space. In any case you're stuck TP> with ugly macros everywhere. Ah, so you /have/ read the Emacs source code! I'll agree that it's just an RBI for Emacs, but for Python, it'd be a RFSI. TP> This technique was pioneered by Lisps, and was beautifully TP> exploited by the Symbolics Lisp Machine and TI Lisp Explorer TP> hardware. Lisp people don't want to admit those failed, so TP> continue simulating the HW design by hand at comparatively TP> sluggish C speed <0.6 wink>. But of course, the ghosts live on at the FSF and xemacs.org (couldn't tell ya much about how modren Lisps do it). -Barry From skip at mojam.com Fri Jun 11 06:26:49 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 11 Jun 1999 00:26:49 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> Barry> Some of the string module functions don't make sense as string Barry> methods, like join, and others never had a C implementation so Barry> weren't added, like center. I take it string.capwords falls into that category. It's one of those things that's so easy to write in Python and there's no real speed gain in going to C, that it didn't make much sense to add it to the strop module, right? I see the following functions in string.py that could reasonably be methodized: ljust, rjust, center, expandtabs, capwords That's not very many, and it would appear that this stuff won't see widespread use for quite some time. I think for completeness sake we should bite the bullet on them. BTW, I built it and think it is very cool. Tipping my virtual hat to Barry, I am... Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From skip at mojam.com Fri Jun 11 06:57:15 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 11 Jun 1999 00:57:15 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> Message-ID: <14176.38521.124491.987817@cm-24-29-94-19.nycap.rr.com> Skip> I see the following functions in string.py that could reasonably be Skip> methodized: Skip> ljust, rjust, center, expandtabs, capwords It occurred to me just a few minutes after sending my previous message that it might make sense to make string.join a method for lists and tuples. They'd obviously have to make the same type checks that string.join does. That would leave the string/strip modules implementing just a couple functions. Skip From da at ski.org Fri Jun 11 07:09:46 1999 From: da at ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:09:46 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.38521.124491.987817@cm-24-29-94-19.nycap.rr.com> Message-ID: On Fri, 11 Jun 1999, Skip Montanaro wrote: > It occurred to me just a few minutes after sending my previous message that > it might make sense to make string.join a method for lists and tuples. > They'd obviously have to make the same type checks that string.join does. as in: >>> ['spam!', 'eggs!'].join() 'spam! eggs!' ? I like the notion, but I think it would naturally migrate towards genericity, at which point it might be called "reduce", so that: >>> ['spam!', 'eggs!'].reduce() 'spam!eggs!' >>> ['spam!', 'eggs!'].reduce(' ') 'spam! eggs!' >>> [1,2,3].reduce() 6 # 1 + 2 + 3 >>> [1,2,3].reduce(10) 26 # 1 + 10 + 2 + 10 + 3 note that string.join(foo) == foo.reduce(' ') and string.join(foo, '') == foo.reduce() --david From guido at CNRI.Reston.VA.US Fri Jun 11 07:16:29 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 01:16:29 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Thu, 10 Jun 1999 22:09:46 PDT." References: Message-ID: <199906110516.BAA02520@eric.cnri.reston.va.us> > On Fri, 11 Jun 1999, Skip Montanaro wrote: > > > It occurred to me just a few minutes after sending my previous message that > > it might make sense to make string.join a method for lists and tuples. > > They'd obviously have to make the same type checks that string.join does. > > as in: > > >>> ['spam!', 'eggs!'].join() > 'spam! eggs!' Note that this is not as powerful as string.join(); the latter works on any sequence, not just on lists and tuples. (Though that may not be a big deal.) I also find it slightly objectionable that this is a general list method but only works if the list contains only strings; Dave Ascher's generalization to reduce() is cute but strikes me are more general than useful, and the name will forever present a mystery to most newcomers. Perhaps join() ought to be a built-in function? --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Fri Jun 11 07:23:06 1999 From: da at ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:23:06 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Guido van Rossum wrote: > Perhaps join() ought to be a built-in function? Would it do the moral equivalent of a reduce(operator.add, ...) or of a string.join? I think it should do the former (otherwise something about 'string' should be in the name), and as a consequence I think it shouldn't have the default whitespace spacer. cute-but-general'ly y'rs, david From da at ski.org Fri Jun 11 07:35:42 1999 From: da at ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:35:42 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Aside: apply syntax Message-ID: I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core to allow * and ** in function calls, so that: class SubFoo(Foo): def __init__(self, *args, **kw): apply(Foo, (self, ) + args, kw) ... could be written class SubFoo(Foo): def __init__(self, *args, **kw): Foo(self, *args, **kw) ... I really like this notion, but before I poke around trying to see if it's doable, I'd like to get feedback on whether y'all think it's a good idea or not. And if someone else wants to do it, feel free -- I am of course swamped, and I won't get to it until after rich comparisons. FWIW, apply() is one of my least favorite builtins, aesthetically speaking. --david From da at ski.org Fri Jun 11 07:36:30 1999 From: da at ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:36:30 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Re: Aside: apply syntax In-Reply-To: Message-ID: On Thu, 10 Jun 1999, David Ascher wrote: > class SubFoo(Foo): > def __init__(self, *args, **kw): > apply(Foo, (self, ) + args, kw) > ... > > could be written > > class SubFoo(Foo): > def __init__(self, *args, **kw): > Foo(self, *args, **kw) Of course I meant Foo.__init__ in both of the above! --david From skip at mojam.com Fri Jun 11 09:07:09 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 11 Jun 1999 03:07:09 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: References: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: <14176.45761.801671.880774@cm-24-29-94-19.nycap.rr.com> David> I think it should do the former (otherwise something about David> 'string' should be in the name), and as a consequence I think it David> shouldn't have the default whitespace spacer. Perhaps "joinstrings" would be an appropriate name (though it seems gratuitously long) or join should call str() on non-string elements. My thought here is that we have left in the string module a couple functions that ought to be string object methods but aren't yet mostly for convenience or time constraints, and one (join) that is 99.9% of the time used on lists or tuples of strings. That leaves a very small handful of methods that don't naturally fit somewhere else. You can, of course, complete the picture and add a join method to string objects, which would be useful to explode them into individual characters. That would complete the join-as-a-sequence-method picture I think. If you don't somebody else (and not me, cuz I'll know why already!) is bound to ask why capwords, join, ljust, etc got left behind in the string module while all the other functions got promotions to object methods. Oh, one other thing I forgot. Split (join) and splitfields (joinfields) used to be different. They've been the same for a long time now, long enough that I no longer recall how they used to differ. In making the leap from string module to string methods, I suggest dropping the long names altogether. There's no particular compatibility reason to keep them and they're not really any more descriptive than their shorter siblings. It's not like you'll be preserving backward compatibility for anyone's code by having them. However, if you release this code to the larger public, then you'll be stuck with both in perpetuity. Skip From fredrik at pythonware.com Fri Jun 11 09:06:58 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 11 Jun 1999 09:06:58 +0200 Subject: [Python-Dev] String methods... finally References: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: <008701beb3da$5e2db9d0$f29b12c2@pythonware.com> Guido wrote: > Note that this is not as powerful as string.join(); the latter works > on any sequence, not just on lists and tuples. (Though that may not > be a big deal.) > > I also find it slightly objectionable that this is a general list > method but only works if the list contains only strings; Dave Ascher's > generalization to reduce() is cute but strikes me are more general > than useful, and the name will forever present a mystery to most > newcomers. > > Perhaps join() ought to be a built-in function? come to think of it, the last design I came up with (inspired by a mail from you which I cannot find right now), was this: def join(sequence, sep=None): # built-in if not sequence: return "" sequence[0].__join__(sequence, sep) string.join => join and __join__ methods in the unicode and string classes. Guido? From fredrik at pythonware.com Fri Jun 11 09:03:19 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 11 Jun 1999 09:03:19 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <008601beb3da$5e0a7a60$f29b12c2@pythonware.com> Barry wrote: > Some of the string module functions don't make sense as > string methods, like join, and others never had a C > implementation so weren't added, like center. fwiw, the Unicode module available from pythonware.com implements them all, and more importantly, it can be com- piled for either 8-bit or 16-bit characters... join is a special problem; IIRC, Guido came up with what I at that time thought was an excellent solution, but I don't recall what it was right now ;-) anyway, maybe we should start by figuring out what methods we really want in there, and then figure out whether we should have one or two independent string implementations in the core... From mal at lemburg.com Fri Jun 11 10:15:33 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jun 1999 10:15:33 +0200 Subject: [Python-Dev] String methods... finally References: Message-ID: <3760C5A5.43FB1658@lemburg.com> David Ascher wrote: > > On Fri, 11 Jun 1999, Guido van Rossum wrote: > > > Perhaps join() ought to be a built-in function? > > Would it do the moral equivalent of a reduce(operator.add, ...) or of a > string.join? > > I think it should do the former (otherwise something about 'string' should > be in the name), and as a consequence I think it shouldn't have the > default whitespace spacer. AFAIK, Guido himself proposed something like this on c.l.p a few months ago. I think something like the following written in C and optimized for lists of strings might be useful: def join(sequence,sep=None): x = sequence[0] if sep: for y in sequence[1:]: x = x + sep + y else: for y in sequence[1:]: x = x + y return x >>> join(('a','b')) 'ab' >>> join(('a','b'),' ') 'a b' >>> join((1,2,3),3) 12 >>> join(((1,2),(3,))) (1, 2, 3) Also, while we're at string functions/methods. Some of the stuff in mxTextTools (see Python Pages link below) might be of general use as well, e.g. splitat(), splitlines() and charsplit(). -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 203 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Jun 11 14:31:51 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 08:31:51 -0400 Subject: [Python-Dev] Aside: apply syntax In-Reply-To: Your message of "Thu, 10 Jun 1999 22:35:42 PDT." References: Message-ID: <199906111231.IAA02774@eric.cnri.reston.va.us> > I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core > to allow * and ** in function calls, so that: > > class SubFoo(Foo): > def __init__(self, *args, **kw): > apply(Foo, (self, ) + args, kw) > ... > > could be written > > class SubFoo(Foo): > def __init__(self, *args, **kw): > Foo(self, *args, **kw) > ... > > I really like this notion, but before I poke around trying to see if it's > doable, I'd like to get feedback on whether y'all think it's a good idea > or not. And if someone else wants to do it, feel free -- I am of course > swamped, and I won't get to it until after rich comparisons. > > FWIW, apply() is one of my least favorite builtins, aesthetically > speaking. I like the idea, but it would mean a major reworking of the grammar and the parser. Can I persuade you to keep this on ice until 2.0? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Fri Jun 11 14:54:30 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 11 Jun 1999 14:54:30 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <004601beb409$8c535750$f29b12c2@pythonware.com> > Two new methods startswith and endswith act like their Java cousins. is it just me, or do those method names suck? begin? starts_with? startsWith? (ouch) has_prefix? From arw at ifu.net Fri Jun 11 15:05:17 1999 From: arw at ifu.net (Aaron Watters) Date: Fri, 11 Jun 1999 09:05:17 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: <199906110342.XAA07977@python.org> Message-ID: <3761098D.A56F58A8@ifu.net> From: "Tim Peters" >Jumping in to opine that mixing tag/type bits with native pointers is a >Really Bad Idea. Put the bits on the low end and word-addressed machines >are screwed. Put the bits on the high end and you've made severe >assumptions about how the platform parcels out address space. In any case >you're stuck with ugly macros everywhere. Agreed. Never ever mess with pointers. This mistake has been made over and over again by each new generation of computer hardware and software and it's still a mistake. I thought it would be good to be able to do the following loop with Numeric arrays for x in array1: array2[x] = array3[x] + array4[x] without any memory management being involved. Right now, I think the for loop has to continually dynamically allocate each new x and intermediate sum (and immediate deallocate them) and that makes the loop piteously slow. The idea replacing pyobject *'s with a struct [typedescr *, data *] was a space/time tradeoff to speed up operations like the above by eliminating any need for mallocs or other memory management.. I really can't say whether it'd be worth it or not without some sort of real testing. Just a thought. -- Aaron Watters From mal at lemburg.com Fri Jun 11 15:11:20 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jun 1999 15:11:20 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> Message-ID: <37610AF8.3EC610FD@lemburg.com> Fredrik Lundh wrote: > > > Two new methods startswith and endswith act like their Java cousins. > > is it just me, or do those method names suck? > > begin? starts_with? startsWith? (ouch) > has_prefix? In mxTextTools I used the names prefix() and suffix() for much the same thing except that those functions accept a list of strings and return the (first) matching string instead of just 1 or 0. Details are available at: http://starship.skyport.net/~lemburg/mxTextTools.html -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 203 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Jun 11 15:58:10 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 09:58:10 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Fri, 11 Jun 1999 15:11:20 +0200." <37610AF8.3EC610FD@lemburg.com> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> <37610AF8.3EC610FD@lemburg.com> Message-ID: <199906111358.JAA02836@eric.cnri.reston.va.us> > > > Two new methods startswith and endswith act like their Java cousins. > > > > is it just me, or do those method names suck? It's just you. > > begin? starts_with? startsWith? (ouch) > > has_prefix? Those are all painful to type, except "begin", which isn't expressive. > In mxTextTools I used the names prefix() and suffix() for much The problem with those is that it's arbitrary (==> harder to remember) whether A.prefix(B) means that A is a prefix of B or that A has B for a prefix. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Fri Jun 11 16:55:14 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jun 1999 16:55:14 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> <37610AF8.3EC610FD@lemburg.com> <199906111358.JAA02836@eric.cnri.reston.va.us> Message-ID: <37612352.227FCA4B@lemburg.com> Guido van Rossum wrote: > > > > > Two new methods startswith and endswith act like their Java cousins. > > > > > > is it just me, or do those method names suck? > > It's just you. > > > > begin? starts_with? startsWith? (ouch) > > > has_prefix? > > Those are all painful to type, except "begin", which isn't expressive. > > > In mxTextTools I used the names prefix() and suffix() for much > > The problem with those is that it's arbitrary (==> harder to remember) > whether A.prefix(B) means that A is a prefix of B or that A has B for > a prefix. True. These are functions in mxTextTools and take a sequence as second argument, so the order is clear there... has_prefix() has_suffix() would probably be appropriate as methods (you don't type them that often ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 203 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jack at oratrix.nl Fri Jun 11 17:55:36 1999 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 11 Jun 1999 17:55:36 +0200 Subject: [Python-Dev] Aside: apply syntax In-Reply-To: Message by Guido van Rossum , Fri, 11 Jun 1999 08:31:51 -0400 , <199906111231.IAA02774@eric.cnri.reston.va.us> Message-ID: <19990611155536.944FA303120@snelboot.oratrix.nl> > > > > class SubFoo(Foo): > > def __init__(self, *args, **kw): > > Foo(self, *args, **kw) > > ... Guido: > I like the idea, but it would mean a major reworking of the grammar > and the parser. Can I persuade you to keep this on ice until 2.0? What exactly would the semantics be? While I hate the apply() loops you have to jump through nowadays to get this behaviour I don't funny understand how this would work in general (as opposed to in this case). For instance, would Foo(self, 12, *args, **kw) be allowed? And Foo(self, *args, x=12, **kw) ? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From da at ski.org Fri Jun 11 18:57:37 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 09:57:37 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Aside: apply syntax In-Reply-To: <199906111231.IAA02774@eric.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Guido van Rossum wrote: > > I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core > > to allow * and ** in function calls, so that: > > > I like the idea, but it would mean a major reworking of the grammar > and the parser. Can I persuade you to keep this on ice until 2.0? Sure. That was hard. =) From da at ski.org Fri Jun 11 19:02:49 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 10:02:49 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Aside: apply syntax In-Reply-To: <19990611155536.944FA303120@snelboot.oratrix.nl> Message-ID: On Fri, 11 Jun 1999, Jack Jansen wrote: > What exactly would the semantics be? While I hate the apply() loops you have > to jump through nowadays to get this behaviour I don't funny understand how > this would work in general (as opposed to in this case). For instance, would > Foo(self, 12, *args, **kw) > be allowed? And > Foo(self, *args, x=12, **kw) Following the rule used for argument processing now, if it's unambiguous, it should be allowed, and not otherwise. So, IMHO, the above two should be allowed, and I suspect Foo.__init__(self, *args, *args2) could be too, but Foo.__init__(self, **kw, **kw2) should not, as dictionary addition is not allowed. However, I could live with the more restricted version as well. --david From bwarsaw at cnri.reston.va.us Fri Jun 11 19:17:20 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:17:20 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <000b01beb3bb$29ccdaa0$329e2299@tim> Message-ID: <14177.17568.637272.328126@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: >> Two new methods startswith and endswith act like their Java >> cousins. TP> Barry, suggest that both of these grow optional start and end TP> slice indices. 'Course it'll make the Java implementations of these extra args a little more work. Right now they just forward off to the underlying String methods. No biggie though. I've got new implementations to check in -- let me add a few new tests to cover 'em and watch your checkin emails. -Barry From guido at CNRI.Reston.VA.US Fri Jun 11 19:20:57 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 13:20:57 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Fri, 11 Jun 1999 13:17:20 EDT." <14177.17568.637272.328126@anthem.cnri.reston.va.us> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <000b01beb3bb$29ccdaa0$329e2299@tim> <14177.17568.637272.328126@anthem.cnri.reston.va.us> Message-ID: <199906111720.NAA03746@eric.cnri.reston.va.us> > From: "Barry A. Warsaw" > > 'Course it'll make the Java implementations of these extra args a > little more work. Right now they just forward off to the underlying > String methods. No biggie though. Which reminds me -- are you tracking this in JPython too? --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Fri Jun 11 19:39:41 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:39:41 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <000b01beb3bb$29ccdaa0$329e2299@tim> <14177.17568.637272.328126@anthem.cnri.reston.va.us> <199906111720.NAA03746@eric.cnri.reston.va.us> Message-ID: <14177.18909.980174.55751@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Which reminds me -- are you tracking this in JPython too? That's definitely my plan. From bwarsaw at cnri.reston.va.us Fri Jun 11 19:43:35 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:43:35 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <199906110516.BAA02520@eric.cnri.reston.va.us> <14176.45761.801671.880774@cm-24-29-94-19.nycap.rr.com> Message-ID: <14177.19143.463951.778491@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> Oh, one other thing I forgot. Split (join) and splitfields SM> (joinfields) used to be different. They've been the same for SM> a long time now, long enough that I no longer recall how they SM> used to differ. I think it was only in the number of arguments they'd accept (at least that's what's implied by the module docos). SM> In making the leap from string module to SM> string methods, I suggest dropping the long names altogether. I agree. Thinking about it, I'm also inclined to not include startswith and endswith in the string module. -Barry From da at ski.org Fri Jun 11 19:42:59 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 10:42:59 -0700 (Pacific Daylight Time) Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <3761098D.A56F58A8@ifu.net> Message-ID: On Fri, 11 Jun 1999, Aaron Watters wrote: > I thought it would be good to be able to do the following loop with Numeric > arrays > > for x in array1: > array2[x] = array3[x] + array4[x] > > without any memory management being involved. Right now, I think the FYI, I think it should be done by writing: array2[array1] = array3[array1] + array4[array1] and doing "the right thing" in NumPy. In other words, I don't think the core needs to be involved. --david PS: I'm in the process of making the NumPy array objects ExtensionClasses, which will make the above much easier to do. From bwarsaw at cnri.reston.va.us Fri Jun 11 19:58:36 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:58:36 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> Message-ID: <14177.20044.69731.219173@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: >> Two new methods startswith and endswith act like their Java >> cousins. FL> is it just me, or do those method names suck? FL> begin? starts_with? startsWith? (ouch) FL> has_prefix? The inspiration was Java string objects, while trying to remain as Pythonic as possible (no mixed case). startswith and endswith doen't seem as bad as issubclass to me :) -Barry From bwarsaw at cnri.reston.va.us Fri Jun 11 20:06:22 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:06:22 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <008601beb3da$5e0a7a60$f29b12c2@pythonware.com> Message-ID: <14177.20510.818041.110989@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> fwiw, the Unicode module available from pythonware.com FL> implements them all, and more importantly, it can be com- FL> piled for either 8-bit or 16-bit characters... Are these separately available? I don't see them under downloads. Send me a URL, and if I can figure out how to get CVS to add files to the branch :/, maybe I can check this in so people can play with it. -Barry From tismer at appliedbiometrics.com Fri Jun 11 20:17:46 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 11 Jun 1999 20:17:46 +0200 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: Message-ID: <376152CA.B46A691E@appliedbiometrics.com> David Ascher wrote: > > On Fri, 11 Jun 1999, Aaron Watters wrote: > > > I thought it would be good to be able to do the following loop with Numeric > > arrays > > > > for x in array1: > > array2[x] = array3[x] + array4[x] > > > > without any memory management being involved. Right now, I think the > > FYI, I think it should be done by writing: > > array2[array1] = array3[array1] + array4[array1] > > and doing "the right thing" in NumPy. In other words, I don't think the > core needs to be involved. For NumPy, this is very ok, dealing with arrays in an array world. Without trying to repeat myself, I'd like to say that I still consider it an unsolved problem which is worth to be solved or to be proven unsolvable: How to do simple things in an efficient way with many tiny Python objects, without writing an extension, without rethinking a problem into APL like style, and without changing the language. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Fri Jun 11 20:22:36 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:22:36 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: <14177.21484.126155.939932@anthem.cnri.reston.va.us> >> Perhaps join() ought to be a built-in function? IMO, builtin join ought to str()ify all the elements in the sequence, concatenating the results. That seems an intuitive interpretation of 'join'ing a sequence. Here's my Python prototype: def join(seq, sep=''): if not seq: return '' x = str(seq[0]) for y in seq[1:]: x = x + sep + str(y) return x Guido? -Barry From da at ski.org Fri Jun 11 20:24:34 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 11:24:34 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.21484.126155.939932@anthem.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Barry A. Warsaw wrote: > IMO, builtin join ought to str()ify all the elements in the sequence, > concatenating the results. That seems an intuitive interpretation of > 'join'ing a sequence. Here's my Python prototype: I don't get it -- why? I'd expect join(((1,2,3), (4,5,6))) to yield (1,2,3,4,5,6), not anything involving strings. --david From bwarsaw at cnri.reston.va.us Fri Jun 11 20:26:48 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:26:48 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> Message-ID: <14177.21736.100540.221487@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> I see the following functions in string.py that could SM> reasonably be methodized: SM> ljust, rjust, center, expandtabs, capwords Also zfill. What do you think, are these important enough to add? Maybe we can just drop in /F's implementation for these. -Barry From bwarsaw at cnri.reston.va.us Fri Jun 11 20:34:08 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:34:08 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14177.21484.126155.939932@anthem.cnri.reston.va.us> Message-ID: <14177.22176.328185.872134@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: DA> On Fri, 11 Jun 1999, Barry A. Warsaw wrote: >> IMO, builtin join ought to str()ify all the elements in the >> sequence, concatenating the results. That seems an intuitive >> interpretation of 'join'ing a sequence. Here's my Python >> prototype: DA> I don't get it -- why? DA> I'd expect join(((1,2,3), (4,5,6))) to yield (1,2,3,4,5,6), DA> not anything involving strings. Oh, just because I think it might useful, and would provide something that isn't easily provided with other constructs. Without those semantics join(((1,2,3), (4,5,6))) isn't much different than (1,2,3) + (4,5,6), or reduce(operator.add, ((1,2,3), (4,5,6))) as you point out. Since those latter two are easy enough to come up with, but str()ing the elements would require painful lambdas, I figured make the new built in do something new. -Barry From bwarsaw at cnri.reston.va.us Fri Jun 11 20:36:54 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:36:54 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> <14177.21736.100540.221487@anthem.cnri.reston.va.us> Message-ID: <14177.22342.320993.969742@anthem.cnri.reston.va.us> One other thing to think about. Where should this new methods be documented? I suppose we should reword the appropriate entries in modules-string and move them to typesseq-strings. What do you think Fred? -Barry From da at ski.org Fri Jun 11 20:36:32 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 11:36:32 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.22176.328185.872134@anthem.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Barry A. Warsaw wrote: Barry: > >> IMO, builtin join ought to str()ify all the elements in the > >> sequence, concatenating the results. Me: > I don't get it -- why? Barry: > Oh, just because I think it might useful, and would provide something > that isn't easily provided with other constructs. I do map(str, ...) all the time. My real concern is that there is nothing about the word 'join' which implies string conversion. Either call it joinstrings or don't do the conversion, I say. --david From bwarsaw at cnri.reston.va.us Fri Jun 11 20:42:27 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:42:27 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14177.22176.328185.872134@anthem.cnri.reston.va.us> Message-ID: <14177.22675.716917.331314@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: DA> My real concern is that there is nothing about the word 'join' DA> which implies string conversion. Either call it joinstrings DA> or don't do the conversion, I say. Can you say mapconcat() ? :) Or instead of join, just call it concat? -Barry From da at ski.org Fri Jun 11 20:46:19 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 11:46:19 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.22675.716917.331314@anthem.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Barry A. Warsaw wrote: > >>>>> "DA" == David Ascher writes: > > DA> My real concern is that there is nothing about the word 'join' > DA> which implies string conversion. Either call it joinstrings > DA> or don't do the conversion, I say. > > Can you say mapconcat() ? :) > > Or instead of join, just call it concat? Again, no. Concatenating sequences is what I think the + operator does. I think you need the letters S, T, and R in there... But I'm still not convinced of its utility. From guido at CNRI.Reston.VA.US Fri Jun 11 20:51:18 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 14:51:18 -0400 Subject: [Python-Dev] join() Message-ID: <199906111851.OAA04105@eric.cnri.reston.va.us> Given the heat in this discussion, I'm not sure if I endorse *any* of the proposals so far any more... How would Java do this? A static function in the String class, probably. The Python equivalent is... A function in the string module. So maybe string.join() it remains. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Fri Jun 11 21:08:11 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 15:08:11 -0400 (EDT) Subject: [Python-Dev] join() References: <199906111851.OAA04105@eric.cnri.reston.va.us> Message-ID: <14177.24219.94236.485421@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Given the heat in this discussion, I'm not sure if I Guido> endorse *any* of the proposals so far any more... Oh I dunno. David and I aren't throwing rocks at each other yet :) Guido> How would Java do this? A static function in the String Guido> class, probably. The Python equivalent is... A function Guido> in the string module. So maybe string.join() it remains. The only reason for making it a builtin would be to avoid pulling in all of string just to get join. But I guess we need to get some more experience using the methods before we know whether this is a real problem or not. as-good-as-a-from-string-import-join-and-easier-to-implement-ly y'rs, -Barry From skip at mojam.com Fri Jun 11 21:38:33 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 11 Jun 1999 15:38:33 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.21484.126155.939932@anthem.cnri.reston.va.us> References: <199906110516.BAA02520@eric.cnri.reston.va.us> <14177.21484.126155.939932@anthem.cnri.reston.va.us> Message-ID: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com> Barry> IMO, builtin join ought to str()ify all the elements in the Barry> sequence, concatenating the results. That seems an intuitive Barry> interpretation of 'join'ing a sequence. Any reason why join should be a builtin and not a method available just to sequences? Would there some valid interpretation of join( {'a': 1} ) join( 1 ) ? If not, I vote for method-hood, not builtin-hood. Seems like you'd avoid some confusion (and some griping by Graham Matthews about how unpure it is ;-). Skip From skip at mojam.com Fri Jun 11 21:42:11 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 11 Jun 1999 15:42:11 -0400 (EDT) Subject: [Python-Dev] join() In-Reply-To: <14177.24219.94236.485421@anthem.cnri.reston.va.us> References: <199906111851.OAA04105@eric.cnri.reston.va.us> <14177.24219.94236.485421@anthem.cnri.reston.va.us> Message-ID: <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> BAW> The only reason for making it a builtin would be to avoid pulling BAW> in all of string just to get join. I still don't understand the motivation for making it a builtin instead of a method of the types it operates on. Making it a builtin seems very un-object-oriented to me. Skip From guido at CNRI.Reston.VA.US Fri Jun 11 21:44:28 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 15:44:28 -0400 Subject: [Python-Dev] join() In-Reply-To: Your message of "Fri, 11 Jun 1999 15:42:11 EDT." <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> References: <199906111851.OAA04105@eric.cnri.reston.va.us> <14177.24219.94236.485421@anthem.cnri.reston.va.us> <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> Message-ID: <199906111944.PAA04277@eric.cnri.reston.va.us> > I still don't understand the motivation for making it a builtin instead of a > method of the types it operates on. Making it a builtin seems very > un-object-oriented to me. Because if you make it a method, every sequence type needs to know about joining strings. (This wouldn't be a problem in Smalltalk where sequence types inherit this stuff from an abstract sequence class, but in Python unfortunately that doesn't exist.) --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Fri Jun 11 22:11:11 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 13:11:11 -0700 (Pacific Daylight Time) Subject: [Python-Dev] join() In-Reply-To: <199906111944.PAA04277@eric.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Guido van Rossum wrote: > > I still don't understand the motivation for making it a builtin instead of a > > method of the types it operates on. Making it a builtin seems very > > un-object-oriented to me. > > Because if you make it a method, every sequence type needs to know > about joining strings. It still seems to me that we could do something like F/'s proposal, where sequences can define a join() method, which could be optimized if the first element is a string to do what string.join, by placing the class method in an instance method of strings, since string joining clearly has to involve at least one string. Pseudocode: class SequenceType: def join(self, separator=None): if hasattr(self[0], '__join__') # covers all types which can be efficiently joined if homogeneous return self[0].__join__(self, separator) # for the rest: if separator is None: return map(operator.add, self) result = self[0] for element in self[1:]: result = result + separator + element return result where the above would have to be done in abstract.c, with error handling, etc. and with strings (regular and unicode) defining efficient __join__'s as in: class StringType: def join(self, separator): raise AttributeError, ... def __join__(self, sequence): return string.join(sequence) # obviously not literally that =) class UnicodeStringType: def __join__(self, sequence): return unicode.join(sequence) (in C, of course). Yes, it's strange to fake class methods with instance methods, but it's been done before =). Yes, this means expanding what it means to "be a sequence" -- is that impossible without breaking lots of code? --david From gmcm at hypernet.com Fri Jun 11 23:30:10 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Fri, 11 Jun 1999 16:30:10 -0500 Subject: [Python-Dev] String methods... finally In-Reply-To: References: <14177.22675.716917.331314@anthem.cnri.reston.va.us> Message-ID: <1282985631-84109501@hypernet.com> David Ascher wrote: > Barry Warsaw wrote: > > Or instead of join, just call it concat? > > Again, no. Concatenating sequences is what I think the + operator > does. I think you need the letters S, T, and R in there... But I'm > still not convinced of its utility. But then Q will feel left out, and since Q doesn't go anywhere without U, pretty soon you'll have the whole damn alphabet in there. I-draw-the-line-at-$-well-$-&- at -but-definitely-not-#-ly y'rs - Gordon From MHammond at skippinet.com.au Sat Jun 12 00:49:29 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Sat, 12 Jun 1999 08:49:29 +1000 Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.20510.818041.110989@anthem.cnri.reston.va.us> Message-ID: <006801beb45c$aab5baa0$0801a8c0@bobcat> > Are these separately available? I don't see them under downloads. > Send me a URL, and if I can figure out how to get CVS to add files to > the branch :/, maybe I can check this in so people can play with it. Fredrik and I have spoken about this. He will dust it off and integrate some patches in the next few days. He will then send it to me to make sure the patches I made for Windows CE all made it OK, then one of us will integrate it with the branch and send it on... Mark. From tim_one at email.msn.com Sat Jun 12 02:56:03 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 11 Jun 1999 20:56:03 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.21736.100540.221487@anthem.cnri.reston.va.us> Message-ID: <000401beb46e$58b965a0$5ba22299@tim> [Skip Montanaro] > I see the following functions in string.py that could > reasonably be methodized: > > ljust, rjust, center, expandtabs, capwords > > Also zfill. > [Barry A. Warsaw] > What do you think, are these important enough to add? I think lack-of-surprise (gratuitous orthogonality ) was the motive here. If Guido could drop string functions in 2.0, which would he be happy to forget? Give him a head start. ljust and rjust were used often a long time ago, before the "%" sprintf-like operator was introduced; don't think I've seen new code use them in years. center was a nice convenience in the pre-HTML world, but probably never speed-critical and easy to write yourself. expandtabs is used frequently in IDLE and even pyclbr.py now. Curiously, though, they almost never want the tab-expanded string, but rather its len. capwords could become an absolute nightmare in a Unicode world <0.5 wink>. > Maybe we can just drop in /F's implementation for these. Sounds like A Plan to me. Wouldn't mourn the passing of the first three. and-i-even-cried-at-my-father's-funeral-ly y'rs - tim From tim_one at email.msn.com Sat Jun 12 08:19:33 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 02:19:33 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <199906110140.VAA02180@eric.cnri.reston.va.us> Message-ID: <000001beb49b$8a94f120$b19e2299@tim> [GvR] > (I sometimes wished I wasn't in the business of making releases. I've > asked for help with making essential patches to 1.5.2 available but > nobody volunteered... :-( ) It's kinda baffling "out here" -- checkin comments usually say what a patch does, but rarely make a judgment about a patch's importance. Sorting thru hundreds of patches without a clue is a pretty hopeless task. Perhaps future checkins that the checker-inner feels are essential could be commented as such in a machine-findable way? an-ounce-of-foresight-is-worth-a-sheet-of-foreskin-or-something-like-that-ly y'rs - tim From tim_one at email.msn.com Sat Jun 12 08:19:37 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 02:19:37 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <199906101411.KAA29962@eric.cnri.reston.va.us> Message-ID: <000101beb49b$8c27c620$b19e2299@tim> [Aaron, describes a scheme where objects are represented by a fixed-size (typecode, variant) pair, where if the typecode is e.g. INT or FLOAT the variant is the value directly instead of a pointer to the value] [Guido] > What you're describing is very close to what I recall I once read > about the runtime organization of Icon. At the lowest level it's exactly what Icon does. It does *not* exempt ints from Icon's flavor of dynamic memory management, but Icon doesn't use refcounting -- it uses compacting mark-&-sweep across some 5 distinct regions each with their own finer-grained policies (e.g., strings are central to Icon and so it manages the string region a little differently; and Icon coroutines save away pieces of the platform's C stack so need *very* special treatment). So: 1) There are no incref/decref expenses anywhere in Icon. 2) Because of compaction, all allocations cost the same and are dirt cheap: just increment the appropriate region's "avail" pointer by the number of bytes you need. If there aren't enough bytes, run GC and try again. If there still aren't enough bytes, Icon usually shuts down (it's not good at asking the OS for more memory! it carves up its initial memory in pretty rigid ways, and relies on tricks like comparing storage addresses to speed M&S and compaction -- those "regions" are in a fixed order relative to each other, so new memory can't be tacked on to a region except at the low and high ends). 3) All the expense is in finding and compacting live objects, so in an odd literal sense cleaning up trash comes for free. 4) Icon has no finalizers, so it doesn't need to identify or preserve trash -- compaction simply overwrites "the holes" where the trash used to be. Icon is nicely implemented, but it's a "self-contained universe" view of the world and its memory approach makes life hard for the tiny handful of folks who have *tried* to make it extendable via C. Icon is also purely procedural -- no OO, no destructors, no resurrection. Irony: one reason I picked up Python in '91 is that my int-fiddling code was too slow in Icon! Even Python 0.9.0 ran int algorithms significantly faster than the 10-years-refined Icon implementation of that time. Never looked into why, but now that Aaron brought up the issue I find it very surprising! Those algorithms had a huge rate of int trash creation, but very few persistent objects, so Icon's M&S should have run like the wind. And Icon's allocation is dirt-cheap (at least as fast as Python's fastest special-purpose allocators), and didn't have any refcounting expenses either. There's an important lesson *somewhere* in that . Maybe it was the fault of Icon's "goal-directed" expression evaluation, constantly asking "did this int succeed or fail?", "did that add suceed or fail?", etc. > ... > The Icon approach (i.e. yours) seems to require a complete rethinking > of all object implementations and all APIs at the C level -- perhaps > we could think about it for Python 2.0. Some ramifications: > > - Uses more memory for highly shared objects (there are as many copies > of the type pointer as there are references). Actually more than that in Icon: if the "variant" part is a pointer, the first word of the block it points to is also a copy of the typecode (turns out the redundancy speeds the GC). > - Thus, lists take double the memory assuming they reference objects > that also exist elsewhere. This affects the performance of slices > etc. > > - On the other hand, a list of ints takes half the memory (given that > most of those ints are not shared). Isn't this 2/3 rather than 1/2? I'm picturing a list element today as essentially a pointer to a type object pointer + int (3 units in all), and a type object pointer + int (2 units in all) "tomorrow". Throw in refcounts too and the ratio likely gets closer to 1. > - *Homogeneous* lists (where all elements have the same type -- > i.e. arrays) can be represented more efficiently by having only one > copy of the type pointer. This was an idea for ABC (whose type system > required all container types to be homogenous) that was never > implemented (because in practice the type check wasn't always applied, > and the top-level namespace used by the interactive command > interpreter violated all the rules). Well, Python already has homogeneous int lists (array.array), and while they save space they suffer in speed due to needing to wrap raw ints "in an object" upon reference and unwrap them upon storage. > - Reference count manipulations could be done by a macro (or C++ > behind-the-scense magic using copy constructors and destructors) that > calls a function in the type object -- i.e. each object could decide > on its own reference counting implementation :-) You don't need to switch representations to get that, though, right? That is, I don't see anything stopping today's type objects from growing __incref__ and __decref__ slots -- except for common sense . An apparent ramification I don't see above that may actually be worth something : - In "i = j + k", the eval stack could contain the ints directly, instead of pointers to the ints. So fetching the value of i takes two loads (get the type pointer + the variant) from adjacent stack locations, instead of today's load-the-pointer + follow-the-pointer (to some other part of memory); similarly for fetching the value of j. Then the sum can be stored *directly* into the stack too, without today's need for allocating and wrapping it in "an int object" first. Possibly happy variant: on top of the above, *don't* exempt ints from refcounting. Let 'em incref and decref like everything else. Give them an intial refcount of max_count/2, and in the exceedingly unlikely event a decref on an int ever sees zero, the int "destructor" simply resets the refcount to max_count/2 and is otherwise a nop. semi-thinking-semi-aloud-ly y'rs - tim From ping at lfw.org Sat Jun 12 10:05:06 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Sat, 12 Jun 1999 01:05:06 -0700 (PDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <004601beb409$8c535750$f29b12c2@pythonware.com> Message-ID: On Fri, 11 Jun 1999, Fredrik Lundh wrote: > > Two new methods startswith and endswith act like their Java cousins. > > is it just me, or do those method names suck? > > begin? starts_with? startsWith? (ouch) > has_prefix? I'm quite happy with "startswith" and "endswith". I mean, they're a bit long, i suppose, but i can't think of anything better. You definitely want to avoid has_prefix, as that compounds the has_key vs. hasattr issue. x.startswith("foo") x[:3] == "foo" x.startswith(y) x[:len(y)] == y Hmm. I guess it doesn't save you much typing until y is an expression. But it's still a lot easier to read. !ping From ping at lfw.org Sat Jun 12 10:12:38 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Sat, 12 Jun 1999 01:12:38 -0700 (PDT) Subject: [Python-Dev] join() In-Reply-To: <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> Message-ID: On Fri, 11 Jun 1999, Skip Montanaro wrote: > > BAW> The only reason for making it a builtin would be to avoid pulling > BAW> in all of string just to get join. > > I still don't understand the motivation for making it a builtin instead of a > method of the types it operates on. Making it a builtin seems very > un-object-oriented to me. Builtin-hood makes it possible for one method to apply to many types (or a heterogeneous list of things). I think i'd support the def join(list, sep=None): if sep is None: result = list[0] for item in list[1:]: result = result + item else: result = list[0] for item in list[1:]: result = result + sep + item idea, basically a reduce(operator.add...) with an optional separator -- *except* my main issue would be to make sure that the actual implementation optimizes the case of joining a list of strings. string.join() currently seems like the last refuge for those wanting to avoid O(n^2) time when assembling many small pieces in string buffers, and i don't want it to see it go away. !ping From fredrik at pythonware.com Sat Jun 12 11:13:59 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 12 Jun 1999 11:13:59 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us><008601beb3da$5e0a7a60$f29b12c2@pythonware.com> <14177.20510.818041.110989@anthem.cnri.reston.va.us> Message-ID: <00c301beb4b3$e84e3de0$f29b12c2@pythonware.com> > FL> fwiw, the Unicode module available from pythonware.com > FL> implements them all, and more importantly, it can be com- > FL> piled for either 8-bit or 16-bit characters... > > Are these separately available? I don't see them under downloads. > Send me a URL, and if I can figure out how to get CVS to add files to > the branch :/, maybe I can check this in so people can play with it. it's under: http://www.pythonware.com/madscientist/index.htm but I've teamed up with Mark H. to update the stuff a bit, test it with his CE port, and produce a set of patches. I'm working on this in this very moment. btw, as for the "missing methods in the string type" issue, my suggestion is to merge the source code into a unified string module, which is compiled twice (or three times, the day we find that we need a 32-bit string type). don't waste any time cutting and pasting until we've sorted that one out... From fredrik at pythonware.com Sat Jun 12 11:31:08 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 12 Jun 1999 11:31:08 +0200 Subject: [Python-Dev] String methods... finally References: <000401beb46e$58b965a0$5ba22299@tim> Message-ID: <00fb01beb4b6$4df59420$f29b12c2@pythonware.com> > expandtabs is used frequently in IDLE and even pyclbr.py now. Curiously, > though, they almost never want the tab-expanded string, but rather its len. looked in stropmodule.c lately: static PyObject * strop_expandtabs(self, args) ... /* First pass: determine size of output string */ ... /* Second pass: create output string and fill it */ ... (btw, I originally wrote that code for pythonworks ;-) how about an "expandtabslength" method? or maybe we should add lazy evaluation of strings! From fredrik at pythonware.com Sat Jun 12 11:49:07 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 12 Jun 1999 11:49:07 +0200 Subject: [Python-Dev] join() References: <199906111851.OAA04105@eric.cnri.reston.va.us> <14177.24219.94236.485421@anthem.cnri.reston.va.us> Message-ID: <014001beb4b9$63f1e820$f29b12c2@pythonware.com> > The only reason for making it a builtin would be to avoid pulling in > all of string just to get join. another reason is that you might be able to avoid a unicode module... From tismer at appliedbiometrics.com Sat Jun 12 15:27:45 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sat, 12 Jun 1999 15:27:45 +0200 Subject: [Python-Dev] More flexible namespaces. References: <008d01be92b2$c56ef5d0$0801a8c0@bobcat> <199904300300.XAA00608@eric.cnri.reston.va.us> <37296096.D0C9C2CC@appliedbiometrics.com> <199904301517.LAA01422@eric.cnri.reston.va.us> Message-ID: <37626051.C1EA8AE0@appliedbiometrics.com> Guido van Rossum wrote: > > > From: Christian Tismer > > > I'd really like to look into that. > > Also I wouldn't worry too much about speed, since this is > > such a cool feature. It might even be a speedup in some cases > > which otherwise would need more complex handling. > > > > May I have a look? > > Sure! > > (I've forwarded Christian the files per separate mail.) > > I'm also interested in your opinion on how well thought-out and robust > the patches are -- I've never found the time to do a good close > reading of them. Coming back from the stackless task with is finished now, I popped this task from my stack. I had a look and it seems well-thought and robust so far. To make a more trustable claim, I would need to build and test it. Is this still of interest, or should I drop it? The follow-ups in this thread indicated that the opinions about flexible namespaces were quite mixed. So, should I waste time in building and testing or better save it? chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Sat Jun 12 19:16:28 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Sat, 12 Jun 1999 13:16:28 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <008601beb3da$5e0a7a60$f29b12c2@pythonware.com> <14177.20510.818041.110989@anthem.cnri.reston.va.us> <00c301beb4b3$e84e3de0$f29b12c2@pythonware.com> Message-ID: <14178.38380.734976.164568@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> btw, as for the "missing methods in the string type" FL> issue, my suggestion is to merge the source code into FL> a unified string module, which is compiled twice (or FL> three times, the day we find that we need a 32-bit FL> string type). don't waste any time cutting and FL> pasting until we've sorted that one out... Very good. Give me the nod when the sorting algorithm halts. From tim_one at email.msn.com Sat Jun 12 20:28:13 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 14:28:13 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com> Message-ID: <000101beb501$55fb9b60$ce9e2299@tim> [Skip Montanaro] > Any reason why join should be a builtin and not a method available just > to sequences? Would there some valid interpretation of > > join( {'a': 1} ) > join( 1 ) > > ? If not, I vote for method-hood, not builtin-hood. Same here, except as a method we've got it twice backwards : it should be a string method, but a method of the *separator*: sep.join(seq) same as convert each elt in seq to a string of the same flavor as sep, then paste the converted strings together with sep between adjacent elements So " ".join(list) delivers the same result as today's string.join(map(str, list), " ") and L" ".join(list) does much the same tomorrow but delivers a Unicode string (or is the "L" for Lundh string ?). It looks odd at first, but the more I play with it the more I think it's "the right thing" to do: captures everything that's done today, plus the most common idiom (mapping str first across the sequence) on top of that, adapts seamlessly (from the user's view) to new string types, and doesn't invite uselessly redundant generalization to non-sequence types. One other attraction perhaps unique to me: I can never remember whether string.join's default separator is a blank or a null string! Explicit is better than implicit . the-heart-of-a-join-is-the-glue-ly y'rs - tim From tim_one at email.msn.com Sat Jun 12 20:28:18 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 14:28:18 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <00fb01beb4b6$4df59420$f29b12c2@pythonware.com> Message-ID: <000201beb501$578548a0$ce9e2299@tim> [Tim] > expandtabs is used frequently in IDLE and even pyclbr.py now. > Curiously, though, they almost never want the tab-expanded string, > but rather its len. [/F] > looked in stropmodule.c lately: > > static PyObject * > strop_expandtabs(self, args) > ... > /* First pass: determine size of output string */ > ... > /* Second pass: create output string and fill it */ > ... > > (btw, I originally wrote that code for pythonworks ;-) Yes, it's nice code! The irony was the source of my "curiously" . > how about an "expandtabslength" method? Na, it's very specialized, easy to spell by hand, and even IDLE/pyclbr don't really need more speed in this area. From tim_one at email.msn.com Sat Jun 12 23:37:08 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 17:37:08 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <3761098D.A56F58A8@ifu.net> Message-ID: <000501beb51b$b9cb3780$ce9e2299@tim> [Aaron Watters] > ... > I thought it would be good to be able to do the following loop > with Numeric arrays > > for x in array1: > array2[x] = array3[x] + array4[x] > > without any memory management being involved. Right now, I think the > for loop has to continually dynamically allocate each new x Actually not, it just binds x to the sequence of PyObject*'s already in array1, one at a time. It does bump & drop the refcount on that object a lot. Also irksome is that it keeps allocating/deallocating a little integer on each trip, for the under-the-covers loop index! Marc-Andre (I think) had/has a patch to worm around that, but IIRC it didn't make much difference (wouldn't expect it to, though -- not if the loop body does any real work). One thing a smarter Python compiler could do is notice the obvious : the *internal* incref/decref operations on the object denoted by x in the loop above must cancel out, so there's no need to do any of them. "internal" == those due to the routine actions of the PVM itself, while pushing and popping the eval stack. Exploiting that is tedious; e.g., inventing a pile of opcode variants that do the same thing as today's except skip an incref here and a decref there. > and intermediate sum (and immediate deallocate them) The intermediate sum is allocated each time, but not deallocated (the pre-existing object at array2[x] *may* be deallocated, though). > and that makes the loop piteously slow. A lot of things conspire to make it slow. David is certainly right that, in this particular case, array2[array1] = array3[array1] + etc worms around the worst of them. > The idea replacing pyobject *'s with a struct [typedescr *, data *] > was a space/time tradeoff to speed up operations like the above > by eliminating any need for mallocs or other memory management.. Fleshing out details may make it look less attractive. For machines where ints are no wider than pointers, the "data *" can be replaced with the int directly and then there's real potential. If for a float the "data*" really *is* a pointer, though, what does it point *at*? Some dynamically allocated memory to hold the float appears to be the only answer, and you're right back at the problem you were hoping to avoid. Make the "data*" field big enough to hold a Python float directly, and the descriptor likely zooms to 128 bits (assuming float is IEEE double and the machine requires natural alignment). Let's say we do that. Where does the "+" implementation get the 16 bytes it needs to store its result? The space presumably already exists in the slot indexed by array2[x], but the "+" implementation has no way to *know* that. Figuring it out requires non-local analysis, which is quite a few steps beyond what Python's compiler can do today. Easiest: internal functions all grow a new PyDescriptor* argument into which they are to write their result's descriptor. The PVM passes "+" the address of the slot indexed by array2[x] if it's smart enough; or, if it's not, the address of the stack slot descriptor into which today's PVM *would* push the result. In the latter case the PVM would need to copy those 16 bytes into the slot indexed by array2[x] later. Neither of those are simple as they sound, though, at least because if array2[x] holds a descriptor with a real pointer in its variant half, the thing to which it points needs to get decref'ed iff the add succeeds. It can get very messy! > I really can't say whether it'd be worth it or not without some sort of > real testing. Just a thought. It's a good thought! Just hard to make real. but-if-michael-hudson-keeps-hacking-at-bytecodes-and-christian- keeps-trying-to-prove-he's-crazier-than-michael-by-2001- we'll-be-able-to-generate-optimized-vector-assembler-for- it-ly y'rs - tim From tim_one at email.msn.com Sat Jun 12 23:37:14 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 17:37:14 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <375FC062.62850DE5@ifu.net> Message-ID: <000601beb51b$bc723ba0$ce9e2299@tim> [Aaron Watters] > ... > Another fix would be to put the refcount in the static side with > no speed penalty > > (typedescr > repr* ----------------------> data > refcount > ) > > but would that be wasteful of space? The killer is for types where repr* is a real pointer: x = [Whatever()] y = x[:] Now we have two physically distinct descriptors pointing at the same thing, and so also two distinct refcounts for that thing -- impossible to keep them in synch efficiently; "del y" has no way efficient way to find the refcount hiding in x. tbings-and-and-their-refcounts-are-monogamous-ly y'rs - tim From bwarsaw at cnri.reston.va.us Sun Jun 13 19:56:33 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Sun, 13 Jun 1999 13:56:33 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com> <000101beb501$55fb9b60$ce9e2299@tim> Message-ID: <14179.61649.286195.248429@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Same here, except as a method we've got it twice backwards TP> : it should be a string method, but a method of the TP> *separator*: TP> sep.join(seq) TP> same as | convert each elt in seq to a string of the same flavor as | sep, then paste the converted strings together with sep | between adjacent elements TP> So TP> " ".join(list) TP> delivers the same result as today's TP> string.join(map(str, list), " ") TP> and TP> L" ".join(list) TP> does much the same tomorrow but delivers a Unicode string (or TP> is the "L" for Lundh string ?). TP> It looks odd at first, but the more I play with it the more I TP> think it's "the right thing" to do At first glance, I like this proposal a lot. I'd be happy to code it up if David'll stop throwing those rocks. Whether or not they hit me, they still hurt :) -Barry From tim_one at email.msn.com Sun Jun 13 21:34:57 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 13 Jun 1999 15:34:57 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14179.61649.286195.248429@anthem.cnri.reston.va.us> Message-ID: <000801beb5d3$d1fd06e0$ae9e2299@tim> > >>>>> "TP" == Tim Peters writes: > > TP> Same here, except as a method we've got it twice backwards > TP> : it should be a string method, but a method of the > TP> *separator*: > > TP> sep.join(seq) > > TP> same as > > | convert each elt in seq to a string of the same flavor as > | sep, then paste the converted strings together with sep > | between adjacent elements > > TP> So > > TP> " ".join(list) > > TP> delivers the same result as today's > > TP> string.join(map(str, list), " ") > > TP> and > > TP> L" ".join(list) > > TP> does much the same tomorrow but delivers a Unicode string (or > TP> is the "L" for Lundh string ?). > > TP> It looks odd at first, but the more I play with it the more I > TP> think it's "the right thing" to do Barry, did it ever occur to you to that this fancy Emacs quoting is pig ugly ? [Barry A. Warsaw] > At first glance, I like this proposal a lot. That's a bit scary -- even I didn't like it at first glance. It kept growing on me, though, especially after a trivial naming trick: space, tab, null = ' ', '\t', '' ... sentence = space.join(list) table = tab.join(list) squashed = null.join(list) That's so beautifully self-descriptive I cried! Well, I actually jerked my leg and stubbed my little toe badly, but it's healing nicely, thank you. Note the naturalness too of creating zippier bound method objects for the kinds of join you're doing most often: spacejoin = ' '.join tabjoin = '\t'.join etc. I still like it more the more I play with it. > I'd be happy to code it up if David'll stop throwing those rocks. David warmed up to it in pvt email (his first response was the expected one-liner "Wacky!"). Other issues: + David may want C.join(T) generalized to other classes C and argument types T. So far my response to all such generalizations has been "wacky!" , but I don't think that bears one way or t'other on whether StringType.join(SequenceType) makes good sense on its own. + string.join(seq) doesn't currently convert seq elements to string type, and in my vision it would. At least three of us admit to mapping str across seq anyway before calling string.join, and I think it would be a nice convenience: I think there's no confusion because there's nothing sensible string.join *could* do with a non-string seq element other than convert it to string. The primary effect of string.join griping about a non-string seq element today is that my if not ok: sys.__stdout__.write("not ok, args are " + string.join(args) + "\n") debugging output blows up instead of being helpful <0.8 wink>. If Guido is opposed to being helpful, though , the auto-convert bit isn't essential. > Whether or not they hit me, they still hurt :) I know they do, Barry. That's why I never throw rocks at you. If you like, I'll have a word with David's ISP. if-this-was-a-flame-war-we're-too-civilized-to-live-long-enough-to- reproduce-ly y'rs - tim From da at ski.org Sun Jun 13 21:48:59 1999 From: da at ski.org (David Ascher) Date: Sun, 13 Jun 1999 12:48:59 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14179.61649.286195.248429@anthem.cnri.reston.va.us> Message-ID: On Sun, 13 Jun 1999, Barry A. Warsaw wrote: > At first glance, I like this proposal a lot. I'd be happy to code it > up if David'll stop throwing those rocks. Whether or not they hit me, > they still hurt :) I like it too, since you ask. =) (When you get a chance, could you bring the rocks back? I only have a limited supply. Thanks). --david From guido at CNRI.Reston.VA.US Mon Jun 14 16:46:34 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 14 Jun 1999 10:46:34 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Sat, 12 Jun 1999 14:28:13 EDT." <000101beb501$55fb9b60$ce9e2299@tim> References: <000101beb501$55fb9b60$ce9e2299@tim> Message-ID: <199906141446.KAA00733@eric.cnri.reston.va.us> > Same here, except as a method we've got it twice backwards : it > should be a string method, but a method of the *separator*: > > sep.join(seq) Funny, but it does seem right! Barry, go for it... --Guido van Rossum (home page: http://www.python.org/~guido/) From klm at digicool.com Mon Jun 14 17:09:58 1999 From: klm at digicool.com (Ken Manheimer) Date: Mon, 14 Jun 1999 11:09:58 -0400 Subject: [Python-Dev] String methods... finally Message-ID: <613145F79272D211914B0020AFF640191D1BAF@gandalf.digicool.com> > [Skip Montanaro] > > I see the following functions in string.py that could > > reasonably be methodized: > > > > ljust, rjust, center, expandtabs, capwords > > > > Also zfill. > > > > [Barry A. Warsaw] > > What do you think, are these important enough to add? I think expandtabs is worthwhile. Though i wouldn't say i use it frequently, when i do use it i'm thankful it's there - it's something i'm really glad to have precooked, since i'm generally not looking for the distraction when i do happen to need it... Ken klm at digicool.com From guido at CNRI.Reston.VA.US Mon Jun 14 17:12:33 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 14 Jun 1999 11:12:33 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: Your message of "Sat, 12 Jun 1999 02:19:37 EDT." <000101beb49b$8c27c620$b19e2299@tim> References: <000101beb49b$8c27c620$b19e2299@tim> Message-ID: <199906141512.LAA00793@eric.cnri.reston.va.us> [me] > > - Thus, lists take double the memory assuming they reference objects > > that also exist elsewhere. This affects the performance of slices > > etc. > > > > - On the other hand, a list of ints takes half the memory (given that > > most of those ints are not shared). [Tim] > Isn't this 2/3 rather than 1/2? I'm picturing a list element today as > essentially a pointer to a type object pointer + int (3 units in all), and a > type object pointer + int (2 units in all) "tomorrow". Throw in refcounts > too and the ratio likely gets closer to 1. An int is currently 3 units: type, refcnt, value. (The sepcial int allocator means that there's no malloc overhead.) A list item is one unit. So a list of N ints is 4N units (+ overhead). In the proposed scheme, there would be 2 units. That makes a factor of 1/2 for me... > Well, Python already has homogeneous int lists (array.array), and while they > save space they suffer in speed due to needing to wrap raw ints "in an > object" upon reference and unwrap them upon storage. Which would become faster with the proposed scheme since it would not require any heap allocation (presuming 2-unit structs can be passed around as function results). > > - Reference count manipulations could be done by a macro (or C++ > > behind-the-scense magic using copy constructors and destructors) that > > calls a function in the type object -- i.e. each object could decide > > on its own reference counting implementation :-) > > You don't need to switch representations to get that, though, right? That > is, I don't see anything stopping today's type objects from growing > __incref__ and __decref__ slots -- except for common sense . Eh, indeed . > An apparent ramification I don't see above that may actually be worth > something : > > - In "i = j + k", the eval stack could contain the ints directly, instead of > pointers to the ints. So fetching the value of i takes two loads (get the > type pointer + the variant) from adjacent stack locations, instead of > today's load-the-pointer + follow-the-pointer (to some other part of > memory); similarly for fetching the value of j. Then the sum can be stored > *directly* into the stack too, without today's need for allocating and > wrapping it in "an int object" first. I though this was assumed all the time? I mentioned "no heap allocation" above before I read this. I think this is the reason why it was proposed at all: things for which the value fits in a unit don't live on the heap at all, *without* playing tricks with pointer representations. > Possibly happy variant: on top of the above, *don't* exempt ints from > refcounting. Let 'em incref and decref like everything else. Give them an > intial refcount of max_count/2, and in the exceedingly unlikely event a > decref on an int ever sees zero, the int "destructor" simply resets the > refcount to max_count/2 and is otherwise a nop. Don't get this -- there's no object on the heap to hold the refcnt. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Mon Jun 14 20:47:32 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 14 Jun 1999 14:47:32 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us> <000801beb5d3$d1fd06e0$ae9e2299@tim> Message-ID: <14181.20036.857729.999835@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: Timbot> Barry, did it ever occur to you to that this fancy Emacs Timbot> quoting is pig ugly ? wink> + string.join(seq) doesn't currently convert seq elements to wink> string type, and in my vision it would. At least three of wink> us admit to mapping str across seq anyway before calling wink> string.join, and I think it would be a nice convenience: Check the CVS branch. It does seem pretty cool! From bwarsaw at cnri.reston.va.us Mon Jun 14 20:48:10 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 14 Jun 1999 14:48:10 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us> Message-ID: <14181.20074.728230.764485@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: DA> (When you get a chance, could you bring the rocks back? I DA> only have a limited supply. Thanks). Sorry, I need them to fill up the empty spaces in my skull. -Barry From tim_one at email.msn.com Tue Jun 15 04:50:08 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 14 Jun 1999 22:50:08 -0400 Subject: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <14181.20036.857729.999835@anthem.cnri.reston.va.us> Message-ID: <000001beb6d9$c82e7980$069e2299@tim> >> wink> + string.join(seq) [etc] [Barry] > Check the CVS branch. It does seem pretty cool! It's even more fun to play with than to argue about . Thank you, Barry! A bug: >>> 'ab'.endswith('b',0,1) # right 0 >>> 'ab'.endswith('ab',0,1) # wrong 1 >>> 'ab'.endswith('ab',0,0) # wrong 1 >>> Two legit compiler warnings from a previous checkin: Objects\intobject.c(236) : warning C4013: 'isspace' undefined; assuming extern returning int Objects\intobject.c(243) : warning C4013: 'isalnum' undefined; assuming extern returning int One docstring glitch ("very" -> "every"): >>> print ''.join.__doc__ S.join(sequence) -> string Return a string which is the concatenation of the string representation of very element in the sequence. The separator between elements is S. >>> "-".join("very nice indeed! ly".split()) + " y'rs - tim" From MHammond at skippinet.com.au Tue Jun 15 05:13:03 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 15 Jun 1999 13:13:03 +1000 Subject: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <000001beb6d9$c82e7980$069e2299@tim> Message-ID: <00e901beb6dc$fc830d60$0801a8c0@bobcat> > "-".join("very nice indeed! ly".split()) + " y'rs - tim" But now the IDLE "CallTips" extenion seems lame. Typing >>> " ".join( doesnt yield the help, where: >>> s=" "; s.join( does :-) Very cute, I must say. The biggest temptation is going to be, as I mentioned, avoiding the use of this stuff for "general" code. Im still unconvinced the "sep.join" concept is natural, but string methods in general sure as hell are. Guido almost hinted that post 1.5.2 interim release(s?) would be acceptable, so long as he didnt have to do it! Im tempted to volunteer to agree to do something for Windows, and if no other platform biggots volunteer, I wont mind in the least :-) I realize it still needs settling down, but this is too good to keep to "ourselves" (being CVS enabled people) for too long ;-) Mark. From tim_one at email.msn.com Tue Jun 15 07:29:03 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 01:29:03 -0400 Subject: [Python-Dev] RE: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <199906141512.LAA00793@eric.cnri.reston.va.us> Message-ID: <000a01beb6ef$fac66ea0$069e2299@tim> [Guido] >>> - On the other hand, a list of ints takes half the memory (given that >>> most of those ints are not shared). [Tim] >> Isn't this 2/3 rather than 1/2? [yadda yadda] [Guido] > An int is currently 3 units: type, refcnt, value. (The sepcial int > allocator means that there's no malloc overhead.) A list item is one > unit. So a list of N ints is 4N units (+ overhead). In the proposed > scheme, there would be 2 units. That makes a factor of 1/2 for me... Well, if you count the refcount, sure . Moving on, implies you're not contemplating making the descriptor big enough to hold a float (else it would still be 4 units assuming natural alignment), in turn implying that *only* ints would get the space advantage in lists/tuples? Plus maybe special-casing the snot out of short strings? >> Well, Python already has homogeneous int lists (array.array), >> and while they save space they suffer in speed ... > Which would become faster with the proposed scheme since it would not > require any heap allocation (presuming 2-unit structs can be passed > around as function results). They can be in any std (even reasonable) C (or C++). If this gets serious, though, strongly suggest timing it on important compiler + platform combos, especially RISC. You can probably *count* on a PyObject* result getting returned in a register, but depressed C++ compiler jockeys have been known to treat struct/class returns via an unoptimized chain of copy constructors. Probably better to allocate "result space" in the caller and pass that via reference to the callee. With care, you can get the result written into its final resting place efficiently then, more efficiently than even a gonzo globally optimizing compiler could figure out (A calls B call C calls D, and A can tell D exactly where to store the result if it's explicit). >> [other ramifications for >> "i = j + k" >> ] > I though this was assumed all the time? Apparently it was! At least by you . Now by me too; no problem. >> [refcount-on-int drivel] > Don't get this -- there's no object on the heap to hold the refcnt. I don't get it either. Desperation? The idea that incref/decref may need to be treated as virtual methods (in order to exempt ints or other possible direct values) really disturbs me -- incref/decref happen *all* the time, explicit integer ops only some of the time. Turning incref/decref into indirected function calls doesn't sound promising at all. Injecting a test-branch guard via macro sounds faster but still icky, and especially if the set of exempt types isn't a singleton. no-positive-suggestions-just-grousing-ly y'rs - tim From tim_one at email.msn.com Tue Jun 15 08:17:02 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 02:17:02 -0400 Subject: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <00e901beb6dc$fc830d60$0801a8c0@bobcat> Message-ID: <001201beb6f6$af0987c0$069e2299@tim> [Mark Hammond] > ... > But now the IDLE "CallTips" extenion seems lame. > > Typing > >>> " ".join( > > doesnt yield the help, where: > >>> s=" "; s.join( > > does :-) No Windows Guy will be stymied by how to hack that! Hint: string literals always end with one of two characters . > Very cute, I must say. The biggest temptation is going to be, as I > mentioned, avoiding the use of this stuff for "general" code. Im still > unconvinced the "sep.join" concept is natural, but string methods in > general sure as hell are. sep.join bothered me until I gave the separator a name (a la the "space.join, tab.join", etc examples earlier). Then it looked *achingly* natural! Using a one-character literal instead still rubs me the wrong way, although for some reason e.g. ", ".join(seq) no longer does. I can't account for any of it, but I know what I like . > Guido almost hinted that post 1.5.2 interim release(s?) would be > acceptable, so long as he didnt have to do it! Im tempted to volunteer to > agree to do something for Windows, and if no other platform biggots > volunteer, I wont mind in the least :-) I realize it still > needs settling down, but this is too good to keep to "ourselves" (being > CVS enabled people) for too long ;-) Yes, I really like the new string methods too! And I want to rewrite all of IDLE to use them ASAP . damn-the-users-let's-go-nuts-ly y'rs - tim From fredrik at pythonware.com Tue Jun 15 09:10:28 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 15 Jun 1999 09:10:28 +0200 Subject: [Python-Dev] Re: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> Message-ID: <006801beb6fe$27490d80$f29b12c2@pythonware.com> > wink> + string.join(seq) doesn't currently convert seq elements to > wink> string type, and in my vision it would. At least three of > wink> us admit to mapping str across seq anyway before calling > wink> string.join, and I think it would be a nice convenience: hmm. consider the following: space = " " foo = L"foo" bar = L"bar" result = space.join((foo, bar)) what should happen if you run this: a) Python raises an exception b) result is an ordinary string object c) result is a unicode string object From ping at lfw.org Tue Jun 15 09:24:33 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Tue, 15 Jun 1999 00:24:33 -0700 (PDT) Subject: [Python-Dev] Re: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <000001beb6d9$c82e7980$069e2299@tim> Message-ID: On Mon, 14 Jun 1999, Tim Peters wrote: > > A bug: > > >>> 'ab'.endswith('b',0,1) # right > 0 > >>> 'ab'.endswith('ab',0,1) # wrong > 1 > >>> 'ab'.endswith('ab',0,0) # wrong > 1 > >>> I assumed you meant that the extra arguments should be slices on the string being searched, i.e. specimen.startswith(text, start, end) is equivalent to specimen[start:end].startswith(text) without the overhead of slicing the specimen? Or did i understand you correctly? > Return a string which is the concatenation of the string representation > of very element in the sequence. The separator between elements is S. > >>> > > "-".join("very nice indeed! ly".split()) + " y'rs - tim" Yes, i have to agree that this (especially once you name the separator string) is a pretty nice way to present the "join" functionality. !ping "Is it so small a thing, To have enjoyed the sun, To have lived light in the Spring, To have loved, to have thought, to have done; To have advanced true friends, and beat down baffling foes-- That we must feign bliss Of a doubtful future date, And while we dream on this, Lose all our present state, And relegate to worlds... yet distant our repose?" -- Matthew Arnold From MHammond at skippinet.com.au Tue Jun 15 10:28:55 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 15 Jun 1999 18:28:55 +1000 Subject: [Python-Dev] RE: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <006801beb6fe$27490d80$f29b12c2@pythonware.com> Message-ID: <00f801beb709$1c874b90$0801a8c0@bobcat> > hmm. consider the following: > > space = " " > foo = L"foo" > bar = L"bar" > result = space.join((foo, bar)) > > what should happen if you run this: > > a) Python raises an exception > b) result is an ordinary string object > c) result is a unicode string object Well, we could take this to the extreme, and allow _every_ object to grow a join method, where join attempts to cooerce to the same type. Thus: " ".join([L"foo", L"bar"]) -> "foo bar" L" ".join(["foo", "bar"]) -> L"foo bar" " ".join([1,2]) -> "1 2" 0.join(['1',2']) -> 102 [].join([...]) # exercise for the reader ;-) etc. Mark. From ping at lfw.org Tue Jun 15 10:50:34 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Tue, 15 Jun 1999 01:50:34 -0700 (PDT) Subject: [Python-Dev] Re: [Python-Dev] RE: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <00f801beb709$1c874b90$0801a8c0@bobcat> Message-ID: On Tue, 15 Jun 1999, Mark Hammond wrote: > > hmm. consider the following: > > > > space = " " > > foo = L"foo" > > bar = L"bar" > > result = space.join((foo, bar)) > > > > what should happen if you run this: > > > > a) Python raises an exception > > b) result is an ordinary string object > > c) result is a unicode string object > > Well, we could take this to the extreme, and allow _every_ object to grow a > join method, where join attempts to cooerce to the same type. I think i'd agree with Mark's answer for this situation, though i don't know about adding 'join' methods to other types. I see two arguments that can be made here: For b): the result should match the type of the object on which the method was called. This way the type of the result more easily determinable by the programmer or reader. Also, since the type of the result is immediately known to the "join" code, each member of the passed-in sequence need only be fetched once, and a __getitem__-style generator can easily stand in for the sequence. For c): the result should match the "biggest" type among the operands. This behaviour is consistent with what you would get if you added all the operands together. Unfortunately this means you have to see all the operands before you know the type of the result, which means you either scan twice or convert potentially the whole result. b) weighs more strongly in my opinion, so i think the right thing to do is to match the type of the separator. (But if a Unicode string contains characters outside of the Latin-1 range, is it supposed to raise an exception on an attempt to convert to an ordinary string? In that case, the actual behaviour of the above example would be a) and i'm not sure if that would get annoying fast.) -- ?!ng "In the sciences, we are now uniquely privileged to sit side by side with the giants on whose shoulders we stand." -- Gerald Holton From gstein at lyra.org Tue Jun 15 11:05:43 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 15 Jun 1999 02:05:43 -0700 Subject: [Python-Dev] Re: String methods... finally References: Message-ID: <37661767.37D8E370@lyra.org> Ka-Ping Yee wrote: >... > (But if a Unicode string contains characters outside of > the Latin-1 range, is it supposed to raise an exception > on an attempt to convert to an ordinary string? In that > case, the actual behaviour of the above example would be > a) and i'm not sure if that would get annoying fast.) I forget the "last word" on this, but (IMO) str(unicode_object) should return a UTF-8 encoded string. Cheers, -g p.s. what's up with Mailman... it seems to have broken badly on the [Python-Dev] insertion... I just stripped a bunch of 'em -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Tue Jun 15 11:48:40 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 15 Jun 1999 11:48:40 +0200 Subject: [Python-Dev] Re: String methods... finally References: Message-ID: <003e01beb714$55d7fd80$f29b12c2@pythonware.com> > > > a) Python raises an exception > > > b) result is an ordinary string object > > > c) result is a unicode string object > > > > Well, we could take this to the extreme, and allow _every_ object to grow a > > join method, where join attempts to cooerce to the same type. well, I think that unicode strings and ordinary strings should behave like "strings" where possible, just like integers, floats, long integers and complex values be- have like "numbers" in many (but not all) situations. if we make unicode strings easier to mix with ordinary strings, we don't necessarily have to make integers and lists easier to mix with strings too... (people who want that can use Tcl instead ;-) > I think i'd agree with Mark's answer for this situation, though > i don't know about adding 'join' methods to other types. I see two > arguments that can be made here: > > For b): the result should match the type of the object > on which the method was called. This way the type of > the result more easily determinable by the programmer > or reader. Also, since the type of the result is > immediately known to the "join" code, each member of the > passed-in sequence need only be fetched once, and a > __getitem__-style generator can easily stand in for the > sequence. > > For c): the result should match the "biggest" type among > the operands. This behaviour is consistent with what > you would get if you added all the operands together. > Unfortunately this means you have to see all the operands > before you know the type of the result, which means you > either scan twice or convert potentially the whole result. > > b) weighs more strongly in my opinion, so i think the right > thing to do is to match the type of the separator. > > (But if a Unicode string contains characters outside of > the Latin-1 range, is it supposed to raise an exception > on an attempt to convert to an ordinary string? In that > case, the actual behaviour of the above example would be > a) and i'm not sure if that would get annoying fast.) exactly. there are some major issues hidden in here, including: 1) what should "str" do for unicode strings? 2) should join really try to convert its arguments? 3) can "str" really raise an exception for a built-in type? 4) should code written by americans fail when used in other parts of the world? based on string-sig input, the unicode class currently solves (1) by returning a UTF-8 encoded version of the unicode string contents. this was chosen to make sure that the answer to (3) is "no, never", and that the an- swer (4) is "not always, at least" -- we've had enough of that, thank you: http://www.lysator.liu.se/%e5ttabitars/7bit-example.txt if (1) is a reasonable solution (I think it is), I think the answer to (2) should be no, based on the rule of least surprise. Python has always required me to explicitly state when I want to convert things in a way that may radically change their meaning. I see little reason to abandon that in 1.6. From gstein at lyra.org Tue Jun 15 12:01:09 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 15 Jun 1999 03:01:09 -0700 Subject: [Python-Dev] Re: [Python-Dev] Re: String methods... finally References: <003e01beb714$55d7fd80$f29b12c2@pythonware.com> Message-ID: <37662465.682FA81B@lyra.org> Fredrik Lundh wrote: >... > if (1) is a reasonable solution (I think it is), I think the > answer to (2) should be no, based on the rule of least > surprise. Python has always required me to explicitly > state when I want to convert things in a way that may > radically change their meaning. I see little reason to > abandon that in 1.6. Especially because it is such a simple translation: sep.join(sequence) becomes sep.join(map(str, sequence)) Very obvious what is happening. It isn't hard to read, and it doesn't take a lot out of a person to insert that extra phrase. And hey... people can always do: def strjoin(sep, seq): return sep.join(map(str, seq)) And just use strjoin() everywhere if they hate the typing. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm at hypernet.com Tue Jun 15 15:08:08 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Tue, 15 Jun 1999 08:08:08 -0500 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: String methods... finally In-Reply-To: <37662465.682FA81B@lyra.org> Message-ID: <1282670144-103087754@hypernet.com> Greg Stein wrote: ... > And hey... people can always do: > > def strjoin(sep, seq): > return sep.join(map(str, seq)) > > And just use strjoin() everywhere if they hate the typing. Those who hate typing regard it as great injury that they have to define this. Of course, they'll gladly type huge long posts on the subject. But, I agree. string.join(['a', 'b', 3]) currently barfs. L" ".join(seq) should complain if seq isn't all unicode, and same for good old strings. - Gordon From guido at CNRI.Reston.VA.US Tue Jun 15 14:39:09 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 08:39:09 -0400 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: Your message of "Tue, 15 Jun 1999 09:10:28 +0200." <006801beb6fe$27490d80$f29b12c2@pythonware.com> References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> Message-ID: <199906151239.IAA02917@eric.cnri.reston.va.us> > hmm. consider the following: > > space = " " > foo = L"foo" > bar = L"bar" > result = space.join((foo, bar)) > > what should happen if you run this: > > a) Python raises an exception > b) result is an ordinary string object > c) result is a unicode string object The same should happen as for L"foo" + " " + L"bar". --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Tue Jun 15 14:50:59 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 15 Jun 1999 08:50:59 -0400 (EDT) Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <199906151239.IAA02917@eric.cnri.reston.va.us> References: <14179.61649.286195.248429@anthem.cnri.reston.va.us> <000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <14182.19420.462788.15633@cm-24-29-94-19.nycap.rr.com> Guido> The same should happen as for L"foo" + " " + L"bar". Remind me again, please. What mnemonic is "L" supposed to evoke? Long? Lundh? Are we talking about Unicode strings? If so, why not "U"? Apologies for my increased density. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From jack at oratrix.nl Tue Jun 15 14:58:05 1999 From: jack at oratrix.nl (Jack Jansen) Date: Tue, 15 Jun 1999 14:58:05 +0200 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: Message by Guido van Rossum , Tue, 15 Jun 1999 08:39:09 -0400 , <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <19990615125805.8CF03303120@snelboot.oratrix.nl> > The same should happen as for L"foo" + " " + L"bar". This is probably the most reasonable solution. Unfortunately it breaks Marks truly novel suggestion that 0.join(1, 2) becomes 102, but I guess we'll have to live with that:-) -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From fredrik at pythonware.com Tue Jun 15 16:28:17 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 15 Jun 1999 16:28:17 +0200 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <00c201beb73b$5fa27b70$f29b12c2@pythonware.com> > > hmm. consider the following: > > > > space = " " > > foo = L"foo" > > bar = L"bar" > > result = space.join((foo, bar)) > > > > what should happen if you run this: > > > > a) Python raises an exception > > b) result is an ordinary string object > > c) result is a unicode string object > > The same should happen as for L"foo" + " " + L"bar". which is? (alright; for the moment, it's (a) for both: >>> import unicode >>> u = unicode.unicode >>> u("foo") + u(" ") + u("bar") Traceback (innermost last): File "", line 1, in ? TypeError: illegal argument type for built-in operation >>> u("foo") + " " + u("bar") Traceback (innermost last): File "", line 1, in ? TypeError: illegal argument type for built-in operation >>> u(" ").join(("foo", "bar")) Traceback (innermost last): File "", line 1, in ? TypeError: first argument must be sequence of unicode strings but that can of course be changed...) From guido at CNRI.Reston.VA.US Tue Jun 15 16:38:32 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 10:38:32 -0400 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: Your message of "Tue, 15 Jun 1999 16:28:17 +0200." <00c201beb73b$5fa27b70$f29b12c2@pythonware.com> References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> <199906151239.IAA02917@eric.cnri.reston.va.us> <00c201beb73b$5fa27b70$f29b12c2@pythonware.com> Message-ID: <199906151438.KAA03355@eric.cnri.reston.va.us> > > The same should happen as for L"foo" + " " + L"bar". > > which is? Whatever it is -- I think we did a lot of reasoning about this, and perhaps we're not quite done -- but I truly believe that whatever is decided, join() should follow. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Tue Jun 15 17:28:11 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 15 Jun 1999 11:28:11 -0400 (EDT) Subject: [Python-Dev] Re: String methods... finally References: <37661767.37D8E370@lyra.org> Message-ID: <14182.28939.509040.125174@anthem.cnri.reston.va.us> >>>>> "GS" == Greg Stein writes: GS> p.s. what's up with Mailman... it seems to have broken badly GS> on the [Python-Dev] insertion... I just stripped a bunch of GS> 'em Harald Meland just checked in a fix for this, which I'm installing now, so the breakage should be just temporary. -Barry From tim_one at email.msn.com Tue Jun 15 17:33:38 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 11:33:38 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <006801beb6fe$27490d80$f29b12c2@pythonware.com> Message-ID: <000601beb744$70c6f9e0$979e2299@tim> > hmm. consider the following: > > space = " " > foo = L"foo" > bar = L"bar" > result = space.join((foo, bar)) > > what should happen if you run this: > > a) Python raises an exception > b) result is an ordinary string object > c) result is a unicode string object The proposal said #b, or, in general, that the resulting string be of the same flavor as the separator. From tim_one at email.msn.com Tue Jun 15 17:33:40 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 11:33:40 -0400 Subject: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: Message-ID: <000701beb744$71e450c0$979e2299@tim> >> A bug: >> >> >>> 'ab'.endswith('b',0,1) # right >> 0 >> >>> 'ab'.endswith('ab',0,1) # wrong >> 1 >> >>> 'ab'.endswith('ab',0,0) # wrong >> 1 >> >>> [Ka-Ping] > I assumed you meant that the extra arguments should be slices > on the string being searched, i.e. > > specimen.startswith(text, start, end) > > is equivalent to > > specimen[start:end].startswith(text) > > without the overhead of slicing the specimen? Or did i understand > you correctly? Yes, and e.g. 'ab'[0:1] == 'a', which does not end with 'ab'. So these are inconsistent today, and the second is a bug: >>> 'ab'[0:1].endswith('ab') 0 >>> 'ab'.endswith('ab', 0, 1) 1 >>> Or did I misunderstand you ? From gward at cnri.reston.va.us Tue Jun 15 17:41:39 1999 From: gward at cnri.reston.va.us (Greg Ward) Date: Tue, 15 Jun 1999 11:41:39 -0400 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <19990615125805.8CF03303120@snelboot.oratrix.nl>; from Jack Jansen on Tue, Jun 15, 1999 at 02:58:05PM +0200 References: <19990615125805.8CF03303120@snelboot.oratrix.nl> Message-ID: <19990615114139.A3697@cnri.reston.va.us> On 15 June 1999, Jack Jansen said: > > The same should happen as for L"foo" + " " + L"bar". > > This is probably the most reasonable solution. Unfortunately it breaks Marks > truly novel suggestion that 0.join(1, 2) becomes 102, but I guess we'll have > to live with that:-) Careful -- it actually works this way in Perl (well, except that join isn't a method of strings...): $ perl -de 1 [...] DB<2> $sep = 0 DB<3> @list = (1, 2) DB<4> p join ($sep, @list) 102 Cool! Who needs type-checking anyways? Greg -- Greg Ward - software developer gward at cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913 From tim_one at email.msn.com Tue Jun 15 17:58:48 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 11:58:48 -0400 Subject: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <000901beb747$f4531840$979e2299@tim> >> space = " " >> foo = L"foo" >> bar = L"bar" >> result = space.join((foo, bar)) > The same should happen as for L"foo" + " " + L"bar". Then " ".join([" ", 42]) should blow up, and auto-conversion for non-string types needs to be removed from the implementation. The attraction of auto-conversion for me is that I had never once seen string.join blow up where the exception revealed a conceptual error; in every case conversion to string was the intent, and an obvious one at that. Just anal nagging. How about dropping Unicode instead ? Anyway, I'm already on record as saying auto-convert wasn't essential, and join should first and foremost make good sense for string arguments. off-to-work-ly y'rs - tim From MHammond at skippinet.com.au Wed Jun 16 00:29:32 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 16 Jun 1999 08:29:32 +1000 Subject: [Python-Dev] Re: String methods... finally In-Reply-To: <003e01beb714$55d7fd80$f29b12c2@pythonware.com> Message-ID: <010101beb77e$8af64430$0801a8c0@bobcat> > well, I think that unicode strings and ordinary strings > should behave like "strings" where possible, just like > integers, floats, long integers and complex values be- > have like "numbers" in many (but not all) situations. I obviously missed a few smileys in my post. I was serious that: L" ".join -> Unicode result " ".join -> String result and even " ".join([1,2]) -> "1 2" But integers and lists growing "join" methods was a little tounge in cheek :-) Mark. From da at ski.org Wed Jun 16 00:48:41 1999 From: da at ski.org (David Ascher) Date: Tue, 15 Jun 1999 15:48:41 -0700 (Pacific Daylight Time) Subject: [Python-Dev] mmap Message-ID: Another topic: what are the chances of adding the mmap module to the core distribution? It's restricted to a smallish set of platforms (modern Unices and Win32, I think), but it's quite small, and would be a nice thing to have available in the core, IMHO. (btw, the buffer object needs more documentation) --david From MHammond at skippinet.com.au Wed Jun 16 00:53:00 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 16 Jun 1999 08:53:00 +1000 Subject: [Python-Dev] String methods... finally In-Reply-To: <000901beb747$f4531840$979e2299@tim> Message-ID: <010201beb781$d1febf30$0801a8c0@bobcat> [Before I start: Skip mentioned "why L, not U". I know C/C++ uses L, presumably to denote a "long" string (presumably keeping the analogy between int and long ints). I guess Java has no such indicator, being native Unicode? Is there any sort of agreement that Python will use L"..." to denote Unicode strings? I would be happy with it. Also, should: print L"foo" -> 'foo' and print `L"foo"` -> L'foo' I would like to know if there is agreement for this, so I can change the Pythonwin implementation of Unicode now to make things more seamless later. ] > >> space = " " > >> foo = L"foo" > >> bar = L"bar" > >> result = space.join((foo, bar)) > > > The same should happen as for L"foo" + " " + L"bar". I must admit Guido's position has real appeal, even if just from a documentation POV. Eg, join can be defined as: sep.join([s1, ..., sn]) Returns s1 + sep + s2 + sep + ... + sepn Nice and simple to define and understand. Thus, if you can't add 2 items, you can't join them. Assuming the Unicode changes allow us to say: assert " " == L" ", "eek" assert L" " + "" == L" " assert " " + L"" == L" " # or even if this == " " Then this still works well in a Unicode environment; Unicode and strings could be mixed in the list, and as long as you understand what L" " + "" returns, you will understand immediately what the result of join() is going to be. > The attraction of auto-conversion for me is that I had never once seen > string.join blow up where the exception revealed a conceptual > error; in > every case conversion to string was the intent, and an > obvious one at that. OTOH, my gut tells me this is better - that an implicit conversion to the seperator type be performed. Also, it appears that this technique will never surprise anyone in a bad way. It seems the rule above, while simple, basically means "sep.join can only take string/Unicode objects", as all other objects will currently fail the add test. So, given that our rule is that the objects must all be strings, how can it hurt to help the user conform? > off-to-work-ly y'rs - tim where-i-should-be-instead-of-writing-rambling-mails-ly, Mark. From guido at CNRI.Reston.VA.US Wed Jun 16 00:54:42 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 18:54:42 -0400 Subject: [Python-Dev] mmap In-Reply-To: Your message of "Tue, 15 Jun 1999 15:48:41 PDT." References: Message-ID: <199906152254.SAA05114@eric.cnri.reston.va.us> > Another topic: what are the chances of adding the mmap module to the core > distribution? It's restricted to a smallish set of platforms (modern > Unices and Win32, I think), but it's quite small, and would be a nice > thing to have available in the core, IMHO. If it works on Linux, Solaris, Irix and Windows, and is reasonably clean, I'll take it. Please send it. > (btw, the buffer object needs more documentation) That's for Jack & Greg... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Wed Jun 16 01:04:17 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 19:04:17 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Wed, 16 Jun 1999 08:53:00 +1000." <010201beb781$d1febf30$0801a8c0@bobcat> References: <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <199906152304.TAA05136@eric.cnri.reston.va.us> > Is there any sort of agreement that Python will use L"..." to denote > Unicode strings? I would be happy with it. I don't know of any agreement, but it makes sense. > Also, should: > print L"foo" -> 'foo' > and > print `L"foo"` -> L'foo' Yes, I think this should be the way. Exactly what happens to non-ASCII characters is up to the implementation. Do we have agreement on escapes like \xDDDD? Should \uDDDD be added? The difference between the two is that according to the ANSI C standard, which I follow rather strictly for string literals, '\xABCDEF' is a single character whose value is the lower bits (however many fit in a char) of 0xABCDEF; this makes it cumbersome to write a string consisting of a hex escape followed by a digit or letter a-f or A-F; you would have to use another hex escape or split the literal in two, like this: "\xABCD" "EF". (This is true for 8-bit chars as well as for long char in ANSI C.) The \u escape takes up to 4 bytes but is not ANSI C. In Java, \u has the additional funny property that it is recognized *everywhere* in the source code, not just in string literals, and I believe that this complicates the interpretation of things like "\\uffff" (is the \uffff interpreted before regular string \ processing happens?). I don't think we ought to copy this behavior, although JPython users or developers might disagree. (I don't know anyone who *uses* Unicode strings much, so it's hard to gauge the importance of these issues.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Jun 16 02:09:15 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Tue, 15 Jun 1999 19:09:15 -0500 Subject: [Python-Dev] String methods... finally In-Reply-To: <199906152304.TAA05136@eric.cnri.reston.va.us> References: Your message of "Wed, 16 Jun 1999 08:53:00 +1000." <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <1282630485-105472998@hypernet.com> Guido asks: > Do we have agreement on escapes like \xDDDD? Should \uDDDD be > added? > ... The \u escape > takes up to 4 bytes but is not ANSI C. How do endian issues fit in with \u? - Gordon From guido at CNRI.Reston.VA.US Wed Jun 16 01:20:07 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 19:20:07 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Tue, 15 Jun 1999 19:09:15 CDT." <1282630485-105472998@hypernet.com> References: Your message of "Wed, 16 Jun 1999 08:53:00 +1000." <010201beb781$d1febf30$0801a8c0@bobcat> <1282630485-105472998@hypernet.com> Message-ID: <199906152320.TAA05211@eric.cnri.reston.va.us> > How do endian issues fit in with \u? I would assume that it uses the same rules as hex and octal numeric literals: these are always *written* in big-endian notation, since that is also what we use for decimal numbers. Thus, on a little-endian machine, the short integer 0x1234 would be stored as the bytes {0x34, 0x12} and so would the string literal "\x1234". --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Wed Jun 16 01:27:44 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 15 Jun 1999 19:27:44 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <000901beb747$f4531840$979e2299@tim> <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <14182.57712.380574.385164@anthem.cnri.reston.va.us> >>>>> "MH" == Mark Hammond writes: MH> OTOH, my gut tells me this is better - that an implicit MH> conversion to the seperator type be performed. Right now, the implementation of join uses PyObject_Str() to str-ify the elements in the sequence. I can't remember, but in our Unicode worldview doesn't PyObject_Str() return a narrowed string if it can, and raise an exception if not? So maybe narrow-string's join shouldn't be doing it this way because that'll autoconvert to the separator's type, which breaks the symmetry. OTOH, we could promote sep to the type of sequence[0] and forward the call to it's join if it were a widestring. That should retain the symmetry. -Barry From bwarsaw at cnri.reston.va.us Wed Jun 16 01:46:24 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 15 Jun 1999 19:46:24 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <010201beb781$d1febf30$0801a8c0@bobcat> <199906152304.TAA05136@eric.cnri.reston.va.us> Message-ID: <14182.58832.140587.711978@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Should \uDDDD be added? That'd be nice! :) Guido> In Java, \u has the additional funny property that it is Guido> recognized *everywhere* in the source code, not just in Guido> string literals, and I believe that this complicates the Guido> interpretation of things like "\\uffff" (is the \uffff Guido> interpreted before regular string \ processing happens?). No. JLS section 3.3 says[1] In addition to the processing implied by the grammar, for each raw input character that is a backslash \, input processing must consider how many other \ characters contiguously precede it, separating it from a non-\ character or the start of the input stream. If this number is even, then the \ is eligible to begin a Unicode escape; if the number is odd, then the \ is not eligible to begin a Unicode escape. and this is born out by example. -------------------- snip snip --------------------Uni.java public class Uni { static public void main(String[] args) { System.out.println("\\u00a9"); System.out.println("\u00a9"); } } -------------------- snip snip --------------------outputs \u00a9 ? -------------------- snip snip -------------------- -Barry [1] http://java.sun.com/docs/books/jls/html/3.doc.html#44591 PS. it is wonderful having the JLS online :) From ping at lfw.org Tue Jun 15 18:05:40 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Tue, 15 Jun 1999 09:05:40 -0700 (PDT) Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <19990615114139.A3697@cnri.reston.va.us> Message-ID: On Tue, 15 Jun 1999, Greg Ward wrote: > Careful -- it actually works this way in Perl (well, except that join > isn't a method of strings...): > > $ perl -de 1 > [...] > DB<2> $sep = 0 > > DB<3> @list = (1, 2) > > DB<4> p join ($sep, @list) > 102 > > Cool! Who needs type-checking anyways? Cool! So then >>> def f(x): return x ** 2 ... >>> def g(x): return x - 5 ... >>> h = join((f, g)) ... >>> h(8) 59 Right? Right? (Just kidding.) -- ?!ng "Any nitwit can understand computers. Many do." -- Ted Nelson From tim_one at email.msn.com Wed Jun 16 06:02:46 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 16 Jun 1999 00:02:46 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <199906152304.TAA05136@eric.cnri.reston.va.us> Message-ID: <000401beb7ad$175193c0$2ca22299@tim> [Guido] > Do we have agreement on escapes like \xDDDD? I think we have to agree to leave that alone -- it affects what e.g. the regular expression parser does too. > Should \uDDDD be added? Yes, but only in string literals. You don't want to be within 10 miles of Barry if you tell him that Emacs pymode has to treat the Unicode escape for a newline as if it were-- as Java treats it outside literals --an actual line break <0.01 wink>. > ... > The \u escape takes up to 4 bytes Not in Java: it requires exactly 4 hex characters after == exactly 2 bytes, and it's an error if it's followed by fewer than 4 hex characters. That's a good rule (simple!), while ANSI C's is too clumsy to live with if people want to take Unicode seriously. So what does it mean for a Unicode escape to appear in a non-L string? aha-the-secret-escape-to-ucs4-ly y'rs - tim From tim_one at email.msn.com Wed Jun 16 06:02:44 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 16 Jun 1999 00:02:44 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <000301beb7ad$1635c380$2ca22299@tim> [MarkH agonizes, over whether to auto-convert or not] Well, the rule *could* be that the result type is the widest string type among the separator and the sequences' string elements (if any), and other types convert to the result type along the way. I'd be more specific, except I'm not sure which flavor of string str() returns (or, indeed, whether that's up to each __str__ implementation). In any case, widening to Unicode should always be possible, and if "widest wins" it doesn't require a multi-pass algorithm regardless (although the partial result so far may need to be widened once -- but that's true even if auto-convert of non-string types isn't implemented). Or, IOW, sep.join([a, b, c]) == f(a) + sep + f(b) + sep + f(c) where I don't know how to spell f, but f(x) *means* x' = if x has a string type then x else x.__str__() return x' coerced to the widest string type seen so far So I think everyone can get what they want -- except that those who want auto-convert are at direct odds with those who prefer to wag Guido's fingers and go "tsk, tsk, we know what you want but you didn't say 'please' so your program dies" . master-of-fair-summaries-ly y'rs - tim From mal at lemburg.com Wed Jun 16 10:29:27 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Wed, 16 Jun 1999 10:29:27 +0200 Subject: [Python-Dev] String methods... finally References: <010201beb781$d1febf30$0801a8c0@bobcat> <199906152304.TAA05136@eric.cnri.reston.va.us> Message-ID: <37676067.62E272F4@lemburg.com> Guido van Rossum wrote: > > > Is there any sort of agreement that Python will use L"..." to denote > > Unicode strings? I would be happy with it. > > I don't know of any agreement, but it makes sense. The u"..." looks more intuitive too me. While inheriting C/C++ constructs usually makes sense I think usage in the C community is not that wide-spread yet and for a Python freak, the small u will definitely remind him of Unicode whereas the L will stand for (nearly) unlimited length/precision. Not that this is important, but... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 198 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fredrik at pythonware.com Wed Jun 16 11:53:23 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 11:53:23 +0200 Subject: [Python-Dev] String methods... finally References: <000401beb7ad$175193c0$2ca22299@tim> Message-ID: <00f701beb7de$cdb422f0$f29b12c2@pythonware.com> > > The \u escape takes up to 4 bytes > > Not in Java: it requires exactly 4 hex characters after == exactly 2 bytes, > and it's an error if it's followed by fewer than 4 hex characters. That's a > good rule (simple!), while ANSI C's is too clumsy to live with if people > want to take Unicode seriously. > > So what does it mean for a Unicode escape to appear in a non-L string? my suggestion is to store it as UTF-8; see the patches included in the unicode package for details. this also means that an u-string literal (L-string, whatever) could be stored as an 8-bit string internally. and that the following two are equivalent: string = u"foo" string = unicode("foo") also note that: unicode(str(u"whatever")) == u"whatever" ... on the other hand, this means that we have at least four major "arrays of bytes or characters" thingies mapped on two data types: the old string type is used for: -- plain old 8-bit strings (ascii, iso-latin-1, whatever) -- byte buffers containing arbitrary data -- unicode strings stored as 8-bit characters, using the UTF-8 encoding. and the unicode string type is used for: -- unicode strings stored as 16-bit characters is this reasonable? ... yet another question is how to deal with source code. is a python 1.6 source file written in ASCII, ISO Latin 1, or UTF-8. speaking from a non-us standpoint, it would be really cool if you could write Python sources in UTF-8... From gstein at lyra.org Wed Jun 16 12:13:45 1999 From: gstein at lyra.org (Greg Stein) Date: Wed, 16 Jun 1999 03:13:45 -0700 (PDT) Subject: [Python-Dev] mmap In-Reply-To: <199906152254.SAA05114@eric.cnri.reston.va.us> Message-ID: On Tue, 15 Jun 1999, Guido van Rossum wrote: > > Another topic: what are the chances of adding the mmap module to the core > > distribution? It's restricted to a smallish set of platforms (modern > > Unices and Win32, I think), but it's quite small, and would be a nice > > thing to have available in the core, IMHO. > > If it works on Linux, Solaris, Irix and Windows, and is reasonably > clean, I'll take it. Please send it. Actually, my preference is to see a change to open() rather than a whole new module. For example, let's say that you open a file, specifying memory-mapping. Then you create a buffer against that file: f = open('foo','rm') # 'm' means mem-map b = buffer(f) print b[100:200] Disclaimer: I haven't looked at the mmap modules (AMK's and Mark's) to see what capabilities are in there. They may not be expressable soly as open() changes. (adding add'l params for mmap flags might be another way to handle this) I'd like to see mmap native in Python. I won't push, though, until I can run a test to see what kind of savings will occur when you mmap a .pyc file and open PyBuffer objects against the thing for the code bytes. My hypothesis is that you can reduce the working set of Python (i.e. amortize the cost of a .pyc's code over several processes by mmap'ing it); this depends on the proportion of code in the pyc relative to "other" stuff. > > (btw, the buffer object needs more documentation) > > That's for Jack & Greg... Quite true. My bad :-( ... That would go into the API doc, I guess... I'll put this on a todo list, but it could be a little while. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Wed Jun 16 12:53:29 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 12:53:29 +0200 Subject: [Python-Dev] mmap References: Message-ID: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Greg wrote: > Actually, my preference is to see a change to open() rather than a whole > new module. For example, let's say that you open a file, specifying > memory-mapping. Then you create a buffer against that file: > > f = open('foo','rm') # 'm' means mem-map > b = buffer(f) > print b[100:200] > > Disclaimer: I haven't looked at the mmap modules (AMK's and Mark's) to see > what capabilities are in there. They may not be expressable soly as open() > changes. (adding add'l params for mmap flags might be another way to > handle this) > > I'd like to see mmap native in Python. I won't push, though, until I can > run a test to see what kind of savings will occur when you mmap a .pyc > file and open PyBuffer objects against the thing for the code bytes. My > hypothesis is that you can reduce the working set of Python (i.e. amortize > the cost of a .pyc's code over several processes by mmap'ing it); this > depends on the proportion of code in the pyc relative to "other" stuff. yes, yes, yes! my good friend the mad scientist (the guy who writes code, not the flaming cult-ridden brainwashed script kiddie) has considered writing a whole new "abstract file" backend, to entirely get rid of stdio in the Python core. some potential advantages: -- performance (some stdio implementations are slow) -- portability (stdio doesn't exist on some platforms!) -- opens up for cool extensions (memory mapping, pluggable file handlers, etc). should I tell him to start hacking? or is this the same thing as PyBuffer/buffer (I've implemented PyBuffer support for the unicode class, but that doesn't mean that I understand how it works...) PS. someone once told me that Perl goes "below" the standard file I/O system. does anyone here know if that's true, and per- haps even explain how they're doing that... From guido at CNRI.Reston.VA.US Wed Jun 16 14:19:10 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 16 Jun 1999 08:19:10 -0400 Subject: [Python-Dev] mmap In-Reply-To: Your message of "Wed, 16 Jun 1999 03:13:45 PDT." References: Message-ID: <199906161219.IAA05802@eric.cnri.reston.va.us> [me] > > If it works on Linux, Solaris, Irix and Windows, and is reasonably > > clean, I'll take it. Please send it. [Greg] > Actually, my preference is to see a change to open() rather than a whole > new module. For example, let's say that you open a file, specifying > memory-mapping. Then you create a buffer against that file: > > f = open('foo','rm') # 'm' means mem-map > b = buffer(f) > print b[100:200] Buh. Changes of this kind to builtins are painful, especially since we expect that this feature may or may not be supported. And imagine the poor reader who comes across this for the first time... What's wrong with import mmap f = mmap.open('foo', 'r') ??? > I'd like to see mmap native in Python. I won't push, though, until I can > run a test to see what kind of savings will occur when you mmap a .pyc > file and open PyBuffer objects against the thing for the code bytes. My > hypothesis is that you can reduce the working set of Python (i.e. amortize > the cost of a .pyc's code over several processes by mmap'ing it); this > depends on the proportion of code in the pyc relative to "other" stuff. We've been through this before. I still doubt it will help much. Anyway, it's a completely independent feature from making the mmap module(any mmap module) available to users. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Wed Jun 16 14:24:26 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 16 Jun 1999 08:24:26 -0400 Subject: [Python-Dev] mmap In-Reply-To: Your message of "Wed, 16 Jun 1999 12:53:29 +0200." <015b01beb7e6$79b61610$f29b12c2@pythonware.com> References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Message-ID: <199906161224.IAA05815@eric.cnri.reston.va.us> > my good friend the mad scientist (the guy who writes code, > not the flaming cult-ridden brainwashed script kiddie) has > considered writing a whole new "abstract file" backend, to > entirely get rid of stdio in the Python core. some potential > advantages: > > -- performance (some stdio implementations are slow) > -- portability (stdio doesn't exist on some platforms!) You have this backwards -- you'd have to port the abstract backend first! Also don't forget that a *good* stdio might be using all sorts of platform-specific tricks that you'd have to copy to match its performance. > -- opens up for cool extensions (memory mapping, > pluggable file handlers, etc). > > should I tell him to start hacking? Tcl/Tk does this. I see some advantages (e.g. you have more control over and knowledge of how much data is buffered) but also some disadvantages (more work to port, harder to use from C), plus tons of changes needed in the rest of Python. I'd say wait until Python 2.0 and let's keep stdio for 1.6. > PS. someone once told me that Perl goes "below" the standard > file I/O system. does anyone here know if that's true, and per- > haps even explain how they're doing that... Probably just means that they use the C equivalent of os.open() and friends. --Guido van Rossum (home page: http://www.python.org/~guido/) From gward at cnri.reston.va.us Wed Jun 16 14:25:34 1999 From: gward at cnri.reston.va.us (Greg Ward) Date: Wed, 16 Jun 1999 08:25:34 -0400 Subject: [Python-Dev] mmap In-Reply-To: <015b01beb7e6$79b61610$f29b12c2@pythonware.com>; from Fredrik Lundh on Wed, Jun 16, 1999 at 12:53:29PM +0200 References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Message-ID: <19990616082533.A4142@cnri.reston.va.us> On 16 June 1999, Fredrik Lundh said: > my good friend the mad scientist (the guy who writes code, > not the flaming cult-ridden brainwashed script kiddie) has > considered writing a whole new "abstract file" backend, to > entirely get rid of stdio in the Python core. some potential > advantages: [...] > PS. someone once told me that Perl goes "below" the standard > file I/O system. does anyone here know if that's true, and per- > haps even explain how they're doing that... My understanding (mainly from folklore -- peeking into the Perl source has been known to turn otherwise staid, solid programmers into raving lunatics) is that yes, Perl does grovel around in the internals of stdio implementations to wring a few extra cycles out. However, what's probably of more interest to you -- I mean your mad scientist alter ego -- is Perl's I/O abstraction layer: a couple of years ago, somebody hacked up Perl's guts to do basically what you're proposing for Python. The main result was a half-baked, unfinished (at least as of last summer, when I actually asked an expert in person at the Perl Conference) way of building Perl with AT&T's sfio library instead of stdio. I think the other things you mentioned, eg. more natural support for memory-mapped files, have also been bandied about as advantages of this scheme. The main problem with Perl's I/O abstraction layer is that extension modules now have to call e.g. PerlIO_open(), PerlIO_printf(), etc. in place of their stdio counterparts. Surprise surprise, many extension modules have not adapted to the new way of doing things, even though it's been in Perl since version 5.003 (I think). Even more surprisingly, the fourth-party C libraries that those extension modules often interface to haven't switched to using Perl's I/O abstraction layer. This doesn't make a whit of difference if Perl is built in either the "standard way" (no abstraction layer, just direct stdio) or with the abstraction layer on top of stdio. But as soon as some poor fool decides Perl on top of sfio would be neat, lots of extension modules break -- their I/O calls go nowhere. I'm sure there is some sneaky way to make it all work using sfio's binary compatibility layer and some clever macros. This might even have been done. However, AFAIK it's not been documented anywhere. This is not merely to bitch about unfinished business in the Perl core; it's to warn you that others have walked down the road you propose to tread, and there may be potholes. Now if the Python source really does get even more modularized for 1.6, you might have a much easier job of it. ("Modular" is not the word that jumps to mind when one looks at the Perl source code.) Greg /* * "Far below them they saw the white waters pour into a foaming bowl, and * then swirl darkly about a deep oval basin in the rocks, until they found * their way out again through a narrow gate, and flowed away, fuming and * chattering, into calmer and more level reaches." */ -- Tolkein, by way of perl/doio.c -- Greg Ward - software developer gward at cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913 From beazley at cs.uchicago.edu Wed Jun 16 15:23:32 1999 From: beazley at cs.uchicago.edu (David Beazley) Date: Wed, 16 Jun 1999 08:23:32 -0500 (CDT) Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Message-ID: <199906161323.IAA28642@gargoyle.cs.uchicago.edu> Fredrik Lundh writes: > > my good friend the mad scientist (the guy who writes code, > not the flaming cult-ridden brainwashed script kiddie) has > considered writing a whole new "abstract file" backend, to > entirely get rid of stdio in the Python core. some potential > advantages: > > -- performance (some stdio implementations are slow) > -- portability (stdio doesn't exist on some platforms!) > -- opens up for cool extensions (memory mapping, > pluggable file handlers, etc). > > should I tell him to start hacking? > I am not in favor of obscuring Python's I/O model too much. When working with C extensions, it is critical to have access to normal I/O mechanisms such as 'FILE *' or integer file descriptors. If you hide all of this behind some sort of abstract I/O layer, it's going to make life hell for extension writers unless you also provide a way to get access to the raw underlying data structures. This is a major gripe I have with the Tcl channel model--namely, there seems to be no easy way to unravel a Tcl channel into a raw file-descriptor for use in C (unless I'm being dense and have missed some simple way to do it). Also, what platforms are we talking about here? I've never come across any normal machine that had a C compiler, but did not have stdio. Is this really a serious problem? Cheers, Dave From MHammond at skippinet.com.au Wed Jun 16 15:47:44 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 16 Jun 1999 23:47:44 +1000 Subject: [Python-Dev] mmap In-Reply-To: <19990616082533.A4142@cnri.reston.va.us> Message-ID: <011c01beb7fe$d213c600$0801a8c0@bobcat> [Greg writes] > The main problem with Perl's I/O abstraction layer is that extension > modules now have to call e.g. PerlIO_open(), PerlIO_printf(), etc. in > place of their stdio counterparts. Surprise surprise, many extension Interestingly, Python _nearly_ suffers this problem now. Although Python does use native FILE pointers, this scheme still assumes that Python and the extensions all use the same stdio. I understand that on most Unix system this can be taken for granted. However, to be truly cross-platform, this assumption may not be valid. A case in point is (surprise surprise :-) Windows. Windows has a number of C RTL options, and Python and its extensions must be careful to select the one that shares FILE * and the heap across separately compiled and linked modules. In-fact, Windows comes with an excellent debug version of the C RTL, but this gets in Python's way - if even one (but not all) Python extension attempts to use these debugging features, we die in a big way. and-dont-even-talk-to-me-about-Windows-CE ly, Mark. From bwarsaw at cnri.reston.va.us Wed Jun 16 16:42:01 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 16 Jun 1999 10:42:01 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <010201beb781$d1febf30$0801a8c0@bobcat> <199906152304.TAA05136@eric.cnri.reston.va.us> <37676067.62E272F4@lemburg.com> Message-ID: <14183.47033.656933.642197@anthem.cnri.reston.va.us> >>>>> "M" == M writes: M> The u"..." looks more intuitive too me. While inheriting C/C++ M> constructs usually makes sense I think usage in the C community M> is not that wide-spread yet and for a Python freak, the small u M> will definitely remind him of Unicode whereas the L will stand M> for (nearly) unlimited length/precision. I don't think I've every seen C code with L"..." strings in them. Here's my list in no particular order. U"..." -- reminds Java/JPython users of Unicode. Alternative mnemonic: Unamerican-strings L"..." -- long-strings, Lundh-strings, ... W"..." -- wide-strings, Warsaw-strings (just trying to take credit where credit's not due :), what-the-heck-are-these?-strings H"..." -- happy-strings, Hammond-strings, hey-you-just-made-my-extension-module-crash-strings F"..." -- funky-stuff-in-these-hyar-strings A"..." -- ain't-strings S"..." -- strange-strings, silly-strings M> Not that this is important, but... Agreed. -Barry From fredrik at pythonware.com Wed Jun 16 21:11:02 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 21:11:02 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <19990616082533.A4142@cnri.reston.va.us> Message-ID: <001901beb82b$fab54200$f29b12c2@pythonware.com> Greg Ward wrote: > This is not merely to bitch about unfinished business in the Perl core; > it's to warn you that others have walked down the road you propose to > tread, and there may be potholes. oh, the mad scientist have rushed down that road a few times before. we'll see if he's prepared to do that again; it sure won't happen before the unicode stuff is in place... From fredrik at pythonware.com Wed Jun 16 21:16:56 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 21:16:56 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161224.IAA05815@eric.cnri.reston.va.us> Message-ID: <004a01beb82e$36ba54a0$f29b12c2@pythonware.com> > > -- performance (some stdio implementations are slow) > > -- portability (stdio doesn't exist on some platforms!) > > You have this backwards -- you'd have to port the abstract backend > first! Also don't forget that a *good* stdio might be using all sorts > of platform-specific tricks that you'd have to copy to match its > performance. well, if the backend layer is good enough, I don't think a stdio-based standard version will be much slower than todays stdio-only implementation. > > PS. someone once told me that Perl goes "below" the standard > > file I/O system. does anyone here know if that's true, and per- > > haps even explain how they're doing that... > > Probably just means that they use the C equivalent of os.open() and > friends. hopefully. my original source described this as "digging around in the innards of the stdio package" (and so did greg). and the same source claimed it wasn't yet ported to Linux. sounds weird, to say the least, but maybe he referred to that sfio package greg mentioned. I'll do some digging, but not today. From fredrik at pythonware.com Wed Jun 16 21:27:02 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 21:27:02 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161323.IAA28642@gargoyle.cs.uchicago.edu> Message-ID: <004b01beb82e$36d44540$f29b12c2@pythonware.com> David Beazley wrote: > I am not in favor of obscuring Python's I/O model too much. When > working with C extensions, it is critical to have access to normal I/O > mechanisms such as 'FILE *' or integer file descriptors. If you hide > all of this behind some sort of abstract I/O layer, it's going to make > life hell for extension writers unless you also provide a way to get > access to the raw underlying data structures. This is a major gripe > I have with the Tcl channel model--namely, there seems to be no easy > way to unravel a Tcl channel into a raw file-descriptor for use in C > (unless I'm being dense and have missed some simple way to do it). > > Also, what platforms are we talking about here? I've never come > across any normal machine that had a C compiler, but did not have stdio. > Is this really a serious problem? in a way, it is a problem today under Windows (in other words, on most of the machines where Python is used today). it's very easy to end up with different DLL's using different stdio implementations, resulting in all kinds of strange errors. a rewrite could use OS-level handles instead, and get rid of that problem. not to mention Windows CE (iirc, Mark had to write his own stdio-ish package for the CE port), maybe PalmOS, BeOS's BFile's, and all the other upcoming platforms which will make Windows look like a fairly decent Unix clone ;-) ... and in Python, any decent extension writer should write code that works with arbitrary file objects, right? "if it cannot deal with StringIO objects, it's broken"... From beazley at cs.uchicago.edu Wed Jun 16 21:53:23 1999 From: beazley at cs.uchicago.edu (David Beazley) Date: Wed, 16 Jun 1999 14:53:23 -0500 (CDT) Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161323.IAA28642@gargoyle.cs.uchicago.edu> <004b01beb82e$36d44540$f29b12c2@pythonware.com> Message-ID: <199906161953.OAA04527@gargoyle.cs.uchicago.edu> Fredrik Lundh writes: > > and in Python, any decent extension writer should write > code that works with arbitrary file objects, right? "if it > cannot deal with StringIO objects, it's broken"... I disagree. Given that a lot of people use Python as a glue language for interfacing with legacy codes, it is unacceptable for extensions to be forced to use some sort of funky non-standard I/O abstraction. Unless you are volunteering to rewrite all of these codes to use the new I/O model, you are always going to need access (in one way or another) to plain old 'FILE *' and integer file descriptors. Of course, one can always just provide a function like FILE *PyFile_AsFile(PyObject *o) That takes an I/O object and returns a 'FILE *' where supported. (Of course, if it's not supported, then it doesn't matter if this function is missing since any extension that needs a 'FILE *' wouldn't work anyways). Cheers, Dave From fredrik at pythonware.com Wed Jun 16 22:04:54 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 22:04:54 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com><199906161323.IAA28642@gargoyle.cs.uchicago.edu><004b01beb82e$36d44540$f29b12c2@pythonware.com> <199906161953.OAA04527@gargoyle.cs.uchicago.edu> Message-ID: <009d01beb833$80d15d40$f29b12c2@pythonware.com> > > and in Python, any decent extension writer should write > > code that works with arbitrary file objects, right? "if it > > cannot deal with StringIO objects, it's broken"... > > I disagree. Given that a lot of people use Python as a glue language > for interfacing with legacy codes, it is unacceptable for extensions > to be forced to use some sort of funky non-standard I/O abstraction. oh, you're right, of course. should have added that extra smiley to that last line. cut and paste from this mail if necessary: ;-) > Unless you are volunteering to rewrite all of these codes to use the > new I/O model, you are always going to need access (in one way or > another) to plain old 'FILE *' and integer file descriptors. Of > course, one can always just provide a function like > > FILE *PyFile_AsFile(PyObject *o) > > That takes an I/O object and returns a 'FILE *' where supported. exactly my idea. when scanning the code, PyFile_AsFile immediately popped up as a potential pothole (if you need the fileno, there's already a method for that in the "standard file object interface"). btw, an "abstract file object" could actually make it much easier to support arbitrary file objects from C/C++ extensions. just map the calls back to Python. or add a tp_file slot, and things get really interesting... > (Of course, if it's not supported, then it doesn't matter if this > function is missing since any extension that needs a 'FILE *' wouldn't > work anyways). yup. I suspect some legacy code may have a hard time running under CE et al. but of course, with a little macro trickery, no- thing stops you from recompiling such code so it uses Python's new "abstract file... okay, okay, I'll stop now ;-) From beazley at cs.uchicago.edu Wed Jun 16 22:13:42 1999 From: beazley at cs.uchicago.edu (David Beazley) Date: Wed, 16 Jun 1999 15:13:42 -0500 (CDT) Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161323.IAA28642@gargoyle.cs.uchicago.edu> <004b01beb82e$36d44540$f29b12c2@pythonware.com> <199906161953.OAA04527@gargoyle.cs.uchicago.edu> <009d01beb833$80d15d40$f29b12c2@pythonware.com> Message-ID: <199906162013.PAA04781@gargoyle.cs.uchicago.edu> Fredrik Lundh writes: > > > and in Python, any decent extension writer should write > > > code that works with arbitrary file objects, right? "if it > > > cannot deal with StringIO objects, it's broken"... > > > > I disagree. Given that a lot of people use Python as a glue language > > for interfacing with legacy codes, it is unacceptable for extensions > > to be forced to use some sort of funky non-standard I/O abstraction. > > oh, you're right, of course. should have added that extra smiley > to that last line. cut and paste from this mail if necessary: ;-) > Good. You had me worried there for a second :-). > > yup. I suspect some legacy code may have a hard time running > under CE et al. but of course, with a little macro trickery, no- > thing stops you from recompiling such code so it uses Python's > new "abstract file... okay, okay, I'll stop now ;-) Macro trickery? Oh yes, we could use that too... (one can never have too much macro trickery if you ask me :-) Cheers, Dave From arw at ifu.net Thu Jun 17 16:12:16 1999 From: arw at ifu.net (Aaron Watters) Date: Thu, 17 Jun 1999 10:12:16 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] Message-ID: <37690240.66F601E1@ifu.net> > no-positive-suggestions-just-grousing-ly y'rs - tim On the contrary. I think this is definitively a bad idea. Retracted. A double negative is a positive. -- Aaron Watters === "Criticism serves the same purpose as pain. It's not pleasant but it suggests that something is wrong." -- Churchill (paraphrased from memory) From da at ski.org Thu Jun 17 19:50:20 1999 From: da at ski.org (David Ascher) Date: Thu, 17 Jun 1999 10:50:20 -0700 (Pacific Daylight Time) Subject: [Python-Dev] org.python.org Message-ID: Not all that revolutionary, but an interesting migration path. FWIW, I think the underlying issue is a real one. We're starting to have more and more conflicts, even among package names. (Of course the symlink solution doesn't work on Win32, but that's a detail =). --david ---------- Forwarded message ---------- Date: Thu, 17 Jun 1999 13:44:33 -0400 (EDT) From: Andy Dustman To: Gordon McMillan Cc: M.-A. Lemburg , Crew List Subject: Re: [Crew] Wizards' Resolution to Zope/PIL/mxDateTime conflict? On Thu, 17 Jun 1999, Gordon McMillan wrote: > M.A.L. wrote: > > > Or maybe we should start the com.domain.mypackage thing ASAP. > > I know many are against this proposal (makes Python look Feudal? > Reminds people of the J language?), but I think it's the only thing > that makes sense. It does mean you have to do some ugly things to get > Pickle working properly. Actually, it can be done very easily. I just tried this, in fact: cd /usr/lib/python1.5 mkdir -p org/python (cd org/python; ln -s ../.. core) touch __init__.py org/__init__.py org/python/__init__.py >>> from org.python.core import rfc822 >>> import profile So this seems to make things nice and backwards compatible. My only concern was having __init__.py in /usr/lib/python1.5, but this doesn't seem to break anything. Of course, if you are using some trendy new atrocity like Windoze, this might not work. -- andy dustman | programmer/analyst | comstar communications corporation telephone: 770.485.6025 / 706.549.7689 | icq: 32922760 | pgp: 0xc72f3f1d _______________________________________________ Crew maillist - Crew at starship.python.net http://starship.python.net/mailman/listinfo/crew From gmcm at hypernet.com Thu Jun 17 21:36:49 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 17 Jun 1999 14:36:49 -0500 Subject: [Python-Dev] org.python.org In-Reply-To: Message-ID: <1282474031-114884629@hypernet.com> David forwards from Starship Crew list: > Not all that revolutionary, but an interesting migration path. > FWIW, I think the underlying issue is a real one. We're starting to > have more and more conflicts, even among package names. (Of course > the symlink solution doesn't work on Win32, but that's a detail =). > > --david > > ---------- Forwarded message ---------- > Date: Thu, 17 Jun 1999 13:44:33 -0400 (EDT) > From: Andy Dustman > To: Gordon McMillan > Cc: M.-A. Lemburg , Crew List > Subject: Re: [Crew] Wizards' Resolution to > Zope/PIL/mxDateTime conflict? > > On Thu, 17 Jun 1999, Gordon McMillan wrote: > > > M.A.L. wrote: > > > > > Or maybe we should start the com.domain.mypackage thing ASAP. > > > > I know many are against this proposal (makes Python look Feudal? > > Reminds people of the J language?), but I think it's the only thing > > that makes sense. It does mean you have to do some ugly things to get > > Pickle working properly. > > Actually, it can be done very easily. I just tried this, in fact: > > cd /usr/lib/python1.5 > mkdir -p org/python > (cd org/python; ln -s ../.. core) > touch __init__.py org/__init__.py org/python/__init__.py > > >>> from org.python.core import rfc822 > >>> import profile > > So this seems to make things nice and backwards compatible. My only > concern was having __init__.py in /usr/lib/python1.5, but this > doesn't seem to break anything. Of course, if you are using some > trendy new atrocity like Windoze, this might not work. In vanilla cases it's backwards compatible. I try packag-izing almost everything I install. Sometimes it works, sometimes it doesn't. In your example, rfc822 uses only builtins at the top level. It's main will import os. Would that work if os lived in org.python.core? Though I really don't think we need to packagize the std distr, (if that happens, I would think it would be for a different reason). The 2 main problems I run across in packagizing things are intra-package imports (where M.A.L's proposal for relative names in dotted imports might ease the pain) and Pickle / cPickle (where the ugliness of the workarounds has often made me drop back to marshal). - Gordon From MHammond at skippinet.com.au Fri Jun 18 10:31:21 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Fri, 18 Jun 1999 18:31:21 +1000 Subject: [Python-Dev] Merge the string_methods tag? Message-ID: <015601beb964$f37a4fa0$0801a8c0@bobcat> Ive been running the string_methods tag (term?) under CVS for quite some time now, and it seems to work perfectly. I admit that I havent stressed the string methods much, but I feel confident that Barry's patches havent broken existing string code. Also, I find using that tag with CVS a bit of a pain. A few updates have been checked into the main branch, and you tend to miss these (its a pity CVS can't be told "only these files are affected by this tag, so the rest should follow the main branch." I know I can do that personally, but that means I personally need to know all files possibly affected by the branch.) Anyway, I digress... I propose that these extensions be merged into the main branch. The main advantage is that we force more people to bash on it, rather than allowing them to make that choice . If the Unicode type is also considered highly experimental, we can make a new tag for that change, but that is really quite independant of the string methods. Mark. From fredrik at pythonware.com Fri Jun 18 10:56:47 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 18 Jun 1999 10:56:47 +0200 Subject: [Python-Dev] cvs problems References: <015601beb964$f37a4fa0$0801a8c0@bobcat> Message-ID: <001d01beb968$7fd47540$f29b12c2@pythonware.com> maybe not the right forum, but I suppose everyone here is using CVS, so... ...could anyone explain why I keep getting this error? $ cvs -z6 up -P -d ... cvs server: Updating dist/src/Tools/ht2html cvs [server aborted]: cannot open directory /projects/cvsroot/python/dist/src/Tools/ht2html: No such file or directory it used to work... From tismer at appliedbiometrics.com Fri Jun 18 11:47:15 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 18 Jun 1999 11:47:15 +0200 Subject: [Python-Dev] Flat Python in Linux Weekly Message-ID: <376A15A3.3968EADE@appliedbiometrics.com> Howdy, Who would have thought this... Linux Weekly took notice. http://lwn.net/bigpage.phtml derangedly yours - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From mal at lemburg.com Fri Jun 18 12:05:52 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 18 Jun 1999 12:05:52 +0200 Subject: [Python-Dev] Relative package imports Message-ID: <376A1A00.3099DE99@lemburg.com> Although David has already copy-posted a message regarding this issue to the list, I would like to restate the problem to get a discussion going (and then maybe take it to c.l.p for general flaming ;). The problem we have run into on starship is that some well-known packages have introduced naming conflicts leading to the unfortunate situation that they can't be all installed on the same default path: 1. Zope has a module named DateTime which also is the base name of the package mxDateTime. 2. Both Zope and PIL have a top-level module named ImageFile.py (different ones of course). Now the problem is how to resolve these issues. One possibility is turning Zope and PIL into proper packages altogether. To ease this transition, one would need a way to specify relative intra-package imports and a way to tell pickle where to look for modules/packages. The next problem we'd probably run into sooner or later is that there are quite a few useful top-level modules with generic names that will conflict with package names and other modules with the same name. I guess we'd need at least three things to overcome this situation once and for all ;-): 1. Provide a way to do relative imports, e.g. a single dot could be interpreted as "parent package": modA.py modD.py [A] modA.py modB.py [B] modC.py modD.py In modC.py: from modD import * (works as usual: import A.B.modD) from .modA import * (imports A.modA) from ..modA import * (import the top-level modA) 2. Establish a general vendor based naming scheme much like the one used in the Java world: from org.python.core import time,os,string from org.zope.core import * from com.lemburg import DateTime from com.pythonware import PIL 3. Add a way to prevent double imports of the same file. This is the mayor gripe I have with pickle currently, because intra- package imports often lead to package modules being imported twice leading to many strange problems (e.g. splitting class hierarchies, problems with isinstance() and issubclass(), etc.), e.g. from org.python.core import UserDict u = UserDict.UserDict() import UserDict v = UserDict.UserDict() Now u and v will point to two different classes: >>> u.__class__ >>> v.__class__ 4. Add some kind of redirection or lookup hook to pickle et al. so that imports done during unpickling can be redirected to the correct (possibly renamed) package. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 196 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fredrik at pythonware.com Fri Jun 18 12:47:49 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 18 Jun 1999 12:47:49 +0200 Subject: [Python-Dev] Flat Python in Linux Weekly References: <376A15A3.3968EADE@appliedbiometrics.com> Message-ID: <001901beb978$0312a440$f29b12c2@pythonware.com> flat eric, flat beat, flat python? http://www.flateric-online.de (best viewed through babelfish.altavista.com, of course ;-) should-flat-eric-in-the-routeroute-route-along-ly yrs /F From fredrik at pythonware.com Fri Jun 18 12:51:21 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 18 Jun 1999 12:51:21 +0200 Subject: [Python-Dev] Relative package imports References: <376A1A00.3099DE99@lemburg.com> Message-ID: <001f01beb978$8177aab0$f29b12c2@pythonware.com> > 2. Both Zope and PIL have a top-level module named ImageFile.py > (different ones of course). > > Now the problem is how to resolve these issues. One possibility > is turning Zope and PIL into proper packages altogether. To > ease this transition, one would need a way to specify relative > intra-package imports and a way to tell pickle where to look > for modules/packages. fwiw, PIL 1.0b1 can already be used as a package, but you have to explicitly import the file format handlers you need: from PIL import Image import PIL.GifImagePlugin import PIL.PngImagePlugin import PIL.JpegImagePlugin etc. this has been fixed in PIL 1.0 final. From guido at CNRI.Reston.VA.US Fri Jun 18 16:51:16 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 18 Jun 1999 10:51:16 -0400 Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: Your message of "Fri, 18 Jun 1999 18:31:21 +1000." <015601beb964$f37a4fa0$0801a8c0@bobcat> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> Message-ID: <199906181451.KAA11549@eric.cnri.reston.va.us> > Ive been running the string_methods tag (term?) under CVS for quite some > time now, and it seems to work perfectly. I admit that I havent stressed > the string methods much, but I feel confident that Barry's patches havent > broken existing string code. > > Also, I find using that tag with CVS a bit of a pain. A few updates have > been checked into the main branch, and you tend to miss these (its a pity > CVS can't be told "only these files are affected by this tag, so the rest > should follow the main branch." I know I can do that personally, but that > means I personally need to know all files possibly affected by the branch.) > Anyway, I digress... > > I propose that these extensions be merged into the main branch. The main > advantage is that we force more people to bash on it, rather than allowing > them to make that choice . If the Unicode type is also considered > highly experimental, we can make a new tag for that change, but that is > really quite independant of the string methods. Hmm... This would make it hard to make a patch release for 1.5.2 (possible called 1.5.3?). I *really* don't want the string methods to end up in a release yet -- there are too many rough edges (e.g. some missing methods, should join str() or not, etc.). I admit that managing CVS branches is painful. We may find that it works better to create a branch for patch releases and to do all new development on the main release... But right now I don't want to change anything yet. In any case Barry just went on vacation so we'll have to wait 10 days... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Fri Jun 18 16:55:45 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 18 Jun 1999 10:55:45 -0400 Subject: [Python-Dev] cvs problems In-Reply-To: Your message of "Fri, 18 Jun 1999 10:56:47 +0200." <001d01beb968$7fd47540$f29b12c2@pythonware.com> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <001d01beb968$7fd47540$f29b12c2@pythonware.com> Message-ID: <199906181455.KAA11564@eric.cnri.reston.va.us> > maybe not the right forum, but I suppose everyone > here is using CVS, so... > > ...could anyone explain why I keep getting this error? > > $ cvs -z6 up -P -d > ... > cvs server: Updating dist/src/Tools/ht2html > cvs [server aborted]: cannot open directory /projects/cvsroot/python/dist/src/Tools/ht2html: No such > file or directory > > it used to work... EXPLANATION: For some reason that directory existed on the mirror server but not in the master CVS tree repository. It was created once but quickly deleted -- not quickly enough apparently to prevent it to leak to the slave. Then we did a global resync from the master to the mirror and that wiped out the mirror version. Good riddance. FIX: Edit Tools/CVS/Entries and delete the line that mentions ht2html, then do another cvs update. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Fri Jun 18 17:41:54 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 18 Jun 1999 11:41:54 -0400 Subject: [Python-Dev] cvs problems In-Reply-To: <001d01beb968$7fd47540$f29b12c2@pythonware.com> Message-ID: <000901beb9a1$179d2380$b79e2299@tim> [/F] > ...could anyone explain why I keep getting this error? > > $ cvs -z6 up -P -d > ... > cvs server: Updating dist/src/Tools/ht2html > cvs [server aborted]: cannot open directory > /projects/cvsroot/python/dist/src/Tools/ht2html: No such > file or directory > > it used to work... It stopped working a week ago Thursday, and Guido & Barry know about it. The directory in question vanished from the server under mysterious circumstances. You can get going again by deleting the ht2html line in your local Tools/CVS/Entries file. From da at ski.org Fri Jun 18 19:09:27 1999 From: da at ski.org (David Ascher) Date: Fri, 18 Jun 1999 10:09:27 -0700 (Pacific Daylight Time) Subject: [Python-Dev] automatic wildcard expansion on Win32 Message-ID: A python-help poster finally convinced me that there was a way to enable automatic wildcard expansion on win32. This is done by linking in "setargv.obj" along with all of the other MS libs. Quick testing shows that it works. Is this a feature we want to add? I can see both sides of that coin. --david PS: I saw a RISKS digest posting last week which had a horror story about wildcard expansion on some flavor of Windows. The person had two files with long filenames: verylongfile1.txt and verylongfile2.txt But Win32 stored them in 8.3 format, so they were stored as verylo~2.txt and verylo~1.txt (Yes, the 1 and 2 were swapped!). So when he did del *1.txt he removed the wrong file. Neat, eh? (This is actually relevant -- it's possible that setargv.obj and glob.glob could give different answers). --david From guido at CNRI.Reston.VA.US Fri Jun 18 20:09:29 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 18 Jun 1999 14:09:29 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Your message of "Fri, 18 Jun 1999 10:09:27 PDT." References: Message-ID: <199906181809.OAA12090@eric.cnri.reston.va.us> > A python-help poster finally convinced me that there was a way to enable > automatic wildcard expansion on win32. This is done by linking in > "setargv.obj" along with all of the other MS libs. Quick testing shows > that it works. > > Is this a feature we want to add? I can see both sides of that coin. I don't see big drawbacks except minor b/w compat problems. Should it be done for both python.exe and pythonw.exe? --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Fri Jun 18 22:06:09 1999 From: da at ski.org (David Ascher) Date: Fri, 18 Jun 1999 13:06:09 -0700 (Pacific Daylight Time) Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: <199906181809.OAA12090@eric.cnri.reston.va.us> Message-ID: On Fri, 18 Jun 1999, Guido van Rossum wrote: > I don't see big drawbacks except minor b/w compat problems. > > Should it be done for both python.exe and pythonw.exe? Sure. From MHammond at skippinet.com.au Sat Jun 19 02:56:42 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Sat, 19 Jun 1999 10:56:42 +1000 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Message-ID: <016e01beb9ee$99e1a710$0801a8c0@bobcat> > A python-help poster finally convinced me that there was a > way to enable > automatic wildcard expansion on win32. This is done by linking in > "setargv.obj" along with all of the other MS libs. Quick > testing shows > that it works. This has existed since I have been using C on Windows. I personally would vote against it. AFAIK, common wisdom on Windows is to not use this. Indeed, if people felt that this behaviour was an improvement, MS would have enabled it by default at some stage over the last 10 years it has existed, and provided a way of disabling it! This behaviour causes subtle side effects; effects Unix users are well aware of, due to every single tool using it. Do the tricks needed to get the wildcard down to the program exist? Will any windows users know what they are? IMO, Windows "fixed" the Unix behaviour by dropping this, and they made a concession to die-hards by providing a rarely used way of enabling it. Windows C programmers dont expect it, VB programmers dont expect it, even batch file programmers dont expect it. I dont think we should use it. > (This is actually relevant -- it's possible that setargv.obj > and glob.glob > could give different answers). Exactly. As may win32api.FindFiles(). Give the user the wildcard, and let them make sense of it. The trivial case of using glob() is so simple I dont believe it worth hiding. Your horror story of the incorrect file being deleted could then only be blamed on the application, not on Python! Mark. From tim_one at email.msn.com Sat Jun 19 03:00:46 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 18 Jun 1999 21:00:46 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Message-ID: <000501beb9ef$2ac61720$a69e2299@tim> [David Ascher] > A python-help poster finally convinced me that there was a way to enable > automatic wildcard expansion on win32. This is done by linking in > "setargv.obj" along with all of the other MS libs. Quick testing shows > that it works. > > Is this a feature we want to add? I can see both sides of that coin. The only real drawback I see is that we're then under some obligation to document Python's behavior. Which is then inherited from the MS setargv.obj, which is in turn only partially documented in developer-only docs, and incorrectly documented at that. > PS: I saw a RISKS digest posting last week which had a horror story about > wildcard expansion on some flavor of Windows. The person had two files > with long filenames: > > verylongfile1.txt > and > verylongfile2.txt > > But Win32 stored them in 8.3 format, so they were stored as > verylo~2.txt > and > verylo~1.txt > > (Yes, the 1 and 2 were swapped!). So when he did > > del *1.txt > > he removed the wrong file. Neat, eh? > > (This is actually relevant -- it's possible that setargv.obj and > glob.glob could give different answers). Yes, and e.g. it works this way under Win95: D:\Python>dir *~* Volume in drive D is DISK1PART2 Volume Serial Number is 1DFF-0F59 Directory of D:\Python PYCLBR~1 PAT 5,765 06-07-99 11:41p pyclbr.patch KJBUCK~1 PYD 34,304 03-31-98 3:07a kjbuckets.pyd WIN32C~1 05-16-99 12:10a win32comext PYTHON~1 05-16-99 12:10a Pythonwin TEXTTO~1 01-15-99 11:35p TextTools UNWISE~1 EXE 109,056 07-03-97 8:35a UnWisePW32.exe 3 file(s) 149,125 bytes 3 dir(s) 1,502,511,104 bytes free Here's the same thing in an argv-spewing console app whipped up to link setargv.obj: D:\Python>garp\debug\garp *~* 0: D:\PYTHON\GARP\DEBUG\GARP.EXE 1: kjbuckets.pyd 2: pyclbr.patch 3: Pythonwin 4: TextTools 5: UnWisePW32.exe 6: win32comext D:\Python> setargv.obj is apparently consistent with what native wildcard expansion does (although you won't find that promise made anywhere!), and it's definitely surprising in the presence of non-8.3 names. The quoting rules too are impossible to explain, seemingly random: D:\Python>garp\debug\garp "\\a\\" 0: D:\PYTHON\GARP\DEBUG\GARP.EXE 1: \\a\ D:\Python> Before I was on the Help list, I used to believe it would work to just say "well, it does what Windows does" . magnification-of-ignorance-ly y'rs - tim From tim_one at email.msn.com Sat Jun 19 03:26:42 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 18 Jun 1999 21:26:42 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: <016e01beb9ee$99e1a710$0801a8c0@bobcat> Message-ID: <000701beb9f2$c95b9880$a69e2299@tim> [MarkH, with *the* killer argument <0.3 wink>] > Your horror story of the incorrect file being deleted could then > only be blamed on the application, not on Python! Sold! Some years ago in the Perl world, they solved this by making regular old perl.exe not expand wildcards on Windows, but also supplying perlglob.exe which did. Don't know what they're doing today, but they apparently changed their minds at least once, as the couple-years-old version of perl.exe on my machine does do wildcard expansion, and does the wrong (i.e., the Windows ) thing. screw-it-ly y'rs - tim From tim_one at email.msn.com Sat Jun 19 20:45:16 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 19 Jun 1999 14:45:16 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <199906101411.KAA29962@eric.cnri.reston.va.us> Message-ID: <000801beba83$df719e80$c49e2299@tim> Backtracking: [Aaron] > I've always considered it a major shame that Python ints and floats > and chars and stuff have anything to do with dynamic allocation ... [Guido] > What you're describing is very close to what I recall I once read > about the runtime organization of Icon. Perl may also use a variant > on this (it has fixed-length object headers). ... I've rarely been able to make sense of Perl's source code, but gave it another try anyway. An hour later I gave up unenlightened, so cruised the web. Turns out there's a *terrific* writeup of Perl's type representation at: http://home.sol.no/~aas/perl/guts/ Pictures and everything . Header is 3 words: An 8-bit "type" field, 24 baffling flag bits (e.g., flag #14 is "BREAK -- refcnt is artificially low"(!)), 32 refcount bits, and a 32-bit pointer field. Appears that the pointer field is always a real (although possibly NULL) pointer. Plain ints have type code SvIV, and the pointer then points to a bogus address, but where that address + 3 words points to the actual integer value. Why? Because then they can use the same offset to get to the int as when the type is SvPVIV, which is the combined string/integer type, and needs three words (to point to the string start address, current len and allocated len) in addition to the integer value at the end. So why is the integer value at the end? So the same offsets work for the SvPV type, which is solely a string descriptor. So why is it important that SvPVIV, SvPV and SvIV all have the same layout? So that either of the latter types can be dynamically "upgraded" to SvPVIV (when a string is converted to int or vice versa; Perl then holds on to both representations internally) by plugging in a new type code and fiddling some of the baffling flag bits. Brr. I have no idea how they manage to keep Perl running! and-not-entirely-sure-that-they-do-ly y'rs - tim From mal at lemburg.com Mon Jun 21 11:54:50 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 21 Jun 1999 11:54:50 +0200 Subject: [Python-Dev] Relative package imports References: <376A1A00.3099DE99@lemburg.com> Message-ID: <376E0BEA.60F22945@lemburg.com> It seems that there is not much interest in the topic... I'll be offline for the next two weeks -- maybe someone could pick the thread up and toss it around a bit while I'm away. Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 193 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From MHammond at skippinet.com.au Mon Jun 21 13:23:34 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Mon, 21 Jun 1999 21:23:34 +1000 Subject: [Python-Dev] Relative package imports In-Reply-To: <376E0BEA.60F22945@lemburg.com> Message-ID: <000501bebbd8$80f56b10$0801a8c0@bobcat> > It seems that there is not much interest in the topic... > > I'll be offline for the next two weeks -- maybe someone could > pick the thread up and toss it around a bit while I'm away. OK - here are my 2c on it: Unless I am mistaken, this problem could be solved with 2 steps: * Code moves to Python packages. * The standard Python library move to a package. If all non-trivial Python program used packages, and some agreement on a standard namespace could be met, I think it would be addressed. There was a thread on the newsgroup about the potential naming of the standard library. You did state as much in your proposal - indeed, you state "to ease the transition". Personally, I dont think it is worth it, mainly because we end up with a half-baked scheme purely for the transition, but one that can never be removed. To me, the question is one of: * Why arent Zope/PIL capable of being used as packages. * If they are (as I understand to be the case) why do people choose not to use them as such, or why do the authors not recommend this? * Is there a deficiency in the package scheme that makes it hard to use? Eg, should "__" that ni used for the parent package be reinstated? Mark. From fredrik at pythonware.com Mon Jun 21 14:41:27 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 21 Jun 1999 14:41:27 +0200 Subject: [Python-Dev] Relative package imports References: <000501bebbd8$80f56b10$0801a8c0@bobcat> Message-ID: <006501bebbe3$6189e570$f29b12c2@pythonware.com> Mark Hammond wrote: > * Why arent Zope/PIL capable of being used as packages. PIL can be used as a package ("from PIL import Image"), assuming that it's installed under a directory in your path. there's one pro- blem in 1.0b1, though: you have to explicitly import the file format handlers you need: import PIL.JpegImagePlugin import PIL.PngImagePlugin this has been fixed in 1.0 final. > * If they are (as I understand to be the case) why do people choose not to > use them as such, or why do the authors not recommend this? inertia, and compatibility concerns. we've decided that all official material related to PIL 1.0 will use the old syntax (and all 1.X releases will be possible to install using the PIL.pth approach). too many users out there... now, PIL 2.0 is a completely different thing... > * Is there a deficiency in the package scheme that makes it hard to use? not that I'm aware... From mal at lemburg.com Mon Jun 21 16:36:58 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 21 Jun 1999 16:36:58 +0200 Subject: [Python-Dev] Relative package imports References: <000501bebbd8$80f56b10$0801a8c0@bobcat> Message-ID: <376E4E0A.3B714BAB@lemburg.com> Mark Hammond wrote: > > > It seems that there is not much interest in the topic... > > > > I'll be offline for the next two weeks -- maybe someone could > > pick the thread up and toss it around a bit while I'm away. > > OK - here are my 2c on it: > > Unless I am mistaken, this problem could be solved with 2 steps: > * Code moves to Python packages. > * The standard Python library move to a package. > > If all non-trivial Python program used packages, and some agreement on a > standard namespace could be met, I think it would be addressed. There was > a thread on the newsgroup about the potential naming of the standard > library. > > You did state as much in your proposal - indeed, you state "to ease the > transition". Personally, I dont think it is worth it, mainly because we > end up with a half-baked scheme purely for the transition, but one that can > never be removed. With "easing the transition" I ment introducing a way to do relative package imports: you don't need relative imports if you can be sure that the package name will never change (with a fixed naming scheme, a la com.domain.product.package...). The smarter import mechanism is needed to work-around the pickle problems you face (because pickle uses absolute package names). > To me, the question is one of: > > * Why arent Zope/PIL capable of being used as packages. > * If they are (as I understand to be the case) why do people choose not to > use them as such, or why do the authors not recommend this? > * Is there a deficiency in the package scheme that makes it hard to use? > Eg, should "__" that ni used for the parent package be reinstated? I guess this would help a great deal; although I'd personally wouldn't like yet another underscore in the language. Simply leave the name empty as in '.submodule' or '..subpackage.submodule'. Cheers, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 193 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Tue Jun 22 00:44:24 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 21 Jun 1999 18:44:24 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Your message of "Fri, 18 Jun 1999 21:26:42 EDT." <000701beb9f2$c95b9880$a69e2299@tim> References: <000701beb9f2$c95b9880$a69e2299@tim> Message-ID: <199906212244.SAA18866@eric.cnri.reston.va.us> > Some years ago in the Perl world, they solved this by making regular old > perl.exe not expand wildcards on Windows, but also supplying perlglob.exe > which did. This seems a reasonable way out. Just like we have pythonw.exe, we could add pythong.exe and pythongw.exe (or pythonwg.exe?). I guess it's time for a README.txt file to be installed explaining all the different executables... By default the g versions would not be used unless invoked explicitly. --Guido van Rossum (home page: http://www.python.org/~guido/) From Vladimir.Marangozov at inrialpes.fr Thu Jun 24 14:23:48 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Thu, 24 Jun 1999 14:23:48 +0200 (DFT) Subject: [Python-Dev] ob_refcnt access Message-ID: <199906241223.OAA46222@pukapuka.inrialpes.fr> How about introducing internal macros for explicit ob_refcnt accesses in the core? Actually, there are a number of places where one can see "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), _Py_SETREF(op, n) thus decoupling completely the low level refcount management defined in object.h: #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) Comments? I've contributed myself to the mess in intobject.c & floatobject.c, so I thought that such macros would make the code cleaner. Here's the current state of affairs: python/dist/src>find . -name "*.[c]" -exec grep ob_refcnt {} \; -print (void *) v, ((PyObject *) v)->ob_refcnt)) ./Modules/_tkinter.c if (self->arg->ob_refcnt > 1) { \ if (ob->ob_refcnt < 2 || self->fast) if (args->ob_refcnt > 1) { ./Modules/cPickle.c if (--inst->ob_refcnt > 0) { ./Objects/classobject.c if (result->ob_refcnt == 1) ./Objects/fileobject.c if (PyFloat_Check(p) && p->ob_refcnt != 0) if (!PyFloat_Check(p) || p->ob_refcnt == 0) { if (PyFloat_Check(p) && p->ob_refcnt != 0) { p, p->ob_refcnt, buf); ./Objects/floatobject.c if (PyInt_Check(p) && p->ob_refcnt != 0) if (!PyInt_Check(p) || p->ob_refcnt == 0) { if (PyInt_Check(p) && p->ob_refcnt != 0) p, p->ob_refcnt, p->ob_ival); ./Objects/intobject.c assert(v->ob_refcnt == 1); /* Since v will be used as accumulator! */ ./Objects/longobject.c if (op->ob_refcnt <= 0) op->ob_refcnt, (long)op); op->ob_refcnt = 1; if (op->ob_refcnt < 0) fprintf(fp, "[%d] ", op->ob_refcnt); ./Objects/object.c if (!PyString_Check(v) || v->ob_refcnt != 1) { if (key->ob_refcnt == 2 && key == value) { ./Objects/stringobject.c if (!PyTuple_Check(op) || op->ob_refcnt != 1) { if (v == NULL || !PyTuple_Check(v) || v->ob_refcnt != 1) { ./Objects/tupleobject.c if (PyList_Check(seq) && seq->ob_refcnt == 1) { if (args->ob_refcnt > 1) { ./Python/bltinmodule.c if (value->ob_refcnt != 1) ./Python/import.c return PyInt_FromLong((long) arg->ob_refcnt); ./Python/sysmodule.c -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From guido at CNRI.Reston.VA.US Thu Jun 24 17:30:45 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 24 Jun 1999 11:30:45 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Your message of "Thu, 24 Jun 1999 14:23:48 +0200." <199906241223.OAA46222@pukapuka.inrialpes.fr> References: <199906241223.OAA46222@pukapuka.inrialpes.fr> Message-ID: <199906241530.LAA27887@eric.cnri.reston.va.us> > How about introducing internal macros for explicit ob_refcnt accesses > in the core? What problem does this solve? > Actually, there are a number of places where one can see > "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), > _Py_SETREF(op, n) thus decoupling completely the low level refcount > management defined in object.h: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) Why the cast? It loses some type-safety, e.g. _Py_GETREF(0) will now cause a core dump instead of a compile-time error. > Comments? I don't see how it's cleaner or saves typing: op->ob_refcnt _Py_GETREF(op) op->ob_refcnt = 1 _Py_SETREF(op, 1) --Guido van Rossum (home page: http://www.python.org/~guido/) From Vladimir.Marangozov at inrialpes.fr Thu Jun 24 18:33:31 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Thu, 24 Jun 1999 18:33:31 +0200 (DFT) Subject: [Python-Dev] Re: ob_refcnt access In-Reply-To: from "marangoz" at "Jun 24, 99 02:23:47 pm" Message-ID: <199906241633.SAA44314@pukapuka.inrialpes.fr> marangoz wrote: > > > How about introducing internal macros for explicit ob_refcnt accesses > in the core? Actually, there are a number of places where one can see > "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), > _Py_SETREF(op, n) thus decoupling completely the low level refcount > management defined in object.h: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) > > Comments? Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot to mention that if this detail doesn't hurt code aesthetics, one (I) could experiment more easily all sort of weird things with refcounting... I formulated the same wish for malloc & friends some time ago, that is, use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be defined for now as malloc, free, but nobody seems to be very excited about a smooth transition to other kinds of malloc. Hence, I reiterate this wish, 'cause switching to macros means preparing the code for the future, even if in the future it remains intact ;-). Defining these basic interfaces is clearly Guido's job :-) as he points out in his summary of the last Open Source summit, but nevertheless, I'm raising the issue to let him see what other people think about this and allow him to make decisions easier :-) -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From ping at lfw.org Thu Jun 24 19:29:19 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Thu, 24 Jun 1999 10:29:19 -0700 (PDT) Subject: [Python-Dev] ob_refcnt access In-Reply-To: <199906241530.LAA27887@eric.cnri.reston.va.us> Message-ID: On Thu, 24 Jun 1999, Guido van Rossum wrote: > > How about introducing internal macros for explicit ob_refcnt accesses > > in the core? > > What problem does this solve? I assume Vladimir was trying to leave the door open for further ob_refcnt manipulation hooks later, like having objects manage their own refcounts. Until there's an actual problem to solve that requires this, though, i'm not sure it's necessary. Are there obvious reasons to want to allow this? * * * While we're talking about refcounts and all, i've had the argument quite successfully made to me that a reasonably written garbage collector can be both (a) simple and (b) more efficient than refcounting. Having spent a good number of work days doing nothing but debugging crashes by tracing refcounting bugs, i was easily converted into a believer once a friend dispelled the notion that garbage collectors were either slow or horribly complicated. I had always been scared of them before, but less so now. Is an incremental GC being considered for a future Python? I've idly been pondering various tricks by which it could be made to work with existing extension modules -- here are some possibilities: 1. Keep the refcounts and let existing code do the usual thing; introduce a new variant of PyObject_NEW that puts an object into the "gc-able" pool rather than the "refcounted" pool. 2. Have Py_DECREF and Py_INCREF just do nothing, and let the garbage collector guess from the contents of the structure where the pointers are. (I'm told it's possible to do this safely, since you can only have false positives, never false negatives.) 3. Have Py_DECREF and Py_INCREF just do nothing, and ask the extension module to just provide (in its type object) a table of where the pointers are in its struct. And so on; mix and match. What are everyone's thoughts on this one? -- ?!ng "All models are wrong; some models are useful." -- George Box From tim_one at email.msn.com Fri Jun 25 08:38:11 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 25 Jun 1999 02:38:11 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Message-ID: <000c01bebed5$4b8d1040$d29e2299@tim> [Ka-Ping Yee, opines about GC] Ping, I think you're not getting any responses because this has been beaten to death on c.l.py over the last month (for the 53rd time, no less ). A hefty percentage of CPython users *like* the reliably timely destruction refcounting yields, and some clearly rely on it. Guido recently (10 June) posted the start of a "add GC on top of RC" scheme, in a thread with the unlikely name "fork()". The combination of cycles, destructors and resurrection is quite difficult to handle in a way both principled and useful (Java's way is principled but by most accounts unhelpful to the point of uselessness). Python experience with the Boehm collector can be found in the FAQ; note that the Boehm collector deals with finalizers in cycles by letting cycles with finalizers leak! > ... > While we're talking about refcounts and all, i've had the > argument quite successfully made to me that a reasonably > written garbage collector can be both (a) simple and (b) more > efficient than refcounting. That's a dubious claim. Sophisticated mark-and-sweep (with or without compaction) is almost universally acknowledged to beat RC, but simple M&S has terrible cache behavior (you fill up the address space before reclaiming anything, then leap all over the address space repeatedly cleaning it up). Don't discount that, in Python unlike as in most other languages, the simple loop for i in xrange(1000000): pass creates a huge amount of trash at a furious pace. Under RC it can happily reuse the same little bit of storage each time around. > Having spent a good number of work days doing nothing but debugging > crashes by tracing refcounting bugs, Yes, we can trade that for tracking down M&S bugs <0.5 wink> -- instead of INCREF/DECREF macros, you end up with M&S macros marking regions where the collector must not be run (because you're in a temporarily "inconsistent" state). That's under sophisticated M&S, though, but is an absolute nightmare when you miss a pair (the bugs only show up "sometimes", and not always the same ways -- depends on when M&S happens to run, and "how inconsistent" you happen to be at the time). > ... > And so on; mix and match. What are everyone's thoughts on this one? I think Python probably needs to clean up cycles, but by some variant of Guido's scheme on top of RC; I very much dislike the property of his scheme that objects with destructors may be get destroyed without their destructors getting invoked, but it seems hard to fix. Alternatives include Java's scheme (which really has nothing going for it other than that Java does it <0.3 wink>); Scheme's "guardian" scheme (which would let the user "get at" cyclic trash with destructors, but refuses to do anything with them on its own); following Boehm by saying that cycles with destructors are immortal; following goofier historical precedent by e.g. destroying such objects in reverse order of creation; or maybe just raising an exception if a trash cycle containing a destructor is found. All of those seem a comparative pain to implement, with Java's being the most painful -- and quite possibly the least satisfying! it's-a-whale-of-a-lot-easier-in-a-self-contained-universe-or-even-an- all-c-one-ly y'rs - tim From Vladimir.Marangozov at inrialpes.fr Fri Jun 25 13:27:43 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Fri, 25 Jun 1999 13:27:43 +0200 (DFT) Subject: [Python-Dev] Re: ob_refcnt access (fwd) Message-ID: <199906251127.NAA27464@pukapuka.inrialpes.fr> FYI, my second message on this issue didn't reach the list because of a stupid error of mine, so Guido and I exchanged two mails in private. His response to the msg below was that he thinks that tweaking the refcount scheme at this level wouldn't contribute much and that he doesn't intend to change anything on this until 2.0 which will be rewritten from scratch. Besides, if I want to satisfy my curiosity in hacking the refcounts I can do it with a small patch because I've already located the places where the ob_refcnt slot is accessed directly. ----- Forwarded message ----- From Vladimir.Marangozov at inrialpes.fr Thu Jun 24 18:33:31 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir.Marangozov at inrialpes.fr) Date: Thu, 24 Jun 1999 18:33:31 +0200 (DFT) Subject: ob_refcnt access In-Reply-To: from "marangoz" at "Jun 24, 99 02:23:47 pm" Message-ID: marangoz wrote: > > > How about introducing internal macros for explicit ob_refcnt accesses > in the core? Actually, there are a number of places where one can see > "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), > _Py_SETREF(op, n) thus decoupling completely the low level refcount > management defined in object.h: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) > > Comments? Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot to mention that if this detail doesn't hurt code aesthetics, one (I) could experiment more easily all sort of weird things with refcounting... I formulated the same wish for malloc & friends some time ago, that is, use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be defined for now as malloc, free, but nobody seems to be very excited about a smooth transition to other kinds of malloc. Hence, I reiterate this wish, 'cause switching to macros means preparing the code for the future, even if in the future it remains intact ;-). Defining these basic interfaces is clearly Guido's job :-) as he points out in his summary of the last Open Source summit, but nevertheless, I'm raising the issue to let him see what other people think about this and allow him to make decisions easier :-) -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 ----- End of forwarded message ----- -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From tismer at appliedbiometrics.com Fri Jun 25 20:47:51 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 25 Jun 1999 20:47:51 +0200 Subject: [Python-Dev] Re: ob_refcnt access (fwd) References: <199906251127.NAA27464@pukapuka.inrialpes.fr> Message-ID: <3773CED7.B87D055C@appliedbiometrics.com> Vladimir Marangozov wrote: > > FYI, my second message on this issue didn't reach the list because > of a stupid error of mine, so Guido and I exchanged two mails > in private. His response to the msg below was that he thinks > that tweaking the refcount scheme at this level wouldn't contribute > much and that he doesn't intend to change anything on this until 2.0 > which will be rewritten from scratch. > > Besides, if I want to satisfy my curiosity in hacking the refcounts > I can do it with a small patch because I've already located the places > where the ob_refcnt slot is accessed directly. Well, one Euro on that issue: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) > > > > Comments? > > Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot > to mention that if this detail doesn't hurt code aesthetics, one (I) could > experiment more easily all sort of weird things with refcounting... I think if at all, this should be no typecast to stay safe. As long as every PyObject has a refcount, this would be correct and checked by the compiler. Why loose it? #define _Py_GETREF(op) ((op)->ob_refcnt) This carries the same semantics, the same compiler check, but adds a level of abstraction for future changes. > I formulated the same wish for malloc & friends some time ago, that is, > use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be > defined for now as malloc, free, but nobody seems to be very excited > about a smooth transition to other kinds of malloc. Hence, I reiterate > this wish, 'cause switching to macros means preparing the code for the > future, even if in the future it remains intact ;-). I wish to incref this wish by mine. In order to be able to try different memory allocation strategies, I would go even further and give every object type its own allocation macro which carries info about the object type about to be allocated. This costs nothing but a little macro expansion for the C compiler, but would allow to try new schemes, without always patching the Python source. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Fri Jun 25 20:56:39 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 25 Jun 1999 20:56:39 +0200 Subject: [Python-Dev] ob_refcnt access References: <000c01bebed5$4b8d1040$d29e2299@tim> Message-ID: <3773D0E7.458E00F1@appliedbiometrics.com> Tim Peters wrote: > > [Ka-Ping Yee, opines about GC] > > Ping, I think you're not getting any responses because this has been beaten > to death on c.l.py over the last month (for the 53rd time, no less ). > > A hefty percentage of CPython users *like* the reliably timely destruction > refcounting yields, and some clearly rely on it. [CG issue dropped, I know the thread] I know how much of a pain in the .. proper refcounting can be. Sometimes, after long debugging, I wished it would go. But finally, I think it is a *really good thing* to have to do proper refcounting. The reason is that this causes a lot of discipline, which improves the whole program. I guess with GC always there, quite a number of errors stay undetected. I can say this, since I have been through a week of debugging now, and I can now publish full blown first class continuations for Python yes I'm happy - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From skip at mojam.com Mon Jun 28 00:11:28 1999 From: skip at mojam.com (Skip Montanaro) Date: Sun, 27 Jun 1999 18:11:28 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: <199906181451.KAA11549@eric.cnri.reston.va.us> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> Message-ID: <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> Guido> Hmm... This would make it hard to make a patch release for 1.5.2 Guido> (possible called 1.5.3?). I *really* don't want the string Guido> methods to end up in a release yet -- there are too many rough Guido> edges (e.g. some missing methods, should join str() or not, Guido> etc.). Sorry for the delayed response. I've been out of town. When Barry returns would it be possible to merge the string methods in conditionally (#ifdef STRING_METHODS) and add a --with-string-methods configure option? How hard would it be to modify string.py, stringobject.c and stropmodule.c to carry that around? Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/~skip/ 518-372-5583 From tim_one at email.msn.com Mon Jun 28 04:27:06 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 27 Jun 1999 22:27:06 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <3773D0E7.458E00F1@appliedbiometrics.com> Message-ID: <000501bec10d$b6f1fb40$e19e2299@tim> [Christian Tismer] > ... > I can say this, since I have been through a week of debugging > now, and I can now publish > > full blown first class continuations for Python > > yes I'm happy - chris You should be! So how come nobody else is ? Let's fire some imagination here: without the stinkin' C stack snaking its way thru everything, then with the exception of external system objects (like open files), the full state of a running Python program is comprised of objects Python understands and controls. So with some amount of additional pain we could pickle them. And unpickle them. Painlessly checkpoint a long computation for possible restarting? Freeze a program while it's running on your mainframe, download it to your laptop and resume it while you're on the road? Ship a bug report with the computation frozen right before the error occurs? Take an app with gobs of expensive initialization, freeze it after it's "finally ready to go", and ship the latter instead? Capture the state of an interactive session for later resumption? Etc. Not saying those are easy, but getting the C stack out of the way means they move from impossible to plausible. Maybe it would help get past the Schemeophobia if, instead of calling them "continuations", you called 'em "platform-independent potentially picklable threads". pippt-sounds-as-good-as-it-reads-ly y'rs - tim From tim_one at email.msn.com Mon Jun 28 05:13:15 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 27 Jun 1999 23:13:15 -0400 Subject: [Python-Dev] ActiveState & fork & Perl Message-ID: <000601bec114$2a2929c0$e19e2299@tim> Moving back in time ... [GordonM] > Perhaps Christian's stackless Python would enable green threads... [Guido] > This has been suggested before... While this seems possible at first, > all blocking I/O calls would have to be redone to pass control to the > thread scheduler, before this would be useful -- a huge task! I didn't understand this. If I/O calls are left alone, and a green thread hit one, the whole program just sits there waiting for the call to complete, right? But if the same thing happens using "real threads" today, the same thing happens today anyway . That is, if a thread doesn't release the global lock before a blocking call today, the whole program just sits there etc. Or do you have some other kind of problem in mind here? unconvincedly y'rs - tim From MHammond at skippinet.com.au Mon Jun 28 06:29:29 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Mon, 28 Jun 1999 14:29:29 +1000 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <000501bec10d$b6f1fb40$e19e2299@tim> Message-ID: <003301bec11e$d0cfc6d0$0801a8c0@bobcat> > > yes I'm happy - chris > > You should be! So how come nobody else is ? Im a little unhappy as this will break the Active Debugging stuff - ie, the ability for Python, Java, Perl, VBScript etc to all exist in the same process, each calling each other, and each being debuggable (makes a _great_ demo :-) Im not _really_ unhappy, Im just throwing this in as an FYI. The Active Debugging interfaces need some way of sorting a call stack. As many languages may be participating in a debugging session, there is no implicit ordering available. Inter-language calls are not made via the debugger, so it has no chance to intercept. So the solution MS came up with was, surprise surprise, the machine stack! :-) The assumption is that all languages will make _some_ use of the stack, so they ask a language to report its "stack base address" and "stack size". Using this information, the debugger sorts into the correct call sequence. Indeed, getting this information (even the half of it I did manage :-) was painful, and hard to get right. Ahh, the joys of bleeding-edge technologies :-) > Let's fire some imagination here: without the stinkin' C > stack snaking its I tried, and look what happened :-) Seriously, some if this stuff would be way cool. Bit I also understand completely the silence on this issue. When the thread started, there was much discussion about exactly what the hell these continuation/coroutine thingies even were. However, there were precious few real-world examples where they could be used. A few acedemic, theoretical places, but the only real contender I have seen brought up was Medusa. There were certainly no clear examples of "as soon as we have this, I could change abc to take advantage, and this would give us the very cool xyz" So, if anyone else if feeling at all like me about this issue, they are feeling all warm and fuzzy knowing that a few smart people are giving us the facility to do something we hope we never, ever have to do. :-) Mark. From rushing at nightmare.com Mon Jun 28 11:53:21 1999 From: rushing at nightmare.com (Sam Rushing) Date: Mon, 28 Jun 1999 02:53:21 -0700 (PDT) Subject: [Python-Dev] ob_refcnt access In-Reply-To: <41219828@toto.iv> Message-ID: <14199.13497.439332.366329@seattle.nightmare.com> Mark Hammond writes: > I tried, and look what happened :-) Seriously, some if this stuff > would be way cool. > > Bit I also understand completely the silence on this issue. When > the thread started, there was much discussion about exactly what > the hell these continuation/coroutine thingies even were. However, > there were precious few real-world examples where they could be > used. A few acedemic, theoretical places, but the only real > contender I have seen brought up was Medusa. There were certainly > no clear examples of "as soon as we have this, I could change abc > to take advantage, and this would give us the very cool xyz" Part of the problem is that we didn't have the feature to play with. Many of the possibilities are showing up now that it's here... The basic advantage to coroutines is they allow you to turn any event-driven/state-machine problem into one that is managed with 'normal' control state; i.e., for loops, while loops, nested procedure calls, etc... Here are a few possible real-world uses: ================================================== Parsing. I remember a discussion from a few years back about the distinction between 'push' and 'pull' model parsers. Coroutines let you have it both ways; you can write a parser in the most natural way (pull), but use it as a 'push'; i.e. for a web browser. ================================================== "http sessions". A single 'thread' of control that is re-entered whenever a hit from a particular user ('session') comes in to the web server: [Apologies to those that have already seen this cheezy example] def ecommerce (session): session.login() # sends a login form, waits for it to return basket = [] while 1: item = session.shop_for_item() if item: basket.append (item) else: break if basket: session.get_shipping_info() session.get_payment_info() session.transact() 'session.shop_for_item()' will resume the main coroutine, which will resume this coroutine only when a new hit comes in from that session/user, and 'return' this hit to the while loop. I have a little web server that uses this idea to play blackjack: http://www.nightmare.com:7777/ http://www.nightmare.com/stuff/blackjack_httpd.py [though I'm a little fuzzy on the rules]. Rather than building a state machine that keeps track of where the user has been, and what they're doing, you can keep all the state in local variables (like 'basket' above) - in other words, it's a much more natural style of programming. ================================================== One of the areas I'm most excited about is GUI coding. All GUI's are event driven. All GUI code is therefore written in a really twisted, state-machine fashion; interactions are very complex. OO helps a bit, but doesn't change the basic difficulty - past a certain point interesting things become too complex to try... Mr. Fuchs' paper ("Escaping the event loop: an alternative control structure for multi-threaded GUIs") does a much better job of describing this than I can: http://cs.nyu.edu/phd_students/fuchs/ http://cs.nyu.edu/phd_students/fuchs/gui.ps ================================================== Tim's example of 'dumping' a computation in the middle and storing it on disk (or sending it over a network), is not a fantasy... I have a 'stackless' Scheme system that does this right now. ================================================== Ok, final example. Isn't there an interface in Python to call a certain function after every so many vm insns? Using coroutines you could hook into this and provide non-preemptive 'threads' for those platforms that don't have them. [And the whole thing would be written in Python, not in C!] ================================================== > So, if anyone else if feeling at all like me about this issue, they > are feeling all warm and fuzzy knowing that a few smart people are > giving us the facility to do something we hope we never, ever have > to do. :-) "When the only tool you have is a hammer, everything looks like a nail". I saw the guys over in the Scheme shop cutting wood with a power saw; now I feel like a schmuck with my hand saw. You are right to be frightened by the strangeness of the underlying machinery; hopefully a simple and easy-to-understand interface can be built for the C level as well as Python. I think Christian's 'frame dispatcher' is fairly clear, and not *that* much of a departure from the current VM; it's amazing to me how little work really had to be done! -Sam From tismer at appliedbiometrics.com Mon Jun 28 14:07:33 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 28 Jun 1999 14:07:33 +0200 Subject: [Python-Dev] ob_refcnt access References: <003301bec11e$d0cfc6d0$0801a8c0@bobcat> Message-ID: <37776585.17B78DD1@appliedbiometrics.com> Mark Hammond wrote: > > > > yes I'm happy - chris > > > > You should be! So how come nobody else is ? (to Tim) I believe this comes simply since following me would force people to change their way of thinking. I am through this already, but it was hard for me. And after accepting to be stackless, there is no way to go back. Today I'm wondering about my past: "how could I think of stacks when thinking of programs?" This is so wrong. The truth is: Programs are just some data, part of it called code, part of it is local state, and! its future of computation. Out, over, roger. All the rest is artificial showstoppers. > Im a little unhappy as this will break the Active Debugging stuff - ie, the > ability for Python, Java, Perl, VBScript etc to all exist in the same > process, each calling each other, and each being debuggable (makes a > _great_ demo :-) > > Im not _really_ unhappy, Im just throwing this in as an FYI. Well, yet I see no problem. > The Active Debugging interfaces need some way of sorting a call stack. As > many languages may be participating in a debugging session, there is no > implicit ordering available. Inter-language calls are not made via the > debugger, so it has no chance to intercept. > > So the solution MS came up with was, surprise surprise, the machine stack! > :-) The assumption is that all languages will make _some_ use of the > stack, so they ask a language to report its "stack base address" and "stack > size". Using this information, the debugger sorts into the correct call > sequence. Now, I can give it a machine stack. There is just a frame dispatcher sitting on the stack, and it grabs frames from the current thread state. > Indeed, getting this information (even the half of it I did manage :-) was > painful, and hard to get right. I would have to see the AX interface. But for sure there will be some method hooks with which I can tell AX how to walk the frame chain. And why don't I simply publish frames as COM objects? This would give you much more than everything else, I guess. BTW, as it is now, there is no need to use AX debugging for Python, since Python can do it alone now. Of course it makes sense to have it all in the AX environment. You will be able to modify a running programs local variables, its evaluation stack, change its code, change where it returns to, all is doable. ... > Bit I also understand completely the silence on this issue. When the > thread started, there was much discussion about exactly what the hell these > continuation/coroutine thingies even were. However, there were precious > few real-world examples where they could be used. A few acedemic, > theoretical places, but the only real contender I have seen brought up was > Medusa. There were certainly no clear examples of "as soon as we have > this, I could change abc to take advantage, and this would give us the very > cool xyz" The problem was for me, that I had also no understanding what I was doing, actually. Implemented continuations without an idea how they work. But Tim and Sam said they were the most powerful control strucure possible, so I used all my time to find this out. Now I'm beginning to understand. And my continuation based coroutine example turns out to be twenty lines of Python code. Coming soon, after I served my whining customers. > So, if anyone else if feeling at all like me about this issue, they are > feeling all warm and fuzzy knowing that a few smart people are giving us > the facility to do something we hope we never, ever have to do. :-) Think of it as just a flare gun in your hands. By reading the fine print, you will realize that you actually hold an atom bomb, with a little code taming it for you. :-) back-to-the-future - ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From skip at mojam.com Mon Jun 28 15:13:31 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 28 Jun 1999 09:13:31 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000601bec114$2a2929c0$e19e2299@tim> References: <000601bec114$2a2929c0$e19e2299@tim> Message-ID: <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com> Still trying to make the brain shift from out-of-town to back-to-work... Tim> [GordonM] >> Perhaps Christian's stackless Python would enable green threads... What's a green thread? Skip From fredrik at pythonware.com Mon Jun 28 15:37:30 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 28 Jun 1999 15:37:30 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <000601bec114$2a2929c0$e19e2299@tim> <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com> Message-ID: <00ca01bec16b$5eef11e0$f29b12c2@secret.pythonware.com> > What's a green thread? a user-level thread (essentially what you can implement yourself by swapping stacks, etc). it's enough to write smoothly running threaded programs, but not enough to support true concurrency on multiple processors. also see: http://www.sun.com/solaris/java/wp-java/4.html From tismer at appliedbiometrics.com Mon Jun 28 18:11:43 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 28 Jun 1999 18:11:43 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <000601bec114$2a2929c0$e19e2299@tim> <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com> Message-ID: <37779EBF.A146D355@appliedbiometrics.com> Skip Montanaro wrote: > > Still trying to make the brain shift from out-of-town to back-to-work... > > Tim> [GordonM] > >> Perhaps Christian's stackless Python would enable green threads... > > What's a green thread? Nano-Threads. Threadless threads, solely Python driven, no system threads needed but possible. Think of the "big" system threads where each can run any number of tiny Python threads. Powered by snake oil - ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From akuchlin at mems-exchange.org Mon Jun 28 19:55:16 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Mon, 28 Jun 1999 13:55:16 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 Message-ID: <14199.46852.932030.576094@amarok.cnri.reston.va.us> Paul Prescod sent the following note to the XML-SIG mailing list. Thoughts? --amk -------------- next part -------------- An embedded message was scrubbed... From: Paul Prescod Subject: [XML-SIG] [Fwd: Re: parsers for Palm?] Date: Mon, 28 Jun 1999 12:00:50 -0400 Size: 2535 URL: From guido at CNRI.Reston.VA.US Mon Jun 28 21:35:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 15:35:04 -0400 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: Your message of "Mon, 28 Jun 1999 13:55:16 EDT." <14199.46852.932030.576094@amarok.cnri.reston.va.us> References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> Message-ID: <199906281935.PAA01439@eric.cnri.reston.va.us> > Paul Prescod sent the following note to the XML-SIG mailing list. > Thoughts? I don't know any of the acronyms, and I'm busy writing a funding proposal plus two talks for the Monterey conference, so I don't have any thoughts to spare at the moment. Perhaps someone could present the case with some more background info? (It does sounds intriguing, but then again I'm not sure how many people *really* need to parse XML -- it doesn't strike me as something of the same generality as regular expressions yet.) --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at digicool.com Mon Jun 28 21:51:00 1999 From: jim at digicool.com (Jim Fulton) Date: Mon, 28 Jun 1999 15:51:00 -0400 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> Message-ID: <3777D224.6936B890@digicool.com> "Andrew M. Kuchling" wrote: > > Paul Prescod sent the following note to the XML-SIG mailing list. > Thoughts? > When I brought up some ideas for adding a separate validation mechanism for PyExpat, some folks suggested that I should look at some other C libraries, including one from the ILU folks and some other one that I can't remember the name of off hand. Should we (used loosely ;) look into the other libraries before including expat in the Python dist? Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido at CNRI.Reston.VA.US Mon Jun 28 22:07:50 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 16:07:50 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Your message of "Mon, 28 Jun 1999 02:53:21 PDT." <14199.13497.439332.366329@seattle.nightmare.com> References: <14199.13497.439332.366329@seattle.nightmare.com> Message-ID: <199906282007.QAA01570@eric.cnri.reston.va.us> > Part of the problem is that we didn't have the feature to play with. > Many of the possibilities are showing up now that it's here... > > The basic advantage to coroutines is they allow you to turn any > event-driven/state-machine problem into one that is managed with > 'normal' control state; i.e., for loops, while loops, nested procedure > calls, etc... > > Here are a few possible real-world uses: Thanks, Sam! Very useful collection of suggestions. (How come I'm not surprised to see these coming from you ;-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Mon Jun 28 22:08:42 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Mon, 28 Jun 1999 16:08:42 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <199906281935.PAA01439@eric.cnri.reston.va.us> References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> <199906281935.PAA01439@eric.cnri.reston.va.us> Message-ID: <14199.54858.464165.381344@amarok.cnri.reston.va.us> Guido van Rossum writes: >any thoughts to spare at the moment. Perhaps someone could present >the case with some more background info? (It does sounds intriguing, Paul is probably suggesting this so that Python comes with a fast, standardized XML parser out of the box. On the other hand, where do you draw the line? Paul suggests including PyExpat and easySAX (a small SAX implementation), but why not full SAX, and why not DOM? My personal leaning is that we can get more bang for the buck by working on the Distutils effort, so that installing a package like PyExpat becomes much easier, rather than piling more things into the core distribution. -- A.M. Kuchling http://starship.python.net/crew/amk/ The Law, in its majestic equality, forbids the rich, as well as the poor, to sleep under the bridges, to beg in the streets, and to steal bread. -- Anatole France From guido at CNRI.Reston.VA.US Mon Jun 28 22:17:41 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 16:17:41 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Sun, 27 Jun 1999 23:13:15 EDT." <000601bec114$2a2929c0$e19e2299@tim> References: <000601bec114$2a2929c0$e19e2299@tim> Message-ID: <199906282017.QAA01592@eric.cnri.reston.va.us> [Tim] > Moving back in time ... > > [GordonM] > > Perhaps Christian's stackless Python would enable green threads... > > [Guido] > > This has been suggested before... While this seems possible at first, > > all blocking I/O calls would have to be redone to pass control to the > > thread scheduler, before this would be useful -- a huge task! > > I didn't understand this. If I/O calls are left alone, and a green thread > hit one, the whole program just sits there waiting for the call to complete, > right? > > But if the same thing happens using "real threads" today, the same thing > happens today anyway . That is, if a thread doesn't release the > global lock before a blocking call today, the whole program just sits there > etc. > > Or do you have some other kind of problem in mind here? OK, I'll explain. Suppose there's a wrapper for a read() call whose essential code looks like this: Py_BEGIN_ALLOW_THREADS n = read(fd, buffer, size); Py_END_ALLOW_THREADS When the read() call is made, other threads can run. However in green threads (e.g. using Christian's stackless Python, where a thread switcher is easily added) the whole program would block at this point. The way to fix this is to have a way to tell the scheduler "come back to this thread when there's input ready on this fd". The scheduler has to combine such calls from all threads into a single giant select. It gets more complicated when you have blocking I/O wrapped in library functions, e.g. gethostbyname() or fread(). Then, you need to have a way to implement sleep() by talking to the thread schedule (remember, this is the thread scheduler we have to write ourselves). Oh, and of course the thread scheduler must also have a select() lookalike API so I can still implement the select module. Does this help? Or am I misunderstanding your complaint? Or is a missing? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Mon Jun 28 22:23:57 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 16:23:57 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Your message of "Sun, 27 Jun 1999 22:27:06 EDT." <000501bec10d$b6f1fb40$e19e2299@tim> References: <000501bec10d$b6f1fb40$e19e2299@tim> Message-ID: <199906282023.QAA01605@eric.cnri.reston.va.us> > > yes I'm happy - chris > > You should be! So how come nobody else is ? Chris and I have been through this in private, but it seems that as long as I don't fess up in public I'm afraid it will come back and I'll get pressure coming at me to endorse Chris' code. I have no problem with the general concept (see my response to Sam's post of exciting examples). But I have a problem with a megapatch like this that affects many places including very sensitive areas like the main loop in ceval.c. The problem is simply that I know this is very intricate code, and I can't accept a patch of this scale to this code before I understand every little detail of the patch. I'm just too worried otherwise that there's a reference count bug in it that will very subtly break stuff and that will take forever to track down; I feel that when I finally have the time to actually understand the whole patch I'll be able to prevent that (famous last words). Please don't expect action or endorsement of Chris' patch from me any time soon, I'm too busy. However I'd love it if others used the patch in a real system and related their experiences regarding performance, stability etc. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Mon Jun 28 22:24:46 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 28 Jun 1999 16:24:46 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <14199.54858.464165.381344@amarok.cnri.reston.va.us> References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> <199906281935.PAA01439@eric.cnri.reston.va.us> <14199.54858.464165.381344@amarok.cnri.reston.va.us> Message-ID: <14199.55737.544299.718558@cm-24-29-94-19.nycap.rr.com> Andrew> My personal leaning is that we can get more bang for the buck by Andrew> working on the Distutils effort, so that installing a package Andrew> like PyExpat becomes much easier, rather than piling more things Andrew> into the core distribution. Amen to that. See Guido's note and my response regarding soundex in the Doc-SIG. Perhaps you could get away with a very small core distribution that only contained the stuff necessary to pull everything else from the net via http or ftp... Skip From bwarsaw at cnri.reston.va.us Mon Jun 28 23:20:05 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 28 Jun 1999 17:20:05 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> Message-ID: <14199.59141.447168.107784@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> Sorry for the delayed response. I've been out of town. When SM> Barry returns would it be possible to merge the string methods SM> in conditionally (#ifdef STRING_METHODS) and add a SM> --with-string-methods configure option? How hard would it be SM> to modify string.py, stringobject.c and stropmodule.c to carry SM> that around? How clean do you want this separation to be? Just disabling the actual string methods would be easy, and I'm sure I can craft a string.py that would work in either case (remember stropmodule.c wasn't even touched). There are a few other miscellaneous changes mostly having to do with some code cleaning, but those are probably small (and uncontroversial?) enough that they can either stay in, or be easily understood and accepted (optimistic aren't I? :) by Guido during the merge. I'll see what I can put together in the next 1/2 hour or so. -Barry From skip at mojam.com Mon Jun 28 23:37:03 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 28 Jun 1999 17:37:03 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: <14199.59141.447168.107784@anthem.cnri.reston.va.us> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> Message-ID: <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> >>>>> "BAW" == Barry A Warsaw writes: >>>>> "SM" == Skip Montanaro writes: SM> would it be possible to merge the string methods in conditionally SM> (#ifdef STRING_METHODS) ... BAW> How clean do you want this separation to be? Just disabling the BAW> actual string methods would be easy, and I'm sure I can craft a BAW> string.py that would work in either case (remember stropmodule.c BAW> wasn't even touched). Barry, I would be happy with having to manually #define STRING_METHODS in stringobject.c. Forget about the configure flag at first. I think the main point for experimenters like myself is that it is a hell of a lot easier to twiddle a #define than to try merging different CVS branches to get access to the functionality. Most of us have probably advanced far enough on the Emacs, vi or Notepad learning curves to handle that change, while most of us are probably not CVS wizards. Once it's in the main CVS branch, you can announce the change or not on the main list as you see fit (perhaps on python-dev sooner and on python-list later after some more experience has been gained with the patches). Skip From tismer at appliedbiometrics.com Mon Jun 28 23:41:28 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 28 Jun 1999 23:41:28 +0200 Subject: [Python-Dev] ob_refcnt access References: <000501bec10d$b6f1fb40$e19e2299@tim> <199906282023.QAA01605@eric.cnri.reston.va.us> Message-ID: <3777EC08.42C15478@appliedbiometrics.com> Guido van Rossum wrote: > > > > yes I'm happy - chris > > > > You should be! So how come nobody else is ? > > Chris and I have been through this in private, but it seems that as > long as I don't fess up in public I'm afraid it will come back and > I'll get pressure coming at me to endorse Chris' code. Please let me add a few comments. > I have no problem with the general concept (see my response to Sam's > post of exciting examples). This is the most worthful statement I can get. And see below. > But I have a problem with a megapatch like this that affects many > places including very sensitive areas like the main loop in ceval.c. Actually it is a rather small patch, but the implicit semantic change is rather hefty. > The problem is simply that I know this is very intricate code, and I > can't accept a patch of this scale to this code before I understand > every little detail of the patch. I'm just too worried otherwise that > there's a reference count bug in it that will very subtly break stuff > and that will take forever to track down; I feel that when I finally > have the time to actually understand the whole patch I'll be able to > prevent that (famous last words). I never expected to see this patch go into Python right now. The current public version is an alpha 0.2. Meanwhile I have 0.3, with again new patches, and a completely reworked policy of frame refcounting. Even worse, there is a night mare of more work which I simply had no time for. All the instance and onbect code must be carefully changed, since they still need to call back in a recursive way. This is hard to change until I have a better mechanism to generate all the callbacks. For instance, I cannot switch tasks in an __init__ at this time. Although I can do so in regular methods. But this is all half-baked. In other words, the danger is by far not over, but still in the growing phase. I believe I should work on and maintain this until I'm convinced that there are not more refcount bugs than before, and until I have evicted every recursion which is a serious impact. This is still months of work. When I release the final version, I will pay $100 to the first person who finds a refcount bug which I introduced. But not before. I don't want to waste Guido's time, and for sure not now with this bloody fresh code. What I needed to know is wether I am on the right track or if I'm wasting my time. But since I have users already, it is no waste at all. What I really could use were some hints about API design. Guido, thank you for Python - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Tue Jun 29 00:04:05 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 28 Jun 1999 18:04:05 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> Message-ID: <14199.61781.695240.71428@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> I would be happy with having to manually #define SM> STRING_METHODS in stringobject.c. Forget about the configure SM> flag at first. Oh, I agree -- I wasn't going to add the configure flag anyway :) What I meant was how much of my changes should be ifdef-out-able? Just the methods on string objects? All my changes? -Barry From skip at mojam.com Tue Jun 29 00:30:55 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 28 Jun 1999 18:30:55 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: <14199.61781.695240.71428@anthem.cnri.reston.va.us> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> <14199.61781.695240.71428@anthem.cnri.reston.va.us> Message-ID: <14199.63115.58129.480522@cm-24-29-94-19.nycap.rr.com> BAW> Oh, I agree -- I wasn't going to add the configure flag anyway :) BAW> What I meant was how much of my changes should be ifdef-out-able? BAW> Just the methods on string objects? All my changes? Well, when the CPP macro is undefined, the behavior from Python should be unchanged, yes? Am I missing something? There are string methods and what else involved in the changes? If string.py has to test to see if "".capitalize yields an AttributeError to decide what to do, I think that sort of change will be simple enough to accommodate. Any new code that gets well-exercised now before string methods become widely available is all to the good in my opinion. It's not fixing something that ain't broke, more like laying the groundwork for new directions. Skip From bwarsaw at cnri.reston.va.us Tue Jun 29 01:04:55 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 28 Jun 1999 19:04:55 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> <14199.61781.695240.71428@anthem.cnri.reston.va.us> <14199.63115.58129.480522@cm-24-29-94-19.nycap.rr.com> Message-ID: <14199.65431.161001.730247@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> Well, when the CPP macro is undefined, the behavior from SM> Python should be unchanged, yes? Am I missing something? SM> There are string methods and what else involved in the SM> changes? There are a few additions to the C API, but these probably don't need to be ifdef'd, since they don't change the existing semantics or interfaces. abstract.c has some code cleaning and reorganization, but the public API and semantics should be unchanged. Builtin long() and int() have grown an extra optional argument, which specifies the base to use. If this extra argument isn't given then they should work the same as in the main branch. Should we ifdef out the extra argument? SM> If string.py has to test to see if "".capitalize yields an SM> AttributeError to decide what to do, I think that sort of SM> change will be simple enough to accommodate. Basically what I've got is to move the main-branch string.py to stringold.py and if you get an attribute error on ''.upper I do a "from stringold import *". I've also got some hackarounds for test_string.py to make it work with or without string methods. SM> Any new code that gets well-exercised now before string SM> methods become widely available is all to the good in my SM> opinion. It's not fixing something that ain't broke, more SM> like laying the groundwork for new directions. Agreed. I'll check my changes in shortly. The ifdef will only disable the string methods. long() and int() will still accept the option argument. Stay tuned, -Barry From tim_one at email.msn.com Tue Jun 29 06:16:34 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 29 Jun 1999 00:16:34 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199906282017.QAA01592@eric.cnri.reston.va.us> Message-ID: <000201bec1e6$2c496940$229e2299@tim> [Tim, claims not to understand Guido's > While this seems possible at first, all blocking I/O calls would > have to be redone to pass control to the thread scheduler, before > this would be useful -- a huge task! ] [Guido replies, sketching an elaborate scheme for making threads that are fake nevertheless act like real threads in the particular case of potentially blocking I/O calls] > ... > However in green threads (e.g. using Christian's stackless Python, > where a thread switcher is easily added) the whole program would block > at this point. The way to fix this is [very painful ]. > ... > Does this help? Or am I misunderstanding your complaint? Or is a > missing? No missing wink; I think it hinges on a confusion about the meaning of your original word "useful". Threads can be very useful purely as a means for algorithm structuring, due to independent control flows. Indeed, I use threads in Python most often these days without any hope or even *use* for potential parallelism (overlapped I/O or otherwise). It's the only non-brain-busting way to write code now that requires advanced control of the iterator, generator, coroutine, or even independent-agents-in-a-pipeline flavors. Fake threads would allow code like that to run portably, and also likely faster than with the overheads of OS-level threads. For pedagogical and debugging purposes too, fake threads could be very much friendlier than the real thing. Heck, we could even run them on a friendly old Macintosh . If all fake threads block when any hits an I/O call, waiting for the latter to return, we're no worse off than in a single-threaded program. Being "fake threads", it *is* a single-threaded program, so it's not even a surprise . Maybe in your Py_BEGIN_ALLOW_THREADS n = read(fd, buffer, size); Py_END_ALLOW_THREADS you're assuming that some other Python thread needs to run in order for the read implementation to find something to read? Then that's a dead program for sure, as it would be for a single-threaded run today too. I can live with that! I don't expect fake threads to act like real threads in all cases. My assumption was that the BEGIN/END macros would do nothing under fake threads -- since there isn't a real thread backing it up, a fake thread can't yield in the middle of random C code (Python has no way to capture/restore the C state). I didn't picture fake threads working except as a Python-level feature, with context switches limited to bytecode boundaries (which a stackless ceval can handle with ease; the macro context switch above is "in the middle of" some bytecode's interpretation, and while "green threads" may be interested in simulating the that, Tim's "fake threads" aren't). different-threads-for-different-heads-ly y'rs - tim From guido at CNRI.Reston.VA.US Tue Jun 29 14:01:30 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 29 Jun 1999 08:01:30 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Tue, 29 Jun 1999 00:16:34 EDT." <000201bec1e6$2c496940$229e2299@tim> References: <000201bec1e6$2c496940$229e2299@tim> Message-ID: <199906291201.IAA02535@eric.cnri.reston.va.us> > [Tim, claims not to understand Guido's > > > While this seems possible at first, all blocking I/O calls would > > have to be redone to pass control to the thread scheduler, before > > this would be useful -- a huge task! > > ] > > [Guido replies, sketching an elaborate scheme for making threads that > are fake nevertheless act like real threads in the particular case of > potentially blocking I/O calls] [Tim responds, explaining that without this threads are quite useful.] I guess it's all in the perspective. 99.99% of all thread apps I've ever written use threads primarily to overlap I/O -- if there wasn't I/O to overlap I wouldn't use a thread. I think I share this perspective with most of the thread community (after all, threads originate in the OS world where they were invented as a replacement for I/O completion routines). (And no, I don't use threads to get the use of multiple CPUs, since I almost never have had more than one of those. And no, I wasn't expecting the read() to be fed from another thread.) As far as I can tell, all the examples you give are easily done using coroutines. Can we call whatever you're asking for coroutines instead of fake threads? I think that when you mention threads, green or otherwise colored, most people who are at all familiar with the concept will assume they provide I/O overlapping, except perhaps when they grew up in the parallel machine world. Certainly all examples I give in my never-completed thread tutorial (still available at http://www.python.org/doc/essays/threads.html) use I/O as the primary motivator -- this kind of example appeals to simples souls (e.g. downloading more than one file in parallel, which they probably have already seen in action in their web browser), as opposed to generators or pipelines or coroutines (for which you need to have some programming theory background to appreciate the powerful abstraction possibillities they give). Another good use of threads (suggested by Sam) is for GUI programming. An old GUI system, News by David Rosenthal at Sun, used threads programmed in PostScript -- very elegant (and it failed for other reasons -- if only he had used Python instead :-). On the other hand, having written lots of GUI code using Tkinter, the event-driven version doesn't feel so bad to me. Threads would be nice when doing things like rubberbanding, but I generally agree with Ousterhout's premise that event-based GUI programming is more reliable than thread-based. Every time your Netscape freezes you can bet there's a threading bug somewhere in the code. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Jun 30 02:03:37 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Tue, 29 Jun 1999 19:03:37 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199906291201.IAA02535@eric.cnri.reston.va.us> References: Your message of "Tue, 29 Jun 1999 00:16:34 EDT." <000201bec1e6$2c496940$229e2299@tim> Message-ID: <1281421591-30373695@hypernet.com> I've been out of town, too (not with Skip), but I'll jump back in here... [Guido] > When the read() call is made, other threads can run. However in > green threads (e.g. using Christian's stackless Python, where a > thread switcher is easily added) the whole program would block at > this point. The way to fix this is to have a way to tell the > scheduler "come back to this thread when there's input ready on > this fd". The scheduler has to combine such calls from all > threads into a single giant select. It gets more complicated when > you have blocking I/O I suppose, in the best of all possible worlds, this is true. But I'm fairly sure there are a number of well-used green thread implementations which go only part way - eg, if this is a "selectable" fd, do a select with a timeout of 0 on this one fd and choose to read/write or swap accordingly. That's a fair amount of bang for the buck, I think... [Tim] > Threads can be very useful purely as a means for algorithm > structuring, due to independent control flows. Spoken like a true schizo, Tim me boyos! Actually, you and Guido are saying almost the same thing - threads are useful when more than one thing is "driving" your processing. It's just that in the real world, that's almost always I/O, not some sick, tortured internal dialogue... I think the real question is: how useful would this be on a Mac? On Win31? (I'll answer that - useful, though I've finally got my last Win31 client to promise to upgrade, RSN ). - Gordon From MHammond at skippinet.com.au Wed Jun 30 01:47:26 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 30 Jun 1999 09:47:26 +1000 Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? Message-ID: <006f01bec289$bf1e3a90$0801a8c0@bobcat> This probably isnt the correct list, but I really dont want to start a philosophical discussion - hopefully people here are both "in the know" and able to resist a huge thread :-) Especially given the recent slashdot flamefest between RMS and ESR, I thought it worth getting correct. I just read a statement early in our book - "Python is an Open Source tool, ...". Is this "near enough"? Should I avoid this term in preference for something more generic (ie, even simply dropping the caps?) - but the OS(tm) idea seems doomed anyway... Just-hoping-to-avoid-flame-mail-from-rabid-devotees-of-either-religion :-) Mark. From da at ski.org Wed Jun 30 08:16:01 1999 From: da at ski.org (David Ascher) Date: Tue, 29 Jun 1999 23:16:01 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? In-Reply-To: <006f01bec289$bf1e3a90$0801a8c0@bobcat> Message-ID: On Wed, 30 Jun 1999, Mark Hammond wrote: > I just read a statement early in our book - "Python is an Open Source tool, > ...". > > Is this "near enough"? Should I avoid this term in preference for > something more generic (ie, even simply dropping the caps?) - but the > OS(tm) idea seems doomed anyway... It's not certified Open Source, but my understanding is that ESR believes the Python license would qualify if GvR applied for certification. BTW, you won't be able to avoid flames about something or other, and given that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and pseudo-RMSs, all Anonymous Cowards. =) --david From fredrik at pythonware.com Wed Jun 30 10:42:15 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 30 Jun 1999 10:42:15 +0200 Subject: [Python-Dev] Is Python Free Software, free software, Open Source,open source, etc? References: Message-ID: <012601bec2d4$74c315b0$f29b12c2@secret.pythonware.com> > BTW, you won't be able to avoid flames about something or other, and given > that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and > pseudo-RMSs, all Anonymous Cowards. =) just check the latest "learning python" review on Amazon... surely proves that perlers are weird people ;-) From guido at CNRI.Reston.VA.US Wed Jun 30 14:06:21 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 30 Jun 1999 08:06:21 -0400 Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? In-Reply-To: Your message of "Tue, 29 Jun 1999 23:16:01 PDT." References: Message-ID: <199906301206.IAA04619@eric.cnri.reston.va.us> > On Wed, 30 Jun 1999, Mark Hammond wrote: > > > I just read a statement early in our book - "Python is an Open Source tool, > > ...". > > > > Is this "near enough"? Should I avoid this term in preference for > > something more generic (ie, even simply dropping the caps?) - but the > > OS(tm) idea seems doomed anyway... > > It's not certified Open Source, but my understanding is that ESR believes > the Python license would qualify if GvR applied for certification. I did, months ago, and haven't heard back yet. My current policy is to drop the initial caps and say "open source" -- most people don't know the difference anyway. > BTW, you won't be able to avoid flames about something or other, and given > that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and > pseudo-RMSs, all Anonymous Cowards. =) I don't have the time to read slashdot -- can anyone summarize what ESR and RMS were flaming about? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Wed Jun 30 14:22:09 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 30 Jun 1999 14:22:09 +0200 Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? References: <199906301206.IAA04619@eric.cnri.reston.va.us> Message-ID: <000701bec2f3$2df78430$f29b12c2@secret.pythonware.com> > I did, months ago, and haven't heard back yet. My current policy is > to drop the initial caps and say "open source" -- most people don't > know the difference anyway. and "Open Source" cannot be trademarked anyway... > I don't have the time to read slashdot -- can anyone summarize what > ESR and RMS were flaming about? the usual; RMS wrote in saying that 1) he's not part of the open source movement, 2) open source folks don't under- stand the real meaning of the word freedom, and 3) he's not a communist. ESR response is here: http://www.tuxedo.org/~esr/writings/shut-up-and-show-them.html ... OSI's tactics work. That's the easy part of the lesson. The hard part is that the FSF's tactics don't work, and never did. ... So the next time RMS, or anybody else, urges you to "talk about freedom", I urge you to reply "Shut up and show them the code." imo, the best thing is of course to ignore them both, and continue to ship great stuff under a truly open license... From bwarsaw at cnri.reston.va.us Wed Jun 30 14:54:06 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 30 Jun 1999 08:54:06 -0400 (EDT) Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? References: <199906301206.IAA04619@eric.cnri.reston.va.us> <000701bec2f3$2df78430$f29b12c2@secret.pythonware.com> Message-ID: <14202.4974.162380.284749@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> imo, the best thing is of course to ignore them both, and FL> continue to ship great stuff under a truly open license... Agreed, of course. I think given the current state of affairs (i.e. the non-trademarkability of "Open Source", but also the mind share that little-oh, little-ess has gotten), we should say that Python (and JPython) are "open source" projects and let people make up their own minds about what that means. waiting-for-guido's-inevitable-faq-entry-ly y'rs, -Barry From tismer at appliedbiometrics.com Tue Jun 29 20:17:51 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 29 Jun 1999 20:17:51 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000201bec1e6$2c496940$229e2299@tim> <199906291201.IAA02535@eric.cnri.reston.va.us> Message-ID: <37790DCF.7C0E8FA@appliedbiometrics.com> Guido van Rossum wrote: > [Guido and Tim, different opinions named misunderstanding :] > > I guess it's all in the perspective. 99.99% of all thread apps I've > ever written use threads primarily to overlap I/O -- if there wasn't > I/O to overlap I wouldn't use a thread. I think I share this > perspective with most of the thread community (after all, threads > originate in the OS world where they were invented as a replacement > for I/O completion routines). > > (And no, I don't use threads to get the use of multiple CPUs, since I > almost never have had more than one of those. And no, I wasn't > expecting the read() to be fed from another thread.) > > As far as I can tell, all the examples you give are easily done using > coroutines. Can we call whatever you're asking for coroutines instead > of fake threads? I don't think this would match it. These threads can be implemented by coroutines which always run apart, and have some scheduling running. When there is polled I/O available, they can of course give a threaded feeling. If an application polls the kbhit function instead of reading, the other "threads" can run nicely. Can be quite useful for very small computers like CE. Many years before, I had my own threads under Turbo Pascal (I had no idea that these are called so). Ok, this was DOS, but it was enough of threading to have a "process" which smoothly updated a graphics screen, while another (single! :) "process" wrote data to the disk, a third one handled keyboard input, and a fourth drove a multichannel A/D sampling device. ? Oops, I just realized that these were *true* threads. The disk process would not run smooth, I agree. All the rest would be fine with green threads. ... > On the other hand, having written lots of GUI code using Tkinter, the > event-driven version doesn't feel so bad to me. Threads would be nice > when doing things like rubberbanding, but I generally agree with > Ousterhout's premise that event-based GUI programming is more reliable > than thread-based. Every time your Netscape freezes you can bet > there's a threading bug somewhere in the code. Right. But with a traceback instead of a machine hang, this could be more attractive to do. Green threads/coroutines are incredibly fast (one c call per switch). And since they have local state, you can save most of the attribute lookups which are needed with event based programming. (But this is all theory until we tried it). ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Sun Jun 6 21:54:04 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sun, 06 Jun 1999 21:54:04 +0200 Subject: [Python-Dev] Stackless Preview (was: Memory leak under Idle?) References: <000901beafcc$424ec400$639e2299@tim> Message-ID: <375AD1DC.19C1C0F6@appliedbiometrics.com> Tim Peters wrote: [see the main list on idle leaks] > if-pystone-works-ship-it-ly y'rs - tim Well, on request of uncle Timmy, I do it. Although it's very early. A preview of stackless Python can be found under ftp://ftp.pns.cc/pub/stackless_990606.zip Current status: The main interpreter is completely stackless. Just for fun, I've set max recursion depth to 30000, so just try it. PyStone does of course run. My measures were about 3-5 percent slower than with standard Python. I think this is quite fair. As a side effect, the exec_statement now behaves better than before, since exec without globals and locals should update the current environment, which worked only for exec "string". Most of the Run_ functions are stackless as well. Almost all cases could be treated tail recursively. I have just begun to work on the builtins, and there is a very bloody, new-born stackless map, which seems to behave quite well. (It is just an hour old, so don't blame me if I didn't get al refcounts right). This is a first special case, since I *had* to build a tiny interpreter from the old map code. Still quite hacky, but not so bad. It creates its own frame and bails out whenever it needs to call the interpreter. If not, it stays in the loop. Since this one is so fresh, the old map is still there, and the new one has the name "map_nr". As a little bonus, map_nr now also shows up in a traceback. I've set the line no to the iteration count. Beware, this is just a proof of concept and will most probably change. Further plans: I will make the other builtins stackless as well (reduce, filter), also the simple tail-recursive ones which I didn't do now due to lack of time. I think I will *not* think of stackless imports. After loking into this for a while, I think this is rather hairy, and also not necessary. On extensions: There will be a coroutine extension in a few days. This is now nearly a no-brainer, since I did the stackless Python with exactly that in mind. This is the real fruit where I'm after, so please let me pick it :) Documentation: Besides the few new comments, there is nothing yet. Diff files: Sorry, there are no diffs but just the modified files. I had no time to do them now. All files stem from the official Python 1.5.2 release. You might wonder about the version: In order to support extension modules which rely on some special new features of frames, I decided to name this Python "1.5.42", since I believe it will be useful at least "four two" people. :-) I consider this an Alpha 1 version. fearing the feedback :-) ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From da at ski.org Mon Jun 7 18:43:09 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 09:43:09 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl Message-ID: In case you haven't heard about it, ActiveState has recently signed a contract with Microsoft to do some work on Perl on win32. One interesting aspect of this for Python is the specific work being performed. From the FAQ on this joint effort, one gets, under "What is the scope of the work that is being done?": fork() This implementation of fork() will clone the running interpreter and create a new interpreter with its own thread, but running in the same process space. The goal is to achieve functional equivalence to fork() on UNIX systems without suffering the performance hit of the process creation overhead on Win32 platforms. Emulating fork() within a single process needs the ability to run multiple interpreters concurrently in separate threads. Perl version 5.005 has experimental support for this in the form of the PERL_OBJECT build option, but it has some shortcomings. PERL_OBJECT needs a C++ compiler, and currently only works on Windows. ActiveState will be working to provide support for revamped support for the PERL_OBJECT functionality that will run on every platform that Perl will build on, and will no longer require C++ to work. This means that other operating systems that lack fork() but have support for threads (such as VMS and MacOS) will benefit from this aspect of the work. Any guesses as to whether we could hijack this work if/when it is released as Open Source? --david From guido at CNRI.Reston.VA.US Mon Jun 7 18:49:27 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 12:49:27 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 09:43:09 PDT." References: Message-ID: <199906071649.MAA12619@eric.cnri.reston.va.us> > In case you haven't heard about it, ActiveState has recently signed a > contract with Microsoft to do some work on Perl on win32. Have I ever heard of it! :-) David Grove pulled me into one of his bouts of paranoia. I think he's calmed down for the moment. > One interesting aspect of this for Python is the specific work being > performed. From the FAQ on this joint effort, one gets, under "What is > the scope of the work that is being done?": > > fork() > > This implementation of fork() will clone the running interpreter > and create a new interpreter with its own thread, but running in the > same process space. The goal is to achieve functional equivalence to > fork() on UNIX systems without suffering the performance hit of the > process creation overhead on Win32 platforms. > > Emulating fork() within a single process needs the ability to run > multiple interpreters concurrently in separate threads. Perl version > 5.005 has experimental support for this in the form of the PERL_OBJECT > build option, but it has some shortcomings. PERL_OBJECT needs a C++ > compiler, and currently only works on Windows. ActiveState will be > working to provide support for revamped support for the PERL_OBJECT > functionality that will run on every platform that Perl will build on, > and will no longer require C++ to work. This means that other operating > systems that lack fork() but have support for threads (such as VMS and > MacOS) will benefit from this aspect of the work. > > Any guesses as to whether we could hijack this work if/when it is released > as Open Source? When I saw this, my own response was simply "those poor Perl suckers are relying too much of fork()." Am I wrong, and is this also a habit of Python programmers? Anyway, I doubt that we coould use their code, as it undoubtedly refers to reimplementing fork() at the Perl level, not at the C level (which would be much harder). --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Mon Jun 7 18:51:45 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 09:51:45 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071649.MAA12619@eric.cnri.reston.va.us> Message-ID: On Mon, 7 Jun 1999, Guido van Rossum wrote: > When I saw this, my own response was simply "those poor Perl suckers > are relying too much of fork()." Am I wrong, and is this also a habit > of Python programmers? Well, I find the fork() model to be a very simple one to use, much easier to manage than threads or full-fledged IPC. So, while I don't rely on it in any crucial way, it's quite convenient at times. --david From guido at CNRI.Reston.VA.US Mon Jun 7 18:56:22 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 12:56:22 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 09:51:45 PDT." References: Message-ID: <199906071656.MAA12642@eric.cnri.reston.va.us> > Well, I find the fork() model to be a very simple one to use, much easier > to manage than threads or full-fledged IPC. So, while I don't rely on it > in any crucial way, it's quite convenient at times. Can you give a typical example where you use it, or is this just a gut feeling? It's also dangerous -- e.g. unexpected errors may percolate down the wrong stack (many mailman bugs had to do with forking), GUI apps generally won't be cloned, and some extension libraries don't like to be cloned either (e.g. ILU). --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Mon Jun 7 19:02:31 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 10:02:31 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071656.MAA12642@eric.cnri.reston.va.us> Message-ID: On Mon, 7 Jun 1999, Guido van Rossum wrote: > Can you give a typical example where you use it, or is this just a gut > feeling? Well, the latest example was that I wanted to spawn a Python process to do viewing of NumPy arrays with Tk from within the Python interactive shell (without using a shell wrapper). It's trivial with a fork(), and non-trivial with threads. The solution I had to finalize on was to branch based on OS and do threads where threads are available and fork() otherwise. Likely 2.05 times as many errors as with a single solution =). > It's also dangerous -- e.g. unexpected errors may percolate down the > wrong stack (many mailman bugs had to do with forking), GUI apps > generally won't be cloned, and some extension libraries don't like to > be cloned either (e.g. ILU). More dangerous than threads? Bwaaahaahaa! =). fork() might be "deceivingly simple in appearance", I grant you that. But sometimes that's good enough. It's also possible that fork() without all of its process-handling relatives isn't useful enough to warrant the effort. --david From bwarsaw at cnri.reston.va.us Mon Jun 7 19:05:20 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 7 Jun 1999 13:05:20 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl References: <199906071656.MAA12642@eric.cnri.reston.va.us> Message-ID: <14171.64464.805578.325069@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> It's also dangerous -- e.g. unexpected errors may percolate Guido> down the wrong stack (many mailman bugs had to do with Guido> forking), GUI apps generally won't be cloned, and some Guido> extension libraries don't like to be cloned either Guido> (e.g. ILU). Rambling mode on... Okay, so you can't guarantee that fork will be everywhere you might want to run an application. For example, that's one of the main reasons Mailman hasn't been ported off of Un*x. But you also can't guarantee that threads will be everywhere either. One of the things I'd (eventually) like to do is to re-architect Mailman so that it uses a threaded central server instead of the current one-shot process model. But there's been debate among the developers because 1) threads aren't supported everywhere, and 2) thread support isn't built-in by default anyway. I wonder if it's feasible or useful to promote threading support in Python? Thoughts would include building threads in by default if possible on the platform, integrating Greg's free threading mods, etc. Providing more integrated support for threads might encourage programmers to reach for that particular tool instead of fork, which is crude, but pretty damn handy and easy to use. Rambling mode off... -Barry From jim at digicool.com Mon Jun 7 19:07:59 1999 From: jim at digicool.com (Jim Fulton) Date: Mon, 07 Jun 1999 13:07:59 -0400 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <375BFC6F.BF779796@digicool.com> David Ascher wrote: > > On Mon, 7 Jun 1999, Guido van Rossum wrote: > > > When I saw this, my own response was simply "those poor Perl suckers > > are relying too much of fork()." Am I wrong, and is this also a habit > > of Python programmers? > > Well, I find the fork() model to be a very simple one to use, much easier > to manage than threads or full-fledged IPC. So, while I don't rely on it > in any crucial way, it's quite convenient at times. Interesting. I prefer threads because they eliminate the *need* for an IPC. I find locks and the various interesting things you can build from them to be much easier to deal with and more elegant than IPC. I wonder if the perl folks are also going to emulate doing IPC in the same process. Hee hee. :) Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From da at ski.org Mon Jun 7 19:10:56 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 10:10:56 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14171.64464.805578.325069@anthem.cnri.reston.va.us> Message-ID: On Mon, 7 Jun 1999, Barry A. Warsaw wrote: > I wonder if it's feasible or useful to promote threading support in > Python? Thoughts would include building threads in by default if > possible on the platform, That seems a good idea to me. It's a relatively safe thing to enable by default, no? > Providing more integrated support for threads might encourage > programmers to reach for that particular tool instead of fork, which > is crude, but pretty damn handy and easy to use. While we're at it, it'd be nice if we could provide a better answer when someone asks (as "they" often do) "how do I program with threads in Python" than our usual "the way you'd do it in C". Threading tutorials are very hard to come by, I've found (I got the ORA multi-threaded programming in win32, but it's such a monster I've barely looked at it). I suggest that we allocate about 10% of TimBot's time to that task. If necessary, we can upgrade it to a dual-CPU setup. With Greg's threading patches, we could even get it to run on both CPUs efficiently. It could write about itself. --david From akuchlin at mems-exchange.org Mon Jun 7 19:20:15 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Mon, 7 Jun 1999 13:20:15 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> Message-ID: <14171.65359.306743.276505@amarok.cnri.reston.va.us> David Ascher writes: >While we're at it, it'd be nice if we could provide a better answer when >someone asks (as "they" often do) "how do I program with threads in >Python" than our usual "the way you'd do it in C". Threading tutorials >are very hard to come by, I've found (I got the ORA multi-threaded Agreed; I'd love to see a HOWTO on thread programming. I really liked Andrew Birrell's introduction to threads for Modula-3; see http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html (Postscript and PDF versions available.) Translating its approach to Python would be an excellent starting point. -- A.M. Kuchling http://starship.python.net/crew/amk/ "If you had stayed with us, we could have given you life until death." "Don't I get that anyway?" -- Stheno and Lyta Hall, in SANDMAN #61: "The Kindly Ones:5" From guido at CNRI.Reston.VA.US Mon Jun 7 19:24:45 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 13:24:45 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 13:20:15 EDT." <14171.65359.306743.276505@amarok.cnri.reston.va.us> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14171.65359.306743.276505@amarok.cnri.reston.va.us> Message-ID: <199906071724.NAA12743@eric.cnri.reston.va.us> > David Ascher writes: > >While we're at it, it'd be nice if we could provide a better answer when > >someone asks (as "they" often do) "how do I program with threads in > >Python" than our usual "the way you'd do it in C". Threading tutorials > >are very hard to come by, I've found (I got the ORA multi-threaded Andrew Kuchling chimes in: > Agreed; I'd love to see a HOWTO on thread programming. I really > liked Andrew Birrell's introduction to threads for Modula-3; see > http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html > (Postscript and PDF versions available.) Translating its approach to > Python would be an excellent starting point. Another idea is for someone to finish the thread tutorial that I started early 1998 (and never finished because I realized that it needed the threading module and some thread-safety patches to urllib for the examples I had in mind to work). It's actually on the website (but unlinked-to): http://www.python.org/doc/essays/threads.html --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at cnri.reston.va.us Mon Jun 7 19:28:57 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Mon, 7 Jun 1999 13:28:57 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071724.NAA12743@eric.cnri.reston.va.us> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14171.65359.306743.276505@amarok.cnri.reston.va.us> <199906071724.NAA12743@eric.cnri.reston.va.us> Message-ID: <14172.289.552901.264826@bitdiddle.cnri.reston.va.us> Indeed, it might be better to start with the threading module for the first tutorial. While I'm also a fan of Birrell's paper, it would encourage people to start with the low-level thread module, instead of the higher-level threading module. So the right answer, of course, is to do both! Jeremy From bwarsaw at cnri.reston.va.us Mon Jun 7 19:36:05 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 7 Jun 1999 13:36:05 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> Message-ID: <14172.773.807413.412693@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: >> I wonder if it's feasible or useful to promote threading >> support in Python? Thoughts would include building threads in >> by default if possible on the platform, DA> That seems a good idea to me. It's a relatively safe thing to DA> enable by default, no? Don't know how hard it would be to write the appropriate configure tests, but then again, if it was easy I'd'a figured Guido would have done it already. A simple thing would be to change the default sense of "Do we build in thread support?". Make this true by default, and add a --without-threads configure flag people can use to turn them off. -Barry From skip at mojam.com Tue Jun 8 00:37:38 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 7 Jun 1999 18:37:38 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.773.807413.412693@anthem.cnri.reston.va.us> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14172.773.807413.412693@anthem.cnri.reston.va.us> Message-ID: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> BAW> A simple thing would be to change the default sense of "Do we build BAW> in thread support?". Make this true by default, and add a BAW> --without-threads configure flag people can use to turn them off. True enough, but as Guido pointed out, enabling threads by default would immediately make the Mac a second-class citizen. Test cases and demos would eventually find their way into the distribution that Mac users could not run, etc., etc. It may not account for a huge fraction of the Python development seats, but it seems a shame to leave it out in the cold. Has there been an assessment of how hard it would be to add thread support to the Mac? On a scale of 1 to 10 (1: we know how, but it's not implemented because nobody's needed it so far, 10: drilling for oil on the sun would be easier), how hard would it be? I assume Jack Jansen is on this list. Jack, any thoughts? Alpha code? Pre-alpha code? Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From da at ski.org Tue Jun 8 00:43:32 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 15:43:32 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: On Mon, 7 Jun 1999, Skip Montanaro wrote: > True enough, but as Guido pointed out, enabling threads by default would > immediately make the Mac a second-class citizen. Test cases and demos would > eventually find their way into the distribution that Mac users could not > run, etc., etc. It may not account for a huge fraction of the Python > development seats, but it seems a shame to leave it out in the cold. I'm not sure I buy that argument. There are already thread demos in the current directory, and no one complains. The windows builds are already threaded by default, and it's not caused any problems that I know of. Think of it like enabling the *new* module. =) > Has there been an assessment of how hard it would be to add thread > support to the Mac? That's an interesting question, especially since ActiveState lists it as a machine w/ threads and w/o fork(). --david From skip at mojam.com Tue Jun 8 00:49:12 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 7 Jun 1999 18:49:12 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com> David> I'm not sure I buy that argument. Think of it like enabling the David> *new* module. =) That's not quite the same thing. The new module simply exposes some normally closed-from-Python-code data structures to the Python programmer. Enabling threads requires some support from the underlying runtime system. If that was already in place, I suspect the Mac binaries would come with the thread module enabled by default, yes? Skip From da at ski.org Tue Jun 8 00:58:22 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 15:58:22 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com> Message-ID: On Mon, 7 Jun 1999, Skip Montanaro wrote: > That's not quite the same thing. The new module simply exposes some > normally closed-from-Python-code data structures to the Python programmer. > Enabling threads requires some support from the underlying runtime system. > If that was already in place, I suspect the Mac binaries would come with the > thread module enabled by default, yes? I'm not denying that. It's just that there are lots of things which fall into that category, like (to take a pointed example =), os.fork(). We don't have a --with-fork configure flag. We expose to the Python programmer all of the underlying OS that is 'wrapped' as long as it's reasonably portable. I think that most unices + win32 is a reasonable approximation of 'reasonably portable'. And in fact, this change might motivate someone with Mac fervor to explore adding Python support of Mac threads. --david From gmcm at hypernet.com Tue Jun 8 02:01:56 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Mon, 7 Jun 1999 19:01:56 -0500 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: <1283322126-63868517@hypernet.com> David Ascher wrote: > On Mon, 7 Jun 1999, Skip Montanaro wrote: > > > True enough, but as Guido pointed out, enabling threads by default would > > immediately make the Mac a second-class citizen. > I'm not sure I buy that argument. There are already thread demos in > the current directory, and no one complains. The windows builds are > already threaded by default, and it's not caused any problems that I > know of. Think of it like enabling the *new* module. =) > > > Has there been an assessment of how hard it would be to add thread > > support to the Mac? > > That's an interesting question, especially since ActiveState lists > it as a machine w/ threads and w/o fork(). Not a Mac programmer, but I recall that when Steve Jobs came back, they published a schedule that said threads would be available a couple releases down the road. Schedules only move one way, so I'd guess ActiveState is premature. Perhaps Christian's stackless Python would enable green threads... (And there are a number of things in the standard distribution which don't work on Windows, either; fork and select()ing on file fds). - Gordon From skip at mojam.com Tue Jun 8 01:06:34 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 7 Jun 1999 19:06:34 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: References: <14172.19387.516361.546698@cm-24-29-94-19.nycap.rr.com> Message-ID: <14172.20567.40217.703269@cm-24-29-94-19.nycap.rr.com> David> I think that most unices + win32 is a reasonable approximation of David> 'reasonably portable'. And in fact, this change might motivate David> someone with Mac fervor to explore adding Python support of Mac David> threads. One can hope... ;-) Skip From MHammond at skippinet.com.au Tue Jun 8 01:06:37 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 09:06:37 +1000 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906071649.MAA12619@eric.cnri.reston.va.us> Message-ID: <000501beb13a$9eec2c10$0801a8c0@bobcat> > > In case you haven't heard about it, ActiveState has > recently signed a > > contract with Microsoft to do some work on Perl on win32. > > Have I ever heard of it! :-) David Grove pulled me into one of his > bouts of paranoia. I think he's calmed down for the moment. It sounds like a :-), but Im afraid I dont understand that reference. When I first heard this, two things sprung to mind: a) Why shouldnt Python push for a similar deal? b) Something more interesting in the MS/Python space is happening anyway, so nyah nya nya ;-) Getting some modest funds to (say) put together and maintain single core+win32 installers to place on the NT resource kit could only help Python. Sometimes I wish we had a few less good programmers, and a few more good marketting type people ;-) > Anyway, I doubt that we coould use their code, as it undoubtedly > refers to reimplementing fork() at the Perl level, not at the C level > (which would be much harder). Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate fork using the Win32 extensions? Python has basically all of the native Win32 process API exposed, and writing a "fork" in Python that only forked Python scripts (for example) may be feasable and not too difficult. It would have obvious limitations, including the fact that it is not available standard with Python on Windows (just like a working popen now :-) but if we could follow the old 80-20 rule, and catch 80% of the uses with 20% of the effort it may be worth investigating. My knowledge of fork is limited to muttering "something about cloning the current process", so I may be naive in the extreme - but is this feasible? Mark. From fredrik at pythonware.com Tue Jun 8 01:21:15 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:21:15 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <000501beb13a$9eec2c10$0801a8c0@bobcat> Message-ID: <001601beb13c$70ff5b90$f29b12c2@pythonware.com> Mark wrote: > Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate > fork using the Win32 extensions? Python has basically all of the native > Win32 process API exposed, and writing a "fork" in Python that only forked > Python scripts (for example) may be feasable and not too difficult. > > It would have obvious limitations, including the fact that it is not > available standard with Python on Windows (just like a working popen now > :-) but if we could follow the old 80-20 rule, and catch 80% of the uses > with 20% of the effort it may be worth investigating. > > My knowledge of fork is limited to muttering "something about cloning the > current process", so I may be naive in the extreme - but is this feasible? as an aside, GvR added Windows' "spawn" API in 1.5.2, so you can at least emulate some common variants of fork+exec. this means that if someone writes a spawn for Unix, we would at least catch >0% of the uses with ~0% of the effort ;-) fwiw, I'm more interested in the "unicode all the way down" parts of the activestate windows project. more on that later. From gstein at lyra.org Tue Jun 8 01:10:38 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 16:10:38 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <375C516E.76EC8ED4@lyra.org> David Ascher wrote: >... > I'm not denying that. It's just that there are lots of things which fall > into that category, like (to take a pointed example =), os.fork(). We > don't have a --with-fork configure flag. We expose to the Python > programmer all of the underlying OS that is 'wrapped' as long as it's > reasonably portable. I think that most unices + win32 is a reasonable > approximation of 'reasonably portable'. And in fact, this change might > motivate someone with Mac fervor to explore adding Python support of Mac > threads. Agreed. Python isn't a least-common-demoninator language. It tries to make things easy for people. Why should we kill all platforms because of a lack on one? Having threads by default will make a lot of things much simpler (in terms of knowing the default platform). Can't tell you how many times I curse to find that the default RedHat distribution (as of 5.x) did not use threads, even though they are well-supported on Linux. And about stuff creeping into the distribution: gee... does that mean that SocketServer doesn't work on the Mac? Threads *and* fork are not available on Python/Mac, so all you would get is a single-threaded server. icky. I can't see how adding threads to other platforms will *hurt* the Macintosh platform... it can only help others. About the only reason that I can see to *not* make them the default is the slight speed loss. But that seems a bit bogus, as the interpreter loop doesn't spend *that* much time mucking with the interp_lock to allow thread switches. There have also been some real good suggestions for making it take near-zero time until you actually create that second thread. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Tue Jun 8 01:26:08 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:26:08 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> <1283322126-63868517@hypernet.com> Message-ID: <002a01beb13d$1fa23c80$f29b12c2@pythonware.com> > Not a Mac programmer, but I recall that when Steve Jobs came back, > they published a schedule that said threads would be available a > couple releases down the road. Schedules only move one way, so I'd > guess ActiveState is premature. http://www.computerworld.com/home/print.nsf/all/990531AAFA > Perhaps Christian's stackless Python would enable green threads... > > (And there are a number of things in the standard distribution which > don't work on Windows, either; fork and select()ing on file fds). time to implement channels? (Tcl's unified abstraction for all kinds of streams that you could theoretically use something like select on. sockets, pipes, asynchronous disk I/O, etc). does select really work on ordinary files under Unix, btw? From fredrik at pythonware.com Tue Jun 8 01:30:57 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:30:57 +0200 Subject: [Python-Dev] ActiveState & fork & Perl Message-ID: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> I wrote: > > Not a Mac programmer, but I recall that when Steve Jobs came back, > > they published a schedule that said threads would be available a > > couple releases down the road. Schedules only move one way, so I'd > > guess ActiveState is premature. > > http://www.computerworld.com/home/print.nsf/all/990531AAFA which was just my way of saying that "did he perhaps refer to OS X ?". or are they adding real threads to good old MacOS too? From fredrik at pythonware.com Tue Jun 8 01:38:02 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 01:38:02 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <375C516E.76EC8ED4@lyra.org> Message-ID: <003f01beb13e$c95a2750$f29b12c2@pythonware.com> > Having threads by default will make a lot of things much simpler > (in terms of knowing the default platform). Can't tell you how > many times I curse to find that the default RedHat distribution > (as of 5.x) did not use threads, even though they are well- > supported on Linux. I have a vague memory that once upon a time, the standard X libraries shipped with RedHat weren't thread safe, and Tkinter didn't work if you compiled Python with threads. but I might be wrong and/or that may have changed... From MHammond at skippinet.com.au Tue Jun 8 01:42:38 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 09:42:38 +1000 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> Message-ID: <000801beb13f$6e118310$0801a8c0@bobcat> > > http://www.computerworld.com/home/print.nsf/all/990531AAFA > > which was just my way of saying that "did he perhaps > refer to OS X ?". > > or are they adding real threads to good old MacOS too? Oh, /F, please dont start adding annotations to your collection of incredibly obscure URLs - takes away half the fun ;-) Mark. From gstein at lyra.org Tue Jun 8 02:01:41 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 17:01:41 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: <375C516E.76EC8ED4@lyra.org> <003f01beb13e$c95a2750$f29b12c2@pythonware.com> Message-ID: <375C5D65.6E6CD6F@lyra.org> Fredrik Lundh wrote: > > > Having threads by default will make a lot of things much simpler > > (in terms of knowing the default platform). Can't tell you how > > many times I curse to find that the default RedHat distribution > > (as of 5.x) did not use threads, even though they are well- > > supported on Linux. > > I have a vague memory that once upon a time, the standard > X libraries shipped with RedHat weren't thread safe, and > Tkinter didn't work if you compiled Python with threads. > > but I might be wrong and/or that may have changed... Yes, it has changed. RedHat now ships with a thread-safe X so that they can use GTK and Gnome (which use threads quite a bit). There may be other limitations, however, as I haven't tried to do any threaded GUI programming, especially on a recent RedHat (I'm using a patched/hacked RH 4.1 system). RedHat 6.0 may even ship with a threaded Python, but I dunno... -g -- Greg Stein, http://www.lyra.org/ From da at ski.org Tue Jun 8 02:43:27 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 17:43:27 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000501beb13a$9eec2c10$0801a8c0@bobcat> Message-ID: On Tue, 8 Jun 1999, Mark Hammond wrote: > When I first heard this, two things sprung to mind: > a) Why shouldnt Python push for a similar deal? > b) Something more interesting in the MS/Python space is happening anyway, > so nyah nya nya ;-) > > Getting some modest funds to (say) put together and maintain single > core+win32 installers to place on the NT resource kit could only help > Python. How much money are we talking about (no, I'm not offering =)? I wonder if one problem we have is that the folks with $$'s don't want to advertise that they have $$'s because they don't want to be swamped with vultures (and because "that isn't done"), and the people with skills but no $$'s don't want to advertise that fact for a variety of reasons (modesty, fear of being labeled 'commercial', fear of exposing that they're not 100% busy, so "can't be good", etc.). I've been wondering if a broker service like sourceXchange for Python could work -- whether there are enough people who want something done to Python and are willing to pay for an Open Soure project (and whether there are enough "worker bees", although I suspect there are). I can think of several items on various TODO lists which could probably be tackled this way. (doing things *within* sourceXchange is clearly a possibility in the long term -- in the short term they seem focused on Linux, but time will tell). Guido, you're probably the point-man for such 'angels' -- do you get those kinds of requests periodically? How about you, Mark? One thing that ActiveState has going for it which doesn't exist in the Python world is a corporate entity devoted to software development and distribution. PPSI is a support company, or at least markets itself that way. --david From gstein at lyra.org Tue Jun 8 03:05:15 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 18:05:15 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <375C6C4B.617138AB@lyra.org> David Ascher wrote: > > On Tue, 8 Jun 1999, Mark Hammond wrote: > > > When I first heard this, two things sprung to mind: > > a) Why shouldnt Python push for a similar deal? As David points out, I believe this is simply because ActiveState is unique in their business type, products, and model. We don't have anything like that in the Python world (although Pythonware could theoretically go in a similar direction). >... > I've been wondering if a broker service like sourceXchange for Python > could work -- whether there are enough people who want something done to > Python and are willing to pay for an Open Soure project (and whether there > are enough "worker bees", although I suspect there are). I can think of > several items on various TODO lists which could probably be tackled this > way. (doing things *within* sourceXchange is clearly a possibility in the > long term -- in the short term they seem focused on Linux, but time will > tell). sourceXchange should work fine. I don't see it being Linux-only by any means. Heck, the server is a FreeBSD box, and Brian Behlendorf comes from the Apache world (and is a FreeBSD guy mostly). > Guido, you're probably the point-man for such 'angels' -- do you get those > kinds of requests periodically? How about you, Mark? > > One thing that ActiveState has going for it which doesn't exist in the > Python world is a corporate entity devoted to software development and > distribution. PPSI is a support company, or at least markets itself that > way. Yup. That's all we are. We are specifically avoiding any attempts to be a product company. ActiveState is all about products and support-type products. I met with Dick Hardt (ActiveState founder/president) just a couple weeks ago. Great guy. We spoke about ActiveState, what they're doing, and what they'd like to do. They might be looking for good Python people, too... Cheers, -g -- Greg Stein, http://www.lyra.org/ From akuchlin at mems-exchange.org Tue Jun 8 03:22:59 1999 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Mon, 7 Jun 1999 21:22:59 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14172.773.807413.412693@anthem.cnri.reston.va.us> <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> Message-ID: <14172.28787.399827.929220@newcnri.cnri.reston.va.us> Skip Montanaro writes: >True enough, but as Guido pointed out, enabling threads by default would >immediately make the Mac a second-class citizen. Test cases and demos would One possibility might be NSPR, the Netscape Portable Runtime, which provides platform-independent threads and I/O on Mac, Win32, and Unix. Perhaps a thread implementation could be written that sat on top of NSPR, in addition to the existing pthreads implementation. See http://www.mozilla.org/docs/refList/refNSPR/. (You'd probably only use NSPR on the Mac, though; there seems no point in adding another layer of complexity to Unix and Windows.) -- A.M. Kuchling http://starship.python.net/crew/amk/ When religion abandons poetic utterance, it cuts its own throat. -- Robertson Davies, _Marchbanks' Garland_ From tim_one at email.msn.com Tue Jun 8 03:24:47 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 7 Jun 1999 21:24:47 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Message-ID: <000901beb14d$b2759100$aaa02299@tim> [David Ascher] > In case you haven't heard about it, ActiveState has recently signed a > contract with Microsoft to do some work on Perl on win32. I'm astonished at the reaction this has provoked "out there". Here: D:\Python>perl -v This is perl, version 5.001 Unofficial patchlevel 1m. Copyright 1987-1994, Larry Wall Win32 port Copyright (c) 1995 Microsoft Corporation. All rights reserved. Developed by hip communications inc., http://info.hip.com/info/ Perl for Win32 Build 107 Built Apr 16 1996 at 14:47:22 Perl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the Perl 5.0 source kit. D:\Python> Notice the MS copyright? From 1995?! Perl for Win32 has *always* been funded by MS, even back when half of ActiveState was named "hip communications" <0.5 wink>. Thank Perl's dominance in CGI scripting -- MS couldn't sell NT Server if it didn't run Perl. MS may be vicious, but they're not stupid . > ... > fork() > ... > Any guesses as to whether we could hijack this work if/when it is released > as Open Source? It's proven impossible so far to reuse anything from the Perl source -- the code is an incestuous nightmare. From time to time the Perl-Porters talk about splitting some of it into reusable libraries, but that never happens; and the less they feel Perl's dominance is assured, the less they even talk about it. So I'm pessimistic (what else is new ?). I'd rather see the work put into threads anyway. The "Mac OS" problem will go away eventually; time to turn the suckers on by default. it's-not-like-millions-of-programmers-will-start-writing-thread-code-then- who-don't-now-ly y'rs - tim From guido at CNRI.Reston.VA.US Tue Jun 8 03:34:59 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 21:34:59 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 19:01:56 CDT." <1283322126-63868517@hypernet.com> References: <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> <1283322126-63868517@hypernet.com> Message-ID: <199906080134.VAA13480@eric.cnri.reston.va.us> > Perhaps Christian's stackless Python would enable green threads... This has been suggested before... While this seems possible at first, all blocking I/O calls would have to be redone to pass control to the thread scheduler, before this would be useful -- a huge task! I believe SunOS 4.x's LWP (light-weight processes) library used this method. It was a drop-in replacement for the standard libc, containing changed versions of all system calls. I recall that there were one or two missing, which of course upset the posix module because it references almost *all* system calls... --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Tue Jun 8 03:38:38 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 7 Jun 1999 21:38:38 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> Message-ID: <000e01beb14f$a16d9a40$aaa02299@tim> [/F] > http://www.computerworld.com/home/print.nsf/all/990531AAFA > > which was just my way of saying that "did he perhaps > refer to OS X ?". > > or are they adding real threads to good old MacOS too? Dragon is doing a port of its speech recog software to "good old MacOS" and "OS X", and best we can tell the former is as close to an impossible target as we've ever seen. OS X looks like a pleasant romp, in comparison. I don't think they're going to do anything with "good old MacOS" except let it die. it-was-a-reasonable-architecture-15-years-ago-ly y'rs - tim From gstein at lyra.org Tue Jun 8 03:31:08 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 18:31:08 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: <14171.64464.805578.325069@anthem.cnri.reston.va.us> <14172.773.807413.412693@anthem.cnri.reston.va.us> <14172.18567.631562.79944@cm-24-29-94-19.nycap.rr.com> <14172.28787.399827.929220@newcnri.cnri.reston.va.us> Message-ID: <375C725C.5A86D05B@lyra.org> Andrew Kuchling wrote: > > Skip Montanaro writes: > >True enough, but as Guido pointed out, enabling threads by default would > >immediately make the Mac a second-class citizen. Test cases and demos would > > One possibility might be NSPR, the Netscape Portable Runtime, > which provides platform-independent threads and I/O on Mac, Win32, and > Unix. Perhaps a thread implementation could be written that sat on > top of NSPR, in addition to the existing pthreads implementation. > See http://www.mozilla.org/docs/refList/refNSPR/. > > (You'd probably only use NSPR on the Mac, though; there seems no > point in adding another layer of complexity to Unix and Windows.) NSPR is licensed under the MPL, which is quite a bit more restrictive than Python's license. Of course, you could separately point Mac users to it to say "if you get NSPR, then you can have threads". Apache ran into the licensing issue and punted NSPR in favor of a home-grown runtime (which is not as ambitious as NSPR). Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm at hypernet.com Tue Jun 8 04:37:34 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Mon, 7 Jun 1999 21:37:34 -0500 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <002a01beb13d$1fa23c80$f29b12c2@pythonware.com> Message-ID: <1283312788-64430290@hypernet.com> Fredrik Lundh writes: > > time to implement channels? (Tcl's unified abstraction > for all kinds of streams that you could theoretically use > something like select on. sockets, pipes, asynchronous > disk I/O, etc). I have mixed feelings about those types of things. I've recently run across a number of them in some C/C++ libs. On the "pro" side, they can give acceptable behavior and adequate performance and thus suffice for the majority of use. On the "con" side, they're usually an order of magnitude slower than the raw interface, don't quite behave correctly in borderline situations, and tend to produce "One True Path" believers. Of course, so do OSes, editors, languages, GUIs, browsers and colas. > does select really work on ordinary files under Unix, > btw? Sorry, should've said "where a socket is a real fd" or some such... just-like-God-intended-ly y'rs - Gordon From guido at CNRI.Reston.VA.US Tue Jun 8 03:46:40 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 21:46:40 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Mon, 07 Jun 1999 17:43:27 PDT." References: Message-ID: <199906080146.VAA13572@eric.cnri.reston.va.us> > Guido, you're probably the point-man for such 'angels' -- do you get those > kinds of requests periodically? No, as far as I recall, nobody has ever offered me money for Python code to be donated to the body of open source. People sometimes seek to hire me, but promarily to further their highly competitive proprietary business goals... --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Tue Jun 8 03:41:32 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 18:41:32 -0700 Subject: [Python-Dev] licensing Message-ID: <375C74CC.2947E4AE@lyra.org> Speaking of licensing issues... I seem to have read somewhere that the two Medusa files are under a separate license. Although, reading the files now, it seems they are not. The issue that I'm really raising is that Python should ship with a single license that covers everything. Otherwise, it will become very complicated for somebody to figure out which pieces fall under what restrictions. Is there anything in the distribution that is different than the normal license? For example, can I take the async modules and build a commercial product on them? Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at CNRI.Reston.VA.US Tue Jun 8 03:56:03 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 21:56:03 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Tue, 08 Jun 1999 09:06:37 +1000." <000501beb13a$9eec2c10$0801a8c0@bobcat> References: <000501beb13a$9eec2c10$0801a8c0@bobcat> Message-ID: <199906080156.VAA13612@eric.cnri.reston.va.us> [me] > > Have I ever heard of it! :-) David Grove pulled me into one of his > > bouts of paranoia. I think he's calmed down for the moment. [Mark] > It sounds like a :-), but Im afraid I dont understand that reference. David Grove occasionally posts to Perl lists with accusations that ActiveState is making Perl proprietary. He once announced a program editor to the Python list which upon inspection by me didn't contain any Python support, for which I flamed him. He then explained to me that he was in a hurry because ActiveState was taking over the Perl world. A couple of days ago, I received an email from him (part of a conversation on the perl5porters list apparently) where he warned me that ActiveState was planning a similar takeover of Python. After some comments from tchrist ("he's a loon") I decided to ignore David. > Sometimes I wish we had a few less good programmers, and a few more good > marketting type people ;-) Ditto... It sure ain't me! > Excuse my ignorance, but how hard would it be to simulate/emulate/ovulate :-) > fork using the Win32 extensions? Python has basically all of the native > Win32 process API exposed, and writing a "fork" in Python that only forked > Python scripts (for example) may be feasable and not too difficult. > > It would have obvious limitations, including the fact that it is not > available standard with Python on Windows (just like a working popen now > :-) but if we could follow the old 80-20 rule, and catch 80% of the uses > with 20% of the effort it may be worth investigating. > > My knowledge of fork is limited to muttering "something about cloning the > current process", so I may be naive in the extreme - but is this feasible? I think it's not needed that much, but David has argued otherwise. I haven't heard much support either way from others. But I think it would be a huge task, because it would require taking control of all file descriptors (given the semantics that upon fork, file descriptors are shared, but if one half closes an fd it is still open in the other half). --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Tue Jun 8 04:58:59 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Mon, 7 Jun 1999 21:58:59 -0500 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000e01beb14f$a16d9a40$aaa02299@tim> References: <003501beb13d$cbadf7d0$f29b12c2@pythonware.com> Message-ID: <1283311503-64507593@hypernet.com> [Tim] > Dragon is doing a port of its speech recog software to "good old > MacOS" and "OS X", and best we can tell the former is as close to an > impossible target as we've ever seen. OS X looks like a pleasant > romp, in comparison. I don't think they're going to do anything > with "good old MacOS" except let it die. > > it-was-a-reasonable-architecture-15-years-ago-ly y'rs - tim Don't Macs have another CPU in the keyboard already? Maybe you could just require a special microphone . that's-not-a-mini-tower-that's-a-um--subwoofer-ly y'rs - Gordon From guido at CNRI.Reston.VA.US Tue Jun 8 04:09:02 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 07 Jun 1999 22:09:02 -0400 Subject: [Python-Dev] licensing In-Reply-To: Your message of "Mon, 07 Jun 1999 18:41:32 PDT." <375C74CC.2947E4AE@lyra.org> References: <375C74CC.2947E4AE@lyra.org> Message-ID: <199906080209.WAA13806@eric.cnri.reston.va.us> > Speaking of licensing issues... > > I seem to have read somewhere that the two Medusa files are under a > separate license. Although, reading the files now, it seems they are > not. > > The issue that I'm really raising is that Python should ship with a > single license that covers everything. Otherwise, it will become very > complicated for somebody to figure out which pieces fall under what > restrictions. > > Is there anything in the distribution that is different than the normal > license? There are pieces with different licenses but they only differ in the names of the beneficiaries, not in the conditions (although the words aren't always exactly the same). As far as I can tell, this is the situation for asyncore.py and asynchat.py: they have a copyright notice of their own (see the 1.5.2 source for the exact text) with Sam Rushing's copyright. > For example, can I take the async modules and build a commercial product > on them? As far as I know, yes. Sam Rushing promised me this when he gave them to me for inclusion. (I've had a complaint that they aren't the latest -- can someone confirm this?) --Guido van Rossum (home page: http://www.python.org/~guido/) From MHammond at skippinet.com.au Tue Jun 8 05:11:57 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 13:11:57 +1000 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <199906080156.VAA13612@eric.cnri.reston.va.us> Message-ID: <000b01beb15c$abd84ea0$0801a8c0@bobcat> [Please dont copy this out of this list :-] > world. A couple of days ago, I received an email from him (part of a > conversation on the perl5porters list apparently) where he warned me > that ActiveState was planning a similar takeover of Python. After > some comments from tchrist ("he's a loon") I decided to ignore David. I believe this to be true - at least "take over" in the same way they have "taken over" Perl. I have it on very good authority that Active State's medium term business plan includes expanding out of Perl alone, and Python is very high on their list. I also believe they would like to recruit people to help with this goal. They are of the opinion that Python alone could not support such a business quite yet, so attaching it to existing infrastructure could fly. On one hand I tend to agree, but on the other hand I think that we do a pretty damn good job as it is, so maybe a Python could fly all alone? And Ive got to say that personally, such an offer would be highly attractive. Depending on the terms (and I must admit I have not had a good look at the ActiveState Perl licenses) this could provide a real boost to the Python world. If the business model is open source software with paid-for support, it seems a win-win situation to me. However, it is very unclear to me, and the industry, that this model alone can work generally. A business-plan that involves withholding sources or technologies until a fee has been paid certainly moves quickly away from win-win to, to quote Guido, "highly competitive proprietary business goals". May be some interesting times ahead. For some time now I have meant to pass this on to PPSI as a heads-up, just incase they intend playing in that space in the future. So consider this it ;-) Mark. From gstein at lyra.org Tue Jun 8 05:13:42 1999 From: gstein at lyra.org (Greg Stein) Date: Mon, 07 Jun 1999 20:13:42 -0700 Subject: [Python-Dev] ActiveState & fork & Perl References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> Message-ID: <375C8A66.56B3F26B@lyra.org> Mark Hammond wrote: > > [Please dont copy this out of this list :-] It's in the archives now... :-) >...[well-said comments about open source and businesses]... > > May be some interesting times ahead. For some time now I have meant to > pass this on to PPSI as a heads-up, just incase they intend playing in that > space in the future. So consider this it ;-) I've already met Dick Hardt and spoken with him at length. Both on an individual basis, and as the President of PPSI. Nothing to report... (yet) Cheers, -g p.s. PPSI is a bit different, as we intend to fill the "support gap" rather than move into real products; ActiveState does products, along with support type stuff and other miscellaneous (I don't recall Dick's list offhand). -- Greg Stein, http://www.lyra.org/ From tim_one at email.msn.com Tue Jun 8 07:14:36 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 01:14:36 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000b01beb15c$abd84ea0$0801a8c0@bobcat> Message-ID: <000401beb16d$cd88d180$f29e2299@tim> [MarkH] > ... > And Ive got to say that personally, such an offer would be highly > attractive. Depending on the terms (and I must admit I have not > had a good look at the ActiveState Perl licenses) this could provide > a real boost to the Python world. I find the ActivePerl license to be quite confusing: http://www.activestate.com/ActivePerl/commlic.htm It appears to say flatly that you can't distribute it yourself, although other pages on the site say "sure, go ahead!". Also seems to imply you can't modify their code (they explicitly allow you to install patches obtained from ActiveState -- but that's all they mention). OTOH, they did a wonderful job on the Perl for Win32 port (a difficult port in the face of an often-hostile Perl community), and gave all the code back to the Perl folk. I've got no complaints about them so far. > If the business model is open source software with paid-for support, it > seems a win-win situation to me. "Part of our business model is to sell value added, proprietary components."; e.g., they sell a Perl Development Kit for $100, and so on. Fine by me! If I could sell tabnanny ... well, I wouldn't do that to anyone . would-like-to-earn-$1-from-python-before-he-dies-ly y'rs - tim From skip at mojam.com Tue Jun 8 07:37:22 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 8 Jun 1999 01:37:22 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <375C516E.76EC8ED4@lyra.org> References: <375C516E.76EC8ED4@lyra.org> Message-ID: <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> Greg> About the only reason that I can see to *not* make them the Greg> default is the slight speed loss. But that seems a bit bogus, as Greg> the interpreter loop doesn't spend *that* much time mucking with Greg> the interp_lock to allow thread switches. There have also been Greg> some real good suggestions for making it take near-zero time until Greg> you actually create that second thread. Okay, everyone has convinced me that holding threads hostage to the Mac is a red herring. I have other fish to fry. (It's 1:30AM and I haven't had dinner yet. Can you tell? ;-) Is there a way with configure to determine whether or not particular Unix variants should have threads enabled or not? If so, I think that's the way to go. I think it would be unfortunate to enable it by default, have it appear to work on some known to be unsupported platforms, but then bite the programmer in an inconvenient place at an inconvenient time. Such a self-deciding configure script should exit with some information about thread enablement: Yes, we support threads on RedHat Linux 6.0. No, you stinking Minix user, you will never have threads. Rhapsody, huh? I never heard of that. Some weird OS from Sunnyvale, you say? I don't know how to do threads there yet, but when you figure it out, send patches along to python-dev at python.org. Of course, users should be able to override anything using --with-thread or without-thread and possibly specify compile-time and link-time flags through arguments or the environment. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From skip at mojam.com Tue Jun 8 07:49:19 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 8 Jun 1999 01:49:19 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000b01beb15c$abd84ea0$0801a8c0@bobcat> References: <199906080156.VAA13612@eric.cnri.reston.va.us> <000b01beb15c$abd84ea0$0801a8c0@bobcat> Message-ID: <14172.44596.528927.548722@cm-24-29-94-19.nycap.rr.com> Okay, folks. I must have missed the memo. Who are ActiveState and sourceXchange? I can't be the only person on python-dev who never heard of either of them before this evening. I guess I'm the only one who's not shy about exposing their ignorance. but-i-can-tell-you-where-to-find-spare-parts-for-your-Triumph-ly 'yrs, Skip Montanaro 518-372-5583 See my car: http://www.musi-cal.com/~skip/ From da at ski.org Tue Jun 8 08:12:11 1999 From: da at ski.org (David Ascher) Date: Mon, 7 Jun 1999 23:12:11 -0700 (Pacific Daylight Time) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <14172.44596.528927.548722@cm-24-29-94-19.nycap.rr.com> Message-ID: > Okay, folks. I must have missed the memo. Who are ActiveState and > sourceXchange? I can't be the only person on python-dev who never heard of > either of them before this evening. I guess I'm the only one who's not shy > about exposing their ignorance. Well, one answer is to look at www.activestate.com and www.sourcexchange.com, of course =) ActiveState "does" the win32 perl port, for money. (it's a little controversial within the Perl community, which has inherited some of RMS's "Microsoft OS? Ha!" attitude). sourceXchange is aiming to match open source programmers with companies who want open source work done for $$'s, in a 'market' format. It was started by Brian Behlendorf, now at O'Reilly, and of Apache fame. Go get dinner. =) --david From rushing at nightmare.com Tue Jun 8 02:10:18 1999 From: rushing at nightmare.com (Sam Rushing) Date: Mon, 7 Jun 1999 17:10:18 -0700 (PDT) Subject: [Python-Dev] licensing In-Reply-To: <9403621@toto.iv> Message-ID: <14172.23937.83700.673653@seattle.nightmare.com> Guido van Rossum writes: > Greg Stein writes: > > For example, can I take the async modules and build a commercial > > product on them? Yes, my intent was that they go under the normal Python 'do what thou wilt' license. If I goofed in any way, please let me know! > As far as I know, yes. Sam Rushing promised me this when he gave > them to me for inclusion. (I've had a complaint that they aren't > the latest -- can someone confirm this?) Guilty as charged. I've been tweaking them a bit lately, for performance, but anyone can grab the very latest versions out of the medusa CVS repository: CVSROOT=:pserver:medusa at seattle.nightmare.com:/usr/local/cvsroot (the password is 'medusa') Or download one of the snapshots. BTW, those particular files have always had the Python copyright/license. -Sam From gstein at lyra.org Tue Jun 8 09:09:00 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 08 Jun 1999 00:09:00 -0700 Subject: [Python-Dev] licensing References: <14172.23937.83700.673653@seattle.nightmare.com> Message-ID: <375CC18C.1DB5E9F2@lyra.org> Sam Rushing wrote: > > Greg Stein writes: > > > For example, can I take the async modules and build a commercial > > > product on them? > > Yes, my intent was that they go under the normal Python 'do what thou > wilt' license. If I goofed in any way, please let me know! Nope... you haven't goofed. I was thrown off when a certain person (nudge, nudge) goofed in their upcoming book, which I recently reviewed. thx! -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Tue Jun 8 10:08:08 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 10:08:08 +0200 Subject: [Python-Dev] licensing References: <375C74CC.2947E4AE@lyra.org> Message-ID: <00c501beb186$0c6d3450$f29b12c2@pythonware.com> > I seem to have read somewhere that the two Medusa files are under a > separate license. Although, reading the files now, it seems they are > not. the medusa server has restrictive license, but the asyncore and asynchat modules use the standard Python license, with Sam Rushing as the copyright owner. just use the source... > The issue that I'm really raising is that Python should ship with a > single license that covers everything. Otherwise, it will become very > complicated for somebody to figure out which pieces fall under what > restrictions. > > Is there anything in the distribution that is different than the normal > license? > > For example, can I take the async modules and build a commercial product > on them? surely hope so -- we're using them in everything we do. and my upcoming book is 60% about doing weird things with tkinter, and 40% about doing weird things with asynclib... From MHammond at skippinet.com.au Tue Jun 8 10:46:33 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 18:46:33 +1000 Subject: [Python-Dev] licensing In-Reply-To: <375CC18C.1DB5E9F2@lyra.org> Message-ID: <001101beb18b$6a049bd0$0801a8c0@bobcat> > Nope... you haven't goofed. I was thrown off when a certain person > (nudge, nudge) goofed in their upcoming book, which I > recently reviewed. I now feel for the other Mark and David, Aaron et al, etc. Our book is out of date in a number of ways before the tech reviewers even saw it. Medusa wasnt a good example - I should have known better when I wrote it. But Pythonwin is a _real_ problem. Just as I start writing the book, Neil sends me a really cool editor control and it leads me down a path of IDLE/Pythonwin integration. So almost _everything_ I have already written on "IDEs for Python" is already out of date - and printing is not scheduled for a number of months. [This may help explain to Guido and Tim my recent fervour in this area - I want to get the "new look" Pythonwin ready for the book. I just yesterday got a dockable interactive window happening. Now adding a splitter window to each window to expose a pyclbr based tree control and then it is time to stop (and re-write that chapter :-] Mark. From fredrik at pythonware.com Tue Jun 8 12:25:47 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 8 Jun 1999 12:25:47 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: Message-ID: <004d01beb199$4fe171c0$f29b12c2@pythonware.com> > (modesty, fear of being labeled 'commercial', fear of exposing that > they're not 100% busy, so "can't be good", etc.). fwiw, we're seeing an endless stream of mails from moral crusaders even before we have opened the little Python- Ware shoppe (coming soon, coming soon). some of them are quite nasty, to say the least... I usually tell them to raise their concerns on c.l.python instead. they never do. > One thing that ActiveState has going for it which doesn't exist in the > Python world is a corporate entity devoted to software development and > distribution. saying that there is NO such entity is a bit harsh, I think ;-) but different "scripting" companies are using different strategies, by various reasons. Scriptics, ActiveState, PythonWare, UserLand, Harlequin, Rebol, etc. are all doing similar things, but in different ways (due to markets, existing communities, and probably most important: different funding strategies). But we're all corporate entities devoted to software development... ... by the way, if someone thinks there's no money in Python, consider this: --- Google is looking to expand its operations and needs talented engineers to develop the next generation search engine. If you have a need to bring order to a chaotic web, contact us. Requirements: Several years of industry or hobby-based experience B.S. in Computer Science or equivalent (M.S. a plus) Extensive experience programming in C or C++ Extensive experience programming in the UNIX environment Knowledge of TCP/IP and network programming Experience developing/designing large software systems Experience programming in Python a plus --- Google Inc., a year-old Internet search-engine company, said it has attracted $25 million in venture-capital funding and will add two of Silicon Valley's best-known financiers, Michael Moritz and L. John Doerr, to its board. Even by Internet standards, Google has attracted an un- usually large amount of money for a company still in its infancy. --- looks like anyone on this list could get a cool Python job for an unusually over-funded startup within minutes ;-) From skip at mojam.com Tue Jun 8 13:12:02 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 8 Jun 1999 07:12:02 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <004d01beb199$4fe171c0$f29b12c2@pythonware.com> References: <004d01beb199$4fe171c0$f29b12c2@pythonware.com> Message-ID: <14172.63947.54638.275348@cm-24-29-94-19.nycap.rr.com> Fredrik> Even by Internet standards, Google has attracted an un- Fredrik> usually large amount of money for a company still in its Fredrik> infancy. And it's a damn good search engine to boot, so I think it probably deserves the funding (most of it will, I suspect, be used to muscle its way into a crowded market). It is *always* my first stop when I need a general-purpose search engine these days. I never use InfoSeek/Go, Lycos or HotBot for anything other than to check that Musi-Cal is still in their database. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From guido at CNRI.Reston.VA.US Tue Jun 8 14:46:51 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 08 Jun 1999 08:46:51 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Tue, 08 Jun 1999 01:37:22 EDT." <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> References: <375C516E.76EC8ED4@lyra.org> <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> Message-ID: <199906081246.IAA14302@eric.cnri.reston.va.us> > Is there a way with configure to determine whether or not particular Unix > variants should have threads enabled or not? If so, I think that's the way > to go. I think it would be unfortunate to enable it by default, have it > appear to work on some known to be unsupported platforms, but then bite the > programmer in an inconvenient place at an inconvenient time. That's not so much the problem, if you can get a threaded program to compile and link that probably means sufficient support exists. There currently are checks in the configure script that try to find out which thread library to use -- these could be expanded to disable threads when none of the known ones work. Anybody care enough to try hacking configure.in, or should I add this to my tired TODO list? --Guido van Rossum (home page: http://www.python.org/~guido/) From jack at oratrix.nl Tue Jun 8 14:47:44 1999 From: jack at oratrix.nl (Jack Jansen) Date: Tue, 08 Jun 1999 14:47:44 +0200 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Message by Andrew Kuchling , Mon, 7 Jun 1999 21:22:59 -0400 (EDT) , <14172.28787.399827.929220@newcnri.cnri.reston.va.us> Message-ID: <19990608124745.3136B303120@snelboot.oratrix.nl> > One possibility might be NSPR, the Netscape Portable Runtime, > which provides platform-independent threads and I/O on Mac, Win32, and > Unix. Perhaps a thread implementation could be written that sat on > top of NSPR, in addition to the existing pthreads implementation. > See http://www.mozilla.org/docs/refList/refNSPR/. NSPR looks rather promising! Does anyone has any experiences with it? What I'd also be interested in is experiences in how it interacts with the "real" I/O system, i.e. can you mix and match NSPR calls with normal os calls, or will that break things? The latter is important for Python, because there are lots of external libraries, and while some are user-built (image libraries, gdbm, etc) and could conceivably be converted to use NSPR others are not... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido at CNRI.Reston.VA.US Tue Jun 8 15:28:02 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 08 Jun 1999 09:28:02 -0400 Subject: [Python-Dev] Python-dev archives going public In-Reply-To: Your message of "Mon, 07 Jun 1999 20:13:42 PDT." <375C8A66.56B3F26B@lyra.org> References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> <375C8A66.56B3F26B@lyra.org> Message-ID: <199906081328.JAA14584@eric.cnri.reston.va.us> > > [Please dont copy this out of this list :-] > > It's in the archives now... :-) Which reminds me... A while ago, Greg made some noises about the archives being public, and temporarily I made them private. In the following brief flurry of messages everybody who spoke up said they preferred the archives to be public (even though the list remains invitation-only). But I never made the change back, waiting for Greg to agree, but after returning from his well deserved tequilla-splashed vacation, he never gave a peep about this, and I "conveniently forgot". I still like the archives to be public. I hope Mark's remark there was a joke? --Guido van Rossum (home page: http://www.python.org/~guido/) From MHammond at skippinet.com.au Tue Jun 8 15:38:03 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 8 Jun 1999 23:38:03 +1000 Subject: [Python-Dev] Python-dev archives going public In-Reply-To: <199906081328.JAA14584@eric.cnri.reston.va.us> Message-ID: <003101beb1b4$22786de0$0801a8c0@bobcat> > I still like the archives to be public. I hope Mark's remark there > was a joke? Well, not really a joke, but I am not naive to think this is a "private" forum even in the absence of archives. What I meant was closer to "please don't make public statements based purely on this information". I never agreed to keep it private, but by the same token didnt want to start the rumour mills and get bad press for either Dick or us ;-) Mark. From bwarsaw at cnri.reston.va.us Tue Jun 8 17:09:24 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 8 Jun 1999 11:09:24 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl References: <375C516E.76EC8ED4@lyra.org> <14172.43513.272949.148790@cm-24-29-94-19.nycap.rr.com> <199906081246.IAA14302@eric.cnri.reston.va.us> Message-ID: <14173.12836.616873.953134@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Anybody care enough to try hacking configure.in, or should Guido> I add this to my tired TODO list? I'll give it a look. I've done enough autoconf hacking that it shouldn't be too hard. I also need to get my string meths changes into the tree... -Barry From gstein at lyra.org Tue Jun 8 20:11:56 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 08 Jun 1999 11:11:56 -0700 Subject: [Python-Dev] Python-dev archives going public References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> <375C8A66.56B3F26B@lyra.org> <199906081328.JAA14584@eric.cnri.reston.va.us> Message-ID: <375D5CEC.340E2531@lyra.org> Guido van Rossum wrote: > > > > [Please dont copy this out of this list :-] > > > > It's in the archives now... :-) > > Which reminds me... A while ago, Greg made some noises about the > archives being public, and temporarily I made them private. In the > following brief flurry of messages everybody who spoke up said they > preferred the archives to be public (even though the list remains > invitation-only). But I never made the change back, waiting for Greg > to agree, but after returning from his well deserved tequilla-splashed > vacation, he never gave a peep about this, and I "conveniently > forgot". I appreciate the consideration, but figured it was a done deal based on feedback. My only consideration in keeping them private was the basic, human fact that people could feel left out. For example, if they read the archives, thought it was neat, and attempted to subscribe only to be refused. It is a bit easier to avoid engendering those bad feelings if the archives aren't public. Cheers, -g -- Greg Stein, http://www.lyra.org/ From jim at digicool.com Tue Jun 8 20:41:11 1999 From: jim at digicool.com (Jim Fulton) Date: Tue, 08 Jun 1999 18:41:11 +0000 Subject: [Python-Dev] Python-dev archives going public References: <000b01beb15c$abd84ea0$0801a8c0@bobcat> <375C8A66.56B3F26B@lyra.org> <199906081328.JAA14584@eric.cnri.reston.va.us> <375D5CEC.340E2531@lyra.org> Message-ID: <375D63C7.6BB6697E@digicool.com> Greg Stein wrote: > > My only consideration in keeping them private was the basic, human fact > that people could feel left out. For example, if they read the archives, > thought it was neat, and attempted to subscribe only to be refused. It > is a bit easier to avoid engendering those bad feelings if the archives > aren't public. I agree. Jim -- Jim Fulton mailto:jim at digicool.com Technical Director (540) 371-6909 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From tismer at appliedbiometrics.com Tue Jun 8 21:37:21 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 08 Jun 1999 21:37:21 +0200 Subject: [Python-Dev] Stackless Preview References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> Message-ID: <375D70F1.37007192@appliedbiometrics.com> Christian Tismer wrote: [a lot] > fearing the feedback :-) ciao - chris I expected everything but forgot to fear "no feedback". :-) About 5 or 6 people seem to have taken the .zip file. Now I'm wondering why nobody complains. Was my code so wonderful, so disgustingly bad, or is this just boring :-? If it's none of the three above, I'd be happy to get a hint if I should continue, or if and what I should change. Maybe it would make sense to add some documentation now, and also to come up with an application which makes use of the stackless implementation, since there is now not much to wonder about than that it seems to work :-) yes-call-me-impatient - ly chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From jeremy at cnri.reston.va.us Tue Jun 8 22:09:15 1999 From: jeremy at cnri.reston.va.us (Jeremy Hylton) Date: Tue, 8 Jun 1999 16:09:15 -0400 (EDT) Subject: [Python-Dev] Stackless Preview In-Reply-To: <375D70F1.37007192@appliedbiometrics.com> References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> <375D70F1.37007192@appliedbiometrics.com> Message-ID: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> >>>>> "CT" == Christian Tismer writes: CT> Christian Tismer wrote: [a lot] >> fearing the feedback :-) ciao - chris CT> I expected everything but forgot to fear "no feedback". :-) CT> About 5 or 6 people seem to have taken the .zip file. Now I'm CT> wondering why nobody complains. Was my code so wonderful, so CT> disgustingly bad, or is this just boring :-? CT> If it's none of the three above, I'd be happy to get a hint if I CT> should continue, or if and what I should change. I'm one of the silent 5 or 6. My reasons fall under "None of the above." They are three in number: 1. No time (the perennial excuse; next 2 weeks are quite hectic) 2. I tried to use ndiff to compare old and new ceval.c, but ran into some problems with that tool. (Tim, it looks like the line endings are identical -- all '\012'.) 3. Wasn't sure what to look at first My only suggestion would be to have an executive summary. If there was a short README file -- no more than 150 lines -- that described the essentials of the approach and told me what to look at first, I would be able to comment more quickly. Jeremy From tismer at appliedbiometrics.com Tue Jun 8 22:15:04 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 08 Jun 1999 22:15:04 +0200 Subject: [Python-Dev] Stackless Preview References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> <375D70F1.37007192@appliedbiometrics.com> <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> Message-ID: <375D79C8.90B3E721@appliedbiometrics.com> Jeremy Hylton wrote: [...] > I'm one of the silent 5 or 6. My reasons fall under "None of the > above." They are three in number: > 1. No time (the perennial excuse; next 2 weeks are quite hectic) > 2. I tried to use ndiff to compare old and new ceval.c, but > ran into some problems with that tool. (Tim, it looks > like the line endings are identical -- all '\012'.) Yes, there are a lot of changes. As a hint: windiff from VC++ does a great job here. You can see both sources in one, in a very readable colored form. > 3. Wasn't sure what to look at first > > My only suggestion would be to have an executive summary. If there > was a short README file -- no more than 150 lines -- that described > the essentials of the approach and told me what to look at first, I > would be able to comment more quickly. Thanks a lot. Will do this tomorrow moaning as my first task. feeling much better - ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From Vladimir.Marangozov at inrialpes.fr Wed Jun 9 00:29:27 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Wed, 9 Jun 1999 00:29:27 +0200 (DFT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <19990608124745.3136B303120@snelboot.oratrix.nl> from "Jack Jansen" at "Jun 8, 99 02:47:44 pm" Message-ID: <199906082229.AAA48646@pukapuka.inrialpes.fr> Jack Jansen wrote: > > NSPR looks rather promising! Does anyone has any experiences with it? What I'd > also be interested in is experiences in how it interacts with the "real" I/O > system, i.e. can you mix and match NSPR calls with normal os calls, or will > that break things? I've looked at it in the past. From memory, NSPR is a fairly big chunk of code and it seemed to me that it's self contained for lots of system stuff. Don't know about I/O, but I played with it to replace the BSD malloc it uses with pymalloc and I was pleased to see the resulting speed & mem stats after rebuilding one of the past Mozilla distribs. This is all the experience I have with it. > > The latter is important for Python, because there are lots of external > libraries, and while some are user-built (image libraries, gdbm, etc) and > could conceivably be converted to use NSPR others are not... I guess that this one would be hard... -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From Vladimir.Marangozov at inrialpes.fr Wed Jun 9 00:45:48 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Wed, 9 Jun 1999 00:45:48 +0200 (DFT) Subject: [Python-Dev] Stackless Preview In-Reply-To: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> from "Jeremy Hylton" at "Jun 8, 99 04:09:15 pm" Message-ID: <199906082245.AAA48828@pukapuka.inrialpes.fr> Jeremy Hylton wrote: > > CT> If it's none of the three above, I'd be happy to get a hint if I > CT> should continue, or if and what I should change. > > I'm one of the silent 5 or 6. My reasons fall under "None of the > above." They are three in number: > ... > My only suggestion would be to have an executive summary. If there > was a short README file -- no more than 150 lines -- that described > the essentials of the approach and told me what to look at first, I > would be able to comment more quickly. Same here + a small wish: please save me the stripping of the ^M line endings typical for MSW, so that I can load the files directly in Xemacs on a Unix box. Otherwise, like Jeremy, I was a bit lost trying to read ceval.c which is already too hairy. -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From tim_one at email.msn.com Wed Jun 9 04:27:37 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 22:27:37 -0400 Subject: [Python-Dev] Stackless Preview In-Reply-To: <199906082245.AAA48828@pukapuka.inrialpes.fr> Message-ID: <000d01beb21f$a3daac20$2fa22299@tim> [Vladimir Marangozov] > ... > please save me the stripping of the ^M line endings typical for MSW, > so that I can load the files directly in Xemacs on a Unix box. Vlad, get linefix.py from Python FTP contrib's System area; converts among Unix, Windows and Mac line conventions; to Unix by default. For that matter, do a global replace of ^M in Emacs . buncha-lazy-whiners-ly y'rs - tim From tim_one at email.msn.com Wed Jun 9 04:27:35 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 22:27:35 -0400 Subject: [Python-Dev] Stackless Preview In-Reply-To: <14173.30485.113456.830246@bitdiddle.cnri.reston.va.us> Message-ID: <000c01beb21f$a2bd5540$2fa22299@tim> [Christian Tismer] > ... > If it's none of the three above, I'd be happy to get a hint if I > should continue, or if and what I should change. Sorry, Chris! Just a case of "no time" here. Of *course* you should continue, and Guido should pop in with an encouraging word too -- or a "forget it". I think this design opens the doors to a world of interesting ideas, but that's based on informed prejudice rather than careful study of your code. Cheer up: if everyone thought you were a lame ass, we all would have studied your code intensely by now . [Jeremy] > 2. I tried to use ndiff to compare old and new ceval.c, but > ran into some problems with that tool. (Tim, it looks > like the line endings are identical -- all '\012'.) Then let's treat this like a real bug : which version of Python did you use? And ship me the files in a tarball (I'll find a way to extract them intact). And does that specific Python+ndiff combo work OK on *other* files? Or does it fail to find any lines in common no matter what you feed it (a 1-line test case would be a real help )? I couldn't provoke a problem with the stock 1.5.2 ndiff under the stock 1.5.2 Windows Python, using the then-current CVS snapshot of ceval.c as file1 and the ceval.c from Christian's stackless_990606.zip file as file2. Both files have \r\n line endings for me, though (one thanks to CVS line translation, and the other thanks to WinZip line translation). or-were-you-running-ndiff-under-the-stackless-python?-ly y'rs - tim From tim_one at email.msn.com Wed Jun 9 04:27:40 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 8 Jun 1999 22:27:40 -0400 Subject: [Python-Dev] licensing In-Reply-To: <001101beb18b$6a049bd0$0801a8c0@bobcat> Message-ID: <000f01beb21f$a5e2ff40$2fa22299@tim> [Mark Hammond] > ... > [This may help explain to Guido and Tim my recent fervour in this area > - I want to get the "new look" Pythonwin ready for the book. I just > yesterday got a dockable interactive window happening. Now adding a > splitter window to each window to expose a pyclbr based tree control and > then it is time to stop (and re-write that chapter :-] All right! Do get the latest CVS versions of these files: pyclbr has been sped up a lot over the past two days, and is much less likely to get baffled now. And AutoIndent.py now defaults usetabs to 1 (which, of course, means it still uses spaces in new files ). From guido at CNRI.Reston.VA.US Wed Jun 9 05:31:11 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 08 Jun 1999 23:31:11 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: Your message of "Tue, 08 Jun 1999 22:27:35 EDT." <000c01beb21f$a2bd5540$2fa22299@tim> References: <000c01beb21f$a2bd5540$2fa22299@tim> Message-ID: <199906090331.XAA23066@eric.cnri.reston.va.us> Tim wrote: > Sorry, Chris! Just a case of "no time" here. Of *course* you > should continue, and Guido should pop in with an encouraging word > too -- or a "forget it". I think this design opens the doors to a > world of interesting ideas, but that's based on informed prejudice > rather than careful study of your code. Cheer up: if everyone > thought you were a lame ass, we all would have studied your code > intensely by now . No time here either... I did try to have a quick peek and my first impression is that it's *very* tricky code! You know what I think of that... Here's what I think we should do first (I've mentioned this before but nobody cheered me on :-). I'd like to see this as the basis for 1.6. We should structurally split the Python Virtual Machine and related code up into different parts -- both at the source code level and at the runtime level. The core PVM becomes a replaceable component, and so do a few other parts like the parser, the bytecode compiler, the import code, and the interactive read-eval-print loop. Most object implementations are shared between all -- or at least the interfaces are interchangeable. Clearly, a few object types are specific to one or another PVM (e.g. frames). The collection of builtins is also a separate component (though some builtins may again be specific to a PVM -- details, details!). The goal of course, is to create a market for 3rd party components here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's importer, and so on. Thoughts? --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Wed Jun 9 05:37:36 1999 From: da at ski.org (David Ascher) Date: Tue, 8 Jun 1999 20:37:36 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Splitting up the PVM In-Reply-To: <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: On Tue, 8 Jun 1999, Guido van Rossum wrote: > We should structurally split the Python Virtual Machine and related > code up into different parts -- both at the source code level and at > the runtime level. The core PVM becomes a replaceable component, and > so do a few other parts like the parser, the bytecode compiler, the > import code, and the interactive read-eval-print loop. > The goal of course, is to create a market for 3rd party components > here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's > importer, and so on. > > Thoughts? If I understand it correctly, it means that I can fit in a third-party read-eval-print loop, which is my biggest area of frustration with the current internal structure. Sounds like a plan to me, and one which (lucky for me) I'm not qualified for! --david From skip at mojam.com Wed Jun 9 05:45:33 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 8 Jun 1999 23:45:33 -0400 (EDT) Subject: [Python-Dev] Stackless Preview In-Reply-To: <375D70F1.37007192@appliedbiometrics.com> References: <000901beafcc$424ec400$639e2299@tim> <375AD1DC.19C1C0F6@appliedbiometrics.com> <375D70F1.37007192@appliedbiometrics.com> Message-ID: <14173.58054.869171.927699@cm-24-29-94-19.nycap.rr.com> Chris> If it's none of the three above, I'd be happy to get a hint if I Chris> should continue, or if and what I should change. Chris, My vote is for you to keep at it. I haven't looked at it because I have absolutely zero free time available. This will probably continue until at least the end of July, perhaps until Labor Day. Big doings at Musi-Cal and in the Montanaro household (look for an area code change in a month or so). Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From tismer at appliedbiometrics.com Wed Jun 9 14:58:40 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 09 Jun 1999 14:58:40 +0200 Subject: [Python-Dev] Splitting up the PVM References: <000c01beb21f$a2bd5540$2fa22299@tim> <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: <375E6500.307EF39E@appliedbiometrics.com> Guido van Rossum wrote: > > Tim wrote: > > > Sorry, Chris! Just a case of "no time" here. Of *course* you > > should continue, and Guido should pop in with an encouraging word > > too -- or a "forget it". I think this design opens the doors to a > > world of interesting ideas, but that's based on informed prejudice > > rather than careful study of your code. Cheer up: if everyone > > thought you were a lame ass, we all would have studied your code > > intensely by now . > > No time here either... > > I did try to have a quick peek and my first impression is that it's > *very* tricky code! You know what I think of that... Thanks for looking into it, thanks for saying it's tricky. Since I failed to supply proper documentation yet, this impression must come up. But it is really not true. The code is not tricky but just straightforward and consequent, after one has understood what it means to work without a stack, under the precondition to avoid too much changes. I didn't want to rewrite the world, and I just added the tiny missing bits. I will write up my documentation now, and you will understand what the difficulties were. These will not vanish, "stackless" is a brainteaser. My problem was not how to change the code, but finally it was how to change my brain. Now everything is just obvious. > Here's what I think we should do first (I've mentioned this before but > nobody cheered me on :-). I'd like to see this as the basis for 1.6. > > We should structurally split the Python Virtual Machine and related > code up into different parts -- both at the source code level and at > the runtime level. The core PVM becomes a replaceable component, and > so do a few other parts like the parser, the bytecode compiler, the > import code, and the interactive read-eval-print loop. Most object > implementations are shared between all -- or at least the interfaces > are interchangeable. Clearly, a few object types are specific to one > or another PVM (e.g. frames). The collection of builtins is also a > separate component (though some builtins may again be specific to a > PVM -- details, details!). Good idea, and a lot of work. Having different frames for different PVM's was too much for me. Instead, I tried to adjust frames in a way where a lot of machines can work with. I tried to show the concept of having different VM's by implementing a stackless map. Stackless map is a very tiny one which uses frames again (and yes, this was really hacked). Well, different frame flavors would make sense, perhaps. But I have a central routine which handles all calls to frames, and this is what I think is needed. I already *have* pluggable interpreters here, since a function can produce a frame which is bound to an interpreter, and push it to the frame stack. > The goal of course, is to create a market for 3rd party components > here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's > importer, and so on. I'm with that component goal, of course. Much work, not for one persone, but great. While I don't think it makes sense to make a flat PVM pluggable. I would start with a flat PVM, since that opens a world of possibilities. You can hardly plug flatness in after you started with a wrong stack layout. Vice versa, plugging the old machine would be possible. later - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Wed Jun 9 15:08:38 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 09 Jun 1999 15:08:38 +0200 Subject: [Python-Dev] Stackless Preview References: <000c01beb21f$a2bd5540$2fa22299@tim> Message-ID: <375E6756.370BA78E@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer] > > ... > > If it's none of the three above, I'd be happy to get a hint if I > > should continue, or if and what I should change. > > Sorry, Chris! Just a case of "no time" here. Of *course* you should > continue, and Guido should pop in with an encouraging word too -- or a > "forget it". Yup, I know this time problem just too good. Well, I think I got something in between. I was warned before, so I didn't try to write final code, but I managed to prove the concept. I *will* continue, regardless what anybody says. > or-were-you-running-ndiff-under-the-stackless-python?-ly y'rs - tim I didn't use ndiff, but regular "diff", and it worked. But since theere is not much change to the code, but some significant change to the control flow, I found the diff output too confusing. Windiff was always open when I wrote that, to be sure that I didn't trample on things which I didn't want to mess up. A good tool! ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Wed Jun 9 16:48:34 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 9 Jun 1999 10:48:34 -0400 (EDT) Subject: [Python-Dev] Stackless Preview References: <199906082245.AAA48828@pukapuka.inrialpes.fr> <000d01beb21f$a3daac20$2fa22299@tim> Message-ID: <14174.32450.29368.914458@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Vlad, get linefix.py from Python FTP contrib's System area; TP> converts among Unix, Windows and Mac line conventions; to Unix TP> by default. For that matter, do a global replace of ^M in TP> Emacs . I forgot to follow up to Vlad's original message, but in XEmacs (dunno about FSFmacs), you can visit DOS-eol files without seeing the ^M's. You will see a "DOS" in the modeline, and when you go to write the file it'll ask you if you want to write it in "plain text". I use XEmacs all the time to convert between DOS-eol and eol-The-Way-God-Intended :) To enable this, add the following to your .emacs file: (require 'crypt) -Barry From tismer at appliedbiometrics.com Wed Jun 9 19:58:52 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Wed, 09 Jun 1999 19:58:52 +0200 Subject: [Python-Dev] First Draft on Stackless Python References: <199906082245.AAA48828@pukapuka.inrialpes.fr> <000d01beb21f$a3daac20$2fa22299@tim> <14174.32450.29368.914458@anthem.cnri.reston.va.us> Message-ID: <375EAB5C.138D32CF@appliedbiometrics.com> Howdy, I've begun with a first draft on Stackless Python. Didn't have enough time to finish it, but something might already be useful. (Should I better drop the fish idea?) Will write the rest tomorrow. ciao - chris http://www.pns.cc/stackless/stackless.htm -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tim_one at email.msn.com Thu Jun 10 07:25:11 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 01:25:11 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: <375E6500.307EF39E@appliedbiometrics.com> Message-ID: <001401beb301$9cf20b00$af9e2299@tim> [Christian Tismer, replying to Guido's enthusiasm ] > Thanks for looking into it, thanks for saying it's tricky. > Since I failed to supply proper documentation yet, this > impression must come up. > > But it is really not true. The code is not tricky > but just straightforward and consequent, after one has understood > what it means to work without a stack, under the precondition > to avoid too much changes. I didn't want to rewrite > the world, and I just added the tiny missing bits. > > I will write up my documentation now, and you will > understand what the difficulties were. These will not > vanish, "stackless" is a brainteaser. My problem was not how > to change the code, but finally it was how to change > my brain. Now everything is just obvious. FWIW, I believe you! There's something *inherently* tricky about maintaining the effect of a stack without using the stack C supplies implicitly, and from all you've said and what I've learned of your code, it really isn't the code that's tricky here. You're making formerly-hidden connections explicit, which means more stuff is visible, but also means more power and flexibility *because* "more stuff is visible". Agree too that this clearly moves in the direction of making the VM pluggable. > ... > I *will* continue, regardless what anybody says. Ah, if that's how this works, then STOP! Immediately! Don't you dare waste more of our time with this crap . want-some-money?-ly y'rs - tim From tim_one at email.msn.com Thu Jun 10 07:44:50 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 01:44:50 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: <001701beb304$5b8a8b80$af9e2299@tim> [Guido van Rossum] > ... > Here's what I think we should do first (I've mentioned this before but > nobody cheered me on :-). I'd like to see this as the basis for 1.6. > > We should structurally split the Python Virtual Machine and related > code up into different parts -- both at the source code level and at > the runtime level. The core PVM becomes a replaceable component, and > so do a few other parts like the parser, the bytecode compiler, the > import code, and the interactive read-eval-print loop. Most object > implementations are shared between all -- or at least the interfaces > are interchangeable. Clearly, a few object types are specific to one > or another PVM (e.g. frames). The collection of builtins is also a > separate component (though some builtins may again be specific to a > PVM -- details, details!). > > The goal of course, is to create a market for 3rd party components > here, e.g. Chris' flat PVM, or Skip's bytecode optimizer, or Greg's > importer, and so on. > > Thoughts? The idea of major subsystems getting reworked to conform to well-defined and well-controlled interfaces is certainly appealing. I'm just more comfortable squeezing another 1.7% out of list.sort() <0.9 wink>. trying-to-reduce-my-ambitions-to-match-my-time-ly y'rs - tim From jack at oratrix.nl Thu Jun 10 10:49:31 1999 From: jack at oratrix.nl (Jack Jansen) Date: Thu, 10 Jun 1999 10:49:31 +0200 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: Message by Guido van Rossum , Tue, 08 Jun 1999 23:31:11 -0400 , <199906090331.XAA23066@eric.cnri.reston.va.us> Message-ID: <19990610084931.55882303120@snelboot.oratrix.nl> > Here's what I think we should do first (I've mentioned this before but > nobody cheered me on :-). Go, Guido, GO!!!! What I'd like in the split you propose is to see which of the items would be implementable in Python, and try to do the split in such a way that such a Python implementation isn't ruled out. Am I correct in guessing that after factoring out the components you mention the only things that aren't in a "replaceable component" are the builtin objects, and a little runtime glue (malloc and such)? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From tismer at appliedbiometrics.com Thu Jun 10 14:16:20 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 10 Jun 1999 14:16:20 +0200 Subject: [Python-Dev] Splitting up the PVM References: <001401beb301$9cf20b00$af9e2299@tim> Message-ID: <375FAC94.D17D43A7@appliedbiometrics.com> Tim Peters wrote: > > [Christian Tismer, replying to Guido's enthusiasm ] ... > > I will write up my documentation now, and you will still under some work :) > > understand what the difficulties were. These will not > > vanish, "stackless" is a brainteaser. My problem was not how > > to change the code, but finally it was how to change > > my brain. Now everything is just obvious. > > FWIW, I believe you! There's something *inherently* tricky about > maintaining the effect of a stack without using the stack C supplies > implicitly, and from all you've said and what I've learned of your code, it > really isn't the code that's tricky here. You're making formerly-hidden > connections explicit, which means more stuff is visible, but also means more > power and flexibility *because* "more stuff is visible". I knew you would understand me. Feeling much, much better now :-)) After this is finalized, restartable exceptions might be interesting to explore. No, Chris, do the doco... > > I *will* continue, regardless what anybody says. > > Ah, if that's how this works, then STOP! Immediately! Don't you dare waste > more of our time with this crap . Thanks, you fired me a continuation. Here the way to get me into an endless loop: Give me an unsolvable problem and claim I can't do that. :) (just realized that I'm just another pluggable interpreter) > want-some-money?-ly y'rs - tim No, but meet you at least once in my life. -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From arw at ifu.net Thu Jun 10 15:40:51 1999 From: arw at ifu.net (Aaron Watters) Date: Thu, 10 Jun 1999 09:40:51 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] Message-ID: <375FC062.62850DE5@ifu.net> While we're talking about stacks... I've always considered it a major shame that Python ints and floats and chars and stuff have anything to do with dynamic allocation, and I always suspected it might be a major speed performance boost if there was some way they could be manipulated without the need for dynamic memory management. One conceivable alternative approach would change the basic manipulation of objects so that instead of representing objects via pyobject pointers everywhere represent them using two "slots" in a structure for each object, one of which is a type descriptor pointer and the other being a (void *) which could contain the data directly for small objects such as ints, floats, chars. In this case, for example, integer addition would never require any memory management, as it shouldn't, I think, in a perfect world. IE instead of C-stack or static: Heap: (pyobject *) ------------> (refcount, typedescr, data ...) in general you get (typedescr repr* ----------------------> (refcount, data, ...) ) or for small objects like ints and floats and chars simply (typedescr, value) with no dereferencing or memory management required. My feeling is that common things like arithmetic and indexing lists of integers and stuff could be much faster under this approach since it reduces memory management overhead and fragmentation, dereferencing, etc... One bad thing, of course, is that this might be a drastic assault on the way existing code works... Unless I'm just not being creative enough with my thinking. Is this a good idea? If so, is there any way to add it to the interpreter without breaking extension modules and everything else? If Python 2.0 will break stuff anyway, would this be an good change to the internals? Curious... -- Aaron Watters ps: I suppose another gotcha is "when do you do increfs/decrefs?" because they no longer make sense for ints in this case... maybe add a flag to the type descriptor "increfable" and assume that the typedescriptors are always in the CPU cache (?). This would slow down increfs by a couple cycles... Would it be worth it? Only the benchmark knows... Another fix would be to put the refcount in the static side with no speed penalty (typedescr repr* ----------------------> data refcount ) but would that be wasteful of space? From guido at CNRI.Reston.VA.US Thu Jun 10 15:45:51 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 10 Jun 1999 09:45:51 -0400 Subject: [Python-Dev] Splitting up the PVM In-Reply-To: Your message of "Thu, 10 Jun 1999 10:49:31 +0200." <19990610084931.55882303120@snelboot.oratrix.nl> References: <19990610084931.55882303120@snelboot.oratrix.nl> Message-ID: <199906101345.JAA29917@eric.cnri.reston.va.us> [me] > > Here's what I think we should do first (I've mentioned this before but > > nobody cheered me on :-). [Jack] > Go, Guido, GO!!!! > > What I'd like in the split you propose is to see which of the items would be > implementable in Python, and try to do the split in such a way that such a > Python implementation isn't ruled out. Indeed. The importing code and the read-eval-print loop are obvious candidates (in fact IDLE shows how the latter can be done today). I'm not sure if it makes sense to have a parser/compiler or the VM written in Python, because of the expected slowdown (plus, the VM would present a chicken-egg problem :-) although for certain purposes one might want to do this. An optimizing pass would certainly be a good candidate. > Am I correct in guessing that after factoring out the components you mention > the only things that aren't in a "replaceable component" are the builtin > objects, and a little runtime glue (malloc and such)? I guess (although how much exactly will only become clear when it's done). I guess that things like thread-safety and GC policy are also pervasive. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Thu Jun 10 16:11:23 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 10 Jun 1999 10:11:23 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: Your message of "Thu, 10 Jun 1999 09:40:51 EDT." <375FC062.62850DE5@ifu.net> References: <375FC062.62850DE5@ifu.net> Message-ID: <199906101411.KAA29962@eric.cnri.reston.va.us> [Aaron] > I've always considered it a major shame that Python ints and floats > and chars and stuff have anything to do with dynamic allocation, and > I always suspected it might be a major speed performance boost if > there was some way they could be manipulated without the need for > dynamic memory management. What you're describing is very close to what I recall I once read about the runtime organization of Icon. Perl may also use a variant on this (it has fixed-length object headers). On the other hand, I believe Smalltalks typically uses something like the following ABC trick: In ABC, we used a variation: objects were represented by pointers as in Python, except when the low bit was 1, in which case the remaining 31 bits were a "small int". My experience with this approach was that it probably saved some memory, but perhaps not time (since almost all operations on objects were slowed down by the check "is it an int?" before the pointer could be accessed); and that because of this it was a major hassle in keeping the implementation code correct. There was always the temptation to make a check early in a piece of code and then skip the check later on, which sometimes didn't work when objects switched places. Plus in general the checks made the code less readable, and it was just one more thing to remember to do. The Icon approach (i.e. yours) seems to require a complete rethinking of all object implementations and all APIs at the C level -- perhaps we could think about it for Python 2.0. Some ramifications: - Uses more memory for highly shared objects (there are as many copies of the type pointer as there are references). - Thus, lists take double the memory assuming they reference objects that also exist elsewhere. This affects the performance of slices etc. - On the other hand, a list of ints takes half the memory (given that most of those ints are not shared). - *Homogeneous* lists (where all elements have the same type -- i.e. arrays) can be represented more efficiently by having only one copy of the type pointer. This was an idea for ABC (whose type system required all container types to be homogenous) that was never implemented (because in practice the type check wasn't always applied, and the top-level namespace used by the interactive command interpreter violated all the rules). - Reference count manipulations could be done by a macro (or C++ behind-the-scense magic using copy constructors and destructors) that calls a function in the type object -- i.e. each object could decide on its own reference counting implementation :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Thu Jun 10 20:02:30 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 10 Jun 1999 14:02:30 -0400 (EDT) Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: <375FC062.62850DE5@ifu.net> <199906101411.KAA29962@eric.cnri.reston.va.us> Message-ID: <14175.64950.720465.456133@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> In ABC, we used a variation: objects were represented by Guido> pointers as in Python, except when the low bit was 1, in Guido> which case the remaining 31 bits were a "small int". Very similar to how Emacs Lisp manages its type system, to which XEmacs extended. The following is from the XEmacs Internals documentation[1]. XEmacs' object representation (on a 32 bit machine) uses the top bit as a GC mark bit, followed by three type tag bits, followed by a pointer or an integer: [ 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 ] [ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 ] ^ <---> <------------------------------------------------------> | tag a pointer to a structure, or an integer | `---> mark bit One of the 8 possible types representable by the tag bits, one is a "record" type, which essentially allows an unlimited (well, 2^32) number of data types. As you might guess there are lots of interesting details and limitations to this scheme, with lots of interesting macros in the C code :). Reading and debugging the C implementation gets fun too (we'll ignore for the moment all the GCPRO'ing going on -- if you think INCREF/DECREF is trouble prone, hah!). Whether or not this is at all relevent for Python 2.0, it all seems to work pretty well in (X)Emacs. >>>>> "AW" == Aaron Watters writes: AW> ps: I suppose another gotcha is "when do you do AW> increfs/decrefs?" because they no longer make sense for ints AW> in this case... maybe add a flag to the type descriptor AW> "increfable" and assume that the typedescriptors are always in AW> the CPU cache (?). This would slow down increfs by a couple AW> cycles... Would it be worth it? Only the benchmark knows... AW> Another fix would be to put the refcount in the static side AW> with no speed penalty | (typedescr | repr* ----------------------> data | refcount | ) AW> but would that be wasteful of space? Once again, you can move the refcount out of the objects, a la NextStep. Could save space and improve LOC for read-only objects. -Barry [1] The Internals documentation comes with XEmacs's Info documetation. Hit: C-h i m Internals RET m How RET From tismer at appliedbiometrics.com Thu Jun 10 21:53:10 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Thu, 10 Jun 1999 21:53:10 +0200 Subject: [Python-Dev] Stackless Preview References: <000d01beb21f$a3daac20$2fa22299@tim> Message-ID: <376017A6.DC619723@appliedbiometrics.com> Howdy, I worked a little more on the docs and figured out that I could use a hint. http://www.pns.cc/stackless/stackless.htm Trying to give an example how coroutines could work, some weaknesses showed up. I wanted to write some function coroutine_transfer which swaps two frame chains. This function should return my unwind token, but unfortunately in that case a real result would be needed as well. Well, I know of several ways out, but it's a matter of design, and I'd like to find the most elegant solution for this. Could perhaps someone of those who encouraged me have a look into the problem? Do I have to add yet another field for return values and handle that in the dispatcher? thanks - chris (tired of thinking) -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Fri Jun 11 01:32:26 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 10 Jun 1999 19:32:26 -0400 (EDT) Subject: [Python-Dev] String methods... finally Message-ID: <14176.19210.146525.172100@anthem.cnri.reston.va.us> I've finally checked my string methods changes into the source tree, albeit on a CVS branch (see below). These changes are outgrowths of discussions we've had on the string-sig, with I think Greg Stein giving lots of very useful early feedback. I'll call these changes controversial (hence the branch) because Guido hasn't had much opportunity to play with them. Now that he -- and you -- can check them out, I'm sure I'll get lots more feedback! First, to check them out you need to switch to the string_methods CVS branch. On Un*x: cvs update -r string_methods You might want to do this in a separate tree because this will sticky tag your tree to this branch. If so, try cvs checkout -r string_methods python Here's a brief summary of the changes (as best I can restore the state -- its been a while since I actually made all these changes ;) Strings now have as methods most of the functions that were previously only in the string module. If you've played with JPython, you've already had this feature for a while. So you can do: Python 1.5.2+ (#1, Jun 10 1999, 18:22:14) [GCC 2.8.1] on sunos5 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> s = 'Hello There Devheads' >>> s.lower() 'hello there devheads' >>> s.upper() 'HELLO THERE DEVHEADS' >>> s.split() ['Hello', 'There', 'Devheads'] >>> 'hello'.upper() 'HELLO' that sort of thing. Some of the string module functions don't make sense as string methods, like join, and others never had a C implementation so weren't added, like center. Two new methods startswith and endswith act like their Java cousins. The string module has been rewritten to be completely (I hope) backwards compatible. No code should break, though they could be slower. Guido and I decided that was acceptable. What else? Some cleaning up of the internals based on Greg's suggestions. A couple of new C API additions. Builtin int(), long(), and float() have grown a few new features. I believe they are essentially interchangable with string.atoi(), string.atol(), and string.float() now. After you guys get to toast me (in either sense of the word) for a while and these changes settle down, I'll make a wider announcement. Enjoy, -Barry From da at ski.org Fri Jun 11 01:37:54 1999 From: da at ski.org (David Ascher) Date: Thu, 10 Jun 1999 16:37:54 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: On Thu, 10 Jun 1999, Barry A. Warsaw wrote: > I've finally checked my string methods changes into the source tree, Great! > ... others never had a C implementation so weren't added, like center. I assume that's not a design decision but a "haven't gotten around to it yet" statement, right? > Two new methods startswith and endswith act like their Java cousins. aaaah... . --david From MHammond at skippinet.com.au Fri Jun 11 01:59:17 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Fri, 11 Jun 1999 09:59:17 +1000 Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <003101beb39d$41b1c7c0$0801a8c0@bobcat> > I've finally checked my string methods changes into the source tree, > albeit on a CVS branch (see below). These changes are outgrowths of Yay! Would this also be a good opportunity to dust-off the Unicode implementation the string-sig recently came up with (as implemented by Fredrik) and get this in as a type? Although we still have the unresolved issue of how to use PyArg_ParseTuple etc to convert to/from Unicode and 8bit, it would still be nice to have Unicode and String objects capable of being used interchangably at the Python level. Of course, the big problem with attempting to test out these sorts of changes is that you must do so in code that will never see the public for a good 12 months. I suppose a 1.5.25 is out of the question ;-) Mark. From guido at CNRI.Reston.VA.US Fri Jun 11 03:40:07 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 10 Jun 1999 21:40:07 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Fri, 11 Jun 1999 09:59:17 +1000." <003101beb39d$41b1c7c0$0801a8c0@bobcat> References: <003101beb39d$41b1c7c0$0801a8c0@bobcat> Message-ID: <199906110140.VAA02180@eric.cnri.reston.va.us> > Would this also be a good opportunity to dust-off the Unicode > implementation the string-sig recently came up with (as implemented by > Fredrik) and get this in as a type? > > Although we still have the unresolved issue of how to use PyArg_ParseTuple > etc to convert to/from Unicode and 8bit, it would still be nice to have > Unicode and String objects capable of being used interchangably at the > Python level. Yes, yes, yes! Even if it's not supported everywhere, at least having the Unicode type in the source tree would definitely help! > Of course, the big problem with attempting to test out these sorts of > changes is that you must do so in code that will never see the public for a > good 12 months. I suppose a 1.5.25 is out of the question ;-) We'll see about that... (I sometimes wished I wasn't in the business of making releases. I've asked for help with making essential patches to 1.5.2 available but nobody volunteered... :-( ) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Fri Jun 11 05:08:28 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 23:08:28 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <14175.64950.720465.456133@anthem.cnri.reston.va.us> Message-ID: <000a01beb3b7$adda3b20$329e2299@tim> Jumping in to opine that mixing tag/type bits with native pointers is a Really Bad Idea. Put the bits on the low end and word-addressed machines are screwed. Put the bits on the high end and you've made severe assumptions about how the platform parcels out address space. In any case you're stuck with ugly macros everywhere. This technique was pioneered by Lisps, and was beautifully exploited by the Symbolics Lisp Machine and TI Lisp Explorer hardware. Lisp people don't want to admit those failed, so continue simulating the HW design by hand at comparatively sluggish C speed <0.6 wink>. BTW, I've never heard this approach argued as a speed optimization (except in the HW implementations): software mask-test-branch around every inc/dec-ref to exempt ints is a nasty new repeated expense. The original motivation was to save space, and that back in the days when a 128Mb RAM chip wasn't even conceivable, let alone under $100 . once-wrote-a-functional-language-interpreter-in-8085-assembler-that-ran- in-24Kb-cuz-that's-all-there-was-but-don't-feel-i-need-to-repeat-the- experience-today-wink>-ly y'rs - tim From bwarsaw at cnri.reston.va.us Fri Jun 11 05:13:29 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 10 Jun 1999 23:13:29 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <14176.32473.408675.992145@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: >> ... others never had a C implementation so weren't added, like >> center. DA> I assume that's not a design decision but a "haven't gotten DA> around to it yet" statement, right? I think we decided that they weren't used enough to implement in C. >> Two new methods startswith and endswith act like their Java >> cousins. DA> aaaah... . Tell me about it! -Barry From tim_one at email.msn.com Fri Jun 11 05:33:25 1999 From: tim_one at email.msn.com (Tim Peters) Date: Thu, 10 Jun 1999 23:33:25 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <000b01beb3bb$29ccdaa0$329e2299@tim> > Two new methods startswith and endswith act like their Java cousins. Barry, suggest that both of these grow optional start and end slice indices. Why? It's Pythonic . Really, I'm forever marching over huge strings a slice-pair at a time, and it's important that searches and matches never give me false hits due to slobbering over the current slice bounds. regexp objects in general, and string.find/.rfind in particular, support this beautifully. Java feels less need since sub-stringing is via cheap descriptor there. The optional indices wouldn't hurt Java, but would help Python. then-again-if-strings-were-so-great-i'd-switch-to-tcl-ly y'rs - tim From bwarsaw at cnri.reston.va.us Fri Jun 11 05:41:55 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Thu, 10 Jun 1999 23:41:55 -0400 (EDT) Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: <14175.64950.720465.456133@anthem.cnri.reston.va.us> <000a01beb3b7$adda3b20$329e2299@tim> Message-ID: <14176.34179.125397.282079@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Jumping in to opine that mixing tag/type bits with native TP> pointers is a Really Bad Idea. Put the bits on the low end TP> and word-addressed machines are screwed. Put the bits on the TP> high end and you've made severe assumptions about how the TP> platform parcels out address space. In any case you're stuck TP> with ugly macros everywhere. Ah, so you /have/ read the Emacs source code! I'll agree that it's just an RBI for Emacs, but for Python, it'd be a RFSI. TP> This technique was pioneered by Lisps, and was beautifully TP> exploited by the Symbolics Lisp Machine and TI Lisp Explorer TP> hardware. Lisp people don't want to admit those failed, so TP> continue simulating the HW design by hand at comparatively TP> sluggish C speed <0.6 wink>. But of course, the ghosts live on at the FSF and xemacs.org (couldn't tell ya much about how modren Lisps do it). -Barry From skip at mojam.com Fri Jun 11 06:26:49 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 11 Jun 1999 00:26:49 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.19210.146525.172100@anthem.cnri.reston.va.us> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> Barry> Some of the string module functions don't make sense as string Barry> methods, like join, and others never had a C implementation so Barry> weren't added, like center. I take it string.capwords falls into that category. It's one of those things that's so easy to write in Python and there's no real speed gain in going to C, that it didn't make much sense to add it to the strop module, right? I see the following functions in string.py that could reasonably be methodized: ljust, rjust, center, expandtabs, capwords That's not very many, and it would appear that this stuff won't see widespread use for quite some time. I think for completeness sake we should bite the bullet on them. BTW, I built it and think it is very cool. Tipping my virtual hat to Barry, I am... Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From skip at mojam.com Fri Jun 11 06:57:15 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 11 Jun 1999 00:57:15 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> Message-ID: <14176.38521.124491.987817@cm-24-29-94-19.nycap.rr.com> Skip> I see the following functions in string.py that could reasonably be Skip> methodized: Skip> ljust, rjust, center, expandtabs, capwords It occurred to me just a few minutes after sending my previous message that it might make sense to make string.join a method for lists and tuples. They'd obviously have to make the same type checks that string.join does. That would leave the string/strip modules implementing just a couple functions. Skip From da at ski.org Fri Jun 11 07:09:46 1999 From: da at ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:09:46 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14176.38521.124491.987817@cm-24-29-94-19.nycap.rr.com> Message-ID: On Fri, 11 Jun 1999, Skip Montanaro wrote: > It occurred to me just a few minutes after sending my previous message that > it might make sense to make string.join a method for lists and tuples. > They'd obviously have to make the same type checks that string.join does. as in: >>> ['spam!', 'eggs!'].join() 'spam! eggs!' ? I like the notion, but I think it would naturally migrate towards genericity, at which point it might be called "reduce", so that: >>> ['spam!', 'eggs!'].reduce() 'spam!eggs!' >>> ['spam!', 'eggs!'].reduce(' ') 'spam! eggs!' >>> [1,2,3].reduce() 6 # 1 + 2 + 3 >>> [1,2,3].reduce(10) 26 # 1 + 10 + 2 + 10 + 3 note that string.join(foo) == foo.reduce(' ') and string.join(foo, '') == foo.reduce() --david From guido at CNRI.Reston.VA.US Fri Jun 11 07:16:29 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 01:16:29 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Thu, 10 Jun 1999 22:09:46 PDT." References: Message-ID: <199906110516.BAA02520@eric.cnri.reston.va.us> > On Fri, 11 Jun 1999, Skip Montanaro wrote: > > > It occurred to me just a few minutes after sending my previous message that > > it might make sense to make string.join a method for lists and tuples. > > They'd obviously have to make the same type checks that string.join does. > > as in: > > >>> ['spam!', 'eggs!'].join() > 'spam! eggs!' Note that this is not as powerful as string.join(); the latter works on any sequence, not just on lists and tuples. (Though that may not be a big deal.) I also find it slightly objectionable that this is a general list method but only works if the list contains only strings; Dave Ascher's generalization to reduce() is cute but strikes me are more general than useful, and the name will forever present a mystery to most newcomers. Perhaps join() ought to be a built-in function? --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Fri Jun 11 07:23:06 1999 From: da at ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:23:06 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Guido van Rossum wrote: > Perhaps join() ought to be a built-in function? Would it do the moral equivalent of a reduce(operator.add, ...) or of a string.join? I think it should do the former (otherwise something about 'string' should be in the name), and as a consequence I think it shouldn't have the default whitespace spacer. cute-but-general'ly y'rs, david From da at ski.org Fri Jun 11 07:35:42 1999 From: da at ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:35:42 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Aside: apply syntax Message-ID: I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core to allow * and ** in function calls, so that: class SubFoo(Foo): def __init__(self, *args, **kw): apply(Foo, (self, ) + args, kw) ... could be written class SubFoo(Foo): def __init__(self, *args, **kw): Foo(self, *args, **kw) ... I really like this notion, but before I poke around trying to see if it's doable, I'd like to get feedback on whether y'all think it's a good idea or not. And if someone else wants to do it, feel free -- I am of course swamped, and I won't get to it until after rich comparisons. FWIW, apply() is one of my least favorite builtins, aesthetically speaking. --david From da at ski.org Fri Jun 11 07:36:30 1999 From: da at ski.org (David Ascher) Date: Thu, 10 Jun 1999 22:36:30 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Re: Aside: apply syntax In-Reply-To: Message-ID: On Thu, 10 Jun 1999, David Ascher wrote: > class SubFoo(Foo): > def __init__(self, *args, **kw): > apply(Foo, (self, ) + args, kw) > ... > > could be written > > class SubFoo(Foo): > def __init__(self, *args, **kw): > Foo(self, *args, **kw) Of course I meant Foo.__init__ in both of the above! --david From skip at mojam.com Fri Jun 11 09:07:09 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 11 Jun 1999 03:07:09 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: References: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: <14176.45761.801671.880774@cm-24-29-94-19.nycap.rr.com> David> I think it should do the former (otherwise something about David> 'string' should be in the name), and as a consequence I think it David> shouldn't have the default whitespace spacer. Perhaps "joinstrings" would be an appropriate name (though it seems gratuitously long) or join should call str() on non-string elements. My thought here is that we have left in the string module a couple functions that ought to be string object methods but aren't yet mostly for convenience or time constraints, and one (join) that is 99.9% of the time used on lists or tuples of strings. That leaves a very small handful of methods that don't naturally fit somewhere else. You can, of course, complete the picture and add a join method to string objects, which would be useful to explode them into individual characters. That would complete the join-as-a-sequence-method picture I think. If you don't somebody else (and not me, cuz I'll know why already!) is bound to ask why capwords, join, ljust, etc got left behind in the string module while all the other functions got promotions to object methods. Oh, one other thing I forgot. Split (join) and splitfields (joinfields) used to be different. They've been the same for a long time now, long enough that I no longer recall how they used to differ. In making the leap from string module to string methods, I suggest dropping the long names altogether. There's no particular compatibility reason to keep them and they're not really any more descriptive than their shorter siblings. It's not like you'll be preserving backward compatibility for anyone's code by having them. However, if you release this code to the larger public, then you'll be stuck with both in perpetuity. Skip From fredrik at pythonware.com Fri Jun 11 09:06:58 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 11 Jun 1999 09:06:58 +0200 Subject: [Python-Dev] String methods... finally References: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: <008701beb3da$5e2db9d0$f29b12c2@pythonware.com> Guido wrote: > Note that this is not as powerful as string.join(); the latter works > on any sequence, not just on lists and tuples. (Though that may not > be a big deal.) > > I also find it slightly objectionable that this is a general list > method but only works if the list contains only strings; Dave Ascher's > generalization to reduce() is cute but strikes me are more general > than useful, and the name will forever present a mystery to most > newcomers. > > Perhaps join() ought to be a built-in function? come to think of it, the last design I came up with (inspired by a mail from you which I cannot find right now), was this: def join(sequence, sep=None): # built-in if not sequence: return "" sequence[0].__join__(sequence, sep) string.join => join and __join__ methods in the unicode and string classes. Guido? From fredrik at pythonware.com Fri Jun 11 09:03:19 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 11 Jun 1999 09:03:19 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <008601beb3da$5e0a7a60$f29b12c2@pythonware.com> Barry wrote: > Some of the string module functions don't make sense as > string methods, like join, and others never had a C > implementation so weren't added, like center. fwiw, the Unicode module available from pythonware.com implements them all, and more importantly, it can be com- piled for either 8-bit or 16-bit characters... join is a special problem; IIRC, Guido came up with what I at that time thought was an excellent solution, but I don't recall what it was right now ;-) anyway, maybe we should start by figuring out what methods we really want in there, and then figure out whether we should have one or two independent string implementations in the core... From mal at lemburg.com Fri Jun 11 10:15:33 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jun 1999 10:15:33 +0200 Subject: [Python-Dev] String methods... finally References: Message-ID: <3760C5A5.43FB1658@lemburg.com> David Ascher wrote: > > On Fri, 11 Jun 1999, Guido van Rossum wrote: > > > Perhaps join() ought to be a built-in function? > > Would it do the moral equivalent of a reduce(operator.add, ...) or of a > string.join? > > I think it should do the former (otherwise something about 'string' should > be in the name), and as a consequence I think it shouldn't have the > default whitespace spacer. AFAIK, Guido himself proposed something like this on c.l.p a few months ago. I think something like the following written in C and optimized for lists of strings might be useful: def join(sequence,sep=None): x = sequence[0] if sep: for y in sequence[1:]: x = x + sep + y else: for y in sequence[1:]: x = x + y return x >>> join(('a','b')) 'ab' >>> join(('a','b'),' ') 'a b' >>> join((1,2,3),3) 12 >>> join(((1,2),(3,))) (1, 2, 3) Also, while we're at string functions/methods. Some of the stuff in mxTextTools (see Python Pages link below) might be of general use as well, e.g. splitat(), splitlines() and charsplit(). -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 203 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Jun 11 14:31:51 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 08:31:51 -0400 Subject: [Python-Dev] Aside: apply syntax In-Reply-To: Your message of "Thu, 10 Jun 1999 22:35:42 PDT." References: Message-ID: <199906111231.IAA02774@eric.cnri.reston.va.us> > I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core > to allow * and ** in function calls, so that: > > class SubFoo(Foo): > def __init__(self, *args, **kw): > apply(Foo, (self, ) + args, kw) > ... > > could be written > > class SubFoo(Foo): > def __init__(self, *args, **kw): > Foo(self, *args, **kw) > ... > > I really like this notion, but before I poke around trying to see if it's > doable, I'd like to get feedback on whether y'all think it's a good idea > or not. And if someone else wants to do it, feel free -- I am of course > swamped, and I won't get to it until after rich comparisons. > > FWIW, apply() is one of my least favorite builtins, aesthetically > speaking. I like the idea, but it would mean a major reworking of the grammar and the parser. Can I persuade you to keep this on ice until 2.0? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Fri Jun 11 14:54:30 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 11 Jun 1999 14:54:30 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> Message-ID: <004601beb409$8c535750$f29b12c2@pythonware.com> > Two new methods startswith and endswith act like their Java cousins. is it just me, or do those method names suck? begin? starts_with? startsWith? (ouch) has_prefix? From arw at ifu.net Fri Jun 11 15:05:17 1999 From: arw at ifu.net (Aaron Watters) Date: Fri, 11 Jun 1999 09:05:17 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: <199906110342.XAA07977@python.org> Message-ID: <3761098D.A56F58A8@ifu.net> From: "Tim Peters" >Jumping in to opine that mixing tag/type bits with native pointers is a >Really Bad Idea. Put the bits on the low end and word-addressed machines >are screwed. Put the bits on the high end and you've made severe >assumptions about how the platform parcels out address space. In any case >you're stuck with ugly macros everywhere. Agreed. Never ever mess with pointers. This mistake has been made over and over again by each new generation of computer hardware and software and it's still a mistake. I thought it would be good to be able to do the following loop with Numeric arrays for x in array1: array2[x] = array3[x] + array4[x] without any memory management being involved. Right now, I think the for loop has to continually dynamically allocate each new x and intermediate sum (and immediate deallocate them) and that makes the loop piteously slow. The idea replacing pyobject *'s with a struct [typedescr *, data *] was a space/time tradeoff to speed up operations like the above by eliminating any need for mallocs or other memory management.. I really can't say whether it'd be worth it or not without some sort of real testing. Just a thought. -- Aaron Watters From mal at lemburg.com Fri Jun 11 15:11:20 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jun 1999 15:11:20 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> Message-ID: <37610AF8.3EC610FD@lemburg.com> Fredrik Lundh wrote: > > > Two new methods startswith and endswith act like their Java cousins. > > is it just me, or do those method names suck? > > begin? starts_with? startsWith? (ouch) > has_prefix? In mxTextTools I used the names prefix() and suffix() for much the same thing except that those functions accept a list of strings and return the (first) matching string instead of just 1 or 0. Details are available at: http://starship.skyport.net/~lemburg/mxTextTools.html -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 203 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Fri Jun 11 15:58:10 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 09:58:10 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Fri, 11 Jun 1999 15:11:20 +0200." <37610AF8.3EC610FD@lemburg.com> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> <37610AF8.3EC610FD@lemburg.com> Message-ID: <199906111358.JAA02836@eric.cnri.reston.va.us> > > > Two new methods startswith and endswith act like their Java cousins. > > > > is it just me, or do those method names suck? It's just you. > > begin? starts_with? startsWith? (ouch) > > has_prefix? Those are all painful to type, except "begin", which isn't expressive. > In mxTextTools I used the names prefix() and suffix() for much The problem with those is that it's arbitrary (==> harder to remember) whether A.prefix(B) means that A is a prefix of B or that A has B for a prefix. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Fri Jun 11 16:55:14 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 11 Jun 1999 16:55:14 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> <37610AF8.3EC610FD@lemburg.com> <199906111358.JAA02836@eric.cnri.reston.va.us> Message-ID: <37612352.227FCA4B@lemburg.com> Guido van Rossum wrote: > > > > > Two new methods startswith and endswith act like their Java cousins. > > > > > > is it just me, or do those method names suck? > > It's just you. > > > > begin? starts_with? startsWith? (ouch) > > > has_prefix? > > Those are all painful to type, except "begin", which isn't expressive. > > > In mxTextTools I used the names prefix() and suffix() for much > > The problem with those is that it's arbitrary (==> harder to remember) > whether A.prefix(B) means that A is a prefix of B or that A has B for > a prefix. True. These are functions in mxTextTools and take a sequence as second argument, so the order is clear there... has_prefix() has_suffix() would probably be appropriate as methods (you don't type them that often ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 203 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From jack at oratrix.nl Fri Jun 11 17:55:36 1999 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 11 Jun 1999 17:55:36 +0200 Subject: [Python-Dev] Aside: apply syntax In-Reply-To: Message by Guido van Rossum , Fri, 11 Jun 1999 08:31:51 -0400 , <199906111231.IAA02774@eric.cnri.reston.va.us> Message-ID: <19990611155536.944FA303120@snelboot.oratrix.nl> > > > > class SubFoo(Foo): > > def __init__(self, *args, **kw): > > Foo(self, *args, **kw) > > ... Guido: > I like the idea, but it would mean a major reworking of the grammar > and the parser. Can I persuade you to keep this on ice until 2.0? What exactly would the semantics be? While I hate the apply() loops you have to jump through nowadays to get this behaviour I don't funny understand how this would work in general (as opposed to in this case). For instance, would Foo(self, 12, *args, **kw) be allowed? And Foo(self, *args, x=12, **kw) ? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From da at ski.org Fri Jun 11 18:57:37 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 09:57:37 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Aside: apply syntax In-Reply-To: <199906111231.IAA02774@eric.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Guido van Rossum wrote: > > I've seen repeatedly on c.l.p a suggestion to modify the syntax & the core > > to allow * and ** in function calls, so that: > > > I like the idea, but it would mean a major reworking of the grammar > and the parser. Can I persuade you to keep this on ice until 2.0? Sure. That was hard. =) From da at ski.org Fri Jun 11 19:02:49 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 10:02:49 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Aside: apply syntax In-Reply-To: <19990611155536.944FA303120@snelboot.oratrix.nl> Message-ID: On Fri, 11 Jun 1999, Jack Jansen wrote: > What exactly would the semantics be? While I hate the apply() loops you have > to jump through nowadays to get this behaviour I don't funny understand how > this would work in general (as opposed to in this case). For instance, would > Foo(self, 12, *args, **kw) > be allowed? And > Foo(self, *args, x=12, **kw) Following the rule used for argument processing now, if it's unambiguous, it should be allowed, and not otherwise. So, IMHO, the above two should be allowed, and I suspect Foo.__init__(self, *args, *args2) could be too, but Foo.__init__(self, **kw, **kw2) should not, as dictionary addition is not allowed. However, I could live with the more restricted version as well. --david From bwarsaw at cnri.reston.va.us Fri Jun 11 19:17:20 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:17:20 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <000b01beb3bb$29ccdaa0$329e2299@tim> Message-ID: <14177.17568.637272.328126@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: >> Two new methods startswith and endswith act like their Java >> cousins. TP> Barry, suggest that both of these grow optional start and end TP> slice indices. 'Course it'll make the Java implementations of these extra args a little more work. Right now they just forward off to the underlying String methods. No biggie though. I've got new implementations to check in -- let me add a few new tests to cover 'em and watch your checkin emails. -Barry From guido at CNRI.Reston.VA.US Fri Jun 11 19:20:57 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 13:20:57 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Fri, 11 Jun 1999 13:17:20 EDT." <14177.17568.637272.328126@anthem.cnri.reston.va.us> References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <000b01beb3bb$29ccdaa0$329e2299@tim> <14177.17568.637272.328126@anthem.cnri.reston.va.us> Message-ID: <199906111720.NAA03746@eric.cnri.reston.va.us> > From: "Barry A. Warsaw" > > 'Course it'll make the Java implementations of these extra args a > little more work. Right now they just forward off to the underlying > String methods. No biggie though. Which reminds me -- are you tracking this in JPython too? --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Fri Jun 11 19:39:41 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:39:41 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <000b01beb3bb$29ccdaa0$329e2299@tim> <14177.17568.637272.328126@anthem.cnri.reston.va.us> <199906111720.NAA03746@eric.cnri.reston.va.us> Message-ID: <14177.18909.980174.55751@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Which reminds me -- are you tracking this in JPython too? That's definitely my plan. From bwarsaw at cnri.reston.va.us Fri Jun 11 19:43:35 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:43:35 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <199906110516.BAA02520@eric.cnri.reston.va.us> <14176.45761.801671.880774@cm-24-29-94-19.nycap.rr.com> Message-ID: <14177.19143.463951.778491@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> Oh, one other thing I forgot. Split (join) and splitfields SM> (joinfields) used to be different. They've been the same for SM> a long time now, long enough that I no longer recall how they SM> used to differ. I think it was only in the number of arguments they'd accept (at least that's what's implied by the module docos). SM> In making the leap from string module to SM> string methods, I suggest dropping the long names altogether. I agree. Thinking about it, I'm also inclined to not include startswith and endswith in the string module. -Barry From da at ski.org Fri Jun 11 19:42:59 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 10:42:59 -0700 (Pacific Daylight Time) Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <3761098D.A56F58A8@ifu.net> Message-ID: On Fri, 11 Jun 1999, Aaron Watters wrote: > I thought it would be good to be able to do the following loop with Numeric > arrays > > for x in array1: > array2[x] = array3[x] + array4[x] > > without any memory management being involved. Right now, I think the FYI, I think it should be done by writing: array2[array1] = array3[array1] + array4[array1] and doing "the right thing" in NumPy. In other words, I don't think the core needs to be involved. --david PS: I'm in the process of making the NumPy array objects ExtensionClasses, which will make the above much easier to do. From bwarsaw at cnri.reston.va.us Fri Jun 11 19:58:36 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 13:58:36 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <004601beb409$8c535750$f29b12c2@pythonware.com> Message-ID: <14177.20044.69731.219173@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: >> Two new methods startswith and endswith act like their Java >> cousins. FL> is it just me, or do those method names suck? FL> begin? starts_with? startsWith? (ouch) FL> has_prefix? The inspiration was Java string objects, while trying to remain as Pythonic as possible (no mixed case). startswith and endswith doen't seem as bad as issubclass to me :) -Barry From bwarsaw at cnri.reston.va.us Fri Jun 11 20:06:22 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:06:22 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <008601beb3da$5e0a7a60$f29b12c2@pythonware.com> Message-ID: <14177.20510.818041.110989@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> fwiw, the Unicode module available from pythonware.com FL> implements them all, and more importantly, it can be com- FL> piled for either 8-bit or 16-bit characters... Are these separately available? I don't see them under downloads. Send me a URL, and if I can figure out how to get CVS to add files to the branch :/, maybe I can check this in so people can play with it. -Barry From tismer at appliedbiometrics.com Fri Jun 11 20:17:46 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 11 Jun 1999 20:17:46 +0200 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] References: Message-ID: <376152CA.B46A691E@appliedbiometrics.com> David Ascher wrote: > > On Fri, 11 Jun 1999, Aaron Watters wrote: > > > I thought it would be good to be able to do the following loop with Numeric > > arrays > > > > for x in array1: > > array2[x] = array3[x] + array4[x] > > > > without any memory management being involved. Right now, I think the > > FYI, I think it should be done by writing: > > array2[array1] = array3[array1] + array4[array1] > > and doing "the right thing" in NumPy. In other words, I don't think the > core needs to be involved. For NumPy, this is very ok, dealing with arrays in an array world. Without trying to repeat myself, I'd like to say that I still consider it an unsolved problem which is worth to be solved or to be proven unsolvable: How to do simple things in an efficient way with many tiny Python objects, without writing an extension, without rethinking a problem into APL like style, and without changing the language. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Fri Jun 11 20:22:36 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:22:36 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <199906110516.BAA02520@eric.cnri.reston.va.us> Message-ID: <14177.21484.126155.939932@anthem.cnri.reston.va.us> >> Perhaps join() ought to be a built-in function? IMO, builtin join ought to str()ify all the elements in the sequence, concatenating the results. That seems an intuitive interpretation of 'join'ing a sequence. Here's my Python prototype: def join(seq, sep=''): if not seq: return '' x = str(seq[0]) for y in seq[1:]: x = x + sep + str(y) return x Guido? -Barry From da at ski.org Fri Jun 11 20:24:34 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 11:24:34 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.21484.126155.939932@anthem.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Barry A. Warsaw wrote: > IMO, builtin join ought to str()ify all the elements in the sequence, > concatenating the results. That seems an intuitive interpretation of > 'join'ing a sequence. Here's my Python prototype: I don't get it -- why? I'd expect join(((1,2,3), (4,5,6))) to yield (1,2,3,4,5,6), not anything involving strings. --david From bwarsaw at cnri.reston.va.us Fri Jun 11 20:26:48 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:26:48 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> Message-ID: <14177.21736.100540.221487@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> I see the following functions in string.py that could SM> reasonably be methodized: SM> ljust, rjust, center, expandtabs, capwords Also zfill. What do you think, are these important enough to add? Maybe we can just drop in /F's implementation for these. -Barry From bwarsaw at cnri.reston.va.us Fri Jun 11 20:34:08 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:34:08 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14177.21484.126155.939932@anthem.cnri.reston.va.us> Message-ID: <14177.22176.328185.872134@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: DA> On Fri, 11 Jun 1999, Barry A. Warsaw wrote: >> IMO, builtin join ought to str()ify all the elements in the >> sequence, concatenating the results. That seems an intuitive >> interpretation of 'join'ing a sequence. Here's my Python >> prototype: DA> I don't get it -- why? DA> I'd expect join(((1,2,3), (4,5,6))) to yield (1,2,3,4,5,6), DA> not anything involving strings. Oh, just because I think it might useful, and would provide something that isn't easily provided with other constructs. Without those semantics join(((1,2,3), (4,5,6))) isn't much different than (1,2,3) + (4,5,6), or reduce(operator.add, ((1,2,3), (4,5,6))) as you point out. Since those latter two are easy enough to come up with, but str()ing the elements would require painful lambdas, I figured make the new built in do something new. -Barry From bwarsaw at cnri.reston.va.us Fri Jun 11 20:36:54 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:36:54 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <14176.36294.902853.594510@cm-24-29-94-19.nycap.rr.com> <14177.21736.100540.221487@anthem.cnri.reston.va.us> Message-ID: <14177.22342.320993.969742@anthem.cnri.reston.va.us> One other thing to think about. Where should this new methods be documented? I suppose we should reword the appropriate entries in modules-string and move them to typesseq-strings. What do you think Fred? -Barry From da at ski.org Fri Jun 11 20:36:32 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 11:36:32 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.22176.328185.872134@anthem.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Barry A. Warsaw wrote: Barry: > >> IMO, builtin join ought to str()ify all the elements in the > >> sequence, concatenating the results. Me: > I don't get it -- why? Barry: > Oh, just because I think it might useful, and would provide something > that isn't easily provided with other constructs. I do map(str, ...) all the time. My real concern is that there is nothing about the word 'join' which implies string conversion. Either call it joinstrings or don't do the conversion, I say. --david From bwarsaw at cnri.reston.va.us Fri Jun 11 20:42:27 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 14:42:27 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14177.22176.328185.872134@anthem.cnri.reston.va.us> Message-ID: <14177.22675.716917.331314@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: DA> My real concern is that there is nothing about the word 'join' DA> which implies string conversion. Either call it joinstrings DA> or don't do the conversion, I say. Can you say mapconcat() ? :) Or instead of join, just call it concat? -Barry From da at ski.org Fri Jun 11 20:46:19 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 11:46:19 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.22675.716917.331314@anthem.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Barry A. Warsaw wrote: > >>>>> "DA" == David Ascher writes: > > DA> My real concern is that there is nothing about the word 'join' > DA> which implies string conversion. Either call it joinstrings > DA> or don't do the conversion, I say. > > Can you say mapconcat() ? :) > > Or instead of join, just call it concat? Again, no. Concatenating sequences is what I think the + operator does. I think you need the letters S, T, and R in there... But I'm still not convinced of its utility. From guido at CNRI.Reston.VA.US Fri Jun 11 20:51:18 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 14:51:18 -0400 Subject: [Python-Dev] join() Message-ID: <199906111851.OAA04105@eric.cnri.reston.va.us> Given the heat in this discussion, I'm not sure if I endorse *any* of the proposals so far any more... How would Java do this? A static function in the String class, probably. The Python equivalent is... A function in the string module. So maybe string.join() it remains. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Fri Jun 11 21:08:11 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Fri, 11 Jun 1999 15:08:11 -0400 (EDT) Subject: [Python-Dev] join() References: <199906111851.OAA04105@eric.cnri.reston.va.us> Message-ID: <14177.24219.94236.485421@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Given the heat in this discussion, I'm not sure if I Guido> endorse *any* of the proposals so far any more... Oh I dunno. David and I aren't throwing rocks at each other yet :) Guido> How would Java do this? A static function in the String Guido> class, probably. The Python equivalent is... A function Guido> in the string module. So maybe string.join() it remains. The only reason for making it a builtin would be to avoid pulling in all of string just to get join. But I guess we need to get some more experience using the methods before we know whether this is a real problem or not. as-good-as-a-from-string-import-join-and-easier-to-implement-ly y'rs, -Barry From skip at mojam.com Fri Jun 11 21:38:33 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 11 Jun 1999 15:38:33 -0400 (EDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.21484.126155.939932@anthem.cnri.reston.va.us> References: <199906110516.BAA02520@eric.cnri.reston.va.us> <14177.21484.126155.939932@anthem.cnri.reston.va.us> Message-ID: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com> Barry> IMO, builtin join ought to str()ify all the elements in the Barry> sequence, concatenating the results. That seems an intuitive Barry> interpretation of 'join'ing a sequence. Any reason why join should be a builtin and not a method available just to sequences? Would there some valid interpretation of join( {'a': 1} ) join( 1 ) ? If not, I vote for method-hood, not builtin-hood. Seems like you'd avoid some confusion (and some griping by Graham Matthews about how unpure it is ;-). Skip From skip at mojam.com Fri Jun 11 21:42:11 1999 From: skip at mojam.com (Skip Montanaro) Date: Fri, 11 Jun 1999 15:42:11 -0400 (EDT) Subject: [Python-Dev] join() In-Reply-To: <14177.24219.94236.485421@anthem.cnri.reston.va.us> References: <199906111851.OAA04105@eric.cnri.reston.va.us> <14177.24219.94236.485421@anthem.cnri.reston.va.us> Message-ID: <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> BAW> The only reason for making it a builtin would be to avoid pulling BAW> in all of string just to get join. I still don't understand the motivation for making it a builtin instead of a method of the types it operates on. Making it a builtin seems very un-object-oriented to me. Skip From guido at CNRI.Reston.VA.US Fri Jun 11 21:44:28 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 11 Jun 1999 15:44:28 -0400 Subject: [Python-Dev] join() In-Reply-To: Your message of "Fri, 11 Jun 1999 15:42:11 EDT." <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> References: <199906111851.OAA04105@eric.cnri.reston.va.us> <14177.24219.94236.485421@anthem.cnri.reston.va.us> <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> Message-ID: <199906111944.PAA04277@eric.cnri.reston.va.us> > I still don't understand the motivation for making it a builtin instead of a > method of the types it operates on. Making it a builtin seems very > un-object-oriented to me. Because if you make it a method, every sequence type needs to know about joining strings. (This wouldn't be a problem in Smalltalk where sequence types inherit this stuff from an abstract sequence class, but in Python unfortunately that doesn't exist.) --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Fri Jun 11 22:11:11 1999 From: da at ski.org (David Ascher) Date: Fri, 11 Jun 1999 13:11:11 -0700 (Pacific Daylight Time) Subject: [Python-Dev] join() In-Reply-To: <199906111944.PAA04277@eric.cnri.reston.va.us> Message-ID: On Fri, 11 Jun 1999, Guido van Rossum wrote: > > I still don't understand the motivation for making it a builtin instead of a > > method of the types it operates on. Making it a builtin seems very > > un-object-oriented to me. > > Because if you make it a method, every sequence type needs to know > about joining strings. It still seems to me that we could do something like F/'s proposal, where sequences can define a join() method, which could be optimized if the first element is a string to do what string.join, by placing the class method in an instance method of strings, since string joining clearly has to involve at least one string. Pseudocode: class SequenceType: def join(self, separator=None): if hasattr(self[0], '__join__') # covers all types which can be efficiently joined if homogeneous return self[0].__join__(self, separator) # for the rest: if separator is None: return map(operator.add, self) result = self[0] for element in self[1:]: result = result + separator + element return result where the above would have to be done in abstract.c, with error handling, etc. and with strings (regular and unicode) defining efficient __join__'s as in: class StringType: def join(self, separator): raise AttributeError, ... def __join__(self, sequence): return string.join(sequence) # obviously not literally that =) class UnicodeStringType: def __join__(self, sequence): return unicode.join(sequence) (in C, of course). Yes, it's strange to fake class methods with instance methods, but it's been done before =). Yes, this means expanding what it means to "be a sequence" -- is that impossible without breaking lots of code? --david From gmcm at hypernet.com Fri Jun 11 23:30:10 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Fri, 11 Jun 1999 16:30:10 -0500 Subject: [Python-Dev] String methods... finally In-Reply-To: References: <14177.22675.716917.331314@anthem.cnri.reston.va.us> Message-ID: <1282985631-84109501@hypernet.com> David Ascher wrote: > Barry Warsaw wrote: > > Or instead of join, just call it concat? > > Again, no. Concatenating sequences is what I think the + operator > does. I think you need the letters S, T, and R in there... But I'm > still not convinced of its utility. But then Q will feel left out, and since Q doesn't go anywhere without U, pretty soon you'll have the whole damn alphabet in there. I-draw-the-line-at-$-well-$-&- at -but-definitely-not-#-ly y'rs - Gordon From MHammond at skippinet.com.au Sat Jun 12 00:49:29 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Sat, 12 Jun 1999 08:49:29 +1000 Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.20510.818041.110989@anthem.cnri.reston.va.us> Message-ID: <006801beb45c$aab5baa0$0801a8c0@bobcat> > Are these separately available? I don't see them under downloads. > Send me a URL, and if I can figure out how to get CVS to add files to > the branch :/, maybe I can check this in so people can play with it. Fredrik and I have spoken about this. He will dust it off and integrate some patches in the next few days. He will then send it to me to make sure the patches I made for Windows CE all made it OK, then one of us will integrate it with the branch and send it on... Mark. From tim_one at email.msn.com Sat Jun 12 02:56:03 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 11 Jun 1999 20:56:03 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.21736.100540.221487@anthem.cnri.reston.va.us> Message-ID: <000401beb46e$58b965a0$5ba22299@tim> [Skip Montanaro] > I see the following functions in string.py that could > reasonably be methodized: > > ljust, rjust, center, expandtabs, capwords > > Also zfill. > [Barry A. Warsaw] > What do you think, are these important enough to add? I think lack-of-surprise (gratuitous orthogonality ) was the motive here. If Guido could drop string functions in 2.0, which would he be happy to forget? Give him a head start. ljust and rjust were used often a long time ago, before the "%" sprintf-like operator was introduced; don't think I've seen new code use them in years. center was a nice convenience in the pre-HTML world, but probably never speed-critical and easy to write yourself. expandtabs is used frequently in IDLE and even pyclbr.py now. Curiously, though, they almost never want the tab-expanded string, but rather its len. capwords could become an absolute nightmare in a Unicode world <0.5 wink>. > Maybe we can just drop in /F's implementation for these. Sounds like A Plan to me. Wouldn't mourn the passing of the first three. and-i-even-cried-at-my-father's-funeral-ly y'rs - tim From tim_one at email.msn.com Sat Jun 12 08:19:33 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 02:19:33 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <199906110140.VAA02180@eric.cnri.reston.va.us> Message-ID: <000001beb49b$8a94f120$b19e2299@tim> [GvR] > (I sometimes wished I wasn't in the business of making releases. I've > asked for help with making essential patches to 1.5.2 available but > nobody volunteered... :-( ) It's kinda baffling "out here" -- checkin comments usually say what a patch does, but rarely make a judgment about a patch's importance. Sorting thru hundreds of patches without a clue is a pretty hopeless task. Perhaps future checkins that the checker-inner feels are essential could be commented as such in a machine-findable way? an-ounce-of-foresight-is-worth-a-sheet-of-foreskin-or-something-like-that-ly y'rs - tim From tim_one at email.msn.com Sat Jun 12 08:19:37 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 02:19:37 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <199906101411.KAA29962@eric.cnri.reston.va.us> Message-ID: <000101beb49b$8c27c620$b19e2299@tim> [Aaron, describes a scheme where objects are represented by a fixed-size (typecode, variant) pair, where if the typecode is e.g. INT or FLOAT the variant is the value directly instead of a pointer to the value] [Guido] > What you're describing is very close to what I recall I once read > about the runtime organization of Icon. At the lowest level it's exactly what Icon does. It does *not* exempt ints from Icon's flavor of dynamic memory management, but Icon doesn't use refcounting -- it uses compacting mark-&-sweep across some 5 distinct regions each with their own finer-grained policies (e.g., strings are central to Icon and so it manages the string region a little differently; and Icon coroutines save away pieces of the platform's C stack so need *very* special treatment). So: 1) There are no incref/decref expenses anywhere in Icon. 2) Because of compaction, all allocations cost the same and are dirt cheap: just increment the appropriate region's "avail" pointer by the number of bytes you need. If there aren't enough bytes, run GC and try again. If there still aren't enough bytes, Icon usually shuts down (it's not good at asking the OS for more memory! it carves up its initial memory in pretty rigid ways, and relies on tricks like comparing storage addresses to speed M&S and compaction -- those "regions" are in a fixed order relative to each other, so new memory can't be tacked on to a region except at the low and high ends). 3) All the expense is in finding and compacting live objects, so in an odd literal sense cleaning up trash comes for free. 4) Icon has no finalizers, so it doesn't need to identify or preserve trash -- compaction simply overwrites "the holes" where the trash used to be. Icon is nicely implemented, but it's a "self-contained universe" view of the world and its memory approach makes life hard for the tiny handful of folks who have *tried* to make it extendable via C. Icon is also purely procedural -- no OO, no destructors, no resurrection. Irony: one reason I picked up Python in '91 is that my int-fiddling code was too slow in Icon! Even Python 0.9.0 ran int algorithms significantly faster than the 10-years-refined Icon implementation of that time. Never looked into why, but now that Aaron brought up the issue I find it very surprising! Those algorithms had a huge rate of int trash creation, but very few persistent objects, so Icon's M&S should have run like the wind. And Icon's allocation is dirt-cheap (at least as fast as Python's fastest special-purpose allocators), and didn't have any refcounting expenses either. There's an important lesson *somewhere* in that . Maybe it was the fault of Icon's "goal-directed" expression evaluation, constantly asking "did this int succeed or fail?", "did that add suceed or fail?", etc. > ... > The Icon approach (i.e. yours) seems to require a complete rethinking > of all object implementations and all APIs at the C level -- perhaps > we could think about it for Python 2.0. Some ramifications: > > - Uses more memory for highly shared objects (there are as many copies > of the type pointer as there are references). Actually more than that in Icon: if the "variant" part is a pointer, the first word of the block it points to is also a copy of the typecode (turns out the redundancy speeds the GC). > - Thus, lists take double the memory assuming they reference objects > that also exist elsewhere. This affects the performance of slices > etc. > > - On the other hand, a list of ints takes half the memory (given that > most of those ints are not shared). Isn't this 2/3 rather than 1/2? I'm picturing a list element today as essentially a pointer to a type object pointer + int (3 units in all), and a type object pointer + int (2 units in all) "tomorrow". Throw in refcounts too and the ratio likely gets closer to 1. > - *Homogeneous* lists (where all elements have the same type -- > i.e. arrays) can be represented more efficiently by having only one > copy of the type pointer. This was an idea for ABC (whose type system > required all container types to be homogenous) that was never > implemented (because in practice the type check wasn't always applied, > and the top-level namespace used by the interactive command > interpreter violated all the rules). Well, Python already has homogeneous int lists (array.array), and while they save space they suffer in speed due to needing to wrap raw ints "in an object" upon reference and unwrap them upon storage. > - Reference count manipulations could be done by a macro (or C++ > behind-the-scense magic using copy constructors and destructors) that > calls a function in the type object -- i.e. each object could decide > on its own reference counting implementation :-) You don't need to switch representations to get that, though, right? That is, I don't see anything stopping today's type objects from growing __incref__ and __decref__ slots -- except for common sense . An apparent ramification I don't see above that may actually be worth something : - In "i = j + k", the eval stack could contain the ints directly, instead of pointers to the ints. So fetching the value of i takes two loads (get the type pointer + the variant) from adjacent stack locations, instead of today's load-the-pointer + follow-the-pointer (to some other part of memory); similarly for fetching the value of j. Then the sum can be stored *directly* into the stack too, without today's need for allocating and wrapping it in "an int object" first. Possibly happy variant: on top of the above, *don't* exempt ints from refcounting. Let 'em incref and decref like everything else. Give them an intial refcount of max_count/2, and in the exceedingly unlikely event a decref on an int ever sees zero, the int "destructor" simply resets the refcount to max_count/2 and is otherwise a nop. semi-thinking-semi-aloud-ly y'rs - tim From ping at lfw.org Sat Jun 12 10:05:06 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Sat, 12 Jun 1999 01:05:06 -0700 (PDT) Subject: [Python-Dev] String methods... finally In-Reply-To: <004601beb409$8c535750$f29b12c2@pythonware.com> Message-ID: On Fri, 11 Jun 1999, Fredrik Lundh wrote: > > Two new methods startswith and endswith act like their Java cousins. > > is it just me, or do those method names suck? > > begin? starts_with? startsWith? (ouch) > has_prefix? I'm quite happy with "startswith" and "endswith". I mean, they're a bit long, i suppose, but i can't think of anything better. You definitely want to avoid has_prefix, as that compounds the has_key vs. hasattr issue. x.startswith("foo") x[:3] == "foo" x.startswith(y) x[:len(y)] == y Hmm. I guess it doesn't save you much typing until y is an expression. But it's still a lot easier to read. !ping From ping at lfw.org Sat Jun 12 10:12:38 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Sat, 12 Jun 1999 01:12:38 -0700 (PDT) Subject: [Python-Dev] join() In-Reply-To: <14177.26195.71364.77248@cm-24-29-94-19.nycap.rr.com> Message-ID: On Fri, 11 Jun 1999, Skip Montanaro wrote: > > BAW> The only reason for making it a builtin would be to avoid pulling > BAW> in all of string just to get join. > > I still don't understand the motivation for making it a builtin instead of a > method of the types it operates on. Making it a builtin seems very > un-object-oriented to me. Builtin-hood makes it possible for one method to apply to many types (or a heterogeneous list of things). I think i'd support the def join(list, sep=None): if sep is None: result = list[0] for item in list[1:]: result = result + item else: result = list[0] for item in list[1:]: result = result + sep + item idea, basically a reduce(operator.add...) with an optional separator -- *except* my main issue would be to make sure that the actual implementation optimizes the case of joining a list of strings. string.join() currently seems like the last refuge for those wanting to avoid O(n^2) time when assembling many small pieces in string buffers, and i don't want it to see it go away. !ping From fredrik at pythonware.com Sat Jun 12 11:13:59 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 12 Jun 1999 11:13:59 +0200 Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us><008601beb3da$5e0a7a60$f29b12c2@pythonware.com> <14177.20510.818041.110989@anthem.cnri.reston.va.us> Message-ID: <00c301beb4b3$e84e3de0$f29b12c2@pythonware.com> > FL> fwiw, the Unicode module available from pythonware.com > FL> implements them all, and more importantly, it can be com- > FL> piled for either 8-bit or 16-bit characters... > > Are these separately available? I don't see them under downloads. > Send me a URL, and if I can figure out how to get CVS to add files to > the branch :/, maybe I can check this in so people can play with it. it's under: http://www.pythonware.com/madscientist/index.htm but I've teamed up with Mark H. to update the stuff a bit, test it with his CE port, and produce a set of patches. I'm working on this in this very moment. btw, as for the "missing methods in the string type" issue, my suggestion is to merge the source code into a unified string module, which is compiled twice (or three times, the day we find that we need a 32-bit string type). don't waste any time cutting and pasting until we've sorted that one out... From fredrik at pythonware.com Sat Jun 12 11:31:08 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 12 Jun 1999 11:31:08 +0200 Subject: [Python-Dev] String methods... finally References: <000401beb46e$58b965a0$5ba22299@tim> Message-ID: <00fb01beb4b6$4df59420$f29b12c2@pythonware.com> > expandtabs is used frequently in IDLE and even pyclbr.py now. Curiously, > though, they almost never want the tab-expanded string, but rather its len. looked in stropmodule.c lately: static PyObject * strop_expandtabs(self, args) ... /* First pass: determine size of output string */ ... /* Second pass: create output string and fill it */ ... (btw, I originally wrote that code for pythonworks ;-) how about an "expandtabslength" method? or maybe we should add lazy evaluation of strings! From fredrik at pythonware.com Sat Jun 12 11:49:07 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat, 12 Jun 1999 11:49:07 +0200 Subject: [Python-Dev] join() References: <199906111851.OAA04105@eric.cnri.reston.va.us> <14177.24219.94236.485421@anthem.cnri.reston.va.us> Message-ID: <014001beb4b9$63f1e820$f29b12c2@pythonware.com> > The only reason for making it a builtin would be to avoid pulling in > all of string just to get join. another reason is that you might be able to avoid a unicode module... From tismer at appliedbiometrics.com Sat Jun 12 15:27:45 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Sat, 12 Jun 1999 15:27:45 +0200 Subject: [Python-Dev] More flexible namespaces. References: <008d01be92b2$c56ef5d0$0801a8c0@bobcat> <199904300300.XAA00608@eric.cnri.reston.va.us> <37296096.D0C9C2CC@appliedbiometrics.com> <199904301517.LAA01422@eric.cnri.reston.va.us> Message-ID: <37626051.C1EA8AE0@appliedbiometrics.com> Guido van Rossum wrote: > > > From: Christian Tismer > > > I'd really like to look into that. > > Also I wouldn't worry too much about speed, since this is > > such a cool feature. It might even be a speedup in some cases > > which otherwise would need more complex handling. > > > > May I have a look? > > Sure! > > (I've forwarded Christian the files per separate mail.) > > I'm also interested in your opinion on how well thought-out and robust > the patches are -- I've never found the time to do a good close > reading of them. Coming back from the stackless task with is finished now, I popped this task from my stack. I had a look and it seems well-thought and robust so far. To make a more trustable claim, I would need to build and test it. Is this still of interest, or should I drop it? The follow-ups in this thread indicated that the opinions about flexible namespaces were quite mixed. So, should I waste time in building and testing or better save it? chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Sat Jun 12 19:16:28 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Sat, 12 Jun 1999 13:16:28 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14176.19210.146525.172100@anthem.cnri.reston.va.us> <008601beb3da$5e0a7a60$f29b12c2@pythonware.com> <14177.20510.818041.110989@anthem.cnri.reston.va.us> <00c301beb4b3$e84e3de0$f29b12c2@pythonware.com> Message-ID: <14178.38380.734976.164568@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> btw, as for the "missing methods in the string type" FL> issue, my suggestion is to merge the source code into FL> a unified string module, which is compiled twice (or FL> three times, the day we find that we need a 32-bit FL> string type). don't waste any time cutting and FL> pasting until we've sorted that one out... Very good. Give me the nod when the sorting algorithm halts. From tim_one at email.msn.com Sat Jun 12 20:28:13 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 14:28:13 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com> Message-ID: <000101beb501$55fb9b60$ce9e2299@tim> [Skip Montanaro] > Any reason why join should be a builtin and not a method available just > to sequences? Would there some valid interpretation of > > join( {'a': 1} ) > join( 1 ) > > ? If not, I vote for method-hood, not builtin-hood. Same here, except as a method we've got it twice backwards : it should be a string method, but a method of the *separator*: sep.join(seq) same as convert each elt in seq to a string of the same flavor as sep, then paste the converted strings together with sep between adjacent elements So " ".join(list) delivers the same result as today's string.join(map(str, list), " ") and L" ".join(list) does much the same tomorrow but delivers a Unicode string (or is the "L" for Lundh string ?). It looks odd at first, but the more I play with it the more I think it's "the right thing" to do: captures everything that's done today, plus the most common idiom (mapping str first across the sequence) on top of that, adapts seamlessly (from the user's view) to new string types, and doesn't invite uselessly redundant generalization to non-sequence types. One other attraction perhaps unique to me: I can never remember whether string.join's default separator is a blank or a null string! Explicit is better than implicit . the-heart-of-a-join-is-the-glue-ly y'rs - tim From tim_one at email.msn.com Sat Jun 12 20:28:18 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 14:28:18 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <00fb01beb4b6$4df59420$f29b12c2@pythonware.com> Message-ID: <000201beb501$578548a0$ce9e2299@tim> [Tim] > expandtabs is used frequently in IDLE and even pyclbr.py now. > Curiously, though, they almost never want the tab-expanded string, > but rather its len. [/F] > looked in stropmodule.c lately: > > static PyObject * > strop_expandtabs(self, args) > ... > /* First pass: determine size of output string */ > ... > /* Second pass: create output string and fill it */ > ... > > (btw, I originally wrote that code for pythonworks ;-) Yes, it's nice code! The irony was the source of my "curiously" . > how about an "expandtabslength" method? Na, it's very specialized, easy to spell by hand, and even IDLE/pyclbr don't really need more speed in this area. From tim_one at email.msn.com Sat Jun 12 23:37:08 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 17:37:08 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <3761098D.A56F58A8@ifu.net> Message-ID: <000501beb51b$b9cb3780$ce9e2299@tim> [Aaron Watters] > ... > I thought it would be good to be able to do the following loop > with Numeric arrays > > for x in array1: > array2[x] = array3[x] + array4[x] > > without any memory management being involved. Right now, I think the > for loop has to continually dynamically allocate each new x Actually not, it just binds x to the sequence of PyObject*'s already in array1, one at a time. It does bump & drop the refcount on that object a lot. Also irksome is that it keeps allocating/deallocating a little integer on each trip, for the under-the-covers loop index! Marc-Andre (I think) had/has a patch to worm around that, but IIRC it didn't make much difference (wouldn't expect it to, though -- not if the loop body does any real work). One thing a smarter Python compiler could do is notice the obvious : the *internal* incref/decref operations on the object denoted by x in the loop above must cancel out, so there's no need to do any of them. "internal" == those due to the routine actions of the PVM itself, while pushing and popping the eval stack. Exploiting that is tedious; e.g., inventing a pile of opcode variants that do the same thing as today's except skip an incref here and a decref there. > and intermediate sum (and immediate deallocate them) The intermediate sum is allocated each time, but not deallocated (the pre-existing object at array2[x] *may* be deallocated, though). > and that makes the loop piteously slow. A lot of things conspire to make it slow. David is certainly right that, in this particular case, array2[array1] = array3[array1] + etc worms around the worst of them. > The idea replacing pyobject *'s with a struct [typedescr *, data *] > was a space/time tradeoff to speed up operations like the above > by eliminating any need for mallocs or other memory management.. Fleshing out details may make it look less attractive. For machines where ints are no wider than pointers, the "data *" can be replaced with the int directly and then there's real potential. If for a float the "data*" really *is* a pointer, though, what does it point *at*? Some dynamically allocated memory to hold the float appears to be the only answer, and you're right back at the problem you were hoping to avoid. Make the "data*" field big enough to hold a Python float directly, and the descriptor likely zooms to 128 bits (assuming float is IEEE double and the machine requires natural alignment). Let's say we do that. Where does the "+" implementation get the 16 bytes it needs to store its result? The space presumably already exists in the slot indexed by array2[x], but the "+" implementation has no way to *know* that. Figuring it out requires non-local analysis, which is quite a few steps beyond what Python's compiler can do today. Easiest: internal functions all grow a new PyDescriptor* argument into which they are to write their result's descriptor. The PVM passes "+" the address of the slot indexed by array2[x] if it's smart enough; or, if it's not, the address of the stack slot descriptor into which today's PVM *would* push the result. In the latter case the PVM would need to copy those 16 bytes into the slot indexed by array2[x] later. Neither of those are simple as they sound, though, at least because if array2[x] holds a descriptor with a real pointer in its variant half, the thing to which it points needs to get decref'ed iff the add succeeds. It can get very messy! > I really can't say whether it'd be worth it or not without some sort of > real testing. Just a thought. It's a good thought! Just hard to make real. but-if-michael-hudson-keeps-hacking-at-bytecodes-and-christian- keeps-trying-to-prove-he's-crazier-than-michael-by-2001- we'll-be-able-to-generate-optimized-vector-assembler-for- it-ly y'rs - tim From tim_one at email.msn.com Sat Jun 12 23:37:14 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 12 Jun 1999 17:37:14 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <375FC062.62850DE5@ifu.net> Message-ID: <000601beb51b$bc723ba0$ce9e2299@tim> [Aaron Watters] > ... > Another fix would be to put the refcount in the static side with > no speed penalty > > (typedescr > repr* ----------------------> data > refcount > ) > > but would that be wasteful of space? The killer is for types where repr* is a real pointer: x = [Whatever()] y = x[:] Now we have two physically distinct descriptors pointing at the same thing, and so also two distinct refcounts for that thing -- impossible to keep them in synch efficiently; "del y" has no way efficient way to find the refcount hiding in x. tbings-and-and-their-refcounts-are-monogamous-ly y'rs - tim From bwarsaw at cnri.reston.va.us Sun Jun 13 19:56:33 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Sun, 13 Jun 1999 13:56:33 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14177.25698.40807.786489@cm-24-29-94-19.nycap.rr.com> <000101beb501$55fb9b60$ce9e2299@tim> Message-ID: <14179.61649.286195.248429@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: TP> Same here, except as a method we've got it twice backwards TP> : it should be a string method, but a method of the TP> *separator*: TP> sep.join(seq) TP> same as | convert each elt in seq to a string of the same flavor as | sep, then paste the converted strings together with sep | between adjacent elements TP> So TP> " ".join(list) TP> delivers the same result as today's TP> string.join(map(str, list), " ") TP> and TP> L" ".join(list) TP> does much the same tomorrow but delivers a Unicode string (or TP> is the "L" for Lundh string ?). TP> It looks odd at first, but the more I play with it the more I TP> think it's "the right thing" to do At first glance, I like this proposal a lot. I'd be happy to code it up if David'll stop throwing those rocks. Whether or not they hit me, they still hurt :) -Barry From tim_one at email.msn.com Sun Jun 13 21:34:57 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 13 Jun 1999 15:34:57 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <14179.61649.286195.248429@anthem.cnri.reston.va.us> Message-ID: <000801beb5d3$d1fd06e0$ae9e2299@tim> > >>>>> "TP" == Tim Peters writes: > > TP> Same here, except as a method we've got it twice backwards > TP> : it should be a string method, but a method of the > TP> *separator*: > > TP> sep.join(seq) > > TP> same as > > | convert each elt in seq to a string of the same flavor as > | sep, then paste the converted strings together with sep > | between adjacent elements > > TP> So > > TP> " ".join(list) > > TP> delivers the same result as today's > > TP> string.join(map(str, list), " ") > > TP> and > > TP> L" ".join(list) > > TP> does much the same tomorrow but delivers a Unicode string (or > TP> is the "L" for Lundh string ?). > > TP> It looks odd at first, but the more I play with it the more I > TP> think it's "the right thing" to do Barry, did it ever occur to you to that this fancy Emacs quoting is pig ugly ? [Barry A. Warsaw] > At first glance, I like this proposal a lot. That's a bit scary -- even I didn't like it at first glance. It kept growing on me, though, especially after a trivial naming trick: space, tab, null = ' ', '\t', '' ... sentence = space.join(list) table = tab.join(list) squashed = null.join(list) That's so beautifully self-descriptive I cried! Well, I actually jerked my leg and stubbed my little toe badly, but it's healing nicely, thank you. Note the naturalness too of creating zippier bound method objects for the kinds of join you're doing most often: spacejoin = ' '.join tabjoin = '\t'.join etc. I still like it more the more I play with it. > I'd be happy to code it up if David'll stop throwing those rocks. David warmed up to it in pvt email (his first response was the expected one-liner "Wacky!"). Other issues: + David may want C.join(T) generalized to other classes C and argument types T. So far my response to all such generalizations has been "wacky!" , but I don't think that bears one way or t'other on whether StringType.join(SequenceType) makes good sense on its own. + string.join(seq) doesn't currently convert seq elements to string type, and in my vision it would. At least three of us admit to mapping str across seq anyway before calling string.join, and I think it would be a nice convenience: I think there's no confusion because there's nothing sensible string.join *could* do with a non-string seq element other than convert it to string. The primary effect of string.join griping about a non-string seq element today is that my if not ok: sys.__stdout__.write("not ok, args are " + string.join(args) + "\n") debugging output blows up instead of being helpful <0.8 wink>. If Guido is opposed to being helpful, though , the auto-convert bit isn't essential. > Whether or not they hit me, they still hurt :) I know they do, Barry. That's why I never throw rocks at you. If you like, I'll have a word with David's ISP. if-this-was-a-flame-war-we're-too-civilized-to-live-long-enough-to- reproduce-ly y'rs - tim From da at ski.org Sun Jun 13 21:48:59 1999 From: da at ski.org (David Ascher) Date: Sun, 13 Jun 1999 12:48:59 -0700 (Pacific Daylight Time) Subject: [Python-Dev] String methods... finally In-Reply-To: <14179.61649.286195.248429@anthem.cnri.reston.va.us> Message-ID: On Sun, 13 Jun 1999, Barry A. Warsaw wrote: > At first glance, I like this proposal a lot. I'd be happy to code it > up if David'll stop throwing those rocks. Whether or not they hit me, > they still hurt :) I like it too, since you ask. =) (When you get a chance, could you bring the rocks back? I only have a limited supply. Thanks). --david From guido at CNRI.Reston.VA.US Mon Jun 14 16:46:34 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 14 Jun 1999 10:46:34 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Sat, 12 Jun 1999 14:28:13 EDT." <000101beb501$55fb9b60$ce9e2299@tim> References: <000101beb501$55fb9b60$ce9e2299@tim> Message-ID: <199906141446.KAA00733@eric.cnri.reston.va.us> > Same here, except as a method we've got it twice backwards : it > should be a string method, but a method of the *separator*: > > sep.join(seq) Funny, but it does seem right! Barry, go for it... --Guido van Rossum (home page: http://www.python.org/~guido/) From klm at digicool.com Mon Jun 14 17:09:58 1999 From: klm at digicool.com (Ken Manheimer) Date: Mon, 14 Jun 1999 11:09:58 -0400 Subject: [Python-Dev] String methods... finally Message-ID: <613145F79272D211914B0020AFF640191D1BAF@gandalf.digicool.com> > [Skip Montanaro] > > I see the following functions in string.py that could > > reasonably be methodized: > > > > ljust, rjust, center, expandtabs, capwords > > > > Also zfill. > > > > [Barry A. Warsaw] > > What do you think, are these important enough to add? I think expandtabs is worthwhile. Though i wouldn't say i use it frequently, when i do use it i'm thankful it's there - it's something i'm really glad to have precooked, since i'm generally not looking for the distraction when i do happen to need it... Ken klm at digicool.com From guido at CNRI.Reston.VA.US Mon Jun 14 17:12:33 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 14 Jun 1999 11:12:33 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: Your message of "Sat, 12 Jun 1999 02:19:37 EDT." <000101beb49b$8c27c620$b19e2299@tim> References: <000101beb49b$8c27c620$b19e2299@tim> Message-ID: <199906141512.LAA00793@eric.cnri.reston.va.us> [me] > > - Thus, lists take double the memory assuming they reference objects > > that also exist elsewhere. This affects the performance of slices > > etc. > > > > - On the other hand, a list of ints takes half the memory (given that > > most of those ints are not shared). [Tim] > Isn't this 2/3 rather than 1/2? I'm picturing a list element today as > essentially a pointer to a type object pointer + int (3 units in all), and a > type object pointer + int (2 units in all) "tomorrow". Throw in refcounts > too and the ratio likely gets closer to 1. An int is currently 3 units: type, refcnt, value. (The sepcial int allocator means that there's no malloc overhead.) A list item is one unit. So a list of N ints is 4N units (+ overhead). In the proposed scheme, there would be 2 units. That makes a factor of 1/2 for me... > Well, Python already has homogeneous int lists (array.array), and while they > save space they suffer in speed due to needing to wrap raw ints "in an > object" upon reference and unwrap them upon storage. Which would become faster with the proposed scheme since it would not require any heap allocation (presuming 2-unit structs can be passed around as function results). > > - Reference count manipulations could be done by a macro (or C++ > > behind-the-scense magic using copy constructors and destructors) that > > calls a function in the type object -- i.e. each object could decide > > on its own reference counting implementation :-) > > You don't need to switch representations to get that, though, right? That > is, I don't see anything stopping today's type objects from growing > __incref__ and __decref__ slots -- except for common sense . Eh, indeed . > An apparent ramification I don't see above that may actually be worth > something : > > - In "i = j + k", the eval stack could contain the ints directly, instead of > pointers to the ints. So fetching the value of i takes two loads (get the > type pointer + the variant) from adjacent stack locations, instead of > today's load-the-pointer + follow-the-pointer (to some other part of > memory); similarly for fetching the value of j. Then the sum can be stored > *directly* into the stack too, without today's need for allocating and > wrapping it in "an int object" first. I though this was assumed all the time? I mentioned "no heap allocation" above before I read this. I think this is the reason why it was proposed at all: things for which the value fits in a unit don't live on the heap at all, *without* playing tricks with pointer representations. > Possibly happy variant: on top of the above, *don't* exempt ints from > refcounting. Let 'em incref and decref like everything else. Give them an > intial refcount of max_count/2, and in the exceedingly unlikely event a > decref on an int ever sees zero, the int "destructor" simply resets the > refcount to max_count/2 and is otherwise a nop. Don't get this -- there's no object on the heap to hold the refcnt. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Mon Jun 14 20:47:32 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 14 Jun 1999 14:47:32 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us> <000801beb5d3$d1fd06e0$ae9e2299@tim> Message-ID: <14181.20036.857729.999835@anthem.cnri.reston.va.us> >>>>> "TP" == Tim Peters writes: Timbot> Barry, did it ever occur to you to that this fancy Emacs Timbot> quoting is pig ugly ? wink> + string.join(seq) doesn't currently convert seq elements to wink> string type, and in my vision it would. At least three of wink> us admit to mapping str across seq anyway before calling wink> string.join, and I think it would be a nice convenience: Check the CVS branch. It does seem pretty cool! From bwarsaw at cnri.reston.va.us Mon Jun 14 20:48:10 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 14 Jun 1999 14:48:10 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us> Message-ID: <14181.20074.728230.764485@anthem.cnri.reston.va.us> >>>>> "DA" == David Ascher writes: DA> (When you get a chance, could you bring the rocks back? I DA> only have a limited supply. Thanks). Sorry, I need them to fill up the empty spaces in my skull. -Barry From tim_one at email.msn.com Tue Jun 15 04:50:08 1999 From: tim_one at email.msn.com (Tim Peters) Date: Mon, 14 Jun 1999 22:50:08 -0400 Subject: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <14181.20036.857729.999835@anthem.cnri.reston.va.us> Message-ID: <000001beb6d9$c82e7980$069e2299@tim> >> wink> + string.join(seq) [etc] [Barry] > Check the CVS branch. It does seem pretty cool! It's even more fun to play with than to argue about . Thank you, Barry! A bug: >>> 'ab'.endswith('b',0,1) # right 0 >>> 'ab'.endswith('ab',0,1) # wrong 1 >>> 'ab'.endswith('ab',0,0) # wrong 1 >>> Two legit compiler warnings from a previous checkin: Objects\intobject.c(236) : warning C4013: 'isspace' undefined; assuming extern returning int Objects\intobject.c(243) : warning C4013: 'isalnum' undefined; assuming extern returning int One docstring glitch ("very" -> "every"): >>> print ''.join.__doc__ S.join(sequence) -> string Return a string which is the concatenation of the string representation of very element in the sequence. The separator between elements is S. >>> "-".join("very nice indeed! ly".split()) + " y'rs - tim" From MHammond at skippinet.com.au Tue Jun 15 05:13:03 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 15 Jun 1999 13:13:03 +1000 Subject: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <000001beb6d9$c82e7980$069e2299@tim> Message-ID: <00e901beb6dc$fc830d60$0801a8c0@bobcat> > "-".join("very nice indeed! ly".split()) + " y'rs - tim" But now the IDLE "CallTips" extenion seems lame. Typing >>> " ".join( doesnt yield the help, where: >>> s=" "; s.join( does :-) Very cute, I must say. The biggest temptation is going to be, as I mentioned, avoiding the use of this stuff for "general" code. Im still unconvinced the "sep.join" concept is natural, but string methods in general sure as hell are. Guido almost hinted that post 1.5.2 interim release(s?) would be acceptable, so long as he didnt have to do it! Im tempted to volunteer to agree to do something for Windows, and if no other platform biggots volunteer, I wont mind in the least :-) I realize it still needs settling down, but this is too good to keep to "ourselves" (being CVS enabled people) for too long ;-) Mark. From tim_one at email.msn.com Tue Jun 15 07:29:03 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 01:29:03 -0400 Subject: [Python-Dev] RE: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <199906141512.LAA00793@eric.cnri.reston.va.us> Message-ID: <000a01beb6ef$fac66ea0$069e2299@tim> [Guido] >>> - On the other hand, a list of ints takes half the memory (given that >>> most of those ints are not shared). [Tim] >> Isn't this 2/3 rather than 1/2? [yadda yadda] [Guido] > An int is currently 3 units: type, refcnt, value. (The sepcial int > allocator means that there's no malloc overhead.) A list item is one > unit. So a list of N ints is 4N units (+ overhead). In the proposed > scheme, there would be 2 units. That makes a factor of 1/2 for me... Well, if you count the refcount, sure . Moving on, implies you're not contemplating making the descriptor big enough to hold a float (else it would still be 4 units assuming natural alignment), in turn implying that *only* ints would get the space advantage in lists/tuples? Plus maybe special-casing the snot out of short strings? >> Well, Python already has homogeneous int lists (array.array), >> and while they save space they suffer in speed ... > Which would become faster with the proposed scheme since it would not > require any heap allocation (presuming 2-unit structs can be passed > around as function results). They can be in any std (even reasonable) C (or C++). If this gets serious, though, strongly suggest timing it on important compiler + platform combos, especially RISC. You can probably *count* on a PyObject* result getting returned in a register, but depressed C++ compiler jockeys have been known to treat struct/class returns via an unoptimized chain of copy constructors. Probably better to allocate "result space" in the caller and pass that via reference to the callee. With care, you can get the result written into its final resting place efficiently then, more efficiently than even a gonzo globally optimizing compiler could figure out (A calls B call C calls D, and A can tell D exactly where to store the result if it's explicit). >> [other ramifications for >> "i = j + k" >> ] > I though this was assumed all the time? Apparently it was! At least by you . Now by me too; no problem. >> [refcount-on-int drivel] > Don't get this -- there's no object on the heap to hold the refcnt. I don't get it either. Desperation? The idea that incref/decref may need to be treated as virtual methods (in order to exempt ints or other possible direct values) really disturbs me -- incref/decref happen *all* the time, explicit integer ops only some of the time. Turning incref/decref into indirected function calls doesn't sound promising at all. Injecting a test-branch guard via macro sounds faster but still icky, and especially if the set of exempt types isn't a singleton. no-positive-suggestions-just-grousing-ly y'rs - tim From tim_one at email.msn.com Tue Jun 15 08:17:02 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 02:17:02 -0400 Subject: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <00e901beb6dc$fc830d60$0801a8c0@bobcat> Message-ID: <001201beb6f6$af0987c0$069e2299@tim> [Mark Hammond] > ... > But now the IDLE "CallTips" extenion seems lame. > > Typing > >>> " ".join( > > doesnt yield the help, where: > >>> s=" "; s.join( > > does :-) No Windows Guy will be stymied by how to hack that! Hint: string literals always end with one of two characters . > Very cute, I must say. The biggest temptation is going to be, as I > mentioned, avoiding the use of this stuff for "general" code. Im still > unconvinced the "sep.join" concept is natural, but string methods in > general sure as hell are. sep.join bothered me until I gave the separator a name (a la the "space.join, tab.join", etc examples earlier). Then it looked *achingly* natural! Using a one-character literal instead still rubs me the wrong way, although for some reason e.g. ", ".join(seq) no longer does. I can't account for any of it, but I know what I like . > Guido almost hinted that post 1.5.2 interim release(s?) would be > acceptable, so long as he didnt have to do it! Im tempted to volunteer to > agree to do something for Windows, and if no other platform biggots > volunteer, I wont mind in the least :-) I realize it still > needs settling down, but this is too good to keep to "ourselves" (being > CVS enabled people) for too long ;-) Yes, I really like the new string methods too! And I want to rewrite all of IDLE to use them ASAP . damn-the-users-let's-go-nuts-ly y'rs - tim From fredrik at pythonware.com Tue Jun 15 09:10:28 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 15 Jun 1999 09:10:28 +0200 Subject: [Python-Dev] Re: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> Message-ID: <006801beb6fe$27490d80$f29b12c2@pythonware.com> > wink> + string.join(seq) doesn't currently convert seq elements to > wink> string type, and in my vision it would. At least three of > wink> us admit to mapping str across seq anyway before calling > wink> string.join, and I think it would be a nice convenience: hmm. consider the following: space = " " foo = L"foo" bar = L"bar" result = space.join((foo, bar)) what should happen if you run this: a) Python raises an exception b) result is an ordinary string object c) result is a unicode string object From ping at lfw.org Tue Jun 15 09:24:33 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Tue, 15 Jun 1999 00:24:33 -0700 (PDT) Subject: [Python-Dev] Re: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: <000001beb6d9$c82e7980$069e2299@tim> Message-ID: On Mon, 14 Jun 1999, Tim Peters wrote: > > A bug: > > >>> 'ab'.endswith('b',0,1) # right > 0 > >>> 'ab'.endswith('ab',0,1) # wrong > 1 > >>> 'ab'.endswith('ab',0,0) # wrong > 1 > >>> I assumed you meant that the extra arguments should be slices on the string being searched, i.e. specimen.startswith(text, start, end) is equivalent to specimen[start:end].startswith(text) without the overhead of slicing the specimen? Or did i understand you correctly? > Return a string which is the concatenation of the string representation > of very element in the sequence. The separator between elements is S. > >>> > > "-".join("very nice indeed! ly".split()) + " y'rs - tim" Yes, i have to agree that this (especially once you name the separator string) is a pretty nice way to present the "join" functionality. !ping "Is it so small a thing, To have enjoyed the sun, To have lived light in the Spring, To have loved, to have thought, to have done; To have advanced true friends, and beat down baffling foes-- That we must feign bliss Of a doubtful future date, And while we dream on this, Lose all our present state, And relegate to worlds... yet distant our repose?" -- Matthew Arnold From MHammond at skippinet.com.au Tue Jun 15 10:28:55 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Tue, 15 Jun 1999 18:28:55 +1000 Subject: [Python-Dev] RE: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <006801beb6fe$27490d80$f29b12c2@pythonware.com> Message-ID: <00f801beb709$1c874b90$0801a8c0@bobcat> > hmm. consider the following: > > space = " " > foo = L"foo" > bar = L"bar" > result = space.join((foo, bar)) > > what should happen if you run this: > > a) Python raises an exception > b) result is an ordinary string object > c) result is a unicode string object Well, we could take this to the extreme, and allow _every_ object to grow a join method, where join attempts to cooerce to the same type. Thus: " ".join([L"foo", L"bar"]) -> "foo bar" L" ".join(["foo", "bar"]) -> L"foo bar" " ".join([1,2]) -> "1 2" 0.join(['1',2']) -> 102 [].join([...]) # exercise for the reader ;-) etc. Mark. From ping at lfw.org Tue Jun 15 10:50:34 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Tue, 15 Jun 1999 01:50:34 -0700 (PDT) Subject: [Python-Dev] Re: [Python-Dev] RE: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <00f801beb709$1c874b90$0801a8c0@bobcat> Message-ID: On Tue, 15 Jun 1999, Mark Hammond wrote: > > hmm. consider the following: > > > > space = " " > > foo = L"foo" > > bar = L"bar" > > result = space.join((foo, bar)) > > > > what should happen if you run this: > > > > a) Python raises an exception > > b) result is an ordinary string object > > c) result is a unicode string object > > Well, we could take this to the extreme, and allow _every_ object to grow a > join method, where join attempts to cooerce to the same type. I think i'd agree with Mark's answer for this situation, though i don't know about adding 'join' methods to other types. I see two arguments that can be made here: For b): the result should match the type of the object on which the method was called. This way the type of the result more easily determinable by the programmer or reader. Also, since the type of the result is immediately known to the "join" code, each member of the passed-in sequence need only be fetched once, and a __getitem__-style generator can easily stand in for the sequence. For c): the result should match the "biggest" type among the operands. This behaviour is consistent with what you would get if you added all the operands together. Unfortunately this means you have to see all the operands before you know the type of the result, which means you either scan twice or convert potentially the whole result. b) weighs more strongly in my opinion, so i think the right thing to do is to match the type of the separator. (But if a Unicode string contains characters outside of the Latin-1 range, is it supposed to raise an exception on an attempt to convert to an ordinary string? In that case, the actual behaviour of the above example would be a) and i'm not sure if that would get annoying fast.) -- ?!ng "In the sciences, we are now uniquely privileged to sit side by side with the giants on whose shoulders we stand." -- Gerald Holton From gstein at lyra.org Tue Jun 15 11:05:43 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 15 Jun 1999 02:05:43 -0700 Subject: [Python-Dev] Re: String methods... finally References: Message-ID: <37661767.37D8E370@lyra.org> Ka-Ping Yee wrote: >... > (But if a Unicode string contains characters outside of > the Latin-1 range, is it supposed to raise an exception > on an attempt to convert to an ordinary string? In that > case, the actual behaviour of the above example would be > a) and i'm not sure if that would get annoying fast.) I forget the "last word" on this, but (IMO) str(unicode_object) should return a UTF-8 encoded string. Cheers, -g p.s. what's up with Mailman... it seems to have broken badly on the [Python-Dev] insertion... I just stripped a bunch of 'em -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Tue Jun 15 11:48:40 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 15 Jun 1999 11:48:40 +0200 Subject: [Python-Dev] Re: String methods... finally References: Message-ID: <003e01beb714$55d7fd80$f29b12c2@pythonware.com> > > > a) Python raises an exception > > > b) result is an ordinary string object > > > c) result is a unicode string object > > > > Well, we could take this to the extreme, and allow _every_ object to grow a > > join method, where join attempts to cooerce to the same type. well, I think that unicode strings and ordinary strings should behave like "strings" where possible, just like integers, floats, long integers and complex values be- have like "numbers" in many (but not all) situations. if we make unicode strings easier to mix with ordinary strings, we don't necessarily have to make integers and lists easier to mix with strings too... (people who want that can use Tcl instead ;-) > I think i'd agree with Mark's answer for this situation, though > i don't know about adding 'join' methods to other types. I see two > arguments that can be made here: > > For b): the result should match the type of the object > on which the method was called. This way the type of > the result more easily determinable by the programmer > or reader. Also, since the type of the result is > immediately known to the "join" code, each member of the > passed-in sequence need only be fetched once, and a > __getitem__-style generator can easily stand in for the > sequence. > > For c): the result should match the "biggest" type among > the operands. This behaviour is consistent with what > you would get if you added all the operands together. > Unfortunately this means you have to see all the operands > before you know the type of the result, which means you > either scan twice or convert potentially the whole result. > > b) weighs more strongly in my opinion, so i think the right > thing to do is to match the type of the separator. > > (But if a Unicode string contains characters outside of > the Latin-1 range, is it supposed to raise an exception > on an attempt to convert to an ordinary string? In that > case, the actual behaviour of the above example would be > a) and i'm not sure if that would get annoying fast.) exactly. there are some major issues hidden in here, including: 1) what should "str" do for unicode strings? 2) should join really try to convert its arguments? 3) can "str" really raise an exception for a built-in type? 4) should code written by americans fail when used in other parts of the world? based on string-sig input, the unicode class currently solves (1) by returning a UTF-8 encoded version of the unicode string contents. this was chosen to make sure that the answer to (3) is "no, never", and that the an- swer (4) is "not always, at least" -- we've had enough of that, thank you: http://www.lysator.liu.se/%e5ttabitars/7bit-example.txt if (1) is a reasonable solution (I think it is), I think the answer to (2) should be no, based on the rule of least surprise. Python has always required me to explicitly state when I want to convert things in a way that may radically change their meaning. I see little reason to abandon that in 1.6. From gstein at lyra.org Tue Jun 15 12:01:09 1999 From: gstein at lyra.org (Greg Stein) Date: Tue, 15 Jun 1999 03:01:09 -0700 Subject: [Python-Dev] Re: [Python-Dev] Re: String methods... finally References: <003e01beb714$55d7fd80$f29b12c2@pythonware.com> Message-ID: <37662465.682FA81B@lyra.org> Fredrik Lundh wrote: >... > if (1) is a reasonable solution (I think it is), I think the > answer to (2) should be no, based on the rule of least > surprise. Python has always required me to explicitly > state when I want to convert things in a way that may > radically change their meaning. I see little reason to > abandon that in 1.6. Especially because it is such a simple translation: sep.join(sequence) becomes sep.join(map(str, sequence)) Very obvious what is happening. It isn't hard to read, and it doesn't take a lot out of a person to insert that extra phrase. And hey... people can always do: def strjoin(sep, seq): return sep.join(map(str, seq)) And just use strjoin() everywhere if they hate the typing. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gmcm at hypernet.com Tue Jun 15 15:08:08 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Tue, 15 Jun 1999 08:08:08 -0500 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: String methods... finally In-Reply-To: <37662465.682FA81B@lyra.org> Message-ID: <1282670144-103087754@hypernet.com> Greg Stein wrote: ... > And hey... people can always do: > > def strjoin(sep, seq): > return sep.join(map(str, seq)) > > And just use strjoin() everywhere if they hate the typing. Those who hate typing regard it as great injury that they have to define this. Of course, they'll gladly type huge long posts on the subject. But, I agree. string.join(['a', 'b', 3]) currently barfs. L" ".join(seq) should complain if seq isn't all unicode, and same for good old strings. - Gordon From guido at CNRI.Reston.VA.US Tue Jun 15 14:39:09 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 08:39:09 -0400 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: Your message of "Tue, 15 Jun 1999 09:10:28 +0200." <006801beb6fe$27490d80$f29b12c2@pythonware.com> References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> Message-ID: <199906151239.IAA02917@eric.cnri.reston.va.us> > hmm. consider the following: > > space = " " > foo = L"foo" > bar = L"bar" > result = space.join((foo, bar)) > > what should happen if you run this: > > a) Python raises an exception > b) result is an ordinary string object > c) result is a unicode string object The same should happen as for L"foo" + " " + L"bar". --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Tue Jun 15 14:50:59 1999 From: skip at mojam.com (Skip Montanaro) Date: Tue, 15 Jun 1999 08:50:59 -0400 (EDT) Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <199906151239.IAA02917@eric.cnri.reston.va.us> References: <14179.61649.286195.248429@anthem.cnri.reston.va.us> <000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <14182.19420.462788.15633@cm-24-29-94-19.nycap.rr.com> Guido> The same should happen as for L"foo" + " " + L"bar". Remind me again, please. What mnemonic is "L" supposed to evoke? Long? Lundh? Are we talking about Unicode strings? If so, why not "U"? Apologies for my increased density. Skip Montanaro | Mojam: "Uniting the World of Music" http://www.mojam.com/ skip at mojam.com | Musi-Cal: http://www.musi-cal.com/ 518-372-5583 From jack at oratrix.nl Tue Jun 15 14:58:05 1999 From: jack at oratrix.nl (Jack Jansen) Date: Tue, 15 Jun 1999 14:58:05 +0200 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: Message by Guido van Rossum , Tue, 15 Jun 1999 08:39:09 -0400 , <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <19990615125805.8CF03303120@snelboot.oratrix.nl> > The same should happen as for L"foo" + " " + L"bar". This is probably the most reasonable solution. Unfortunately it breaks Marks truly novel suggestion that 0.join(1, 2) becomes 102, but I guess we'll have to live with that:-) -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From fredrik at pythonware.com Tue Jun 15 16:28:17 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 15 Jun 1999 16:28:17 +0200 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <00c201beb73b$5fa27b70$f29b12c2@pythonware.com> > > hmm. consider the following: > > > > space = " " > > foo = L"foo" > > bar = L"bar" > > result = space.join((foo, bar)) > > > > what should happen if you run this: > > > > a) Python raises an exception > > b) result is an ordinary string object > > c) result is a unicode string object > > The same should happen as for L"foo" + " " + L"bar". which is? (alright; for the moment, it's (a) for both: >>> import unicode >>> u = unicode.unicode >>> u("foo") + u(" ") + u("bar") Traceback (innermost last): File "", line 1, in ? TypeError: illegal argument type for built-in operation >>> u("foo") + " " + u("bar") Traceback (innermost last): File "", line 1, in ? TypeError: illegal argument type for built-in operation >>> u(" ").join(("foo", "bar")) Traceback (innermost last): File "", line 1, in ? TypeError: first argument must be sequence of unicode strings but that can of course be changed...) From guido at CNRI.Reston.VA.US Tue Jun 15 16:38:32 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 10:38:32 -0400 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: Your message of "Tue, 15 Jun 1999 16:28:17 +0200." <00c201beb73b$5fa27b70$f29b12c2@pythonware.com> References: <14179.61649.286195.248429@anthem.cnri.reston.va.us><000801beb5d3$d1fd06e0$ae9e2299@tim> <14181.20036.857729.999835@anthem.cnri.reston.va.us> <006801beb6fe$27490d80$f29b12c2@pythonware.com> <199906151239.IAA02917@eric.cnri.reston.va.us> <00c201beb73b$5fa27b70$f29b12c2@pythonware.com> Message-ID: <199906151438.KAA03355@eric.cnri.reston.va.us> > > The same should happen as for L"foo" + " " + L"bar". > > which is? Whatever it is -- I think we did a lot of reasoning about this, and perhaps we're not quite done -- but I truly believe that whatever is decided, join() should follow. --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Tue Jun 15 17:28:11 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 15 Jun 1999 11:28:11 -0400 (EDT) Subject: [Python-Dev] Re: String methods... finally References: <37661767.37D8E370@lyra.org> Message-ID: <14182.28939.509040.125174@anthem.cnri.reston.va.us> >>>>> "GS" == Greg Stein writes: GS> p.s. what's up with Mailman... it seems to have broken badly GS> on the [Python-Dev] insertion... I just stripped a bunch of GS> 'em Harald Meland just checked in a fix for this, which I'm installing now, so the breakage should be just temporary. -Barry From tim_one at email.msn.com Tue Jun 15 17:33:38 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 11:33:38 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <006801beb6fe$27490d80$f29b12c2@pythonware.com> Message-ID: <000601beb744$70c6f9e0$979e2299@tim> > hmm. consider the following: > > space = " " > foo = L"foo" > bar = L"bar" > result = space.join((foo, bar)) > > what should happen if you run this: > > a) Python raises an exception > b) result is an ordinary string object > c) result is a unicode string object The proposal said #b, or, in general, that the resulting string be of the same flavor as the separator. From tim_one at email.msn.com Tue Jun 15 17:33:40 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 11:33:40 -0400 Subject: [Python-Dev] RE: [Python-Dev] String methods... finally In-Reply-To: Message-ID: <000701beb744$71e450c0$979e2299@tim> >> A bug: >> >> >>> 'ab'.endswith('b',0,1) # right >> 0 >> >>> 'ab'.endswith('ab',0,1) # wrong >> 1 >> >>> 'ab'.endswith('ab',0,0) # wrong >> 1 >> >>> [Ka-Ping] > I assumed you meant that the extra arguments should be slices > on the string being searched, i.e. > > specimen.startswith(text, start, end) > > is equivalent to > > specimen[start:end].startswith(text) > > without the overhead of slicing the specimen? Or did i understand > you correctly? Yes, and e.g. 'ab'[0:1] == 'a', which does not end with 'ab'. So these are inconsistent today, and the second is a bug: >>> 'ab'[0:1].endswith('ab') 0 >>> 'ab'.endswith('ab', 0, 1) 1 >>> Or did I misunderstand you ? From gward at cnri.reston.va.us Tue Jun 15 17:41:39 1999 From: gward at cnri.reston.va.us (Greg Ward) Date: Tue, 15 Jun 1999 11:41:39 -0400 Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <19990615125805.8CF03303120@snelboot.oratrix.nl>; from Jack Jansen on Tue, Jun 15, 1999 at 02:58:05PM +0200 References: <19990615125805.8CF03303120@snelboot.oratrix.nl> Message-ID: <19990615114139.A3697@cnri.reston.va.us> On 15 June 1999, Jack Jansen said: > > The same should happen as for L"foo" + " " + L"bar". > > This is probably the most reasonable solution. Unfortunately it breaks Marks > truly novel suggestion that 0.join(1, 2) becomes 102, but I guess we'll have > to live with that:-) Careful -- it actually works this way in Perl (well, except that join isn't a method of strings...): $ perl -de 1 [...] DB<2> $sep = 0 DB<3> @list = (1, 2) DB<4> p join ($sep, @list) 102 Cool! Who needs type-checking anyways? Greg -- Greg Ward - software developer gward at cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913 From tim_one at email.msn.com Tue Jun 15 17:58:48 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 15 Jun 1999 11:58:48 -0400 Subject: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <199906151239.IAA02917@eric.cnri.reston.va.us> Message-ID: <000901beb747$f4531840$979e2299@tim> >> space = " " >> foo = L"foo" >> bar = L"bar" >> result = space.join((foo, bar)) > The same should happen as for L"foo" + " " + L"bar". Then " ".join([" ", 42]) should blow up, and auto-conversion for non-string types needs to be removed from the implementation. The attraction of auto-conversion for me is that I had never once seen string.join blow up where the exception revealed a conceptual error; in every case conversion to string was the intent, and an obvious one at that. Just anal nagging. How about dropping Unicode instead ? Anyway, I'm already on record as saying auto-convert wasn't essential, and join should first and foremost make good sense for string arguments. off-to-work-ly y'rs - tim From MHammond at skippinet.com.au Wed Jun 16 00:29:32 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 16 Jun 1999 08:29:32 +1000 Subject: [Python-Dev] Re: String methods... finally In-Reply-To: <003e01beb714$55d7fd80$f29b12c2@pythonware.com> Message-ID: <010101beb77e$8af64430$0801a8c0@bobcat> > well, I think that unicode strings and ordinary strings > should behave like "strings" where possible, just like > integers, floats, long integers and complex values be- > have like "numbers" in many (but not all) situations. I obviously missed a few smileys in my post. I was serious that: L" ".join -> Unicode result " ".join -> String result and even " ".join([1,2]) -> "1 2" But integers and lists growing "join" methods was a little tounge in cheek :-) Mark. From da at ski.org Wed Jun 16 00:48:41 1999 From: da at ski.org (David Ascher) Date: Tue, 15 Jun 1999 15:48:41 -0700 (Pacific Daylight Time) Subject: [Python-Dev] mmap Message-ID: Another topic: what are the chances of adding the mmap module to the core distribution? It's restricted to a smallish set of platforms (modern Unices and Win32, I think), but it's quite small, and would be a nice thing to have available in the core, IMHO. (btw, the buffer object needs more documentation) --david From MHammond at skippinet.com.au Wed Jun 16 00:53:00 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 16 Jun 1999 08:53:00 +1000 Subject: [Python-Dev] String methods... finally In-Reply-To: <000901beb747$f4531840$979e2299@tim> Message-ID: <010201beb781$d1febf30$0801a8c0@bobcat> [Before I start: Skip mentioned "why L, not U". I know C/C++ uses L, presumably to denote a "long" string (presumably keeping the analogy between int and long ints). I guess Java has no such indicator, being native Unicode? Is there any sort of agreement that Python will use L"..." to denote Unicode strings? I would be happy with it. Also, should: print L"foo" -> 'foo' and print `L"foo"` -> L'foo' I would like to know if there is agreement for this, so I can change the Pythonwin implementation of Unicode now to make things more seamless later. ] > >> space = " " > >> foo = L"foo" > >> bar = L"bar" > >> result = space.join((foo, bar)) > > > The same should happen as for L"foo" + " " + L"bar". I must admit Guido's position has real appeal, even if just from a documentation POV. Eg, join can be defined as: sep.join([s1, ..., sn]) Returns s1 + sep + s2 + sep + ... + sepn Nice and simple to define and understand. Thus, if you can't add 2 items, you can't join them. Assuming the Unicode changes allow us to say: assert " " == L" ", "eek" assert L" " + "" == L" " assert " " + L"" == L" " # or even if this == " " Then this still works well in a Unicode environment; Unicode and strings could be mixed in the list, and as long as you understand what L" " + "" returns, you will understand immediately what the result of join() is going to be. > The attraction of auto-conversion for me is that I had never once seen > string.join blow up where the exception revealed a conceptual > error; in > every case conversion to string was the intent, and an > obvious one at that. OTOH, my gut tells me this is better - that an implicit conversion to the seperator type be performed. Also, it appears that this technique will never surprise anyone in a bad way. It seems the rule above, while simple, basically means "sep.join can only take string/Unicode objects", as all other objects will currently fail the add test. So, given that our rule is that the objects must all be strings, how can it hurt to help the user conform? > off-to-work-ly y'rs - tim where-i-should-be-instead-of-writing-rambling-mails-ly, Mark. From guido at CNRI.Reston.VA.US Wed Jun 16 00:54:42 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 18:54:42 -0400 Subject: [Python-Dev] mmap In-Reply-To: Your message of "Tue, 15 Jun 1999 15:48:41 PDT." References: Message-ID: <199906152254.SAA05114@eric.cnri.reston.va.us> > Another topic: what are the chances of adding the mmap module to the core > distribution? It's restricted to a smallish set of platforms (modern > Unices and Win32, I think), but it's quite small, and would be a nice > thing to have available in the core, IMHO. If it works on Linux, Solaris, Irix and Windows, and is reasonably clean, I'll take it. Please send it. > (btw, the buffer object needs more documentation) That's for Jack & Greg... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Wed Jun 16 01:04:17 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 19:04:17 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Wed, 16 Jun 1999 08:53:00 +1000." <010201beb781$d1febf30$0801a8c0@bobcat> References: <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <199906152304.TAA05136@eric.cnri.reston.va.us> > Is there any sort of agreement that Python will use L"..." to denote > Unicode strings? I would be happy with it. I don't know of any agreement, but it makes sense. > Also, should: > print L"foo" -> 'foo' > and > print `L"foo"` -> L'foo' Yes, I think this should be the way. Exactly what happens to non-ASCII characters is up to the implementation. Do we have agreement on escapes like \xDDDD? Should \uDDDD be added? The difference between the two is that according to the ANSI C standard, which I follow rather strictly for string literals, '\xABCDEF' is a single character whose value is the lower bits (however many fit in a char) of 0xABCDEF; this makes it cumbersome to write a string consisting of a hex escape followed by a digit or letter a-f or A-F; you would have to use another hex escape or split the literal in two, like this: "\xABCD" "EF". (This is true for 8-bit chars as well as for long char in ANSI C.) The \u escape takes up to 4 bytes but is not ANSI C. In Java, \u has the additional funny property that it is recognized *everywhere* in the source code, not just in string literals, and I believe that this complicates the interpretation of things like "\\uffff" (is the \uffff interpreted before regular string \ processing happens?). I don't think we ought to copy this behavior, although JPython users or developers might disagree. (I don't know anyone who *uses* Unicode strings much, so it's hard to gauge the importance of these issues.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Jun 16 02:09:15 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Tue, 15 Jun 1999 19:09:15 -0500 Subject: [Python-Dev] String methods... finally In-Reply-To: <199906152304.TAA05136@eric.cnri.reston.va.us> References: Your message of "Wed, 16 Jun 1999 08:53:00 +1000." <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <1282630485-105472998@hypernet.com> Guido asks: > Do we have agreement on escapes like \xDDDD? Should \uDDDD be > added? > ... The \u escape > takes up to 4 bytes but is not ANSI C. How do endian issues fit in with \u? - Gordon From guido at CNRI.Reston.VA.US Wed Jun 16 01:20:07 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 15 Jun 1999 19:20:07 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: Your message of "Tue, 15 Jun 1999 19:09:15 CDT." <1282630485-105472998@hypernet.com> References: Your message of "Wed, 16 Jun 1999 08:53:00 +1000." <010201beb781$d1febf30$0801a8c0@bobcat> <1282630485-105472998@hypernet.com> Message-ID: <199906152320.TAA05211@eric.cnri.reston.va.us> > How do endian issues fit in with \u? I would assume that it uses the same rules as hex and octal numeric literals: these are always *written* in big-endian notation, since that is also what we use for decimal numbers. Thus, on a little-endian machine, the short integer 0x1234 would be stored as the bytes {0x34, 0x12} and so would the string literal "\x1234". --Guido van Rossum (home page: http://www.python.org/~guido/) From bwarsaw at cnri.reston.va.us Wed Jun 16 01:27:44 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 15 Jun 1999 19:27:44 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <000901beb747$f4531840$979e2299@tim> <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <14182.57712.380574.385164@anthem.cnri.reston.va.us> >>>>> "MH" == Mark Hammond writes: MH> OTOH, my gut tells me this is better - that an implicit MH> conversion to the seperator type be performed. Right now, the implementation of join uses PyObject_Str() to str-ify the elements in the sequence. I can't remember, but in our Unicode worldview doesn't PyObject_Str() return a narrowed string if it can, and raise an exception if not? So maybe narrow-string's join shouldn't be doing it this way because that'll autoconvert to the separator's type, which breaks the symmetry. OTOH, we could promote sep to the type of sequence[0] and forward the call to it's join if it were a widestring. That should retain the symmetry. -Barry From bwarsaw at cnri.reston.va.us Wed Jun 16 01:46:24 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Tue, 15 Jun 1999 19:46:24 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <010201beb781$d1febf30$0801a8c0@bobcat> <199906152304.TAA05136@eric.cnri.reston.va.us> Message-ID: <14182.58832.140587.711978@anthem.cnri.reston.va.us> >>>>> "Guido" == Guido van Rossum writes: Guido> Should \uDDDD be added? That'd be nice! :) Guido> In Java, \u has the additional funny property that it is Guido> recognized *everywhere* in the source code, not just in Guido> string literals, and I believe that this complicates the Guido> interpretation of things like "\\uffff" (is the \uffff Guido> interpreted before regular string \ processing happens?). No. JLS section 3.3 says[1] In addition to the processing implied by the grammar, for each raw input character that is a backslash \, input processing must consider how many other \ characters contiguously precede it, separating it from a non-\ character or the start of the input stream. If this number is even, then the \ is eligible to begin a Unicode escape; if the number is odd, then the \ is not eligible to begin a Unicode escape. and this is born out by example. -------------------- snip snip --------------------Uni.java public class Uni { static public void main(String[] args) { System.out.println("\\u00a9"); System.out.println("\u00a9"); } } -------------------- snip snip --------------------outputs \u00a9 ? -------------------- snip snip -------------------- -Barry [1] http://java.sun.com/docs/books/jls/html/3.doc.html#44591 PS. it is wonderful having the JLS online :) From ping at lfw.org Tue Jun 15 18:05:40 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Tue, 15 Jun 1999 09:05:40 -0700 (PDT) Subject: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] Re: [Python-Dev] String methods... finally In-Reply-To: <19990615114139.A3697@cnri.reston.va.us> Message-ID: On Tue, 15 Jun 1999, Greg Ward wrote: > Careful -- it actually works this way in Perl (well, except that join > isn't a method of strings...): > > $ perl -de 1 > [...] > DB<2> $sep = 0 > > DB<3> @list = (1, 2) > > DB<4> p join ($sep, @list) > 102 > > Cool! Who needs type-checking anyways? Cool! So then >>> def f(x): return x ** 2 ... >>> def g(x): return x - 5 ... >>> h = join((f, g)) ... >>> h(8) 59 Right? Right? (Just kidding.) -- ?!ng "Any nitwit can understand computers. Many do." -- Ted Nelson From tim_one at email.msn.com Wed Jun 16 06:02:46 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 16 Jun 1999 00:02:46 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <199906152304.TAA05136@eric.cnri.reston.va.us> Message-ID: <000401beb7ad$175193c0$2ca22299@tim> [Guido] > Do we have agreement on escapes like \xDDDD? I think we have to agree to leave that alone -- it affects what e.g. the regular expression parser does too. > Should \uDDDD be added? Yes, but only in string literals. You don't want to be within 10 miles of Barry if you tell him that Emacs pymode has to treat the Unicode escape for a newline as if it were-- as Java treats it outside literals --an actual line break <0.01 wink>. > ... > The \u escape takes up to 4 bytes Not in Java: it requires exactly 4 hex characters after == exactly 2 bytes, and it's an error if it's followed by fewer than 4 hex characters. That's a good rule (simple!), while ANSI C's is too clumsy to live with if people want to take Unicode seriously. So what does it mean for a Unicode escape to appear in a non-L string? aha-the-secret-escape-to-ucs4-ly y'rs - tim From tim_one at email.msn.com Wed Jun 16 06:02:44 1999 From: tim_one at email.msn.com (Tim Peters) Date: Wed, 16 Jun 1999 00:02:44 -0400 Subject: [Python-Dev] String methods... finally In-Reply-To: <010201beb781$d1febf30$0801a8c0@bobcat> Message-ID: <000301beb7ad$1635c380$2ca22299@tim> [MarkH agonizes, over whether to auto-convert or not] Well, the rule *could* be that the result type is the widest string type among the separator and the sequences' string elements (if any), and other types convert to the result type along the way. I'd be more specific, except I'm not sure which flavor of string str() returns (or, indeed, whether that's up to each __str__ implementation). In any case, widening to Unicode should always be possible, and if "widest wins" it doesn't require a multi-pass algorithm regardless (although the partial result so far may need to be widened once -- but that's true even if auto-convert of non-string types isn't implemented). Or, IOW, sep.join([a, b, c]) == f(a) + sep + f(b) + sep + f(c) where I don't know how to spell f, but f(x) *means* x' = if x has a string type then x else x.__str__() return x' coerced to the widest string type seen so far So I think everyone can get what they want -- except that those who want auto-convert are at direct odds with those who prefer to wag Guido's fingers and go "tsk, tsk, we know what you want but you didn't say 'please' so your program dies" . master-of-fair-summaries-ly y'rs - tim From mal at lemburg.com Wed Jun 16 10:29:27 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Wed, 16 Jun 1999 10:29:27 +0200 Subject: [Python-Dev] String methods... finally References: <010201beb781$d1febf30$0801a8c0@bobcat> <199906152304.TAA05136@eric.cnri.reston.va.us> Message-ID: <37676067.62E272F4@lemburg.com> Guido van Rossum wrote: > > > Is there any sort of agreement that Python will use L"..." to denote > > Unicode strings? I would be happy with it. > > I don't know of any agreement, but it makes sense. The u"..." looks more intuitive too me. While inheriting C/C++ constructs usually makes sense I think usage in the C community is not that wide-spread yet and for a Python freak, the small u will definitely remind him of Unicode whereas the L will stand for (nearly) unlimited length/precision. Not that this is important, but... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 198 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fredrik at pythonware.com Wed Jun 16 11:53:23 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 11:53:23 +0200 Subject: [Python-Dev] String methods... finally References: <000401beb7ad$175193c0$2ca22299@tim> Message-ID: <00f701beb7de$cdb422f0$f29b12c2@pythonware.com> > > The \u escape takes up to 4 bytes > > Not in Java: it requires exactly 4 hex characters after == exactly 2 bytes, > and it's an error if it's followed by fewer than 4 hex characters. That's a > good rule (simple!), while ANSI C's is too clumsy to live with if people > want to take Unicode seriously. > > So what does it mean for a Unicode escape to appear in a non-L string? my suggestion is to store it as UTF-8; see the patches included in the unicode package for details. this also means that an u-string literal (L-string, whatever) could be stored as an 8-bit string internally. and that the following two are equivalent: string = u"foo" string = unicode("foo") also note that: unicode(str(u"whatever")) == u"whatever" ... on the other hand, this means that we have at least four major "arrays of bytes or characters" thingies mapped on two data types: the old string type is used for: -- plain old 8-bit strings (ascii, iso-latin-1, whatever) -- byte buffers containing arbitrary data -- unicode strings stored as 8-bit characters, using the UTF-8 encoding. and the unicode string type is used for: -- unicode strings stored as 16-bit characters is this reasonable? ... yet another question is how to deal with source code. is a python 1.6 source file written in ASCII, ISO Latin 1, or UTF-8. speaking from a non-us standpoint, it would be really cool if you could write Python sources in UTF-8... From gstein at lyra.org Wed Jun 16 12:13:45 1999 From: gstein at lyra.org (Greg Stein) Date: Wed, 16 Jun 1999 03:13:45 -0700 (PDT) Subject: [Python-Dev] mmap In-Reply-To: <199906152254.SAA05114@eric.cnri.reston.va.us> Message-ID: On Tue, 15 Jun 1999, Guido van Rossum wrote: > > Another topic: what are the chances of adding the mmap module to the core > > distribution? It's restricted to a smallish set of platforms (modern > > Unices and Win32, I think), but it's quite small, and would be a nice > > thing to have available in the core, IMHO. > > If it works on Linux, Solaris, Irix and Windows, and is reasonably > clean, I'll take it. Please send it. Actually, my preference is to see a change to open() rather than a whole new module. For example, let's say that you open a file, specifying memory-mapping. Then you create a buffer against that file: f = open('foo','rm') # 'm' means mem-map b = buffer(f) print b[100:200] Disclaimer: I haven't looked at the mmap modules (AMK's and Mark's) to see what capabilities are in there. They may not be expressable soly as open() changes. (adding add'l params for mmap flags might be another way to handle this) I'd like to see mmap native in Python. I won't push, though, until I can run a test to see what kind of savings will occur when you mmap a .pyc file and open PyBuffer objects against the thing for the code bytes. My hypothesis is that you can reduce the working set of Python (i.e. amortize the cost of a .pyc's code over several processes by mmap'ing it); this depends on the proportion of code in the pyc relative to "other" stuff. > > (btw, the buffer object needs more documentation) > > That's for Jack & Greg... Quite true. My bad :-( ... That would go into the API doc, I guess... I'll put this on a todo list, but it could be a little while. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik at pythonware.com Wed Jun 16 12:53:29 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 12:53:29 +0200 Subject: [Python-Dev] mmap References: Message-ID: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Greg wrote: > Actually, my preference is to see a change to open() rather than a whole > new module. For example, let's say that you open a file, specifying > memory-mapping. Then you create a buffer against that file: > > f = open('foo','rm') # 'm' means mem-map > b = buffer(f) > print b[100:200] > > Disclaimer: I haven't looked at the mmap modules (AMK's and Mark's) to see > what capabilities are in there. They may not be expressable soly as open() > changes. (adding add'l params for mmap flags might be another way to > handle this) > > I'd like to see mmap native in Python. I won't push, though, until I can > run a test to see what kind of savings will occur when you mmap a .pyc > file and open PyBuffer objects against the thing for the code bytes. My > hypothesis is that you can reduce the working set of Python (i.e. amortize > the cost of a .pyc's code over several processes by mmap'ing it); this > depends on the proportion of code in the pyc relative to "other" stuff. yes, yes, yes! my good friend the mad scientist (the guy who writes code, not the flaming cult-ridden brainwashed script kiddie) has considered writing a whole new "abstract file" backend, to entirely get rid of stdio in the Python core. some potential advantages: -- performance (some stdio implementations are slow) -- portability (stdio doesn't exist on some platforms!) -- opens up for cool extensions (memory mapping, pluggable file handlers, etc). should I tell him to start hacking? or is this the same thing as PyBuffer/buffer (I've implemented PyBuffer support for the unicode class, but that doesn't mean that I understand how it works...) PS. someone once told me that Perl goes "below" the standard file I/O system. does anyone here know if that's true, and per- haps even explain how they're doing that... From guido at CNRI.Reston.VA.US Wed Jun 16 14:19:10 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 16 Jun 1999 08:19:10 -0400 Subject: [Python-Dev] mmap In-Reply-To: Your message of "Wed, 16 Jun 1999 03:13:45 PDT." References: Message-ID: <199906161219.IAA05802@eric.cnri.reston.va.us> [me] > > If it works on Linux, Solaris, Irix and Windows, and is reasonably > > clean, I'll take it. Please send it. [Greg] > Actually, my preference is to see a change to open() rather than a whole > new module. For example, let's say that you open a file, specifying > memory-mapping. Then you create a buffer against that file: > > f = open('foo','rm') # 'm' means mem-map > b = buffer(f) > print b[100:200] Buh. Changes of this kind to builtins are painful, especially since we expect that this feature may or may not be supported. And imagine the poor reader who comes across this for the first time... What's wrong with import mmap f = mmap.open('foo', 'r') ??? > I'd like to see mmap native in Python. I won't push, though, until I can > run a test to see what kind of savings will occur when you mmap a .pyc > file and open PyBuffer objects against the thing for the code bytes. My > hypothesis is that you can reduce the working set of Python (i.e. amortize > the cost of a .pyc's code over several processes by mmap'ing it); this > depends on the proportion of code in the pyc relative to "other" stuff. We've been through this before. I still doubt it will help much. Anyway, it's a completely independent feature from making the mmap module(any mmap module) available to users. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Wed Jun 16 14:24:26 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 16 Jun 1999 08:24:26 -0400 Subject: [Python-Dev] mmap In-Reply-To: Your message of "Wed, 16 Jun 1999 12:53:29 +0200." <015b01beb7e6$79b61610$f29b12c2@pythonware.com> References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Message-ID: <199906161224.IAA05815@eric.cnri.reston.va.us> > my good friend the mad scientist (the guy who writes code, > not the flaming cult-ridden brainwashed script kiddie) has > considered writing a whole new "abstract file" backend, to > entirely get rid of stdio in the Python core. some potential > advantages: > > -- performance (some stdio implementations are slow) > -- portability (stdio doesn't exist on some platforms!) You have this backwards -- you'd have to port the abstract backend first! Also don't forget that a *good* stdio might be using all sorts of platform-specific tricks that you'd have to copy to match its performance. > -- opens up for cool extensions (memory mapping, > pluggable file handlers, etc). > > should I tell him to start hacking? Tcl/Tk does this. I see some advantages (e.g. you have more control over and knowledge of how much data is buffered) but also some disadvantages (more work to port, harder to use from C), plus tons of changes needed in the rest of Python. I'd say wait until Python 2.0 and let's keep stdio for 1.6. > PS. someone once told me that Perl goes "below" the standard > file I/O system. does anyone here know if that's true, and per- > haps even explain how they're doing that... Probably just means that they use the C equivalent of os.open() and friends. --Guido van Rossum (home page: http://www.python.org/~guido/) From gward at cnri.reston.va.us Wed Jun 16 14:25:34 1999 From: gward at cnri.reston.va.us (Greg Ward) Date: Wed, 16 Jun 1999 08:25:34 -0400 Subject: [Python-Dev] mmap In-Reply-To: <015b01beb7e6$79b61610$f29b12c2@pythonware.com>; from Fredrik Lundh on Wed, Jun 16, 1999 at 12:53:29PM +0200 References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Message-ID: <19990616082533.A4142@cnri.reston.va.us> On 16 June 1999, Fredrik Lundh said: > my good friend the mad scientist (the guy who writes code, > not the flaming cult-ridden brainwashed script kiddie) has > considered writing a whole new "abstract file" backend, to > entirely get rid of stdio in the Python core. some potential > advantages: [...] > PS. someone once told me that Perl goes "below" the standard > file I/O system. does anyone here know if that's true, and per- > haps even explain how they're doing that... My understanding (mainly from folklore -- peeking into the Perl source has been known to turn otherwise staid, solid programmers into raving lunatics) is that yes, Perl does grovel around in the internals of stdio implementations to wring a few extra cycles out. However, what's probably of more interest to you -- I mean your mad scientist alter ego -- is Perl's I/O abstraction layer: a couple of years ago, somebody hacked up Perl's guts to do basically what you're proposing for Python. The main result was a half-baked, unfinished (at least as of last summer, when I actually asked an expert in person at the Perl Conference) way of building Perl with AT&T's sfio library instead of stdio. I think the other things you mentioned, eg. more natural support for memory-mapped files, have also been bandied about as advantages of this scheme. The main problem with Perl's I/O abstraction layer is that extension modules now have to call e.g. PerlIO_open(), PerlIO_printf(), etc. in place of their stdio counterparts. Surprise surprise, many extension modules have not adapted to the new way of doing things, even though it's been in Perl since version 5.003 (I think). Even more surprisingly, the fourth-party C libraries that those extension modules often interface to haven't switched to using Perl's I/O abstraction layer. This doesn't make a whit of difference if Perl is built in either the "standard way" (no abstraction layer, just direct stdio) or with the abstraction layer on top of stdio. But as soon as some poor fool decides Perl on top of sfio would be neat, lots of extension modules break -- their I/O calls go nowhere. I'm sure there is some sneaky way to make it all work using sfio's binary compatibility layer and some clever macros. This might even have been done. However, AFAIK it's not been documented anywhere. This is not merely to bitch about unfinished business in the Perl core; it's to warn you that others have walked down the road you propose to tread, and there may be potholes. Now if the Python source really does get even more modularized for 1.6, you might have a much easier job of it. ("Modular" is not the word that jumps to mind when one looks at the Perl source code.) Greg /* * "Far below them they saw the white waters pour into a foaming bowl, and * then swirl darkly about a deep oval basin in the rocks, until they found * their way out again through a narrow gate, and flowed away, fuming and * chattering, into calmer and more level reaches." */ -- Tolkein, by way of perl/doio.c -- Greg Ward - software developer gward at cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive voice: +1-703-620-8990 Reston, Virginia, USA 20191-5434 fax: +1-703-620-0913 From beazley at cs.uchicago.edu Wed Jun 16 15:23:32 1999 From: beazley at cs.uchicago.edu (David Beazley) Date: Wed, 16 Jun 1999 08:23:32 -0500 (CDT) Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> Message-ID: <199906161323.IAA28642@gargoyle.cs.uchicago.edu> Fredrik Lundh writes: > > my good friend the mad scientist (the guy who writes code, > not the flaming cult-ridden brainwashed script kiddie) has > considered writing a whole new "abstract file" backend, to > entirely get rid of stdio in the Python core. some potential > advantages: > > -- performance (some stdio implementations are slow) > -- portability (stdio doesn't exist on some platforms!) > -- opens up for cool extensions (memory mapping, > pluggable file handlers, etc). > > should I tell him to start hacking? > I am not in favor of obscuring Python's I/O model too much. When working with C extensions, it is critical to have access to normal I/O mechanisms such as 'FILE *' or integer file descriptors. If you hide all of this behind some sort of abstract I/O layer, it's going to make life hell for extension writers unless you also provide a way to get access to the raw underlying data structures. This is a major gripe I have with the Tcl channel model--namely, there seems to be no easy way to unravel a Tcl channel into a raw file-descriptor for use in C (unless I'm being dense and have missed some simple way to do it). Also, what platforms are we talking about here? I've never come across any normal machine that had a C compiler, but did not have stdio. Is this really a serious problem? Cheers, Dave From MHammond at skippinet.com.au Wed Jun 16 15:47:44 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 16 Jun 1999 23:47:44 +1000 Subject: [Python-Dev] mmap In-Reply-To: <19990616082533.A4142@cnri.reston.va.us> Message-ID: <011c01beb7fe$d213c600$0801a8c0@bobcat> [Greg writes] > The main problem with Perl's I/O abstraction layer is that extension > modules now have to call e.g. PerlIO_open(), PerlIO_printf(), etc. in > place of their stdio counterparts. Surprise surprise, many extension Interestingly, Python _nearly_ suffers this problem now. Although Python does use native FILE pointers, this scheme still assumes that Python and the extensions all use the same stdio. I understand that on most Unix system this can be taken for granted. However, to be truly cross-platform, this assumption may not be valid. A case in point is (surprise surprise :-) Windows. Windows has a number of C RTL options, and Python and its extensions must be careful to select the one that shares FILE * and the heap across separately compiled and linked modules. In-fact, Windows comes with an excellent debug version of the C RTL, but this gets in Python's way - if even one (but not all) Python extension attempts to use these debugging features, we die in a big way. and-dont-even-talk-to-me-about-Windows-CE ly, Mark. From bwarsaw at cnri.reston.va.us Wed Jun 16 16:42:01 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 16 Jun 1999 10:42:01 -0400 (EDT) Subject: [Python-Dev] String methods... finally References: <010201beb781$d1febf30$0801a8c0@bobcat> <199906152304.TAA05136@eric.cnri.reston.va.us> <37676067.62E272F4@lemburg.com> Message-ID: <14183.47033.656933.642197@anthem.cnri.reston.va.us> >>>>> "M" == M writes: M> The u"..." looks more intuitive too me. While inheriting C/C++ M> constructs usually makes sense I think usage in the C community M> is not that wide-spread yet and for a Python freak, the small u M> will definitely remind him of Unicode whereas the L will stand M> for (nearly) unlimited length/precision. I don't think I've every seen C code with L"..." strings in them. Here's my list in no particular order. U"..." -- reminds Java/JPython users of Unicode. Alternative mnemonic: Unamerican-strings L"..." -- long-strings, Lundh-strings, ... W"..." -- wide-strings, Warsaw-strings (just trying to take credit where credit's not due :), what-the-heck-are-these?-strings H"..." -- happy-strings, Hammond-strings, hey-you-just-made-my-extension-module-crash-strings F"..." -- funky-stuff-in-these-hyar-strings A"..." -- ain't-strings S"..." -- strange-strings, silly-strings M> Not that this is important, but... Agreed. -Barry From fredrik at pythonware.com Wed Jun 16 21:11:02 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 21:11:02 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <19990616082533.A4142@cnri.reston.va.us> Message-ID: <001901beb82b$fab54200$f29b12c2@pythonware.com> Greg Ward wrote: > This is not merely to bitch about unfinished business in the Perl core; > it's to warn you that others have walked down the road you propose to > tread, and there may be potholes. oh, the mad scientist have rushed down that road a few times before. we'll see if he's prepared to do that again; it sure won't happen before the unicode stuff is in place... From fredrik at pythonware.com Wed Jun 16 21:16:56 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 21:16:56 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161224.IAA05815@eric.cnri.reston.va.us> Message-ID: <004a01beb82e$36ba54a0$f29b12c2@pythonware.com> > > -- performance (some stdio implementations are slow) > > -- portability (stdio doesn't exist on some platforms!) > > You have this backwards -- you'd have to port the abstract backend > first! Also don't forget that a *good* stdio might be using all sorts > of platform-specific tricks that you'd have to copy to match its > performance. well, if the backend layer is good enough, I don't think a stdio-based standard version will be much slower than todays stdio-only implementation. > > PS. someone once told me that Perl goes "below" the standard > > file I/O system. does anyone here know if that's true, and per- > > haps even explain how they're doing that... > > Probably just means that they use the C equivalent of os.open() and > friends. hopefully. my original source described this as "digging around in the innards of the stdio package" (and so did greg). and the same source claimed it wasn't yet ported to Linux. sounds weird, to say the least, but maybe he referred to that sfio package greg mentioned. I'll do some digging, but not today. From fredrik at pythonware.com Wed Jun 16 21:27:02 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 21:27:02 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161323.IAA28642@gargoyle.cs.uchicago.edu> Message-ID: <004b01beb82e$36d44540$f29b12c2@pythonware.com> David Beazley wrote: > I am not in favor of obscuring Python's I/O model too much. When > working with C extensions, it is critical to have access to normal I/O > mechanisms such as 'FILE *' or integer file descriptors. If you hide > all of this behind some sort of abstract I/O layer, it's going to make > life hell for extension writers unless you also provide a way to get > access to the raw underlying data structures. This is a major gripe > I have with the Tcl channel model--namely, there seems to be no easy > way to unravel a Tcl channel into a raw file-descriptor for use in C > (unless I'm being dense and have missed some simple way to do it). > > Also, what platforms are we talking about here? I've never come > across any normal machine that had a C compiler, but did not have stdio. > Is this really a serious problem? in a way, it is a problem today under Windows (in other words, on most of the machines where Python is used today). it's very easy to end up with different DLL's using different stdio implementations, resulting in all kinds of strange errors. a rewrite could use OS-level handles instead, and get rid of that problem. not to mention Windows CE (iirc, Mark had to write his own stdio-ish package for the CE port), maybe PalmOS, BeOS's BFile's, and all the other upcoming platforms which will make Windows look like a fairly decent Unix clone ;-) ... and in Python, any decent extension writer should write code that works with arbitrary file objects, right? "if it cannot deal with StringIO objects, it's broken"... From beazley at cs.uchicago.edu Wed Jun 16 21:53:23 1999 From: beazley at cs.uchicago.edu (David Beazley) Date: Wed, 16 Jun 1999 14:53:23 -0500 (CDT) Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161323.IAA28642@gargoyle.cs.uchicago.edu> <004b01beb82e$36d44540$f29b12c2@pythonware.com> Message-ID: <199906161953.OAA04527@gargoyle.cs.uchicago.edu> Fredrik Lundh writes: > > and in Python, any decent extension writer should write > code that works with arbitrary file objects, right? "if it > cannot deal with StringIO objects, it's broken"... I disagree. Given that a lot of people use Python as a glue language for interfacing with legacy codes, it is unacceptable for extensions to be forced to use some sort of funky non-standard I/O abstraction. Unless you are volunteering to rewrite all of these codes to use the new I/O model, you are always going to need access (in one way or another) to plain old 'FILE *' and integer file descriptors. Of course, one can always just provide a function like FILE *PyFile_AsFile(PyObject *o) That takes an I/O object and returns a 'FILE *' where supported. (Of course, if it's not supported, then it doesn't matter if this function is missing since any extension that needs a 'FILE *' wouldn't work anyways). Cheers, Dave From fredrik at pythonware.com Wed Jun 16 22:04:54 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 16 Jun 1999 22:04:54 +0200 Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com><199906161323.IAA28642@gargoyle.cs.uchicago.edu><004b01beb82e$36d44540$f29b12c2@pythonware.com> <199906161953.OAA04527@gargoyle.cs.uchicago.edu> Message-ID: <009d01beb833$80d15d40$f29b12c2@pythonware.com> > > and in Python, any decent extension writer should write > > code that works with arbitrary file objects, right? "if it > > cannot deal with StringIO objects, it's broken"... > > I disagree. Given that a lot of people use Python as a glue language > for interfacing with legacy codes, it is unacceptable for extensions > to be forced to use some sort of funky non-standard I/O abstraction. oh, you're right, of course. should have added that extra smiley to that last line. cut and paste from this mail if necessary: ;-) > Unless you are volunteering to rewrite all of these codes to use the > new I/O model, you are always going to need access (in one way or > another) to plain old 'FILE *' and integer file descriptors. Of > course, one can always just provide a function like > > FILE *PyFile_AsFile(PyObject *o) > > That takes an I/O object and returns a 'FILE *' where supported. exactly my idea. when scanning the code, PyFile_AsFile immediately popped up as a potential pothole (if you need the fileno, there's already a method for that in the "standard file object interface"). btw, an "abstract file object" could actually make it much easier to support arbitrary file objects from C/C++ extensions. just map the calls back to Python. or add a tp_file slot, and things get really interesting... > (Of course, if it's not supported, then it doesn't matter if this > function is missing since any extension that needs a 'FILE *' wouldn't > work anyways). yup. I suspect some legacy code may have a hard time running under CE et al. but of course, with a little macro trickery, no- thing stops you from recompiling such code so it uses Python's new "abstract file... okay, okay, I'll stop now ;-) From beazley at cs.uchicago.edu Wed Jun 16 22:13:42 1999 From: beazley at cs.uchicago.edu (David Beazley) Date: Wed, 16 Jun 1999 15:13:42 -0500 (CDT) Subject: [Python-Dev] mmap References: <015b01beb7e6$79b61610$f29b12c2@pythonware.com> <199906161323.IAA28642@gargoyle.cs.uchicago.edu> <004b01beb82e$36d44540$f29b12c2@pythonware.com> <199906161953.OAA04527@gargoyle.cs.uchicago.edu> <009d01beb833$80d15d40$f29b12c2@pythonware.com> Message-ID: <199906162013.PAA04781@gargoyle.cs.uchicago.edu> Fredrik Lundh writes: > > > and in Python, any decent extension writer should write > > > code that works with arbitrary file objects, right? "if it > > > cannot deal with StringIO objects, it's broken"... > > > > I disagree. Given that a lot of people use Python as a glue language > > for interfacing with legacy codes, it is unacceptable for extensions > > to be forced to use some sort of funky non-standard I/O abstraction. > > oh, you're right, of course. should have added that extra smiley > to that last line. cut and paste from this mail if necessary: ;-) > Good. You had me worried there for a second :-). > > yup. I suspect some legacy code may have a hard time running > under CE et al. but of course, with a little macro trickery, no- > thing stops you from recompiling such code so it uses Python's > new "abstract file... okay, okay, I'll stop now ;-) Macro trickery? Oh yes, we could use that too... (one can never have too much macro trickery if you ask me :-) Cheers, Dave From arw at ifu.net Thu Jun 17 16:12:16 1999 From: arw at ifu.net (Aaron Watters) Date: Thu, 17 Jun 1999 10:12:16 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] Message-ID: <37690240.66F601E1@ifu.net> > no-positive-suggestions-just-grousing-ly y'rs - tim On the contrary. I think this is definitively a bad idea. Retracted. A double negative is a positive. -- Aaron Watters === "Criticism serves the same purpose as pain. It's not pleasant but it suggests that something is wrong." -- Churchill (paraphrased from memory) From da at ski.org Thu Jun 17 19:50:20 1999 From: da at ski.org (David Ascher) Date: Thu, 17 Jun 1999 10:50:20 -0700 (Pacific Daylight Time) Subject: [Python-Dev] org.python.org Message-ID: Not all that revolutionary, but an interesting migration path. FWIW, I think the underlying issue is a real one. We're starting to have more and more conflicts, even among package names. (Of course the symlink solution doesn't work on Win32, but that's a detail =). --david ---------- Forwarded message ---------- Date: Thu, 17 Jun 1999 13:44:33 -0400 (EDT) From: Andy Dustman To: Gordon McMillan Cc: M.-A. Lemburg , Crew List Subject: Re: [Crew] Wizards' Resolution to Zope/PIL/mxDateTime conflict? On Thu, 17 Jun 1999, Gordon McMillan wrote: > M.A.L. wrote: > > > Or maybe we should start the com.domain.mypackage thing ASAP. > > I know many are against this proposal (makes Python look Feudal? > Reminds people of the J language?), but I think it's the only thing > that makes sense. It does mean you have to do some ugly things to get > Pickle working properly. Actually, it can be done very easily. I just tried this, in fact: cd /usr/lib/python1.5 mkdir -p org/python (cd org/python; ln -s ../.. core) touch __init__.py org/__init__.py org/python/__init__.py >>> from org.python.core import rfc822 >>> import profile So this seems to make things nice and backwards compatible. My only concern was having __init__.py in /usr/lib/python1.5, but this doesn't seem to break anything. Of course, if you are using some trendy new atrocity like Windoze, this might not work. -- andy dustman | programmer/analyst | comstar communications corporation telephone: 770.485.6025 / 706.549.7689 | icq: 32922760 | pgp: 0xc72f3f1d _______________________________________________ Crew maillist - Crew at starship.python.net http://starship.python.net/mailman/listinfo/crew From gmcm at hypernet.com Thu Jun 17 21:36:49 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Thu, 17 Jun 1999 14:36:49 -0500 Subject: [Python-Dev] org.python.org In-Reply-To: Message-ID: <1282474031-114884629@hypernet.com> David forwards from Starship Crew list: > Not all that revolutionary, but an interesting migration path. > FWIW, I think the underlying issue is a real one. We're starting to > have more and more conflicts, even among package names. (Of course > the symlink solution doesn't work on Win32, but that's a detail =). > > --david > > ---------- Forwarded message ---------- > Date: Thu, 17 Jun 1999 13:44:33 -0400 (EDT) > From: Andy Dustman > To: Gordon McMillan > Cc: M.-A. Lemburg , Crew List > Subject: Re: [Crew] Wizards' Resolution to > Zope/PIL/mxDateTime conflict? > > On Thu, 17 Jun 1999, Gordon McMillan wrote: > > > M.A.L. wrote: > > > > > Or maybe we should start the com.domain.mypackage thing ASAP. > > > > I know many are against this proposal (makes Python look Feudal? > > Reminds people of the J language?), but I think it's the only thing > > that makes sense. It does mean you have to do some ugly things to get > > Pickle working properly. > > Actually, it can be done very easily. I just tried this, in fact: > > cd /usr/lib/python1.5 > mkdir -p org/python > (cd org/python; ln -s ../.. core) > touch __init__.py org/__init__.py org/python/__init__.py > > >>> from org.python.core import rfc822 > >>> import profile > > So this seems to make things nice and backwards compatible. My only > concern was having __init__.py in /usr/lib/python1.5, but this > doesn't seem to break anything. Of course, if you are using some > trendy new atrocity like Windoze, this might not work. In vanilla cases it's backwards compatible. I try packag-izing almost everything I install. Sometimes it works, sometimes it doesn't. In your example, rfc822 uses only builtins at the top level. It's main will import os. Would that work if os lived in org.python.core? Though I really don't think we need to packagize the std distr, (if that happens, I would think it would be for a different reason). The 2 main problems I run across in packagizing things are intra-package imports (where M.A.L's proposal for relative names in dotted imports might ease the pain) and Pickle / cPickle (where the ugliness of the workarounds has often made me drop back to marshal). - Gordon From MHammond at skippinet.com.au Fri Jun 18 10:31:21 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Fri, 18 Jun 1999 18:31:21 +1000 Subject: [Python-Dev] Merge the string_methods tag? Message-ID: <015601beb964$f37a4fa0$0801a8c0@bobcat> Ive been running the string_methods tag (term?) under CVS for quite some time now, and it seems to work perfectly. I admit that I havent stressed the string methods much, but I feel confident that Barry's patches havent broken existing string code. Also, I find using that tag with CVS a bit of a pain. A few updates have been checked into the main branch, and you tend to miss these (its a pity CVS can't be told "only these files are affected by this tag, so the rest should follow the main branch." I know I can do that personally, but that means I personally need to know all files possibly affected by the branch.) Anyway, I digress... I propose that these extensions be merged into the main branch. The main advantage is that we force more people to bash on it, rather than allowing them to make that choice . If the Unicode type is also considered highly experimental, we can make a new tag for that change, but that is really quite independant of the string methods. Mark. From fredrik at pythonware.com Fri Jun 18 10:56:47 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 18 Jun 1999 10:56:47 +0200 Subject: [Python-Dev] cvs problems References: <015601beb964$f37a4fa0$0801a8c0@bobcat> Message-ID: <001d01beb968$7fd47540$f29b12c2@pythonware.com> maybe not the right forum, but I suppose everyone here is using CVS, so... ...could anyone explain why I keep getting this error? $ cvs -z6 up -P -d ... cvs server: Updating dist/src/Tools/ht2html cvs [server aborted]: cannot open directory /projects/cvsroot/python/dist/src/Tools/ht2html: No such file or directory it used to work... From tismer at appliedbiometrics.com Fri Jun 18 11:47:15 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 18 Jun 1999 11:47:15 +0200 Subject: [Python-Dev] Flat Python in Linux Weekly Message-ID: <376A15A3.3968EADE@appliedbiometrics.com> Howdy, Who would have thought this... Linux Weekly took notice. http://lwn.net/bigpage.phtml derangedly yours - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From mal at lemburg.com Fri Jun 18 12:05:52 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 18 Jun 1999 12:05:52 +0200 Subject: [Python-Dev] Relative package imports Message-ID: <376A1A00.3099DE99@lemburg.com> Although David has already copy-posted a message regarding this issue to the list, I would like to restate the problem to get a discussion going (and then maybe take it to c.l.p for general flaming ;). The problem we have run into on starship is that some well-known packages have introduced naming conflicts leading to the unfortunate situation that they can't be all installed on the same default path: 1. Zope has a module named DateTime which also is the base name of the package mxDateTime. 2. Both Zope and PIL have a top-level module named ImageFile.py (different ones of course). Now the problem is how to resolve these issues. One possibility is turning Zope and PIL into proper packages altogether. To ease this transition, one would need a way to specify relative intra-package imports and a way to tell pickle where to look for modules/packages. The next problem we'd probably run into sooner or later is that there are quite a few useful top-level modules with generic names that will conflict with package names and other modules with the same name. I guess we'd need at least three things to overcome this situation once and for all ;-): 1. Provide a way to do relative imports, e.g. a single dot could be interpreted as "parent package": modA.py modD.py [A] modA.py modB.py [B] modC.py modD.py In modC.py: from modD import * (works as usual: import A.B.modD) from .modA import * (imports A.modA) from ..modA import * (import the top-level modA) 2. Establish a general vendor based naming scheme much like the one used in the Java world: from org.python.core import time,os,string from org.zope.core import * from com.lemburg import DateTime from com.pythonware import PIL 3. Add a way to prevent double imports of the same file. This is the mayor gripe I have with pickle currently, because intra- package imports often lead to package modules being imported twice leading to many strange problems (e.g. splitting class hierarchies, problems with isinstance() and issubclass(), etc.), e.g. from org.python.core import UserDict u = UserDict.UserDict() import UserDict v = UserDict.UserDict() Now u and v will point to two different classes: >>> u.__class__ >>> v.__class__ 4. Add some kind of redirection or lookup hook to pickle et al. so that imports done during unpickling can be redirected to the correct (possibly renamed) package. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 196 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fredrik at pythonware.com Fri Jun 18 12:47:49 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 18 Jun 1999 12:47:49 +0200 Subject: [Python-Dev] Flat Python in Linux Weekly References: <376A15A3.3968EADE@appliedbiometrics.com> Message-ID: <001901beb978$0312a440$f29b12c2@pythonware.com> flat eric, flat beat, flat python? http://www.flateric-online.de (best viewed through babelfish.altavista.com, of course ;-) should-flat-eric-in-the-routeroute-route-along-ly yrs /F From fredrik at pythonware.com Fri Jun 18 12:51:21 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri, 18 Jun 1999 12:51:21 +0200 Subject: [Python-Dev] Relative package imports References: <376A1A00.3099DE99@lemburg.com> Message-ID: <001f01beb978$8177aab0$f29b12c2@pythonware.com> > 2. Both Zope and PIL have a top-level module named ImageFile.py > (different ones of course). > > Now the problem is how to resolve these issues. One possibility > is turning Zope and PIL into proper packages altogether. To > ease this transition, one would need a way to specify relative > intra-package imports and a way to tell pickle where to look > for modules/packages. fwiw, PIL 1.0b1 can already be used as a package, but you have to explicitly import the file format handlers you need: from PIL import Image import PIL.GifImagePlugin import PIL.PngImagePlugin import PIL.JpegImagePlugin etc. this has been fixed in PIL 1.0 final. From guido at CNRI.Reston.VA.US Fri Jun 18 16:51:16 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 18 Jun 1999 10:51:16 -0400 Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: Your message of "Fri, 18 Jun 1999 18:31:21 +1000." <015601beb964$f37a4fa0$0801a8c0@bobcat> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> Message-ID: <199906181451.KAA11549@eric.cnri.reston.va.us> > Ive been running the string_methods tag (term?) under CVS for quite some > time now, and it seems to work perfectly. I admit that I havent stressed > the string methods much, but I feel confident that Barry's patches havent > broken existing string code. > > Also, I find using that tag with CVS a bit of a pain. A few updates have > been checked into the main branch, and you tend to miss these (its a pity > CVS can't be told "only these files are affected by this tag, so the rest > should follow the main branch." I know I can do that personally, but that > means I personally need to know all files possibly affected by the branch.) > Anyway, I digress... > > I propose that these extensions be merged into the main branch. The main > advantage is that we force more people to bash on it, rather than allowing > them to make that choice . If the Unicode type is also considered > highly experimental, we can make a new tag for that change, but that is > really quite independant of the string methods. Hmm... This would make it hard to make a patch release for 1.5.2 (possible called 1.5.3?). I *really* don't want the string methods to end up in a release yet -- there are too many rough edges (e.g. some missing methods, should join str() or not, etc.). I admit that managing CVS branches is painful. We may find that it works better to create a branch for patch releases and to do all new development on the main release... But right now I don't want to change anything yet. In any case Barry just went on vacation so we'll have to wait 10 days... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Fri Jun 18 16:55:45 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 18 Jun 1999 10:55:45 -0400 Subject: [Python-Dev] cvs problems In-Reply-To: Your message of "Fri, 18 Jun 1999 10:56:47 +0200." <001d01beb968$7fd47540$f29b12c2@pythonware.com> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <001d01beb968$7fd47540$f29b12c2@pythonware.com> Message-ID: <199906181455.KAA11564@eric.cnri.reston.va.us> > maybe not the right forum, but I suppose everyone > here is using CVS, so... > > ...could anyone explain why I keep getting this error? > > $ cvs -z6 up -P -d > ... > cvs server: Updating dist/src/Tools/ht2html > cvs [server aborted]: cannot open directory /projects/cvsroot/python/dist/src/Tools/ht2html: No such > file or directory > > it used to work... EXPLANATION: For some reason that directory existed on the mirror server but not in the master CVS tree repository. It was created once but quickly deleted -- not quickly enough apparently to prevent it to leak to the slave. Then we did a global resync from the master to the mirror and that wiped out the mirror version. Good riddance. FIX: Edit Tools/CVS/Entries and delete the line that mentions ht2html, then do another cvs update. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Fri Jun 18 17:41:54 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 18 Jun 1999 11:41:54 -0400 Subject: [Python-Dev] cvs problems In-Reply-To: <001d01beb968$7fd47540$f29b12c2@pythonware.com> Message-ID: <000901beb9a1$179d2380$b79e2299@tim> [/F] > ...could anyone explain why I keep getting this error? > > $ cvs -z6 up -P -d > ... > cvs server: Updating dist/src/Tools/ht2html > cvs [server aborted]: cannot open directory > /projects/cvsroot/python/dist/src/Tools/ht2html: No such > file or directory > > it used to work... It stopped working a week ago Thursday, and Guido & Barry know about it. The directory in question vanished from the server under mysterious circumstances. You can get going again by deleting the ht2html line in your local Tools/CVS/Entries file. From da at ski.org Fri Jun 18 19:09:27 1999 From: da at ski.org (David Ascher) Date: Fri, 18 Jun 1999 10:09:27 -0700 (Pacific Daylight Time) Subject: [Python-Dev] automatic wildcard expansion on Win32 Message-ID: A python-help poster finally convinced me that there was a way to enable automatic wildcard expansion on win32. This is done by linking in "setargv.obj" along with all of the other MS libs. Quick testing shows that it works. Is this a feature we want to add? I can see both sides of that coin. --david PS: I saw a RISKS digest posting last week which had a horror story about wildcard expansion on some flavor of Windows. The person had two files with long filenames: verylongfile1.txt and verylongfile2.txt But Win32 stored them in 8.3 format, so they were stored as verylo~2.txt and verylo~1.txt (Yes, the 1 and 2 were swapped!). So when he did del *1.txt he removed the wrong file. Neat, eh? (This is actually relevant -- it's possible that setargv.obj and glob.glob could give different answers). --david From guido at CNRI.Reston.VA.US Fri Jun 18 20:09:29 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Fri, 18 Jun 1999 14:09:29 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Your message of "Fri, 18 Jun 1999 10:09:27 PDT." References: Message-ID: <199906181809.OAA12090@eric.cnri.reston.va.us> > A python-help poster finally convinced me that there was a way to enable > automatic wildcard expansion on win32. This is done by linking in > "setargv.obj" along with all of the other MS libs. Quick testing shows > that it works. > > Is this a feature we want to add? I can see both sides of that coin. I don't see big drawbacks except minor b/w compat problems. Should it be done for both python.exe and pythonw.exe? --Guido van Rossum (home page: http://www.python.org/~guido/) From da at ski.org Fri Jun 18 22:06:09 1999 From: da at ski.org (David Ascher) Date: Fri, 18 Jun 1999 13:06:09 -0700 (Pacific Daylight Time) Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: <199906181809.OAA12090@eric.cnri.reston.va.us> Message-ID: On Fri, 18 Jun 1999, Guido van Rossum wrote: > I don't see big drawbacks except minor b/w compat problems. > > Should it be done for both python.exe and pythonw.exe? Sure. From MHammond at skippinet.com.au Sat Jun 19 02:56:42 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Sat, 19 Jun 1999 10:56:42 +1000 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Message-ID: <016e01beb9ee$99e1a710$0801a8c0@bobcat> > A python-help poster finally convinced me that there was a > way to enable > automatic wildcard expansion on win32. This is done by linking in > "setargv.obj" along with all of the other MS libs. Quick > testing shows > that it works. This has existed since I have been using C on Windows. I personally would vote against it. AFAIK, common wisdom on Windows is to not use this. Indeed, if people felt that this behaviour was an improvement, MS would have enabled it by default at some stage over the last 10 years it has existed, and provided a way of disabling it! This behaviour causes subtle side effects; effects Unix users are well aware of, due to every single tool using it. Do the tricks needed to get the wildcard down to the program exist? Will any windows users know what they are? IMO, Windows "fixed" the Unix behaviour by dropping this, and they made a concession to die-hards by providing a rarely used way of enabling it. Windows C programmers dont expect it, VB programmers dont expect it, even batch file programmers dont expect it. I dont think we should use it. > (This is actually relevant -- it's possible that setargv.obj > and glob.glob > could give different answers). Exactly. As may win32api.FindFiles(). Give the user the wildcard, and let them make sense of it. The trivial case of using glob() is so simple I dont believe it worth hiding. Your horror story of the incorrect file being deleted could then only be blamed on the application, not on Python! Mark. From tim_one at email.msn.com Sat Jun 19 03:00:46 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 18 Jun 1999 21:00:46 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Message-ID: <000501beb9ef$2ac61720$a69e2299@tim> [David Ascher] > A python-help poster finally convinced me that there was a way to enable > automatic wildcard expansion on win32. This is done by linking in > "setargv.obj" along with all of the other MS libs. Quick testing shows > that it works. > > Is this a feature we want to add? I can see both sides of that coin. The only real drawback I see is that we're then under some obligation to document Python's behavior. Which is then inherited from the MS setargv.obj, which is in turn only partially documented in developer-only docs, and incorrectly documented at that. > PS: I saw a RISKS digest posting last week which had a horror story about > wildcard expansion on some flavor of Windows. The person had two files > with long filenames: > > verylongfile1.txt > and > verylongfile2.txt > > But Win32 stored them in 8.3 format, so they were stored as > verylo~2.txt > and > verylo~1.txt > > (Yes, the 1 and 2 were swapped!). So when he did > > del *1.txt > > he removed the wrong file. Neat, eh? > > (This is actually relevant -- it's possible that setargv.obj and > glob.glob could give different answers). Yes, and e.g. it works this way under Win95: D:\Python>dir *~* Volume in drive D is DISK1PART2 Volume Serial Number is 1DFF-0F59 Directory of D:\Python PYCLBR~1 PAT 5,765 06-07-99 11:41p pyclbr.patch KJBUCK~1 PYD 34,304 03-31-98 3:07a kjbuckets.pyd WIN32C~1 05-16-99 12:10a win32comext PYTHON~1 05-16-99 12:10a Pythonwin TEXTTO~1 01-15-99 11:35p TextTools UNWISE~1 EXE 109,056 07-03-97 8:35a UnWisePW32.exe 3 file(s) 149,125 bytes 3 dir(s) 1,502,511,104 bytes free Here's the same thing in an argv-spewing console app whipped up to link setargv.obj: D:\Python>garp\debug\garp *~* 0: D:\PYTHON\GARP\DEBUG\GARP.EXE 1: kjbuckets.pyd 2: pyclbr.patch 3: Pythonwin 4: TextTools 5: UnWisePW32.exe 6: win32comext D:\Python> setargv.obj is apparently consistent with what native wildcard expansion does (although you won't find that promise made anywhere!), and it's definitely surprising in the presence of non-8.3 names. The quoting rules too are impossible to explain, seemingly random: D:\Python>garp\debug\garp "\\a\\" 0: D:\PYTHON\GARP\DEBUG\GARP.EXE 1: \\a\ D:\Python> Before I was on the Help list, I used to believe it would work to just say "well, it does what Windows does" . magnification-of-ignorance-ly y'rs - tim From tim_one at email.msn.com Sat Jun 19 03:26:42 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 18 Jun 1999 21:26:42 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: <016e01beb9ee$99e1a710$0801a8c0@bobcat> Message-ID: <000701beb9f2$c95b9880$a69e2299@tim> [MarkH, with *the* killer argument <0.3 wink>] > Your horror story of the incorrect file being deleted could then > only be blamed on the application, not on Python! Sold! Some years ago in the Perl world, they solved this by making regular old perl.exe not expand wildcards on Windows, but also supplying perlglob.exe which did. Don't know what they're doing today, but they apparently changed their minds at least once, as the couple-years-old version of perl.exe on my machine does do wildcard expansion, and does the wrong (i.e., the Windows ) thing. screw-it-ly y'rs - tim From tim_one at email.msn.com Sat Jun 19 20:45:16 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sat, 19 Jun 1999 14:45:16 -0400 Subject: [Python-Dev] stackable ints [stupid idea (ignore) :v] In-Reply-To: <199906101411.KAA29962@eric.cnri.reston.va.us> Message-ID: <000801beba83$df719e80$c49e2299@tim> Backtracking: [Aaron] > I've always considered it a major shame that Python ints and floats > and chars and stuff have anything to do with dynamic allocation ... [Guido] > What you're describing is very close to what I recall I once read > about the runtime organization of Icon. Perl may also use a variant > on this (it has fixed-length object headers). ... I've rarely been able to make sense of Perl's source code, but gave it another try anyway. An hour later I gave up unenlightened, so cruised the web. Turns out there's a *terrific* writeup of Perl's type representation at: http://home.sol.no/~aas/perl/guts/ Pictures and everything . Header is 3 words: An 8-bit "type" field, 24 baffling flag bits (e.g., flag #14 is "BREAK -- refcnt is artificially low"(!)), 32 refcount bits, and a 32-bit pointer field. Appears that the pointer field is always a real (although possibly NULL) pointer. Plain ints have type code SvIV, and the pointer then points to a bogus address, but where that address + 3 words points to the actual integer value. Why? Because then they can use the same offset to get to the int as when the type is SvPVIV, which is the combined string/integer type, and needs three words (to point to the string start address, current len and allocated len) in addition to the integer value at the end. So why is the integer value at the end? So the same offsets work for the SvPV type, which is solely a string descriptor. So why is it important that SvPVIV, SvPV and SvIV all have the same layout? So that either of the latter types can be dynamically "upgraded" to SvPVIV (when a string is converted to int or vice versa; Perl then holds on to both representations internally) by plugging in a new type code and fiddling some of the baffling flag bits. Brr. I have no idea how they manage to keep Perl running! and-not-entirely-sure-that-they-do-ly y'rs - tim From mal at lemburg.com Mon Jun 21 11:54:50 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 21 Jun 1999 11:54:50 +0200 Subject: [Python-Dev] Relative package imports References: <376A1A00.3099DE99@lemburg.com> Message-ID: <376E0BEA.60F22945@lemburg.com> It seems that there is not much interest in the topic... I'll be offline for the next two weeks -- maybe someone could pick the thread up and toss it around a bit while I'm away. Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 193 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From MHammond at skippinet.com.au Mon Jun 21 13:23:34 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Mon, 21 Jun 1999 21:23:34 +1000 Subject: [Python-Dev] Relative package imports In-Reply-To: <376E0BEA.60F22945@lemburg.com> Message-ID: <000501bebbd8$80f56b10$0801a8c0@bobcat> > It seems that there is not much interest in the topic... > > I'll be offline for the next two weeks -- maybe someone could > pick the thread up and toss it around a bit while I'm away. OK - here are my 2c on it: Unless I am mistaken, this problem could be solved with 2 steps: * Code moves to Python packages. * The standard Python library move to a package. If all non-trivial Python program used packages, and some agreement on a standard namespace could be met, I think it would be addressed. There was a thread on the newsgroup about the potential naming of the standard library. You did state as much in your proposal - indeed, you state "to ease the transition". Personally, I dont think it is worth it, mainly because we end up with a half-baked scheme purely for the transition, but one that can never be removed. To me, the question is one of: * Why arent Zope/PIL capable of being used as packages. * If they are (as I understand to be the case) why do people choose not to use them as such, or why do the authors not recommend this? * Is there a deficiency in the package scheme that makes it hard to use? Eg, should "__" that ni used for the parent package be reinstated? Mark. From fredrik at pythonware.com Mon Jun 21 14:41:27 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 21 Jun 1999 14:41:27 +0200 Subject: [Python-Dev] Relative package imports References: <000501bebbd8$80f56b10$0801a8c0@bobcat> Message-ID: <006501bebbe3$6189e570$f29b12c2@pythonware.com> Mark Hammond wrote: > * Why arent Zope/PIL capable of being used as packages. PIL can be used as a package ("from PIL import Image"), assuming that it's installed under a directory in your path. there's one pro- blem in 1.0b1, though: you have to explicitly import the file format handlers you need: import PIL.JpegImagePlugin import PIL.PngImagePlugin this has been fixed in 1.0 final. > * If they are (as I understand to be the case) why do people choose not to > use them as such, or why do the authors not recommend this? inertia, and compatibility concerns. we've decided that all official material related to PIL 1.0 will use the old syntax (and all 1.X releases will be possible to install using the PIL.pth approach). too many users out there... now, PIL 2.0 is a completely different thing... > * Is there a deficiency in the package scheme that makes it hard to use? not that I'm aware... From mal at lemburg.com Mon Jun 21 16:36:58 1999 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 21 Jun 1999 16:36:58 +0200 Subject: [Python-Dev] Relative package imports References: <000501bebbd8$80f56b10$0801a8c0@bobcat> Message-ID: <376E4E0A.3B714BAB@lemburg.com> Mark Hammond wrote: > > > It seems that there is not much interest in the topic... > > > > I'll be offline for the next two weeks -- maybe someone could > > pick the thread up and toss it around a bit while I'm away. > > OK - here are my 2c on it: > > Unless I am mistaken, this problem could be solved with 2 steps: > * Code moves to Python packages. > * The standard Python library move to a package. > > If all non-trivial Python program used packages, and some agreement on a > standard namespace could be met, I think it would be addressed. There was > a thread on the newsgroup about the potential naming of the standard > library. > > You did state as much in your proposal - indeed, you state "to ease the > transition". Personally, I dont think it is worth it, mainly because we > end up with a half-baked scheme purely for the transition, but one that can > never be removed. With "easing the transition" I ment introducing a way to do relative package imports: you don't need relative imports if you can be sure that the package name will never change (with a fixed naming scheme, a la com.domain.product.package...). The smarter import mechanism is needed to work-around the pickle problems you face (because pickle uses absolute package names). > To me, the question is one of: > > * Why arent Zope/PIL capable of being used as packages. > * If they are (as I understand to be the case) why do people choose not to > use them as such, or why do the authors not recommend this? > * Is there a deficiency in the package scheme that makes it hard to use? > Eg, should "__" that ni used for the parent package be reinstated? I guess this would help a great deal; although I'd personally wouldn't like yet another underscore in the language. Simply leave the name empty as in '.submodule' or '..subpackage.submodule'. Cheers, -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 193 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at CNRI.Reston.VA.US Tue Jun 22 00:44:24 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 21 Jun 1999 18:44:24 -0400 Subject: [Python-Dev] automatic wildcard expansion on Win32 In-Reply-To: Your message of "Fri, 18 Jun 1999 21:26:42 EDT." <000701beb9f2$c95b9880$a69e2299@tim> References: <000701beb9f2$c95b9880$a69e2299@tim> Message-ID: <199906212244.SAA18866@eric.cnri.reston.va.us> > Some years ago in the Perl world, they solved this by making regular old > perl.exe not expand wildcards on Windows, but also supplying perlglob.exe > which did. This seems a reasonable way out. Just like we have pythonw.exe, we could add pythong.exe and pythongw.exe (or pythonwg.exe?). I guess it's time for a README.txt file to be installed explaining all the different executables... By default the g versions would not be used unless invoked explicitly. --Guido van Rossum (home page: http://www.python.org/~guido/) From Vladimir.Marangozov at inrialpes.fr Thu Jun 24 14:23:48 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Thu, 24 Jun 1999 14:23:48 +0200 (DFT) Subject: [Python-Dev] ob_refcnt access Message-ID: <199906241223.OAA46222@pukapuka.inrialpes.fr> How about introducing internal macros for explicit ob_refcnt accesses in the core? Actually, there are a number of places where one can see "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), _Py_SETREF(op, n) thus decoupling completely the low level refcount management defined in object.h: #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) Comments? I've contributed myself to the mess in intobject.c & floatobject.c, so I thought that such macros would make the code cleaner. Here's the current state of affairs: python/dist/src>find . -name "*.[c]" -exec grep ob_refcnt {} \; -print (void *) v, ((PyObject *) v)->ob_refcnt)) ./Modules/_tkinter.c if (self->arg->ob_refcnt > 1) { \ if (ob->ob_refcnt < 2 || self->fast) if (args->ob_refcnt > 1) { ./Modules/cPickle.c if (--inst->ob_refcnt > 0) { ./Objects/classobject.c if (result->ob_refcnt == 1) ./Objects/fileobject.c if (PyFloat_Check(p) && p->ob_refcnt != 0) if (!PyFloat_Check(p) || p->ob_refcnt == 0) { if (PyFloat_Check(p) && p->ob_refcnt != 0) { p, p->ob_refcnt, buf); ./Objects/floatobject.c if (PyInt_Check(p) && p->ob_refcnt != 0) if (!PyInt_Check(p) || p->ob_refcnt == 0) { if (PyInt_Check(p) && p->ob_refcnt != 0) p, p->ob_refcnt, p->ob_ival); ./Objects/intobject.c assert(v->ob_refcnt == 1); /* Since v will be used as accumulator! */ ./Objects/longobject.c if (op->ob_refcnt <= 0) op->ob_refcnt, (long)op); op->ob_refcnt = 1; if (op->ob_refcnt < 0) fprintf(fp, "[%d] ", op->ob_refcnt); ./Objects/object.c if (!PyString_Check(v) || v->ob_refcnt != 1) { if (key->ob_refcnt == 2 && key == value) { ./Objects/stringobject.c if (!PyTuple_Check(op) || op->ob_refcnt != 1) { if (v == NULL || !PyTuple_Check(v) || v->ob_refcnt != 1) { ./Objects/tupleobject.c if (PyList_Check(seq) && seq->ob_refcnt == 1) { if (args->ob_refcnt > 1) { ./Python/bltinmodule.c if (value->ob_refcnt != 1) ./Python/import.c return PyInt_FromLong((long) arg->ob_refcnt); ./Python/sysmodule.c -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From guido at CNRI.Reston.VA.US Thu Jun 24 17:30:45 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Thu, 24 Jun 1999 11:30:45 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Your message of "Thu, 24 Jun 1999 14:23:48 +0200." <199906241223.OAA46222@pukapuka.inrialpes.fr> References: <199906241223.OAA46222@pukapuka.inrialpes.fr> Message-ID: <199906241530.LAA27887@eric.cnri.reston.va.us> > How about introducing internal macros for explicit ob_refcnt accesses > in the core? What problem does this solve? > Actually, there are a number of places where one can see > "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), > _Py_SETREF(op, n) thus decoupling completely the low level refcount > management defined in object.h: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) Why the cast? It loses some type-safety, e.g. _Py_GETREF(0) will now cause a core dump instead of a compile-time error. > Comments? I don't see how it's cleaner or saves typing: op->ob_refcnt _Py_GETREF(op) op->ob_refcnt = 1 _Py_SETREF(op, 1) --Guido van Rossum (home page: http://www.python.org/~guido/) From Vladimir.Marangozov at inrialpes.fr Thu Jun 24 18:33:31 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Thu, 24 Jun 1999 18:33:31 +0200 (DFT) Subject: [Python-Dev] Re: ob_refcnt access In-Reply-To: from "marangoz" at "Jun 24, 99 02:23:47 pm" Message-ID: <199906241633.SAA44314@pukapuka.inrialpes.fr> marangoz wrote: > > > How about introducing internal macros for explicit ob_refcnt accesses > in the core? Actually, there are a number of places where one can see > "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), > _Py_SETREF(op, n) thus decoupling completely the low level refcount > management defined in object.h: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) > > Comments? Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot to mention that if this detail doesn't hurt code aesthetics, one (I) could experiment more easily all sort of weird things with refcounting... I formulated the same wish for malloc & friends some time ago, that is, use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be defined for now as malloc, free, but nobody seems to be very excited about a smooth transition to other kinds of malloc. Hence, I reiterate this wish, 'cause switching to macros means preparing the code for the future, even if in the future it remains intact ;-). Defining these basic interfaces is clearly Guido's job :-) as he points out in his summary of the last Open Source summit, but nevertheless, I'm raising the issue to let him see what other people think about this and allow him to make decisions easier :-) -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From ping at lfw.org Thu Jun 24 19:29:19 1999 From: ping at lfw.org (Ka-Ping Yee) Date: Thu, 24 Jun 1999 10:29:19 -0700 (PDT) Subject: [Python-Dev] ob_refcnt access In-Reply-To: <199906241530.LAA27887@eric.cnri.reston.va.us> Message-ID: On Thu, 24 Jun 1999, Guido van Rossum wrote: > > How about introducing internal macros for explicit ob_refcnt accesses > > in the core? > > What problem does this solve? I assume Vladimir was trying to leave the door open for further ob_refcnt manipulation hooks later, like having objects manage their own refcounts. Until there's an actual problem to solve that requires this, though, i'm not sure it's necessary. Are there obvious reasons to want to allow this? * * * While we're talking about refcounts and all, i've had the argument quite successfully made to me that a reasonably written garbage collector can be both (a) simple and (b) more efficient than refcounting. Having spent a good number of work days doing nothing but debugging crashes by tracing refcounting bugs, i was easily converted into a believer once a friend dispelled the notion that garbage collectors were either slow or horribly complicated. I had always been scared of them before, but less so now. Is an incremental GC being considered for a future Python? I've idly been pondering various tricks by which it could be made to work with existing extension modules -- here are some possibilities: 1. Keep the refcounts and let existing code do the usual thing; introduce a new variant of PyObject_NEW that puts an object into the "gc-able" pool rather than the "refcounted" pool. 2. Have Py_DECREF and Py_INCREF just do nothing, and let the garbage collector guess from the contents of the structure where the pointers are. (I'm told it's possible to do this safely, since you can only have false positives, never false negatives.) 3. Have Py_DECREF and Py_INCREF just do nothing, and ask the extension module to just provide (in its type object) a table of where the pointers are in its struct. And so on; mix and match. What are everyone's thoughts on this one? -- ?!ng "All models are wrong; some models are useful." -- George Box From tim_one at email.msn.com Fri Jun 25 08:38:11 1999 From: tim_one at email.msn.com (Tim Peters) Date: Fri, 25 Jun 1999 02:38:11 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Message-ID: <000c01bebed5$4b8d1040$d29e2299@tim> [Ka-Ping Yee, opines about GC] Ping, I think you're not getting any responses because this has been beaten to death on c.l.py over the last month (for the 53rd time, no less ). A hefty percentage of CPython users *like* the reliably timely destruction refcounting yields, and some clearly rely on it. Guido recently (10 June) posted the start of a "add GC on top of RC" scheme, in a thread with the unlikely name "fork()". The combination of cycles, destructors and resurrection is quite difficult to handle in a way both principled and useful (Java's way is principled but by most accounts unhelpful to the point of uselessness). Python experience with the Boehm collector can be found in the FAQ; note that the Boehm collector deals with finalizers in cycles by letting cycles with finalizers leak! > ... > While we're talking about refcounts and all, i've had the > argument quite successfully made to me that a reasonably > written garbage collector can be both (a) simple and (b) more > efficient than refcounting. That's a dubious claim. Sophisticated mark-and-sweep (with or without compaction) is almost universally acknowledged to beat RC, but simple M&S has terrible cache behavior (you fill up the address space before reclaiming anything, then leap all over the address space repeatedly cleaning it up). Don't discount that, in Python unlike as in most other languages, the simple loop for i in xrange(1000000): pass creates a huge amount of trash at a furious pace. Under RC it can happily reuse the same little bit of storage each time around. > Having spent a good number of work days doing nothing but debugging > crashes by tracing refcounting bugs, Yes, we can trade that for tracking down M&S bugs <0.5 wink> -- instead of INCREF/DECREF macros, you end up with M&S macros marking regions where the collector must not be run (because you're in a temporarily "inconsistent" state). That's under sophisticated M&S, though, but is an absolute nightmare when you miss a pair (the bugs only show up "sometimes", and not always the same ways -- depends on when M&S happens to run, and "how inconsistent" you happen to be at the time). > ... > And so on; mix and match. What are everyone's thoughts on this one? I think Python probably needs to clean up cycles, but by some variant of Guido's scheme on top of RC; I very much dislike the property of his scheme that objects with destructors may be get destroyed without their destructors getting invoked, but it seems hard to fix. Alternatives include Java's scheme (which really has nothing going for it other than that Java does it <0.3 wink>); Scheme's "guardian" scheme (which would let the user "get at" cyclic trash with destructors, but refuses to do anything with them on its own); following Boehm by saying that cycles with destructors are immortal; following goofier historical precedent by e.g. destroying such objects in reverse order of creation; or maybe just raising an exception if a trash cycle containing a destructor is found. All of those seem a comparative pain to implement, with Java's being the most painful -- and quite possibly the least satisfying! it's-a-whale-of-a-lot-easier-in-a-self-contained-universe-or-even-an- all-c-one-ly y'rs - tim From Vladimir.Marangozov at inrialpes.fr Fri Jun 25 13:27:43 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir Marangozov) Date: Fri, 25 Jun 1999 13:27:43 +0200 (DFT) Subject: [Python-Dev] Re: ob_refcnt access (fwd) Message-ID: <199906251127.NAA27464@pukapuka.inrialpes.fr> FYI, my second message on this issue didn't reach the list because of a stupid error of mine, so Guido and I exchanged two mails in private. His response to the msg below was that he thinks that tweaking the refcount scheme at this level wouldn't contribute much and that he doesn't intend to change anything on this until 2.0 which will be rewritten from scratch. Besides, if I want to satisfy my curiosity in hacking the refcounts I can do it with a small patch because I've already located the places where the ob_refcnt slot is accessed directly. ----- Forwarded message ----- From Vladimir.Marangozov at inrialpes.fr Thu Jun 24 18:33:31 1999 From: Vladimir.Marangozov at inrialpes.fr (Vladimir.Marangozov at inrialpes.fr) Date: Thu, 24 Jun 1999 18:33:31 +0200 (DFT) Subject: ob_refcnt access In-Reply-To: from "marangoz" at "Jun 24, 99 02:23:47 pm" Message-ID: marangoz wrote: > > > How about introducing internal macros for explicit ob_refcnt accesses > in the core? Actually, there are a number of places where one can see > "op->ob_refcnt" logic, which could be replaced with _Py_GETREF(op), > _Py_SETREF(op, n) thus decoupling completely the low level refcount > management defined in object.h: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) > > Comments? Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot to mention that if this detail doesn't hurt code aesthetics, one (I) could experiment more easily all sort of weird things with refcounting... I formulated the same wish for malloc & friends some time ago, that is, use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be defined for now as malloc, free, but nobody seems to be very excited about a smooth transition to other kinds of malloc. Hence, I reiterate this wish, 'cause switching to macros means preparing the code for the future, even if in the future it remains intact ;-). Defining these basic interfaces is clearly Guido's job :-) as he points out in his summary of the last Open Source summit, but nevertheless, I'm raising the issue to let him see what other people think about this and allow him to make decisions easier :-) -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 ----- End of forwarded message ----- -- Vladimir MARANGOZOV | Vladimir.Marangozov at inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252 From tismer at appliedbiometrics.com Fri Jun 25 20:47:51 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 25 Jun 1999 20:47:51 +0200 Subject: [Python-Dev] Re: ob_refcnt access (fwd) References: <199906251127.NAA27464@pukapuka.inrialpes.fr> Message-ID: <3773CED7.B87D055C@appliedbiometrics.com> Vladimir Marangozov wrote: > > FYI, my second message on this issue didn't reach the list because > of a stupid error of mine, so Guido and I exchanged two mails > in private. His response to the msg below was that he thinks > that tweaking the refcount scheme at this level wouldn't contribute > much and that he doesn't intend to change anything on this until 2.0 > which will be rewritten from scratch. > > Besides, if I want to satisfy my curiosity in hacking the refcounts > I can do it with a small patch because I've already located the places > where the ob_refcnt slot is accessed directly. Well, one Euro on that issue: > > #define _Py_GETREF(op) (((PyObject *)op)->ob_refcnt) > > #define _Py_SETREF(op, n) (((PyObject *)op)->ob_refcnt = (n)) > > > > Comments? > > Of course, the above should be (PyObject *)(op)->ob_refcnt. Also, I forgot > to mention that if this detail doesn't hurt code aesthetics, one (I) could > experiment more easily all sort of weird things with refcounting... I think if at all, this should be no typecast to stay safe. As long as every PyObject has a refcount, this would be correct and checked by the compiler. Why loose it? #define _Py_GETREF(op) ((op)->ob_refcnt) This carries the same semantics, the same compiler check, but adds a level of abstraction for future changes. > I formulated the same wish for malloc & friends some time ago, that is, > use everywhere in the core PyMem_MALLOC, PyMem_FREE etc, which would be > defined for now as malloc, free, but nobody seems to be very excited > about a smooth transition to other kinds of malloc. Hence, I reiterate > this wish, 'cause switching to macros means preparing the code for the > future, even if in the future it remains intact ;-). I wish to incref this wish by mine. In order to be able to try different memory allocation strategies, I would go even further and give every object type its own allocation macro which carries info about the object type about to be allocated. This costs nothing but a little macro expansion for the C compiler, but would allow to try new schemes, without always patching the Python source. ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From tismer at appliedbiometrics.com Fri Jun 25 20:56:39 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Fri, 25 Jun 1999 20:56:39 +0200 Subject: [Python-Dev] ob_refcnt access References: <000c01bebed5$4b8d1040$d29e2299@tim> Message-ID: <3773D0E7.458E00F1@appliedbiometrics.com> Tim Peters wrote: > > [Ka-Ping Yee, opines about GC] > > Ping, I think you're not getting any responses because this has been beaten > to death on c.l.py over the last month (for the 53rd time, no less ). > > A hefty percentage of CPython users *like* the reliably timely destruction > refcounting yields, and some clearly rely on it. [CG issue dropped, I know the thread] I know how much of a pain in the .. proper refcounting can be. Sometimes, after long debugging, I wished it would go. But finally, I think it is a *really good thing* to have to do proper refcounting. The reason is that this causes a lot of discipline, which improves the whole program. I guess with GC always there, quite a number of errors stay undetected. I can say this, since I have been through a week of debugging now, and I can now publish full blown first class continuations for Python yes I'm happy - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From skip at mojam.com Mon Jun 28 00:11:28 1999 From: skip at mojam.com (Skip Montanaro) Date: Sun, 27 Jun 1999 18:11:28 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: <199906181451.KAA11549@eric.cnri.reston.va.us> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> Message-ID: <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> Guido> Hmm... This would make it hard to make a patch release for 1.5.2 Guido> (possible called 1.5.3?). I *really* don't want the string Guido> methods to end up in a release yet -- there are too many rough Guido> edges (e.g. some missing methods, should join str() or not, Guido> etc.). Sorry for the delayed response. I've been out of town. When Barry returns would it be possible to merge the string methods in conditionally (#ifdef STRING_METHODS) and add a --with-string-methods configure option? How hard would it be to modify string.py, stringobject.c and stropmodule.c to carry that around? Skip Montanaro | http://www.mojam.com/ skip at mojam.com | http://www.musi-cal.com/~skip/ 518-372-5583 From tim_one at email.msn.com Mon Jun 28 04:27:06 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 27 Jun 1999 22:27:06 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <3773D0E7.458E00F1@appliedbiometrics.com> Message-ID: <000501bec10d$b6f1fb40$e19e2299@tim> [Christian Tismer] > ... > I can say this, since I have been through a week of debugging > now, and I can now publish > > full blown first class continuations for Python > > yes I'm happy - chris You should be! So how come nobody else is ? Let's fire some imagination here: without the stinkin' C stack snaking its way thru everything, then with the exception of external system objects (like open files), the full state of a running Python program is comprised of objects Python understands and controls. So with some amount of additional pain we could pickle them. And unpickle them. Painlessly checkpoint a long computation for possible restarting? Freeze a program while it's running on your mainframe, download it to your laptop and resume it while you're on the road? Ship a bug report with the computation frozen right before the error occurs? Take an app with gobs of expensive initialization, freeze it after it's "finally ready to go", and ship the latter instead? Capture the state of an interactive session for later resumption? Etc. Not saying those are easy, but getting the C stack out of the way means they move from impossible to plausible. Maybe it would help get past the Schemeophobia if, instead of calling them "continuations", you called 'em "platform-independent potentially picklable threads". pippt-sounds-as-good-as-it-reads-ly y'rs - tim From tim_one at email.msn.com Mon Jun 28 05:13:15 1999 From: tim_one at email.msn.com (Tim Peters) Date: Sun, 27 Jun 1999 23:13:15 -0400 Subject: [Python-Dev] ActiveState & fork & Perl Message-ID: <000601bec114$2a2929c0$e19e2299@tim> Moving back in time ... [GordonM] > Perhaps Christian's stackless Python would enable green threads... [Guido] > This has been suggested before... While this seems possible at first, > all blocking I/O calls would have to be redone to pass control to the > thread scheduler, before this would be useful -- a huge task! I didn't understand this. If I/O calls are left alone, and a green thread hit one, the whole program just sits there waiting for the call to complete, right? But if the same thing happens using "real threads" today, the same thing happens today anyway . That is, if a thread doesn't release the global lock before a blocking call today, the whole program just sits there etc. Or do you have some other kind of problem in mind here? unconvincedly y'rs - tim From MHammond at skippinet.com.au Mon Jun 28 06:29:29 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Mon, 28 Jun 1999 14:29:29 +1000 Subject: [Python-Dev] ob_refcnt access In-Reply-To: <000501bec10d$b6f1fb40$e19e2299@tim> Message-ID: <003301bec11e$d0cfc6d0$0801a8c0@bobcat> > > yes I'm happy - chris > > You should be! So how come nobody else is ? Im a little unhappy as this will break the Active Debugging stuff - ie, the ability for Python, Java, Perl, VBScript etc to all exist in the same process, each calling each other, and each being debuggable (makes a _great_ demo :-) Im not _really_ unhappy, Im just throwing this in as an FYI. The Active Debugging interfaces need some way of sorting a call stack. As many languages may be participating in a debugging session, there is no implicit ordering available. Inter-language calls are not made via the debugger, so it has no chance to intercept. So the solution MS came up with was, surprise surprise, the machine stack! :-) The assumption is that all languages will make _some_ use of the stack, so they ask a language to report its "stack base address" and "stack size". Using this information, the debugger sorts into the correct call sequence. Indeed, getting this information (even the half of it I did manage :-) was painful, and hard to get right. Ahh, the joys of bleeding-edge technologies :-) > Let's fire some imagination here: without the stinkin' C > stack snaking its I tried, and look what happened :-) Seriously, some if this stuff would be way cool. Bit I also understand completely the silence on this issue. When the thread started, there was much discussion about exactly what the hell these continuation/coroutine thingies even were. However, there were precious few real-world examples where they could be used. A few acedemic, theoretical places, but the only real contender I have seen brought up was Medusa. There were certainly no clear examples of "as soon as we have this, I could change abc to take advantage, and this would give us the very cool xyz" So, if anyone else if feeling at all like me about this issue, they are feeling all warm and fuzzy knowing that a few smart people are giving us the facility to do something we hope we never, ever have to do. :-) Mark. From rushing at nightmare.com Mon Jun 28 11:53:21 1999 From: rushing at nightmare.com (Sam Rushing) Date: Mon, 28 Jun 1999 02:53:21 -0700 (PDT) Subject: [Python-Dev] ob_refcnt access In-Reply-To: <41219828@toto.iv> Message-ID: <14199.13497.439332.366329@seattle.nightmare.com> Mark Hammond writes: > I tried, and look what happened :-) Seriously, some if this stuff > would be way cool. > > Bit I also understand completely the silence on this issue. When > the thread started, there was much discussion about exactly what > the hell these continuation/coroutine thingies even were. However, > there were precious few real-world examples where they could be > used. A few acedemic, theoretical places, but the only real > contender I have seen brought up was Medusa. There were certainly > no clear examples of "as soon as we have this, I could change abc > to take advantage, and this would give us the very cool xyz" Part of the problem is that we didn't have the feature to play with. Many of the possibilities are showing up now that it's here... The basic advantage to coroutines is they allow you to turn any event-driven/state-machine problem into one that is managed with 'normal' control state; i.e., for loops, while loops, nested procedure calls, etc... Here are a few possible real-world uses: ================================================== Parsing. I remember a discussion from a few years back about the distinction between 'push' and 'pull' model parsers. Coroutines let you have it both ways; you can write a parser in the most natural way (pull), but use it as a 'push'; i.e. for a web browser. ================================================== "http sessions". A single 'thread' of control that is re-entered whenever a hit from a particular user ('session') comes in to the web server: [Apologies to those that have already seen this cheezy example] def ecommerce (session): session.login() # sends a login form, waits for it to return basket = [] while 1: item = session.shop_for_item() if item: basket.append (item) else: break if basket: session.get_shipping_info() session.get_payment_info() session.transact() 'session.shop_for_item()' will resume the main coroutine, which will resume this coroutine only when a new hit comes in from that session/user, and 'return' this hit to the while loop. I have a little web server that uses this idea to play blackjack: http://www.nightmare.com:7777/ http://www.nightmare.com/stuff/blackjack_httpd.py [though I'm a little fuzzy on the rules]. Rather than building a state machine that keeps track of where the user has been, and what they're doing, you can keep all the state in local variables (like 'basket' above) - in other words, it's a much more natural style of programming. ================================================== One of the areas I'm most excited about is GUI coding. All GUI's are event driven. All GUI code is therefore written in a really twisted, state-machine fashion; interactions are very complex. OO helps a bit, but doesn't change the basic difficulty - past a certain point interesting things become too complex to try... Mr. Fuchs' paper ("Escaping the event loop: an alternative control structure for multi-threaded GUIs") does a much better job of describing this than I can: http://cs.nyu.edu/phd_students/fuchs/ http://cs.nyu.edu/phd_students/fuchs/gui.ps ================================================== Tim's example of 'dumping' a computation in the middle and storing it on disk (or sending it over a network), is not a fantasy... I have a 'stackless' Scheme system that does this right now. ================================================== Ok, final example. Isn't there an interface in Python to call a certain function after every so many vm insns? Using coroutines you could hook into this and provide non-preemptive 'threads' for those platforms that don't have them. [And the whole thing would be written in Python, not in C!] ================================================== > So, if anyone else if feeling at all like me about this issue, they > are feeling all warm and fuzzy knowing that a few smart people are > giving us the facility to do something we hope we never, ever have > to do. :-) "When the only tool you have is a hammer, everything looks like a nail". I saw the guys over in the Scheme shop cutting wood with a power saw; now I feel like a schmuck with my hand saw. You are right to be frightened by the strangeness of the underlying machinery; hopefully a simple and easy-to-understand interface can be built for the C level as well as Python. I think Christian's 'frame dispatcher' is fairly clear, and not *that* much of a departure from the current VM; it's amazing to me how little work really had to be done! -Sam From tismer at appliedbiometrics.com Mon Jun 28 14:07:33 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 28 Jun 1999 14:07:33 +0200 Subject: [Python-Dev] ob_refcnt access References: <003301bec11e$d0cfc6d0$0801a8c0@bobcat> Message-ID: <37776585.17B78DD1@appliedbiometrics.com> Mark Hammond wrote: > > > > yes I'm happy - chris > > > > You should be! So how come nobody else is ? (to Tim) I believe this comes simply since following me would force people to change their way of thinking. I am through this already, but it was hard for me. And after accepting to be stackless, there is no way to go back. Today I'm wondering about my past: "how could I think of stacks when thinking of programs?" This is so wrong. The truth is: Programs are just some data, part of it called code, part of it is local state, and! its future of computation. Out, over, roger. All the rest is artificial showstoppers. > Im a little unhappy as this will break the Active Debugging stuff - ie, the > ability for Python, Java, Perl, VBScript etc to all exist in the same > process, each calling each other, and each being debuggable (makes a > _great_ demo :-) > > Im not _really_ unhappy, Im just throwing this in as an FYI. Well, yet I see no problem. > The Active Debugging interfaces need some way of sorting a call stack. As > many languages may be participating in a debugging session, there is no > implicit ordering available. Inter-language calls are not made via the > debugger, so it has no chance to intercept. > > So the solution MS came up with was, surprise surprise, the machine stack! > :-) The assumption is that all languages will make _some_ use of the > stack, so they ask a language to report its "stack base address" and "stack > size". Using this information, the debugger sorts into the correct call > sequence. Now, I can give it a machine stack. There is just a frame dispatcher sitting on the stack, and it grabs frames from the current thread state. > Indeed, getting this information (even the half of it I did manage :-) was > painful, and hard to get right. I would have to see the AX interface. But for sure there will be some method hooks with which I can tell AX how to walk the frame chain. And why don't I simply publish frames as COM objects? This would give you much more than everything else, I guess. BTW, as it is now, there is no need to use AX debugging for Python, since Python can do it alone now. Of course it makes sense to have it all in the AX environment. You will be able to modify a running programs local variables, its evaluation stack, change its code, change where it returns to, all is doable. ... > Bit I also understand completely the silence on this issue. When the > thread started, there was much discussion about exactly what the hell these > continuation/coroutine thingies even were. However, there were precious > few real-world examples where they could be used. A few acedemic, > theoretical places, but the only real contender I have seen brought up was > Medusa. There were certainly no clear examples of "as soon as we have > this, I could change abc to take advantage, and this would give us the very > cool xyz" The problem was for me, that I had also no understanding what I was doing, actually. Implemented continuations without an idea how they work. But Tim and Sam said they were the most powerful control strucure possible, so I used all my time to find this out. Now I'm beginning to understand. And my continuation based coroutine example turns out to be twenty lines of Python code. Coming soon, after I served my whining customers. > So, if anyone else if feeling at all like me about this issue, they are > feeling all warm and fuzzy knowing that a few smart people are giving us > the facility to do something we hope we never, ever have to do. :-) Think of it as just a flare gun in your hands. By reading the fine print, you will realize that you actually hold an atom bomb, with a little code taming it for you. :-) back-to-the-future - ly y'rs - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From skip at mojam.com Mon Jun 28 15:13:31 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 28 Jun 1999 09:13:31 -0400 (EDT) Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: <000601bec114$2a2929c0$e19e2299@tim> References: <000601bec114$2a2929c0$e19e2299@tim> Message-ID: <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com> Still trying to make the brain shift from out-of-town to back-to-work... Tim> [GordonM] >> Perhaps Christian's stackless Python would enable green threads... What's a green thread? Skip From fredrik at pythonware.com Mon Jun 28 15:37:30 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 28 Jun 1999 15:37:30 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <000601bec114$2a2929c0$e19e2299@tim> <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com> Message-ID: <00ca01bec16b$5eef11e0$f29b12c2@secret.pythonware.com> > What's a green thread? a user-level thread (essentially what you can implement yourself by swapping stacks, etc). it's enough to write smoothly running threaded programs, but not enough to support true concurrency on multiple processors. also see: http://www.sun.com/solaris/java/wp-java/4.html From tismer at appliedbiometrics.com Mon Jun 28 18:11:43 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 28 Jun 1999 18:11:43 +0200 Subject: [Python-Dev] ActiveState & fork & Perl References: <000601bec114$2a2929c0$e19e2299@tim> <14199.29856.78030.445795@cm-24-29-94-19.nycap.rr.com> Message-ID: <37779EBF.A146D355@appliedbiometrics.com> Skip Montanaro wrote: > > Still trying to make the brain shift from out-of-town to back-to-work... > > Tim> [GordonM] > >> Perhaps Christian's stackless Python would enable green threads... > > What's a green thread? Nano-Threads. Threadless threads, solely Python driven, no system threads needed but possible. Think of the "big" system threads where each can run any number of tiny Python threads. Powered by snake oil - ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From akuchlin at mems-exchange.org Mon Jun 28 19:55:16 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Mon, 28 Jun 1999 13:55:16 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 Message-ID: <14199.46852.932030.576094@amarok.cnri.reston.va.us> Paul Prescod sent the following note to the XML-SIG mailing list. Thoughts? --amk -------------- next part -------------- An embedded message was scrubbed... From: Paul Prescod Subject: [XML-SIG] [Fwd: Re: parsers for Palm?] Date: Mon, 28 Jun 1999 12:00:50 -0400 Size: 2535 URL: From guido at CNRI.Reston.VA.US Mon Jun 28 21:35:04 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 15:35:04 -0400 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: Your message of "Mon, 28 Jun 1999 13:55:16 EDT." <14199.46852.932030.576094@amarok.cnri.reston.va.us> References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> Message-ID: <199906281935.PAA01439@eric.cnri.reston.va.us> > Paul Prescod sent the following note to the XML-SIG mailing list. > Thoughts? I don't know any of the acronyms, and I'm busy writing a funding proposal plus two talks for the Monterey conference, so I don't have any thoughts to spare at the moment. Perhaps someone could present the case with some more background info? (It does sounds intriguing, but then again I'm not sure how many people *really* need to parse XML -- it doesn't strike me as something of the same generality as regular expressions yet.) --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at digicool.com Mon Jun 28 21:51:00 1999 From: jim at digicool.com (Jim Fulton) Date: Mon, 28 Jun 1999 15:51:00 -0400 Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> Message-ID: <3777D224.6936B890@digicool.com> "Andrew M. Kuchling" wrote: > > Paul Prescod sent the following note to the XML-SIG mailing list. > Thoughts? > When I brought up some ideas for adding a separate validation mechanism for PyExpat, some folks suggested that I should look at some other C libraries, including one from the ILU folks and some other one that I can't remember the name of off hand. Should we (used loosely ;) look into the other libraries before including expat in the Python dist? Jim -- Jim Fulton mailto:jim at digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From guido at CNRI.Reston.VA.US Mon Jun 28 22:07:50 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 16:07:50 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Your message of "Mon, 28 Jun 1999 02:53:21 PDT." <14199.13497.439332.366329@seattle.nightmare.com> References: <14199.13497.439332.366329@seattle.nightmare.com> Message-ID: <199906282007.QAA01570@eric.cnri.reston.va.us> > Part of the problem is that we didn't have the feature to play with. > Many of the possibilities are showing up now that it's here... > > The basic advantage to coroutines is they allow you to turn any > event-driven/state-machine problem into one that is managed with > 'normal' control state; i.e., for loops, while loops, nested procedure > calls, etc... > > Here are a few possible real-world uses: Thanks, Sam! Very useful collection of suggestions. (How come I'm not surprised to see these coming from you ;-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Mon Jun 28 22:08:42 1999 From: akuchlin at mems-exchange.org (Andrew M. Kuchling) Date: Mon, 28 Jun 1999 16:08:42 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <199906281935.PAA01439@eric.cnri.reston.va.us> References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> <199906281935.PAA01439@eric.cnri.reston.va.us> Message-ID: <14199.54858.464165.381344@amarok.cnri.reston.va.us> Guido van Rossum writes: >any thoughts to spare at the moment. Perhaps someone could present >the case with some more background info? (It does sounds intriguing, Paul is probably suggesting this so that Python comes with a fast, standardized XML parser out of the box. On the other hand, where do you draw the line? Paul suggests including PyExpat and easySAX (a small SAX implementation), but why not full SAX, and why not DOM? My personal leaning is that we can get more bang for the buck by working on the Distutils effort, so that installing a package like PyExpat becomes much easier, rather than piling more things into the core distribution. -- A.M. Kuchling http://starship.python.net/crew/amk/ The Law, in its majestic equality, forbids the rich, as well as the poor, to sleep under the bridges, to beg in the streets, and to steal bread. -- Anatole France From guido at CNRI.Reston.VA.US Mon Jun 28 22:17:41 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 16:17:41 -0400 Subject: [Python-Dev] ActiveState & fork & Perl In-Reply-To: Your message of "Sun, 27 Jun 1999 23:13:15 EDT." <000601bec114$2a2929c0$e19e2299@tim> References: <000601bec114$2a2929c0$e19e2299@tim> Message-ID: <199906282017.QAA01592@eric.cnri.reston.va.us> [Tim] > Moving back in time ... > > [GordonM] > > Perhaps Christian's stackless Python would enable green threads... > > [Guido] > > This has been suggested before... While this seems possible at first, > > all blocking I/O calls would have to be redone to pass control to the > > thread scheduler, before this would be useful -- a huge task! > > I didn't understand this. If I/O calls are left alone, and a green thread > hit one, the whole program just sits there waiting for the call to complete, > right? > > But if the same thing happens using "real threads" today, the same thing > happens today anyway . That is, if a thread doesn't release the > global lock before a blocking call today, the whole program just sits there > etc. > > Or do you have some other kind of problem in mind here? OK, I'll explain. Suppose there's a wrapper for a read() call whose essential code looks like this: Py_BEGIN_ALLOW_THREADS n = read(fd, buffer, size); Py_END_ALLOW_THREADS When the read() call is made, other threads can run. However in green threads (e.g. using Christian's stackless Python, where a thread switcher is easily added) the whole program would block at this point. The way to fix this is to have a way to tell the scheduler "come back to this thread when there's input ready on this fd". The scheduler has to combine such calls from all threads into a single giant select. It gets more complicated when you have blocking I/O wrapped in library functions, e.g. gethostbyname() or fread(). Then, you need to have a way to implement sleep() by talking to the thread schedule (remember, this is the thread scheduler we have to write ourselves). Oh, and of course the thread scheduler must also have a select() lookalike API so I can still implement the select module. Does this help? Or am I misunderstanding your complaint? Or is a missing? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at CNRI.Reston.VA.US Mon Jun 28 22:23:57 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Mon, 28 Jun 1999 16:23:57 -0400 Subject: [Python-Dev] ob_refcnt access In-Reply-To: Your message of "Sun, 27 Jun 1999 22:27:06 EDT." <000501bec10d$b6f1fb40$e19e2299@tim> References: <000501bec10d$b6f1fb40$e19e2299@tim> Message-ID: <199906282023.QAA01605@eric.cnri.reston.va.us> > > yes I'm happy - chris > > You should be! So how come nobody else is ? Chris and I have been through this in private, but it seems that as long as I don't fess up in public I'm afraid it will come back and I'll get pressure coming at me to endorse Chris' code. I have no problem with the general concept (see my response to Sam's post of exciting examples). But I have a problem with a megapatch like this that affects many places including very sensitive areas like the main loop in ceval.c. The problem is simply that I know this is very intricate code, and I can't accept a patch of this scale to this code before I understand every little detail of the patch. I'm just too worried otherwise that there's a reference count bug in it that will very subtly break stuff and that will take forever to track down; I feel that when I finally have the time to actually understand the whole patch I'll be able to prevent that (famous last words). Please don't expect action or endorsement of Chris' patch from me any time soon, I'm too busy. However I'd love it if others used the patch in a real system and related their experiences regarding performance, stability etc. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at mojam.com Mon Jun 28 22:24:46 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 28 Jun 1999 16:24:46 -0400 (EDT) Subject: [Python-Dev] Paul Prescod: add Expat to 1.6 In-Reply-To: <14199.54858.464165.381344@amarok.cnri.reston.va.us> References: <14199.46852.932030.576094@amarok.cnri.reston.va.us> <199906281935.PAA01439@eric.cnri.reston.va.us> <14199.54858.464165.381344@amarok.cnri.reston.va.us> Message-ID: <14199.55737.544299.718558@cm-24-29-94-19.nycap.rr.com> Andrew> My personal leaning is that we can get more bang for the buck by Andrew> working on the Distutils effort, so that installing a package Andrew> like PyExpat becomes much easier, rather than piling more things Andrew> into the core distribution. Amen to that. See Guido's note and my response regarding soundex in the Doc-SIG. Perhaps you could get away with a very small core distribution that only contained the stuff necessary to pull everything else from the net via http or ftp... Skip From bwarsaw at cnri.reston.va.us Mon Jun 28 23:20:05 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 28 Jun 1999 17:20:05 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> Message-ID: <14199.59141.447168.107784@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> Sorry for the delayed response. I've been out of town. When SM> Barry returns would it be possible to merge the string methods SM> in conditionally (#ifdef STRING_METHODS) and add a SM> --with-string-methods configure option? How hard would it be SM> to modify string.py, stringobject.c and stropmodule.c to carry SM> that around? How clean do you want this separation to be? Just disabling the actual string methods would be easy, and I'm sure I can craft a string.py that would work in either case (remember stropmodule.c wasn't even touched). There are a few other miscellaneous changes mostly having to do with some code cleaning, but those are probably small (and uncontroversial?) enough that they can either stay in, or be easily understood and accepted (optimistic aren't I? :) by Guido during the merge. I'll see what I can put together in the next 1/2 hour or so. -Barry From skip at mojam.com Mon Jun 28 23:37:03 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 28 Jun 1999 17:37:03 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: <14199.59141.447168.107784@anthem.cnri.reston.va.us> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> Message-ID: <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> >>>>> "BAW" == Barry A Warsaw writes: >>>>> "SM" == Skip Montanaro writes: SM> would it be possible to merge the string methods in conditionally SM> (#ifdef STRING_METHODS) ... BAW> How clean do you want this separation to be? Just disabling the BAW> actual string methods would be easy, and I'm sure I can craft a BAW> string.py that would work in either case (remember stropmodule.c BAW> wasn't even touched). Barry, I would be happy with having to manually #define STRING_METHODS in stringobject.c. Forget about the configure flag at first. I think the main point for experimenters like myself is that it is a hell of a lot easier to twiddle a #define than to try merging different CVS branches to get access to the functionality. Most of us have probably advanced far enough on the Emacs, vi or Notepad learning curves to handle that change, while most of us are probably not CVS wizards. Once it's in the main CVS branch, you can announce the change or not on the main list as you see fit (perhaps on python-dev sooner and on python-list later after some more experience has been gained with the patches). Skip From tismer at appliedbiometrics.com Mon Jun 28 23:41:28 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Mon, 28 Jun 1999 23:41:28 +0200 Subject: [Python-Dev] ob_refcnt access References: <000501bec10d$b6f1fb40$e19e2299@tim> <199906282023.QAA01605@eric.cnri.reston.va.us> Message-ID: <3777EC08.42C15478@appliedbiometrics.com> Guido van Rossum wrote: > > > > yes I'm happy - chris > > > > You should be! So how come nobody else is ? > > Chris and I have been through this in private, but it seems that as > long as I don't fess up in public I'm afraid it will come back and > I'll get pressure coming at me to endorse Chris' code. Please let me add a few comments. > I have no problem with the general concept (see my response to Sam's > post of exciting examples). This is the most worthful statement I can get. And see below. > But I have a problem with a megapatch like this that affects many > places including very sensitive areas like the main loop in ceval.c. Actually it is a rather small patch, but the implicit semantic change is rather hefty. > The problem is simply that I know this is very intricate code, and I > can't accept a patch of this scale to this code before I understand > every little detail of the patch. I'm just too worried otherwise that > there's a reference count bug in it that will very subtly break stuff > and that will take forever to track down; I feel that when I finally > have the time to actually understand the whole patch I'll be able to > prevent that (famous last words). I never expected to see this patch go into Python right now. The current public version is an alpha 0.2. Meanwhile I have 0.3, with again new patches, and a completely reworked policy of frame refcounting. Even worse, there is a night mare of more work which I simply had no time for. All the instance and onbect code must be carefully changed, since they still need to call back in a recursive way. This is hard to change until I have a better mechanism to generate all the callbacks. For instance, I cannot switch tasks in an __init__ at this time. Although I can do so in regular methods. But this is all half-baked. In other words, the danger is by far not over, but still in the growing phase. I believe I should work on and maintain this until I'm convinced that there are not more refcount bugs than before, and until I have evicted every recursion which is a serious impact. This is still months of work. When I release the final version, I will pay $100 to the first person who finds a refcount bug which I introduced. But not before. I don't want to waste Guido's time, and for sure not now with this bloody fresh code. What I needed to know is wether I am on the right track or if I'm wasting my time. But since I have users already, it is no waste at all. What I really could use were some hints about API design. Guido, thank you for Python - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home From bwarsaw at cnri.reston.va.us Tue Jun 29 00:04:05 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 28 Jun 1999 18:04:05 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> Message-ID: <14199.61781.695240.71428@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> I would be happy with having to manually #define SM> STRING_METHODS in stringobject.c. Forget about the configure SM> flag at first. Oh, I agree -- I wasn't going to add the configure flag anyway :) What I meant was how much of my changes should be ifdef-out-able? Just the methods on string objects? All my changes? -Barry From skip at mojam.com Tue Jun 29 00:30:55 1999 From: skip at mojam.com (Skip Montanaro) Date: Mon, 28 Jun 1999 18:30:55 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? In-Reply-To: <14199.61781.695240.71428@anthem.cnri.reston.va.us> References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> <14199.61781.695240.71428@anthem.cnri.reston.va.us> Message-ID: <14199.63115.58129.480522@cm-24-29-94-19.nycap.rr.com> BAW> Oh, I agree -- I wasn't going to add the configure flag anyway :) BAW> What I meant was how much of my changes should be ifdef-out-able? BAW> Just the methods on string objects? All my changes? Well, when the CPP macro is undefined, the behavior from Python should be unchanged, yes? Am I missing something? There are string methods and what else involved in the changes? If string.py has to test to see if "".capitalize yields an AttributeError to decide what to do, I think that sort of change will be simple enough to accommodate. Any new code that gets well-exercised now before string methods become widely available is all to the good in my opinion. It's not fixing something that ain't broke, more like laying the groundwork for new directions. Skip From bwarsaw at cnri.reston.va.us Tue Jun 29 01:04:55 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Mon, 28 Jun 1999 19:04:55 -0400 (EDT) Subject: [Python-Dev] Merge the string_methods tag? References: <015601beb964$f37a4fa0$0801a8c0@bobcat> <199906181451.KAA11549@eric.cnri.reston.va.us> <14198.41195.968785.37763@cm-24-29-94-19.nycap.rr.com> <14199.59141.447168.107784@anthem.cnri.reston.va.us> <14199.59831.285431.966321@cm-24-29-94-19.nycap.rr.com> <14199.61781.695240.71428@anthem.cnri.reston.va.us> <14199.63115.58129.480522@cm-24-29-94-19.nycap.rr.com> Message-ID: <14199.65431.161001.730247@anthem.cnri.reston.va.us> >>>>> "SM" == Skip Montanaro writes: SM> Well, when the CPP macro is undefined, the behavior from SM> Python should be unchanged, yes? Am I missing something? SM> There are string methods and what else involved in the SM> changes? There are a few additions to the C API, but these probably don't need to be ifdef'd, since they don't change the existing semantics or interfaces. abstract.c has some code cleaning and reorganization, but the public API and semantics should be unchanged. Builtin long() and int() have grown an extra optional argument, which specifies the base to use. If this extra argument isn't given then they should work the same as in the main branch. Should we ifdef out the extra argument? SM> If string.py has to test to see if "".capitalize yields an SM> AttributeError to decide what to do, I think that sort of SM> change will be simple enough to accommodate. Basically what I've got is to move the main-branch string.py to stringold.py and if you get an attribute error on ''.upper I do a "from stringold import *". I've also got some hackarounds for test_string.py to make it work with or without string methods. SM> Any new code that gets well-exercised now before string SM> methods become widely available is all to the good in my SM> opinion. It's not fixing something that ain't broke, more SM> like laying the groundwork for new directions. Agreed. I'll check my changes in shortly. The ifdef will only disable the string methods. long() and int() will still accept the option argument. Stay tuned, -Barry From tim_one at email.msn.com Tue Jun 29 06:16:34 1999 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 29 Jun 1999 00:16:34 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199906282017.QAA01592@eric.cnri.reston.va.us> Message-ID: <000201bec1e6$2c496940$229e2299@tim> [Tim, claims not to understand Guido's > While this seems possible at first, all blocking I/O calls would > have to be redone to pass control to the thread scheduler, before > this would be useful -- a huge task! ] [Guido replies, sketching an elaborate scheme for making threads that are fake nevertheless act like real threads in the particular case of potentially blocking I/O calls] > ... > However in green threads (e.g. using Christian's stackless Python, > where a thread switcher is easily added) the whole program would block > at this point. The way to fix this is [very painful ]. > ... > Does this help? Or am I misunderstanding your complaint? Or is a > missing? No missing wink; I think it hinges on a confusion about the meaning of your original word "useful". Threads can be very useful purely as a means for algorithm structuring, due to independent control flows. Indeed, I use threads in Python most often these days without any hope or even *use* for potential parallelism (overlapped I/O or otherwise). It's the only non-brain-busting way to write code now that requires advanced control of the iterator, generator, coroutine, or even independent-agents-in-a-pipeline flavors. Fake threads would allow code like that to run portably, and also likely faster than with the overheads of OS-level threads. For pedagogical and debugging purposes too, fake threads could be very much friendlier than the real thing. Heck, we could even run them on a friendly old Macintosh . If all fake threads block when any hits an I/O call, waiting for the latter to return, we're no worse off than in a single-threaded program. Being "fake threads", it *is* a single-threaded program, so it's not even a surprise . Maybe in your Py_BEGIN_ALLOW_THREADS n = read(fd, buffer, size); Py_END_ALLOW_THREADS you're assuming that some other Python thread needs to run in order for the read implementation to find something to read? Then that's a dead program for sure, as it would be for a single-threaded run today too. I can live with that! I don't expect fake threads to act like real threads in all cases. My assumption was that the BEGIN/END macros would do nothing under fake threads -- since there isn't a real thread backing it up, a fake thread can't yield in the middle of random C code (Python has no way to capture/restore the C state). I didn't picture fake threads working except as a Python-level feature, with context switches limited to bytecode boundaries (which a stackless ceval can handle with ease; the macro context switch above is "in the middle of" some bytecode's interpretation, and while "green threads" may be interested in simulating the that, Tim's "fake threads" aren't). different-threads-for-different-heads-ly y'rs - tim From guido at CNRI.Reston.VA.US Tue Jun 29 14:01:30 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 29 Jun 1999 08:01:30 -0400 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: Your message of "Tue, 29 Jun 1999 00:16:34 EDT." <000201bec1e6$2c496940$229e2299@tim> References: <000201bec1e6$2c496940$229e2299@tim> Message-ID: <199906291201.IAA02535@eric.cnri.reston.va.us> > [Tim, claims not to understand Guido's > > > While this seems possible at first, all blocking I/O calls would > > have to be redone to pass control to the thread scheduler, before > > this would be useful -- a huge task! > > ] > > [Guido replies, sketching an elaborate scheme for making threads that > are fake nevertheless act like real threads in the particular case of > potentially blocking I/O calls] [Tim responds, explaining that without this threads are quite useful.] I guess it's all in the perspective. 99.99% of all thread apps I've ever written use threads primarily to overlap I/O -- if there wasn't I/O to overlap I wouldn't use a thread. I think I share this perspective with most of the thread community (after all, threads originate in the OS world where they were invented as a replacement for I/O completion routines). (And no, I don't use threads to get the use of multiple CPUs, since I almost never have had more than one of those. And no, I wasn't expecting the read() to be fed from another thread.) As far as I can tell, all the examples you give are easily done using coroutines. Can we call whatever you're asking for coroutines instead of fake threads? I think that when you mention threads, green or otherwise colored, most people who are at all familiar with the concept will assume they provide I/O overlapping, except perhaps when they grew up in the parallel machine world. Certainly all examples I give in my never-completed thread tutorial (still available at http://www.python.org/doc/essays/threads.html) use I/O as the primary motivator -- this kind of example appeals to simples souls (e.g. downloading more than one file in parallel, which they probably have already seen in action in their web browser), as opposed to generators or pipelines or coroutines (for which you need to have some programming theory background to appreciate the powerful abstraction possibillities they give). Another good use of threads (suggested by Sam) is for GUI programming. An old GUI system, News by David Rosenthal at Sun, used threads programmed in PostScript -- very elegant (and it failed for other reasons -- if only he had used Python instead :-). On the other hand, having written lots of GUI code using Tkinter, the event-driven version doesn't feel so bad to me. Threads would be nice when doing things like rubberbanding, but I generally agree with Ousterhout's premise that event-based GUI programming is more reliable than thread-based. Every time your Netscape freezes you can bet there's a threading bug somewhere in the code. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm at hypernet.com Wed Jun 30 02:03:37 1999 From: gmcm at hypernet.com (Gordon McMillan) Date: Tue, 29 Jun 1999 19:03:37 -0500 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) In-Reply-To: <199906291201.IAA02535@eric.cnri.reston.va.us> References: Your message of "Tue, 29 Jun 1999 00:16:34 EDT." <000201bec1e6$2c496940$229e2299@tim> Message-ID: <1281421591-30373695@hypernet.com> I've been out of town, too (not with Skip), but I'll jump back in here... [Guido] > When the read() call is made, other threads can run. However in > green threads (e.g. using Christian's stackless Python, where a > thread switcher is easily added) the whole program would block at > this point. The way to fix this is to have a way to tell the > scheduler "come back to this thread when there's input ready on > this fd". The scheduler has to combine such calls from all > threads into a single giant select. It gets more complicated when > you have blocking I/O I suppose, in the best of all possible worlds, this is true. But I'm fairly sure there are a number of well-used green thread implementations which go only part way - eg, if this is a "selectable" fd, do a select with a timeout of 0 on this one fd and choose to read/write or swap accordingly. That's a fair amount of bang for the buck, I think... [Tim] > Threads can be very useful purely as a means for algorithm > structuring, due to independent control flows. Spoken like a true schizo, Tim me boyos! Actually, you and Guido are saying almost the same thing - threads are useful when more than one thing is "driving" your processing. It's just that in the real world, that's almost always I/O, not some sick, tortured internal dialogue... I think the real question is: how useful would this be on a Mac? On Win31? (I'll answer that - useful, though I've finally got my last Win31 client to promise to upgrade, RSN ). - Gordon From MHammond at skippinet.com.au Wed Jun 30 01:47:26 1999 From: MHammond at skippinet.com.au (Mark Hammond) Date: Wed, 30 Jun 1999 09:47:26 +1000 Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? Message-ID: <006f01bec289$bf1e3a90$0801a8c0@bobcat> This probably isnt the correct list, but I really dont want to start a philosophical discussion - hopefully people here are both "in the know" and able to resist a huge thread :-) Especially given the recent slashdot flamefest between RMS and ESR, I thought it worth getting correct. I just read a statement early in our book - "Python is an Open Source tool, ...". Is this "near enough"? Should I avoid this term in preference for something more generic (ie, even simply dropping the caps?) - but the OS(tm) idea seems doomed anyway... Just-hoping-to-avoid-flame-mail-from-rabid-devotees-of-either-religion :-) Mark. From da at ski.org Wed Jun 30 08:16:01 1999 From: da at ski.org (David Ascher) Date: Tue, 29 Jun 1999 23:16:01 -0700 (Pacific Daylight Time) Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? In-Reply-To: <006f01bec289$bf1e3a90$0801a8c0@bobcat> Message-ID: On Wed, 30 Jun 1999, Mark Hammond wrote: > I just read a statement early in our book - "Python is an Open Source tool, > ...". > > Is this "near enough"? Should I avoid this term in preference for > something more generic (ie, even simply dropping the caps?) - but the > OS(tm) idea seems doomed anyway... It's not certified Open Source, but my understanding is that ESR believes the Python license would qualify if GvR applied for certification. BTW, you won't be able to avoid flames about something or other, and given that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and pseudo-RMSs, all Anonymous Cowards. =) --david From fredrik at pythonware.com Wed Jun 30 10:42:15 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 30 Jun 1999 10:42:15 +0200 Subject: [Python-Dev] Is Python Free Software, free software, Open Source,open source, etc? References: Message-ID: <012601bec2d4$74c315b0$f29b12c2@secret.pythonware.com> > BTW, you won't be able to avoid flames about something or other, and given > that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and > pseudo-RMSs, all Anonymous Cowards. =) just check the latest "learning python" review on Amazon... surely proves that perlers are weird people ;-) From guido at CNRI.Reston.VA.US Wed Jun 30 14:06:21 1999 From: guido at CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 30 Jun 1999 08:06:21 -0400 Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? In-Reply-To: Your message of "Tue, 29 Jun 1999 23:16:01 PDT." References: Message-ID: <199906301206.IAA04619@eric.cnri.reston.va.us> > On Wed, 30 Jun 1999, Mark Hammond wrote: > > > I just read a statement early in our book - "Python is an Open Source tool, > > ...". > > > > Is this "near enough"? Should I avoid this term in preference for > > something more generic (ie, even simply dropping the caps?) - but the > > OS(tm) idea seems doomed anyway... > > It's not certified Open Source, but my understanding is that ESR believes > the Python license would qualify if GvR applied for certification. I did, months ago, and haven't heard back yet. My current policy is to drop the initial caps and say "open source" -- most people don't know the difference anyway. > BTW, you won't be able to avoid flames about something or other, and given > that you're writing a Win32 book, you'll be flamed by both pseudo-ESRs and > pseudo-RMSs, all Anonymous Cowards. =) I don't have the time to read slashdot -- can anyone summarize what ESR and RMS were flaming about? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Wed Jun 30 14:22:09 1999 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 30 Jun 1999 14:22:09 +0200 Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? References: <199906301206.IAA04619@eric.cnri.reston.va.us> Message-ID: <000701bec2f3$2df78430$f29b12c2@secret.pythonware.com> > I did, months ago, and haven't heard back yet. My current policy is > to drop the initial caps and say "open source" -- most people don't > know the difference anyway. and "Open Source" cannot be trademarked anyway... > I don't have the time to read slashdot -- can anyone summarize what > ESR and RMS were flaming about? the usual; RMS wrote in saying that 1) he's not part of the open source movement, 2) open source folks don't under- stand the real meaning of the word freedom, and 3) he's not a communist. ESR response is here: http://www.tuxedo.org/~esr/writings/shut-up-and-show-them.html ... OSI's tactics work. That's the easy part of the lesson. The hard part is that the FSF's tactics don't work, and never did. ... So the next time RMS, or anybody else, urges you to "talk about freedom", I urge you to reply "Shut up and show them the code." imo, the best thing is of course to ignore them both, and continue to ship great stuff under a truly open license... From bwarsaw at cnri.reston.va.us Wed Jun 30 14:54:06 1999 From: bwarsaw at cnri.reston.va.us (Barry A. Warsaw) Date: Wed, 30 Jun 1999 08:54:06 -0400 (EDT) Subject: [Python-Dev] Is Python Free Software, free software, Open Source, open source, etc? References: <199906301206.IAA04619@eric.cnri.reston.va.us> <000701bec2f3$2df78430$f29b12c2@secret.pythonware.com> Message-ID: <14202.4974.162380.284749@anthem.cnri.reston.va.us> >>>>> "FL" == Fredrik Lundh writes: FL> imo, the best thing is of course to ignore them both, and FL> continue to ship great stuff under a truly open license... Agreed, of course. I think given the current state of affairs (i.e. the non-trademarkability of "Open Source", but also the mind share that little-oh, little-ess has gotten), we should say that Python (and JPython) are "open source" projects and let people make up their own minds about what that means. waiting-for-guido's-inevitable-faq-entry-ly y'rs, -Barry From tismer at appliedbiometrics.com Tue Jun 29 20:17:51 1999 From: tismer at appliedbiometrics.com (Christian Tismer) Date: Tue, 29 Jun 1999 20:17:51 +0200 Subject: Fake threads (was [Python-Dev] ActiveState & fork & Perl) References: <000201bec1e6$2c496940$229e2299@tim> <199906291201.IAA02535@eric.cnri.reston.va.us> Message-ID: <37790DCF.7C0E8FA@appliedbiometrics.com> Guido van Rossum wrote: > [Guido and Tim, different opinions named misunderstanding :] > > I guess it's all in the perspective. 99.99% of all thread apps I've > ever written use threads primarily to overlap I/O -- if there wasn't > I/O to overlap I wouldn't use a thread. I think I share this > perspective with most of the thread community (after all, threads > originate in the OS world where they were invented as a replacement > for I/O completion routines). > > (And no, I don't use threads to get the use of multiple CPUs, since I > almost never have had more than one of those. And no, I wasn't > expecting the read() to be fed from another thread.) > > As far as I can tell, all the examples you give are easily done using > coroutines. Can we call whatever you're asking for coroutines instead > of fake threads? I don't think this would match it. These threads can be implemented by coroutines which always run apart, and have some scheduling running. When there is polled I/O available, they can of course give a threaded feeling. If an application polls the kbhit function instead of reading, the other "threads" can run nicely. Can be quite useful for very small computers like CE. Many years before, I had my own threads under Turbo Pascal (I had no idea that these are called so). Ok, this was DOS, but it was enough of threading to have a "process" which smoothly updated a graphics screen, while another (single! :) "process" wrote data to the disk, a third one handled keyboard input, and a fourth drove a multichannel A/D sampling device. ? Oops, I just realized that these were *true* threads. The disk process would not run smooth, I agree. All the rest would be fine with green threads. ... > On the other hand, having written lots of GUI code using Tkinter, the > event-driven version doesn't feel so bad to me. Threads would be nice > when doing things like rubberbanding, but I generally agree with > Ousterhout's premise that event-based GUI programming is more reliable > than thread-based. Every time your Netscape freezes you can bet > there's a threading bug somewhere in the code. Right. But with a traceback instead of a machine hang, this could be more attractive to do. Green threads/coroutines are incredibly fast (one c call per switch). And since they have local state, you can save most of the attribute lookups which are needed with event based programming. (But this is all theory until we tried it). ciao - chris -- Christian Tismer :^) Applied Biometrics GmbH : Have a break! Take a ride on Python's Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net 10553 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF we're tired of banana software - shipped green, ripens at home