From tismer@tismer.com Wed Jan 1 02:44:04 2003 From: tismer@tismer.com (Christian Tismer) Date: Wed, 01 Jan 2003 03:44:04 +0100 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors References: <200212311738.SAA02244@proton.lysator.liu.se> <200212311822.gBVIMJM24712@odiug.zope.com> Message-ID: <3E1255F4.7070001@tismer.com> Guido van Rossum wrote: > On the one hand I find this a nice backwards-compatible extension of > divmod(). On the other hand I've coded this use case with separate > divmod() calls and never had any trouble getting it right. So I think > this doesn't address a real need, and it conflicts with the need to > avoid unnecessary growth of the language. Explaining it (especially > why it works backward) takes some effort, and I don't want to do that > to all the book authors etc. I totally agree. This nice extension to divmod() turns the function into something more polynomial algebra related, and divmod is no longer the proper name of the function. Also, the specialized use of it makes it more suitable to be put into a general algebra module for Python, for sure not into the builtins. Having that said, let's have a closer look at divmod. It appears in 36 .py files out of about 1500 .py files in the Python 2.2 source distribution. This is little more than 2 percent. The effort to correctly remember the order of the arguments and the results tuple does not suffice to justify the existance of this function at all. Furthermore, divmod does not provide any functionality that isn't easily obtained by a simple div and mod. Finally, and for me this is the killer argument for divmod: The standard use of divmod is splitting numbers over small integer bases. I can only see an advantage for divisions which come at high computational cost, which is factorizing polyomials or large numbers. But for the majority of applications, a quick time measurement of the function call overhead against doing an inline div and mod, I found that divmod compared to / and % is at least 50 percent *slower* to compute, due to the fact that a function call, followed by building and unpacking a result tuple is more expensive than the optimized interpreter code. The real computation is neglectible for small numbers, compared to the overhead of the engine. Therefore, I consider divmod a waste of code and a function to be deprecated ASAP (and since years). Save brain cells, save computation time, and save paper and ink of the book writers by dropping divmod! Vice versa, if at all, I suggest a built in /% operator as a faster equivalen to single / and %, but I believe this code can be used more efficiently. divmod should be a special case of an algebraic extension module and removed from the builtins. Python has got a lot of extensions, modelled towards more user-friendlyness. I do think that this goal can be achieved not only by extending, but also by deprecating functions which have little or no benefits over built-in operators. The existence of divmod gives the wrong feeling of optimizing one's code, which is a lie to the user. divmod does not save computation time, does not reduce programming time, does not simplify programs, eats paper and code space. divmod is also even not found in common compiled languages any longer. Even there, division isn't expensive enough to justify yet another function to remember and to implement. Please drop divmod. We will not miss it. from-__past__-import-divmod - ly y'rs -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From guido@python.org Wed Jan 1 04:22:00 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 31 Dec 2002 23:22:00 -0500 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors In-Reply-To: Your message of "Wed, 01 Jan 2003 03:44:04 +0100." <3E1255F4.7070001@tismer.com> References: <200212311738.SAA02244@proton.lysator.liu.se> <200212311822.gBVIMJM24712@odiug.zope.com> <3E1255F4.7070001@tismer.com> Message-ID: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> > Therefore, I consider divmod a waste of code and > a function to be deprecated ASAP (and since years). Maybe it wasn't such a good idea. > Save brain cells, save computation time, and save > paper and ink of the book writers by dropping divmod! At this point, any change causes waste, so unless it's truly broken (which I don't believe) I'm not for changing it. > Vice versa, if at all, I suggest a built in /% operator > as a faster equivalen to single / and %, but I believe > this code can be used more efficiently. > divmod should be a special case of an algebraic > extension module and removed from the builtins. That would have to be //% to be in line with the // operator, of course. But I'd rather spend the effort towards making the compiler smart enough to recognize that divmod is a built-in so it can generate more efficient code (and I *am* prepared to make the necessary -- slight -- changes to the language so that the compiler can make this deduction safely). > divmod does not save computation time, does not > reduce programming time, does not simplify programs, > eats paper and code space I find mm, ss = divmod(ss, 60) easier to read than mm, ss = ss//60, ss%60 --Guido van Rossum (home page: http://www.python.org/~guido/) From tismer@tismer.com Wed Jan 1 09:00:31 2003 From: tismer@tismer.com (Christian Tismer) Date: Wed, 01 Jan 2003 10:00:31 +0100 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors References: <200212311738.SAA02244@proton.lysator.liu.se> <200212311822.gBVIMJM24712@odiug.zope.com> <3E1255F4.7070001@tismer.com> <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E12AE2F.2010805@tismer.com> Guido van Rossum wrote: ... >>Vice versa, if at all, I suggest a built in /% operator >>as a faster equivalen to single / and %, but I believe >>this code can be used more efficiently. >>divmod should be a special case of an algebraic >>extension module and removed from the builtins. > > > That would have to be //% to be in line with the // operator, of > course. But I'd rather spend the effort towards making the compiler > smart enough to recognize that divmod is a built-in so it can generate > more efficient code (and I *am* prepared to make the necessary -- > slight -- changes to the language so that the compiler can make this > deduction safely). I stand corrected. And yes, catching certain builtins sounds like an optimzation path that is still open. Hey, len would have an incredible boost!!! >>divmod does not save computation time, does not >>reduce programming time, does not simplify programs, >>eats paper and code space > > > I find > > mm, ss = divmod(ss, 60) > > easier to read than > > mm, ss = ss//60, ss%60 Sure it is. Until Silvester night, I even had no idea that divmod is so drastically slower. Hee hee. Which gave me a nice chance to start a whole rant towards shrinking the language, which I found an attractive new way to shake you up. Don't take it too serious :-)) I-wish-you-a-divmod-ly-new-2003-and-all-the-best -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer@tismer.com Wed Jan 1 11:25:05 2003 From: tismer@tismer.com (Christian Tismer) Date: Wed, 01 Jan 2003 12:25:05 +0100 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors References: <200212311738.SAA02244@proton.lysator.liu.se> <200212311822.gBVIMJM24712@odiug.zope.com> <3E1255F4.7070001@tismer.com> <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E12D011.5050104@tismer.com> Guido van Rossum wrote: ... > I find > > mm, ss = divmod(ss, 60) > > easier to read than > > mm, ss = ss//60, ss%60 Before I retract completely, I have a small addition. mm = ss//60; ss %= 60 is very readable to me, very efficient, and doesn't use extra tuples. It is also 2 characters shorter than the divmod version (both with the usual spacing, of course). On the other hand, divmod clearly says what is going to happen, and in fact it is a higher level approach. On dropping features, I came to this in the morning: You said you might optimize certain builtins by the compiler, instead of removing divmod. How about this: If divmod is supported by the compiler, there isn't any reason to keep it as a compiled C module. Instead, divmod could be a piece of Python code, which is injected into builtins somewhere at startup time. Since he compiler supports it, it just needs to be there at all, but does not need to waste space in the Python executable. I'm not saying this just to fight divmod out of the language. This is ridiculous. But in the long term, there might be quite a lot of redundancy introduced by enhanced versions of the compiler. Instead of letting the code grow on and on, I'd like to propose some shrinking attempt like in this case. If the compiler eats 95 percent of calls to some code already, I believe it makes sense to replace that piece of code by Python code and drop the C version. The smaller the C code base, the better for Python. Less static code also improves the effect of compilers like Psyco. Let's make a smaller and better kernel. Get rid of ballast and have less code to maintain. This is a real wish and *no* Silvester joke :-) cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From ben@algroup.co.uk Wed Jan 1 12:58:43 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Wed, 01 Jan 2003 12:58:43 +0000 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors References: <200212311738.SAA02244@proton.lysator.liu.se> <20021231184457.GE31681@laranja.org> Message-ID: <3E12E603.9080809@algroup.co.uk> Andrew Koenig wrote: > Lalo> I'm +0 on this (reasons below), but since Guido already said no: > Lalo> I find this function pretty useful, as it serves to collapse > Lalo> ugly code, and I hate ugly python code. However, the name > Lalo> sucks. With your changes, 'divmod' doesn't describe it anymore. > > Lalo> So I suggest you come up with a better name and offer it as a > Lalo> contribution to, say, the math module. In that case I would > Lalo> prefer to have the reverse function too. > > If it helps, APL uses the names "base" and "represent" for these > two functions. I thought it used "radix" and not "base"? Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From guido@python.org Wed Jan 1 13:56:25 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 01 Jan 2003 08:56:25 -0500 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors In-Reply-To: Your message of "Wed, 01 Jan 2003 12:25:05 +0100." <3E12D011.5050104@tismer.com> References: <200212311738.SAA02244@proton.lysator.liu.se> <200212311822.gBVIMJM24712@odiug.zope.com> <3E1255F4.7070001@tismer.com> <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> <3E12D011.5050104@tismer.com> Message-ID: <200301011356.h01DuP517028@pcp02138704pcs.reston01.va.comcast.net> > > I find > > > > mm, ss = divmod(ss, 60) > > > > easier to read than > > > > mm, ss = ss//60, ss%60 > > Before I retract completely, I have a small addition. > > mm = ss//60; ss %= 60 > > is very readable to me, very efficient, and doesn't use > extra tuples. A good optimizing compiler could also get rid of extra tuples. > It is also 2 characters shorter than the > divmod version (both with the usual spacing, of course). Shame on you. Would you rather write Perl? :-) > On the other hand, divmod clearly says what is going to > happen, and in fact it is a higher level approach. Yes. > On dropping features, I came to this in the morning: > > You said you might optimize certain builtins by the > compiler, instead of removing divmod. > > How about this: > If divmod is supported by the compiler, there isn't > any reason to keep it as a compiled C module. Instead, > divmod could be a piece of Python code, which is > injected into builtins somewhere at startup time. > Since he compiler supports it, it just needs to be > there at all, but does not need to waste space in > the Python executable. But what about long division, where divmod can (in extreme cases) really save time by doing the division only once? > I'm not saying this just to fight divmod out of the > language. This is ridiculous. But in the long term, > there might be quite a lot of redundancy introduced > by enhanced versions of the compiler. Instead of > letting the code grow on and on, I'd like to propose > some shrinking attempt like in this case. > If the compiler eats 95 percent of calls to some code > already, I believe it makes sense to replace that > piece of code by Python code and drop the C version. > > The smaller the C code base, the better for Python. > Less static code also improves the effect of compilers > like Psyco. Let's make a smaller and better kernel. > Get rid of ballast and have less code to maintain. Let's first make some steps towards the better compiler I alluded to. Then we can start cutting. > This is a real wish and *no* Silvester joke :-) If only I knew what Silvester was. Is it a German holiday celebrating the invention of root beer? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Wed Jan 1 17:35:34 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 01 Jan 2003 12:35:34 -0500 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors In-Reply-To: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Christian Tismer] >> Therefore, I consider divmod a waste of code and >> a function to be deprecated ASAP (and since years). [Guido] > Maybe it wasn't such a good idea. You can remove divmod() when I'm dead. Before then, it stays. I'm sure all will agree that's a reasonable compromise. > ... > But what about long division, where divmod can (in extreme cases) > really save time by doing the division only once? On my box, divmod(x, y) is already 40% faster than "x//y; x%y" when x is a 63-bit int and y is 137354. In the hundreds of places I've used it in programs slinging multi-thousand bit integers, it's essentially twice as fast. But I don't care so much about that as that divmod() is the right conceptual spelling for the operation it performs. If we have to drop a builtin, I never liked reduce , although Jeremy pointed out that its most common use case no longer requires writing a lambda, or importing operator.add: >>> reduce(int.__add__, range(11)) 55 >>> This is a little suprising if you toss a mix of types into it, though, since int.__add__ isn't operator.add: >>> reduce(int.__add__, [1, 2, 3.0]) NotImplemented >>> OTOH, >>> reduce(float.__add__, [1, 2, 3.0]) Traceback (most recent call last): File "", line 1, in ? TypeError: descriptor '__add__' requires a 'float' object but received a 'int' >>> It's a good way to reverse-engineer the internals . "on-topic"-is-my-middle-name-ly y'rs - tim From barry@python.org Wed Jan 1 19:36:32 2003 From: barry@python.org (Barry A. Warsaw) Date: Wed, 1 Jan 2003 14:36:32 -0500 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors References: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15891.17216.321848.816687@gargle.gargle.HOWL> >>>>> "TP" == Tim Peters writes: TP> If we have to drop a builtin, I never liked reduce , TP> although Jeremy pointed out that its most common use case no TP> longer requires writing a lambda, or importing operator.add: Hey, if we'll killing off builtins, I vote for apply(). -Barry From python@rcn.com Wed Jan 1 19:59:04 2003 From: python@rcn.com (Raymond Hettinger) Date: Wed, 1 Jan 2003 14:59:04 -0500 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors References: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> <15891.17216.321848.816687@gargle.gargle.HOWL> Message-ID: <004401c2b1d0$3d320f40$125ffea9@oemcomputer> [Tim] > TP> If we have to drop a builtin, I never liked reduce , > TP> although Jeremy pointed out that its most common use case no > TP> longer requires writing a lambda, or importing operator.add: [Barry] > Hey, if we'll killing off builtins, I vote for apply(). buffer() and intern() are two candidates for least understood, least used, and most likely not to be missed. Raymond Hettinger From tismer@tismer.com Wed Jan 1 20:09:10 2003 From: tismer@tismer.com (Christian Tismer) Date: Wed, 01 Jan 2003 21:09:10 +0100 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors References: <200212311738.SAA02244@proton.lysator.liu.se> <200212311822.gBVIMJM24712@odiug.zope.com> <3E1255F4.7070001@tismer.com> <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> <3E12D011.5050104@tismer.com> <200301011356.h01DuP517028@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E134AE6.4070903@tismer.com> Guido van Rossum wrote: ... >> mm = ss//60; ss %= 60 >> >>is very readable to me, very efficient, and doesn't use >>extra tuples. > > A good optimizing compiler could also get rid of extra tuples. Yes. It could also figure out how to save one division if it sees the above line instead of divmod. But do you have it? ... >>On dropping features, I came to this in the morning: >> >>You said you might optimize certain builtins by the >>compiler, instead of removing divmod. [proposal to replace divmod by Python code] > But what about long division, where divmod can (in extreme cases) > really save time by doing the division only once? Would this case be special-cased in the interpreter code? If not, the long divmod version would probably not be replaced. ... > Let's first make some steps towards the better compiler I alluded to. > Then we can start cutting. If this is a promise, I'm happy. Enhancing the compiler *and* cutting code really improves the ovrall quality. >>This is a real wish and *no* Silvester joke :-) > > > If only I knew what Silvester was. Is it a German holiday celebrating > the invention of root beer? :-) Silvester is New Year's Eve; Hogmanay in Scotland. We have lots of celebration, drinks, and many fire works at midnight, to bomb the bad ghosts away for the new year. We also define new goals, things to do better. One of my goals is to enhance Python while reducing its C code base. Another one is to get the portable part of Stackless 3.0 into the Python core. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer@tismer.com Wed Jan 1 20:16:38 2003 From: tismer@tismer.com (Christian Tismer) Date: Wed, 01 Jan 2003 21:16:38 +0100 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors References: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> <15891.17216.321848.816687@gargle.gargle.HOWL> <004401c2b1d0$3d320f40$125ffea9@oemcomputer> Message-ID: <3E134CA6.7000004@tismer.com> Raymond Hettinger wrote: > [Tim] > >> TP> If we have to drop a builtin, I never liked reduce , >> TP> although Jeremy pointed out that its most common use case no >> TP> longer requires writing a lambda, or importing operator.add: > > > [Barry] > >>Hey, if we'll killing off builtins, I vote for apply(). > > > buffer() and intern() are two candidates for least understood, > least used, and most likely not to be missed. Hey, this is the positive response I was hoping for. You got the point. We do have functions which might perhaps not be dropped, but removed from C code. There is also a side effect for apply(), for instance: Controversely to divmod, apply is still found in 166 .py files of the 2.2 source dist. This is still more than 10 %, although apply() can be completely replaced by the new calling patterns. I believe that in the code base of the average user, this will look even worse, since nobody cares about changing working code. If we now quasi-deprecate apply by making it *slower*, simply by replacing the C code by a Python function that itself uses the new calling style, then we have less C code, still compatibility, *and* a good argument for everybody to replace apply by the new way to go. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From mal@lemburg.com Wed Jan 1 23:34:35 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 02 Jan 2003 00:34:35 +0100 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors In-Reply-To: <004401c2b1d0$3d320f40$125ffea9@oemcomputer> References: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> <15891.17216.321848.816687@gargle.gargle.HOWL> <004401c2b1d0$3d320f40$125ffea9@oemcomputer> Message-ID: <3E137B0B.2040101@lemburg.com> Raymond Hettinger wrote: > [Tim] > >> TP> If we have to drop a builtin, I never liked reduce , >> TP> although Jeremy pointed out that its most common use case no >> TP> longer requires writing a lambda, or importing operator.add: > > [Barry] > >>Hey, if we'll killing off builtins, I vote for apply(). > > buffer() and intern() are two candidates for least understood, > least used, and most likely not to be missed. Hey, this is (or was) January 1st, not April 1st. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tismer@tismer.com Thu Jan 2 00:23:50 2003 From: tismer@tismer.com (Christian Tismer) Date: Thu, 02 Jan 2003 01:23:50 +0100 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors References: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> <15891.17216.321848.816687@gargle.gargle.HOWL> <004401c2b1d0$3d320f40$125ffea9@oemcomputer> <3E137B0B.2040101@lemburg.com> Message-ID: <3E138696.2070006@tismer.com> M.-A. Lemburg wrote: > Raymond Hettinger wrote: > >> [Tim] >> >>> TP> If we have to drop a builtin, I never liked reduce , >>> TP> although Jeremy pointed out that its most common use case no >>> TP> longer requires writing a lambda, or importing operator.add: >> >> >> [Barry] >> >>> Hey, if we'll killing off builtins, I vote for apply(). >> >> >> buffer() and intern() are two candidates for least understood, >> least used, and most likely not to be missed. > > > Hey, this is (or was) January 1st, not April 1st. Right. On April 1st, the code size of Python will be fixed to a certain amount, and every additional C code must come with an equivalent amount of code to be dropped. Until then, starting with January 1st, you can get bonus code by proposing code obsoletion in advance. Later code reductions will be laid out in a PEP. The Python C code should be shrunk to 50% by end of 2004. The goal is to get it down to the pure bootstrap code of a JIT until 2007. This should be doable within at most 8 KB of binary code. -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tim@zope.com Thu Jan 2 00:46:00 2003 From: tim@zope.com (Tim Peters) Date: Wed, 1 Jan 2003 19:46:00 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <3E123925.9060901@zope.com> Message-ID: The datetime module underwent a number of last-minute changes to make writing tzinfo classes a pleasure instead of an expert-level nightmare (which latter is what it turned to be -- it's not anymore). Realistic examples can be found in the Python CVS datetime sandbox, at http://tinyurl.com/3zr7 US.py Some US time zones EU.py Some European time zones PSF.py Essential if you're a PSF Director dateutil.py Examples of using the datetime module to compute other kinds of practical stuff (like the 2nd-last Tuesday in August) We still have two time zone conversion problems, and always will (they come with the territory -- they're inherent problems, not artifacts of the implementation). They only arise in a tzinfo class that's trying to model both standard and daylight time. In effect, that's one class trying to pretend it's two different time zones, and at the transition points there are "impossible problems". For concreteness, I'll use US Eastern here, UTC-0500. On some day in April, DST starts at the minute following the local wall clock's 1:59. On some day in October, DST ends at the minute following the local wall clock's 1:59. Here's a picture: UTC 3:MM 4:MM 5:MM 6:MM 7:MM 8:MM standard 22:MM 23:MM 0:MM 1:MM 2:MM 3:MM daylight 23:MM 0:MM 1:MM 2:MM 3:MM 4:MM wall start 22:MM 23:MM 0:MM 1:MM 3:MM 4:MM wall end 23:MM 0:MM 1:MM 1:MM 2:MM 3:MM UTC, EST and EDT are all self-consistent and trivial. It's only wall time that's a problem, and only at the transition points: 1. When DST starts (the "wall start" line), the wall clock leaps from 1:59 to 3:00. A wall time of the form 2:MM doesn't really make sense on that day. The example classes do what I believe is the best that can be done: since 2:MM is "after 2" on that day, it's taken as daylight time, and so as an alias for 1:MM standard == 1:MM wall on that day, which is a time that does make sense on the wall clock that day. The astimezone() function ensures that the "impossible hour" on that day is never the result of a conversion (you'll get the standard-time spelling instead). If you don't think that's the best that can be done, speak up now. 2. When DST ends (the "wall end" line), we have a worse problem: there'a an hour that can't be spelled *at all* in wall time. It's the hour beginning at the moment DST ends; in the example, that's times of the form 6:MM UTC on the day daylight time ends. The local wall clock leaps from 1:59 (daylight time) back to 1:00 again (but the second time as a standard time). The hour 6:MM UTC looks like 1:MM, but so does the hour 5:MM UTC on that day. A reasonable tzinfo class should take 1:MM as being daylight time on that day, since it's "before 2". As a consequence, the hour 6:MM UTC has no wall-clock spelling at all. This can't be glossed over. If you code a tzinfo class to take 1:MM as being standard time on that day instead, then the UTC hour 5:MM becomes unspellable in wall time instead. No matter how you cut it, the redundant spellings of an hour on the day DST starts means there's an hour that can't be spelled at all on the day DST ends (so in that sense, they're the two sides of a single problem). What to do? The current implementation of dt.astimezone(tz) raises ValueError if dt can't be expressed as a local time in tz. That's the "errors should never pass silently" school, which I briefly attended in college . If you don't like that, what would you rather see happen? Try to be precise, and remember that getting a "correct" time in tz is flatly impossible in this case. Two other debatable edge cases in the implementation of dt.astimezone(tz): 3. If dt.tzinfo.dst(dt) returns None, the current implementation takes that as a synonym for 0. Perhaps it should raise an exception instead. 4. If dt.tzinfo.utcoffset(dt) first returns an offset, and on a subsequent call (while still trying to figure out the same conversion)returns None, an exception is raised. Those who don't want unspellable hours to raise an exception may also want inconsistent tzinfo implementations to go without complaint. If so, what do you want it to do instead? From aahz@pythoncraft.com Thu Jan 2 01:02:35 2003 From: aahz@pythoncraft.com (Aahz) Date: Wed, 1 Jan 2003 20:02:35 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: References: <3E123925.9060901@zope.com> Message-ID: <20030102010235.GA25005@panix.com> On Wed, Jan 01, 2003, Tim Peters wrote: > > This can't be glossed over. If you code a tzinfo class to take 1:MM as > being standard time on that day instead, then the UTC hour 5:MM becomes > unspellable in wall time instead. No matter how you cut it, the redundant > spellings of an hour on the day DST starts means there's an hour that can't > be spelled at all on the day DST ends (so in that sense, they're the two > sides of a single problem). > > What to do? The current implementation of dt.astimezone(tz) raises > ValueError if dt can't be expressed as a local time in tz. That's the > "errors should never pass silently" school, which I briefly attended in > college . If you don't like that, what would you rather see happen? > Try to be precise, and remember that getting a "correct" time in tz is > flatly impossible in this case. What exactly do you mean by "wall time"? Is it supposed to be a timezone-less value? If yes, you can't convert between wall time and any timezone-based value without specifying the time zone the wall clock lives in, which has to be relative to UTC. This only becomes a serious problem for round-trip conversions. Does it make any sense to include a flag if a time value ever gets converted between wall time and any other kind of time? Every other datetime package seems to live with the wiggle involved in round-trip conversions, why not Python? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "There are three kinds of lies: Lies, Damn Lies, and Statistics." --Disraeli From tim@zope.com Thu Jan 2 01:24:00 2003 From: tim@zope.com (Tim Peters) Date: Wed, 1 Jan 2003 20:24:00 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <20030102010235.GA25005@panix.com> Message-ID: [Aahz] > What exactly do you mean by "wall time"? Is it supposed to be a > timezone-less value? The diagram in the original gave agonizingly detailed answers to these questions. US Eastern as a tzinfo class tries to combine both EDT and EST; please read the original msg again, and note that the two "wall time" lines in the diagram show what US Eastern displays at all the relevant UTC times. > If yes, you can't convert between wall time and any timezone-based > value without specifying the time zone the wall clock lives in, > which has to be relative to UTC. See the original; the relationship to UTC was explicit at all points there. > This only becomes a serious problem for round-trip conversions. The major problem here was in one-way conversion; see the original; two-way conversion T1 -> T2 -> T1 is impossible if the time in T1 can't be represented at all in T2 (which is the case here). > Does it make any sense to include a flag if a time value ever gets > converted between wall time and any other kind of time? > > Every other datetime package seems to live with the > wiggle involved in round-trip conversions, why not Python? Round-trip conversions aren't at issue here, except to the extent that the "unspellable hour" at the end of DST makes some one-way conversions impossible. From tim.one@comcast.net Thu Jan 2 01:38:59 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 01 Jan 2003 20:38:59 -0500 Subject: [Python-Dev] bugs in stdlib In-Reply-To: <20021231162406.GA22850@thyrsus.com> Message-ID: [Eric S. Raymond] > ... > You may not be alone, I fear. I almost blew past mail from Dennis > Ritchie last night, just barely caught myself in time. I'm wondering > now what other important bits I lost... :-( :-( :-( Linus sent a bunch of frenzied questions to Python-Dev about CML2, and told us to "piss off, the bastard hasn't responded in weeks" when we said we were sure you'd respond soon. Other than that, probably not much. Since you surely missed my happy new year wishes too, let me repeat them: Happy New Year, Eric! From tim.one@comcast.net Thu Jan 2 01:43:48 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 01 Jan 2003 20:43:48 -0500 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules _randommodule.c,1.1,1.2 In-Reply-To: <20021231222823.GB20404@epoch.metaslash.com> Message-ID: >> ... >> DEFERRED(PyObject_GenericGetAttr), /*tp_getattro*/ [NealN] > I agree, this makes sense. The only problem is that the > functions/methods will have to be specified twice. It would be nice > if we only had to specify each once. Any ideas? None beyond not making A Project out of it -- tricks with the C preprocessor are usually worse than the diseases they're trying to cure, so KISS. > There are several places this has already been done (and at least one > more coming). Do you want a global Py_DEFERRED() or some other name? Py_DEFERRED would be fine. So would a pile of local DEFERREDs: a trivial one-line one-token macro that can't be screwed up isn't really aching for generalization or factorization, although it may be nice to have a longer comment explaining the need for it in just one place. From python@rcn.com Thu Jan 2 01:56:07 2003 From: python@rcn.com (Raymond Hettinger) Date: Wed, 1 Jan 2003 20:56:07 -0500 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules _randommodule.c,1.1,1.2 References: Message-ID: <003b01c2b202$1e1eaa00$125ffea9@oemcomputer> How about having specific markers such as DEFERRED_GenericGetAttr and then having PyType_Ready scan for them and replace them with the appropriate pointers? ----- Original Message ----- From: "Tim Peters" To: Cc: ; "PythonDev" Sent: Wednesday, January 01, 2003 8:43 PM Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules _randommodule.c,1.1,1.2 > >> ... > >> DEFERRED(PyObject_GenericGetAttr), /*tp_getattro*/ > > [NealN] > > I agree, this makes sense. The only problem is that the > > functions/methods will have to be specified twice. It would be nice > > if we only had to specify each once. Any ideas? > > None beyond not making A Project out of it -- tricks with the C preprocessor > are usually worse than the diseases they're trying to cure, so KISS. > > > There are several places this has already been done (and at least one > > more coming). Do you want a global Py_DEFERRED() or some other name? > > Py_DEFERRED would be fine. So would a pile of local DEFERREDs: a trivial > one-line one-token macro that can't be screwed up isn't really aching for > generalization or factorization, although it may be nice to have a longer > comment explaining the need for it in just one place. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From guido@python.org Thu Jan 2 02:19:39 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 01 Jan 2003 21:19:39 -0500 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors In-Reply-To: Your message of "Wed, 01 Jan 2003 14:36:32 EST." <15891.17216.321848.816687@gargle.gargle.HOWL> References: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> <15891.17216.321848.816687@gargle.gargle.HOWL> Message-ID: <200301020219.h022Jea18255@pcp02138704pcs.reston01.va.comcast.net> > Hey, if we'll killing off builtins, I vote for apply(). Agreed, it's redundant. You can help by checking in documentation that marks it as deprecated and code that adds a PendingDeprecationWarning to it (unfortunately it's so common that I wouldn't want to risk a plain DeprecationWarning). BTW, I recently find myself longing for an extension of the *args, **kw syntax, as follows: foo(1, 2, 3, *args, 5, 6, 7, **kw) ^^^^^^^ This part is currently not allowed. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 2 02:51:37 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 01 Jan 2003 21:51:37 -0500 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules _randommodule.c,1.1,1.2 In-Reply-To: Your message of "Wed, 01 Jan 2003 20:56:07 EST." <003b01c2b202$1e1eaa00$125ffea9@oemcomputer> References: <003b01c2b202$1e1eaa00$125ffea9@oemcomputer> Message-ID: <200301020251.h022pbb18427@pcp02138704pcs.reston01.va.comcast.net> > How about having specific markers such as DEFERRED_GenericGetAttr > and then having PyType_Ready scan for them and replace them with the > appropriate pointers? -1. Violates the KISS principle in many ways. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 2 03:20:05 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 01 Jan 2003 22:20:05 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Wed, 01 Jan 2003 19:46:00 EST." References: Message-ID: <200301020320.h023K5318509@pcp02138704pcs.reston01.va.comcast.net> > 1. When DST starts (the "wall start" line), the wall clock leaps from 1:59 > to 3:00. A wall time of the form 2:MM doesn't really make sense on > that day. The example classes do what I believe is the best that > can be done: since 2:MM is "after 2" on that day, it's taken as > daylight time, and so as an alias for 1:MM standard == 1:MM wall on > that day, which is a time that does make sense on the wall clock > that day. The astimezone() function ensures that the "impossible > hour" on that day is never the result of a conversion (you'll get > the standard-time spelling instead). If you don't think that's the > best that can be done, speak up now. I don't particularly care one way or the other, but when I'm *awake* during this hour, 2:MM more likely means that I forgot to move my clock forward, so it may make more sense to interpret it as standard time after all (meaning it should be switched to 3:MM DST rather than 1:MM STD). > 2. When DST ends (the "wall end" line), we have a worse problem: > there'a an hour that can't be spelled *at all* in wall time. It's > the hour beginning at the moment DST ends; in the example, that's > times of the form 6:MM UTC on the day daylight time ends. The local > wall clock leaps from 1:59 (daylight time) back to 1:00 again (but > the second time as a standard time). The hour 6:MM UTC looks like > 1:MM, but so does the hour 5:MM UTC on that day. A reasonable > tzinfo class should take 1:MM as being daylight time on that day, > since it's "before 2". As a consequence, the hour 6:MM UTC has no > wall-clock spelling at all. > > This can't be glossed over. If you code a tzinfo class to take 1:MM > as being standard time on that day instead, then the UTC hour 5:MM > becomes unspellable in wall time instead. No matter how you cut it, > the redundant spellings of an hour on the day DST starts means > there's an hour that can't be spelled at all on the day DST ends (so > in that sense, they're the two sides of a single problem). > > What to do? The current implementation of dt.astimezone(tz) raises > ValueError if dt can't be expressed as a local time in tz. That's > the "errors should never pass silently" school, which I briefly > attended in college . If you don't like that, what would you > rather see happen? Try to be precise, and remember that getting a > "correct" time in tz is flatly impossible in this case. One (perhaps feeble) argument against raising ValueError here is that this introduces a case where a calculation that normally never raises an error (assuming sane timezones) can raise an exception for one hour a year. If you run a webserver that e.g. tries to render the current time at the server (which is represented in UTC of course) in the end user's local time, it would be embarrassing if this caused an error page when the end user's local time happens to be in the unrepresentable hour (i.e. one hour per year). You really want to render it as 1:MM, since the user should know whether his DST has already ended or not yet. While in an ideal world the programmer would have read the docs for .astimezone() and heeded the warning to catch ValueError (and then display what? UTC?), realistically if the programmer is a mere mortal and found no problems during testing, she will be embarrassed by the error page. (For worse effect, imagine the server running in a space probe :-) The counterargument is of course a use case where the time displayed is of real importance, and we would rather die than show the wrong time. The C standard library (following the original Unix treatment of timezones) has a solution for this: it adds a three-valued flag to the local time which indicates whether DST is in effect or not. Normally, you can set this flag to -1 ("don't know") in which case the proper value is calculated from the DST rules. But for ambiguous local times (the hour at the end of DST), setting it to 0 or 1 makes the times unambiguous again. This would mean that we'd have to add an "isdst" field to datetimetz objects and ways to set it; it would default to -1 (or perhaps to the proper value based on DST rules) but .astimezone() could set it explicitly to 0 or 1 to differentiate between the two ambiguous times. > Two other debatable edge cases in the implementation of > dt.astimezone(tz): > > 3. If dt.tzinfo.dst(dt) returns None, the current implementation > takes that as a synonym for 0. Perhaps it should raise an exception > instead. Why would dst() return None for a tzinfo object whose utcoffset() returns a definite value? I think that's a poor tzinfo implementation and an exception would be appropriate. > 4. If dt.tzinfo.utcoffset(dt) first returns an offset, and on a > subsequent call (while still trying to figure out the same > conversion) returns None, an exception is raised. Those who don't > want unspellable hours to raise an exception may also want > inconsistent tzinfo implementations to go without complaint. If so, > what do you want it to do instead? I think this could only happen if a tzinfo's utcoffset() returns None for *some* times but not for others, right? I don't think such a tzinfo should be considered sane. Hmm, perhaps that's how a tzinfo object would signal that a particular local time is illegal (e.g. the hour at the start of DST). Then astimezone() would have to worm around that in some other way. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Thu Jan 2 03:42:02 2003 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 1 Jan 2003 22:42:02 -0500 Subject: [Python-Dev] Holes in time References: <200301020320.h023K5318509@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15891.46346.346014.611286@gargle.gargle.HOWL> >>>>> "GvR" == Guido van Rossum writes: GvR> The C standard library (following the original Unix treatment GvR> of timezones) has a solution for this: it adds a three-valued GvR> flag to the local time which indicates whether DST is in GvR> effect or not. Normally, you can set this flag to -1 ("don't GvR> know") in which case the proper value is calculated from the GvR> DST rules. But for ambiguous local times (the hour at the GvR> end of DST), setting it to 0 or 1 makes the times unambiguous GvR> again. This would mean that we'd have to add an "isdst" GvR> field to datetimetz objects and ways to set it; it would GvR> default to -1 (or perhaps to the proper value based on DST GvR> rules) but .astimezone() could set it explicitly to 0 or 1 to GvR> differentiate between the two ambiguous times. +1, but let's use True, False, and None. :) -Barry From paul-python@svensson.org Thu Jan 2 03:55:24 2003 From: paul-python@svensson.org (Paul Svensson) Date: Wed, 1 Jan 2003 22:55:24 -0500 (EST) Subject: [Python-Dev] Holes in time In-Reply-To: <15891.46346.346014.611286@gargle.gargle.HOWL> Message-ID: <20030101224621.E79442-100000@familjen.svensson.org> On Wed, 1 Jan 2003, Barry A. Warsaw wrote: > >>>>>> "GvR" == Guido van Rossum writes: > > GvR> The C standard library (following the original Unix treatment > GvR> of timezones) has a solution for this: it adds a three-valued > GvR> flag to the local time which indicates whether DST is in > GvR> effect or not. Normally, you can set this flag to -1 ("don't > GvR> know") in which case the proper value is calculated from the > GvR> DST rules. But for ambiguous local times (the hour at the > GvR> end of DST), setting it to 0 or 1 makes the times unambiguous > GvR> again. This would mean that we'd have to add an "isdst" > GvR> field to datetimetz objects and ways to set it; it would > GvR> default to -1 (or perhaps to the proper value based on DST > GvR> rules) but .astimezone() could set it explicitly to 0 or 1 to > GvR> differentiate between the two ambiguous times. > >+1, but let's use True, False, and None. :) This is a typical example of where bool is wrong. Google for "double daylight savings", and you'll understand why. So it would make more sense to use: 3600, 0, and None. /Paul From barry@python.org Thu Jan 2 03:59:27 2003 From: barry@python.org (Barry A. Warsaw) Date: Wed, 1 Jan 2003 22:59:27 -0500 Subject: [Python-Dev] Holes in time References: <15891.46346.346014.611286@gargle.gargle.HOWL> <20030101224621.E79442-100000@familjen.svensson.org> Message-ID: <15891.47391.424238.596094@gargle.gargle.HOWL> >>>>> "PS" == Paul Svensson writes: PS> This is a typical example of where bool is wrong. Google for PS> "double daylight savings", and you'll understand why. So it PS> would make more sense to use: 3600, 0, and None. Then you have to call it something other than "isdst()". -Barry From tim@zope.com Thu Jan 2 04:13:05 2003 From: tim@zope.com (Tim Peters) Date: Wed, 1 Jan 2003 23:13:05 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: <200301020320.h023K5318509@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido, on the DST start case: in US Eastern, 2:MM doesn't exist on the wall clock, which jumps from 1:MM to 3:MM; the example tzinfo classes take 2:MM as being daylight then, since it's "after 2", and astimezone() delivers the equivalent standard-time 1:MM spelling when it has to deliver a result in this hour] > I don't particularly care one way or the other, but when I'm *awake* > during this hour, 2:MM more likely means that I forgot to move my > clock forward, so it may make more sense to interpret it as standard > time after all (meaning it should be switched to 3:MM DST rather than > 1:MM STD). Note that this issue is almost entirely about how users code *their* tzinfo classes: the interpretion of 2:MM is up to them. If you would like to call 2:MM standard time instead, then write your dst() method accordingly. Note that the example time zone classes in US.py call 2:MM daylight time "because it's after 2:00", and the time zone classes in EU.py do an equivalent thing for the time zones they implement; it's the most natural thing to code. If this hour happens to be the result of an astimezone() conversion, it picks the unambiguous standard-time spelling (which is 1:MM whether you moved your clock forward at exactly the right moment, or waited until the next day; the astimezone() part of this case is simply that it doesn't deliver an ambiguous result in this case -- an Eastern tzinfo subclass would be insane if it called 1:MM daylight time on this day). [on the unspellable hour at the end of DST] > One (perhaps feeble) argument against raising ValueError here is that > this introduces a case where a calculation that normally never raises > an error (assuming sane timezones) can raise an exception for one hour > a year. If you run a webserver that e.g. tries to render the current > time at the server (which is represented in UTC of course) in the end > user's local time, it would be embarrassing if this caused an error > page when the end user's local time happens to be in the > unrepresentable hour (i.e. one hour per year). You really want to > render it as 1:MM, since the user should know whether his DST has > already ended or not yet. While in an ideal world the programmer > would have read the docs for .astimezone() and heeded the warning to > catch ValueError (and then display what? UTC?), The parenthetical questions have to be answered even if the implementation changes: if astimezone() doesn't raise an exception here, it has to set *some* time values in the result object, and the server will display them. Probably 1:MM in the Eastern case -- it would take some thought to figure out whether it could at least promise to deliver the ambiguous wall-clock spelling in this case (so that it's never worse than an off-by-one (hour) error). > realistically if the programmer is a mere mortal and found no > problems during testing, she will be embarrassed by the error page. > (For worse effect, imagine the server running in a space probe :-) > > The counterargument is of course a use case where the time displayed > is of real importance, and we would rather die than show the wrong > time. Displaying times across time zones is a pretty harmless case. The reason I worry more about this here is that one-zone (intrazone, as opposed to your interzone use case) DST-aware arithmetic can't be done in the datetime module without converting to UTC (or some other fixed reference), doing the arithmetic there, and converting back again. In that set of use cases, getting an incorrect result seems much more likely to have bad consequences. If the Russian space probe is programmed in some hybrid Russian std+daylight time zone, you tell it to turn away from the sun 6 hours from now, and it doesn't actually turn away for 7 hours and fries itself as a result, it will be a lead story on comp.risks. Although I hope they have sense enough to pick on it more for programming a space probe in some hybrid Russian std+daylight time zone . Perhaps astimezone() could grow an optional "error return" value, a datetimetz to be used if the time is unrepresentable in the target zone. > The C standard library (following the original Unix treatment of > timezones) has a solution for this: it adds a three-valued flag to the > local time which indicates whether DST is in effect or not. Normally, > you can set this flag to -1 ("don't know") in which case the proper > value is calculated from the DST rules. But for ambiguous local times > (the hour at the end of DST), setting it to 0 or 1 makes the times > unambiguous again. This would mean that we'd have to add an "isdst" > field to datetimetz objects and ways to set it; it would default to -1 > (or perhaps to the proper value based on DST rules) but .astimezone() > could set it explicitly to 0 or 1 to differentiate between the two > ambiguous times. This could work too. >> 3. If dt.tzinfo.dst(dt) returns None, the current implementation >> takes that as a synonym for 0. Perhaps it should raise an exception >> instead. > Why would dst() return None for a tzinfo object whose utcoffset() > returns a definite value? I think that's a poor tzinfo implementation > and an exception would be appropriate. Only because the first three times I wrote a tzinfo class, I didn't give a rip about DST so implemented dst() to return None . I'm afraid you still have to care *enough* to read the docs, and heed the advice that dst() should return 0 if DST isn't in effect. I'll change this. >> 4. If dt.tzinfo.utcoffset(dt) first returns an offset, and on a >> subsequent call (while still trying to figure out the same >> conversion) returns None, an exception is raised. Those who don't >> want unspellable hours to raise an exception may also want >> inconsistent tzinfo implementations to go without complaint. If so, >> what do you want it to do instead? > I think this could only happen if a tzinfo's utcoffset() returns None > for *some* times but not for others, right? Correct. When this happens, the implementation currently raises ValueError, with a msg complaining that the utcoffset() method is inconsistent. > I don't think such a tzinfo should be considered sane. Then you don't hate the current implementation on this count . > Hmm, perhaps that's how a tzinfo object would signal that a particular > local time is illegal (e.g. the hour at the start of DST). I'm not worried about that case: astimezone() never *produces* that spelling (not even internally -- see the long new correctness proof for the gory details). > Then astimezone() would have to worm around that in some other way. If the input hour to astimezone() is of that ambiguous form, astimezone() already takes a None return from utcoffset() as meaning the input datetimetz is naive, and simply attaches the new timezone to the input date and time fields without altering the latter. On the other end (when DST ends), since there's no way to spell the "impossible hour" in local time, a tzinfo class knows nothing about that hour (in particular, it can't detect it -- astimezone() can, based on the otherwise impossible sequence of results it gets from calling utcoffset() more than once). From python@rcn.com Thu Jan 2 04:21:12 2003 From: python@rcn.com (Raymond Hettinger) Date: Wed, 1 Jan 2003 23:21:12 -0500 Subject: [Python-Dev] Holes in time References: <15891.46346.346014.611286@gargle.gargle.HOWL><20030101224621.E79442-100000@familjen.svensson.org> <15891.47391.424238.596094@gargle.gargle.HOWL> Message-ID: <009501c2b216$62d2c500$125ffea9@oemcomputer> > >>>>> "PS" == Paul Svensson writes: > > PS> This is a typical example of where bool is wrong. Google for > PS> "double daylight savings", and you'll understand why. So it > PS> would make more sense to use: 3600, 0, and None. > > Then you have to call it something other than "isdst()". > -Barry Invert the test and have the best of both worlds: isStandardTime() --> bool Raymond Hettinger From guido@python.org Thu Jan 2 04:40:29 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 01 Jan 2003 23:40:29 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Wed, 01 Jan 2003 22:55:24 EST." <20030101224621.E79442-100000@familjen.svensson.org> References: <20030101224621.E79442-100000@familjen.svensson.org> Message-ID: <200301020440.h024eTe18762@pcp02138704pcs.reston01.va.comcast.net> > >+1, but let's use True, False, and None. :) > > This is a typical example of where bool is wrong. No it isn't. > Google for "double daylight savings", and you'll understand why. > So it would make more sense to use: 3600, 0, and None. Unless congressman Sherman proposes to move the clock forward and/or back more than once a year, this is not necessary: the flag only indicates *whether* DST is in effect or not; to know *how much* the DST adjustment is you should call the tzinfo's dst() method. The conventional value of one hour of is never implied in any of these calculations -- that's always up to the tzinfo object. (However, it's assumed that dst() returns a positive number -- otherwise it wouldn't be daylight *savings*. On the southern hemisphere, the seasons are reversed, but DST still moves the clock forward by some amount, not back.) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Thu Jan 2 04:43:21 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 1 Jan 2003 22:43:21 -0600 Subject: [Python-Dev] Holes in time In-Reply-To: <009501c2b216$62d2c500$125ffea9@oemcomputer> References: <15891.46346.346014.611286@gargle.gargle.HOWL> <20030101224621.E79442-100000@familjen.svensson.org> <15891.47391.424238.596094@gargle.gargle.HOWL> <009501c2b216$62d2c500$125ffea9@oemcomputer> Message-ID: <15891.50025.383328.125053@montanaro.dyndns.org> Barry> Then you have to call it something other than "isdst()". Raymond> Invert the test and have the best of both worlds: Raymond> isStandardTime() --> bool Changing the sense of the test doesn't get rid of the fact that there are three possible answers, not two: "standard time", "daylight time", and "i don't know". Skip From tim@zope.com Thu Jan 2 05:16:41 2003 From: tim@zope.com (Tim Peters) Date: Thu, 2 Jan 2003 00:16:41 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <200301020440.h024eTe18762@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > ... > The conventional value of one hour of is never implied in any of these > calculations -- that's always up to the tzinfo object. This is so. > (However, it's assumed that dst() returns a positive number -- > otherwise it wouldn't be daylight *savings*. On the southern > hemisphere, the seasons are reversed, but DST still moves the clock > forward by some amount, not back.) If you dig into the long new correctness proof, you'll find that the astimezone() implementation doesn't actually need the assumption that dst() returns a positive number. It does rely on the sanity requirement that dst() return a non-zero result if and only if daylight time is in effect, and on the subtler consistency requirement that dt.tzinfo.utcoffset(dt) - dt.tzinfo.dst(dt) is a fixed value across all datetimetz dt with the same dt.year member (hmm -- I think I'm assuming there too that a DST switch doesn't occur within a day or so of Jan 1). The flexibility to rely on only one-year-at-a-time consistency is important for people who need to model that Chile delayed its changeover date for the Pope's visit in 1987 . From max@malva.ua Thu Jan 2 07:51:23 2003 From: max@malva.ua (Max Ischenko) Date: Thu, 2 Jan 2003 09:51:23 +0200 Subject: [Python-Dev] Encoding file for KOI8-U In-Reply-To: <3E0F0970.6060104@lemburg.com> References: <3E0F0970.6060104@lemburg.com> Message-ID: <20030102075123.GA2287@malva.ua> M.-A. Lemburg wrote: > > I wonder, whether koi8-u.py would be included in Python 2.3? > Such a codec is already included in Python CVS. > Maxim submitted this codec a while ago. It has a few bugs. The > version in CVS is compliant to the RFC and also faster: Great! I just want to make sure a codec would be in Python 2.3. -- Bst rgrds, M.A.X.: Mechanical Artificial Xenomorph. From paul-python@svensson.org Thu Jan 2 07:52:21 2003 From: paul-python@svensson.org (Paul Svensson) Date: Thu, 2 Jan 2003 02:52:21 -0500 (EST) Subject: [Python-Dev] Holes in time In-Reply-To: <200301020440.h024eTe18762@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030102023615.T2511-100000@familjen.svensson.org> On Wed, 1 Jan 2003, Guido van Rossum wrote: >> >+1, but let's use True, False, and None. :) >> >> This is a typical example of where bool is wrong. > >No it isn't. > >> Google for "double daylight savings", and you'll understand why. >> So it would make more sense to use: 3600, 0, and None. > >Unless congressman Sherman proposes to move the clock forward and/or >back more than once a year, this is not necessary: the flag only >indicates *whether* DST is in effect or not; to know *how much* the >DST adjustment is you should call the tzinfo's dst() method. The >conventional value of one hour of is never implied in any of these >calculations -- that's always up to the tzinfo object. (However, it's >assumed that dst() returns a positive number -- otherwise it wouldn't >be daylight *savings*. On the southern hemisphere, the seasons are >reversed, but DST still moves the clock forward by some amount, not >back.) Congressman Sherman is totally irrelevant. In 1945, Britain used GMT+1 until April 2nd, then GMT+2 until July 15th, then went back GMT+1 until October 7th, and then GMT the rest of he year. Again in Britain, in 1947, DST extended from March 16th to November 2nd, with double DST from April 13th to August 10th. /Paul From max@malva.ua Thu Jan 2 08:03:58 2003 From: max@malva.ua (Max Ischenko) Date: Thu, 2 Jan 2003 10:03:58 +0200 Subject: [Python-Dev] GC at exit? In-Reply-To: <20021231005736.GA15066@panix.com> References: <20021230034551.GA18622@panix.com> <20021231005736.GA15066@panix.com> Message-ID: <20030102080358.GC2287@malva.ua> Aahz wrote: > > The language reference manual doesn't promise that any garbage will be > > collected, ever. So, no, "supposed to" doesn't apply. > Thanks. I'll tell people to run gc.collect() at the end of their > applications if they care. Why? When application exits the OS would reclaim it's memory pages anyway. -- Bst rgrds, M.A.X.: Mechanical Artificial Xenomorph. From mal@lemburg.com Thu Jan 2 09:18:10 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 02 Jan 2003 10:18:10 +0100 Subject: [Python-Dev] Holes in time In-Reply-To: References: Message-ID: <3E1403D2.3090605@lemburg.com> Tim Peters wrote: > We still have two time zone conversion problems, and always will (they come > with the territory -- they're inherent problems, not artifacts of the > implementation). They only arise in a tzinfo class that's trying to model > both standard and daylight time. In effect, that's one class trying to > pretend it's two different time zones, and at the transition points there > are "impossible problems". > ... > UTC, EST and EDT are all self-consistent and trivial. It's only wall time > that's a problem, and only at the transition points: > > 1. When DST starts (the "wall start" line), the wall clock leaps from 1:59 > to 3:00. A wall time of the form 2:MM doesn't really make sense on that > day. The example classes do what I believe is the best that can be done: > since 2:MM is "after 2" on that day, it's taken as daylight time, and so as > an alias for 1:MM standard == 1:MM wall on that day, which is a time that > does make sense on the wall clock that day. The astimezone() function > ensures that the "impossible hour" on that day is never the result of a > conversion (you'll get the standard-time spelling instead). If you don't > think that's the best that can be done, speak up now. > > 2. When DST ends (the "wall end" line), we have a worse problem: there'a an > hour that can't be spelled *at all* in wall time. It's the hour beginning > at the moment DST ends; in the example, that's times of the form 6:MM UTC on > the day daylight time ends. The local wall clock leaps from 1:59 (daylight > time) back to 1:00 again (but the second time as a standard time). The hour > 6:MM UTC looks like 1:MM, but so does the hour 5:MM UTC on that day. A > reasonable tzinfo class should take 1:MM as being daylight time on that day, > since it's "before 2". As a consequence, the hour 6:MM UTC has no > wall-clock spelling at all. The only right way to deal with these problems is to raise ValueErrors. Calculations resulting in these local times are simply doomed and should be done in UTC instead. DST and local times are not mathematical properties, so you shouldn't expect them to behave in that way. For some fun reading have a look at the tzarchive package docs at: ftp://elsie.nci.nih.gov/pub/ -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Thu Jan 2 12:51:50 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 07:51:50 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Thu, 02 Jan 2003 00:16:41 EST." References: Message-ID: <200301021251.h02Cpog19722@pcp02138704pcs.reston01.va.comcast.net> > If you dig into the long new correctness proof, you'll find that the > astimezone() implementation doesn't actually need the assumption that dst() > returns a positive number. It does rely on the sanity requirement that > dst() return a non-zero result if and only if daylight time is in effect, > and on the subtler consistency requirement that > > dt.tzinfo.utcoffset(dt) - dt.tzinfo.dst(dt) > > is a fixed value across all datetimetz dt with the same dt.year member > (hmm -- I think I'm assuming there too that a DST switch doesn't occur > within a day or so of Jan 1). The flexibility to rely on only > one-year-at-a-time consistency is important for people who need to model > that Chile delayed its changeover date for the Pope's visit in 1987 . Unless a country changes timezones permanently, I'd assume that this expression is constant regardless of the year. And *if* a country switches timezones, it doesn't necessarily happen on Jan 1st. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 2 12:55:29 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 07:55:29 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Thu, 02 Jan 2003 02:52:21 EST." <20030102023615.T2511-100000@familjen.svensson.org> References: <20030102023615.T2511-100000@familjen.svensson.org> Message-ID: <200301021255.h02CtTe19745@pcp02138704pcs.reston01.va.comcast.net> > Congressman Sherman is totally irrelevant. Then why did you waste everybody's time by suggesting we Google for something that returned him as first and second hits, rather than just explaining the issue? > In 1945, Britain used GMT+1 until April 2nd, then GMT+2 until July > 15th, then went back GMT+1 until October 7th, and then GMT the rest > of he year. Again in Britain, in 1947, DST extended from March 16th > to November 2nd, with double DST from April 13th to August 10th. OK, I'm convinced. I believe the C99 standard also changes the tm_isdst flag to indicate the value of the DST adjustment; now I understand why. I wonder if this affects an assumption of Tim's correctness proof? --Guido van Rossum (home page: http://www.python.org/~guido/) From ben@algroup.co.uk Thu Jan 2 13:07:39 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Thu, 02 Jan 2003 13:07:39 +0000 Subject: [Python-Dev] Holes in time References: <20030102023615.T2511-100000@familjen.svensson.org> <200301021255.h02CtTe19745@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E14399B.5040306@algroup.co.uk> Guido van Rossum wrote: >>Congressman Sherman is totally irrelevant. > > > Then why did you waste everybody's time by suggesting we Google for > something that returned him as first and second hits, rather than just > explaining the issue? > > >>In 1945, Britain used GMT+1 until April 2nd, then GMT+2 until July >>15th, then went back GMT+1 until October 7th, and then GMT the rest >>of he year. Again in Britain, in 1947, DST extended from March 16th >>to November 2nd, with double DST from April 13th to August 10th. > > > OK, I'm convinced. I believe the C99 standard also changes the > tm_isdst flag to indicate the value of the DST adjustment; now I > understand why. > > I wonder if this affects an assumption of Tim's correctness proof? Since we're on the subject, its always distressed me (in a purely intellectual way) that Unix time doesn't _really_ convert to wall-clock time properly, because it ignores leap seconds. The effect of this is that times in the past get converted incorrectly by several seconds (how many depending on exactly when in the past, of course). I don't suppose there's interest in fixing that? Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From guido@python.org Thu Jan 2 13:22:43 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 08:22:43 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Thu, 02 Jan 2003 13:07:39 GMT." <3E14399B.5040306@algroup.co.uk> References: <20030102023615.T2511-100000@familjen.svensson.org> <200301021255.h02CtTe19745@pcp02138704pcs.reston01.va.comcast.net> <3E14399B.5040306@algroup.co.uk> Message-ID: <200301021322.h02DMh802898@pcp02138704pcs.reston01.va.comcast.net> > Since we're on the subject, its always distressed me (in a purely > intellectual way) that Unix time doesn't _really_ convert to wall-clock > time properly, because it ignores leap seconds. The effect of this is > that times in the past get converted incorrectly by several seconds (how > many depending on exactly when in the past, of course). > > I don't suppose there's interest in fixing that? Given how timestamps in a typical Unix system are used, I think ignoring leap seconds is the only sensible thing. The POSIX standard agrees. If by "fixing" you mean taking leap seconds into account in any way, I would strongly object that. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@pythoncraft.com Thu Jan 2 13:55:44 2003 From: aahz@pythoncraft.com (Aahz) Date: Thu, 2 Jan 2003 08:55:44 -0500 Subject: [Python-Dev] GC at exit? In-Reply-To: <20030102080358.GC2287@malva.ua> References: <20021230034551.GA18622@panix.com> <20021231005736.GA15066@panix.com> <20030102080358.GC2287@malva.ua> Message-ID: <20030102135544.GA2891@panix.com> On Thu, Jan 02, 2003, Max Ischenko wrote: > Aahz wrote: >> Tim Peters: >>> >>> The language reference manual doesn't promise that any garbage will be >>> collected, ever. So, no, "supposed to" doesn't apply. >> >> Thanks. I'll tell people to run gc.collect() at the end of their >> applications if they care. > > Why? > When application exits the OS would reclaim it's memory pages anyway. Because garbage cycles can point at non-garbage; when the garbage is reclaimed, __del__() methods will run. You could argue that this is another reason against using __del__(), but since this is part of the way CPython works, I'm documenting it in my book. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "There are three kinds of lies: Lies, Damn Lies, and Statistics." --Disraeli From Paul Hughett Thu Jan 2 14:03:41 2003 From: Paul Hughett (Paul Hughett) Date: Thu, 2 Jan 2003 09:03:41 -0500 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors In-Reply-To: <200301020219.h022Jea18255@pcp02138704pcs.reston01.va.comcast.net> (message from Guido van Rossum on Wed, 01 Jan 2003 21:19:39 -0500) References: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> <15891.17216.321848.816687@gargle.gargle.HOWL> <200301020219.h022Jea18255@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200301021403.h02E3fx19873@mercur.uphs.upenn.edu> Guido wrote: > > Hey, if we'll killing off builtins, I vote for apply(). > Agreed, it's redundant. You can help by checking in documentation > that marks it as deprecated and code that adds a > PendingDeprecationWarning to it (unfortunately it's so common that I > wouldn't want to risk a plain DeprecationWarning). You've lost me here. I've recently written a piece of code that uses a lookup table on the name of a file to find the right function to apply to it; if I don't use apply for this, what should I use? An explicit case statement cannot be dynamically modified; using eval() requires a conversion to string (and is arguably even uglier than apply). Paul Hughett From aleaxit@yahoo.com Thu Jan 2 14:06:23 2003 From: aleaxit@yahoo.com (Alex Martelli) Date: Thu, 2 Jan 2003 15:06:23 +0100 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors In-Reply-To: <200301021403.h02E3fx19873@mercur.uphs.upenn.edu> References: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> <200301020219.h022Jea18255@pcp02138704pcs.reston01.va.comcast.net> <200301021403.h02E3fx19873@mercur.uphs.upenn.edu> Message-ID: <200301021506.23881.aleaxit@yahoo.com> On Thursday 02 January 2003 03:03 pm, Paul Hughett wrote: > Guido wrote: > > > Hey, if we'll killing off builtins, I vote for apply(). > > > > Agreed, it's redundant. You can help by checking in documentation > > that marks it as deprecated and code that adds a > > PendingDeprecationWarning to it (unfortunately it's so common that I > > wouldn't want to risk a plain DeprecationWarning). > > You've lost me here. I've recently written a piece of code that uses > a lookup table on the name of a file to find the right function to > apply to it; if I don't use apply for this, what should I use? An thetable[thefilename](thefilename) should do just as well as apply(thetable[thefilename], (thefilename,)) and more compactly and readably, too. Alex From ben@algroup.co.uk Thu Jan 2 14:09:51 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Thu, 02 Jan 2003 14:09:51 +0000 Subject: [Python-Dev] Holes in time References: <20030102023615.T2511-100000@familjen.svensson.org> <200301021255.h02CtTe19745@pcp02138704pcs.reston01.va.comcast.net> <3E14399B.5040306@algroup.co.uk> <200301021322.h02DMh802898@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E14482F.7040007@algroup.co.uk> Guido van Rossum wrote: >>Since we're on the subject, its always distressed me (in a purely >>intellectual way) that Unix time doesn't _really_ convert to wall-clock >>time properly, because it ignores leap seconds. The effect of this is >>that times in the past get converted incorrectly by several seconds (how >>many depending on exactly when in the past, of course). >> >>I don't suppose there's interest in fixing that? > > > Given how timestamps in a typical Unix system are used, I think > ignoring leap seconds is the only sensible thing. The POSIX standard > agrees. If by "fixing" you mean taking leap seconds into account in > any way, I would strongly object that. Actually, now I think about it again, its not the human readable version that's wrong (though there is this silly issue that there are some seconds that you can't represent as Unix timestamps), its the interval between two timestamps that's wrong. Anyway, its not an issue I'm hugely passionate about - though I imagine it might matter to some scientists - I just thought I'd mention it, since we're on the subject. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From pinard@iro.umontreal.ca Thu Jan 2 14:47:01 2003 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois_Pinard?=) Date: 02 Jan 2003 09:47:01 -0500 Subject: [Python-Dev] Re: GC at exit? In-Reply-To: <20030102135544.GA2891@panix.com> References: <20021230034551.GA18622@panix.com> <20021231005736.GA15066@panix.com> <20030102080358.GC2287@malva.ua> <20030102135544.GA2891@panix.com> Message-ID: [Aahz] > Because garbage cycles can point at non-garbage; when the garbage is > reclaimed, __del__() methods will run. When cycles contain `__del__()', they are wholly added to gc.garbage if I understand the documentation correctly, and so, `__del__()' will not be run. -- François Pinard http://www.iro.umontreal.ca/~pinard From aleaxit@yahoo.com Thu Jan 2 14:50:21 2003 From: aleaxit@yahoo.com (Alex Martelli) Date: Thu, 2 Jan 2003 15:50:21 +0100 Subject: [Python-Dev] Re: GC at exit? In-Reply-To: References: <20021230034551.GA18622@panix.com> <20030102135544.GA2891@panix.com> Message-ID: <200301021550.21693.aleaxit@yahoo.com> On Thursday 02 January 2003 03:47 pm, François Pinard wrote: > [Aahz] > > > Because garbage cycles can point at non-garbage; when the garbage is > > reclaimed, __del__() methods will run. > > When cycles contain `__del__()', they are wholly added to gc.garbage if I > understand the documentation correctly, and so, `__del__()' will not be > run. Yes, but that doesn't affect Aahz's argument -- consider, e.g.: class Nodel: pass class Hasdel: def __del__(self): print 'del(%s)'%self a, b, c = Nodel(), Nodel(), Nodel() a.x = b; b.x = c; c.x = a b.y = Hasdel() del a, b, c import gc # gc.collect() Here, the __del__ method is never run, but, if you uncomment the gc.collect call, then the __del__ method IS run -- just as Aahz said. Alex From guido@python.org Thu Jan 2 15:07:16 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 10:07:16 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Thu, 02 Jan 2003 14:09:51 GMT." <3E14482F.7040007@algroup.co.uk> References: <20030102023615.T2511-100000@familjen.svensson.org> <200301021255.h02CtTe19745@pcp02138704pcs.reston01.va.comcast.net> <3E14399B.5040306@algroup.co.uk> <200301021322.h02DMh802898@pcp02138704pcs.reston01.va.comcast.net> <3E14482F.7040007@algroup.co.uk> Message-ID: <200301021507.h02F7GB18232@odiug.zope.com> > Actually, now I think about it again, its not the human readable version > that's wrong (though there is this silly issue that there are some > seconds that you can't represent as Unix timestamps), its the interval > between two timestamps that's wrong. Few clocks can represent those seconds. Few clocks need to. > Anyway, its not an issue I'm hugely passionate about - though I imagine > it might matter to some scientists - I just thought I'd mention it, > since we're on the subject. Let's drop it please. I'm hugely passionate about this -- leap seconds have no reason to be accounted for in most people's lives. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 2 15:14:16 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 10:14:16 -0500 Subject: [Python-Dev] Re: GC at exit? In-Reply-To: Your message of "02 Jan 2003 09:47:01 EST." References: <20021230034551.GA18622@panix.com> <20021231005736.GA15066@panix.com> <20030102080358.GC2287@malva.ua> <20030102135544.GA2891@panix.com> Message-ID: <200301021514.h02FEHc18274@odiug.zope.com> > [Aahz] > > > Because garbage cycles can point at non-garbage; when the garbage is > > reclaimed, __del__() methods will run. > [François Pinard] > When cycles contain `__del__()', they are wholly added to gc.garbage if I > understand the documentation correctly, and so, `__del__()' will not be run. True, but objects that are not *in* a cycle but that were kept alive because the cycle referenced them will have their __del__ methods run when the cycle is GC'ed. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Thu Jan 2 15:26:50 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Jan 2003 16:26:50 +0100 Subject: [Python-Dev] GC at exit? In-Reply-To: <20030102135544.GA2891@panix.com> References: <20021230034551.GA18622@panix.com> <20021231005736.GA15066@panix.com> <20030102080358.GC2287@malva.ua> <20030102135544.GA2891@panix.com> Message-ID: <3E145A3A.4060202@v.loewis.de> Aahz wrote: > Because garbage cycles can point at non-garbage; when the garbage is > reclaimed, __del__() methods will run. You could argue that this is > another reason against using __del__(), but since this is part of the > way CPython works, I'm documenting it in my book. Documenting that you can call gc.collect() at the end is good; documenting that you should call it is not. I would expect that in many cases, it will be irrelevant whether __del__ is called at the end or not, as the system or the underlying libraries will reclaim whatever resources have been acquired. If you need to guarantee that __del__ is called at the end for all objects, you have probably much bigger problems in your application than that, and it is likely better to explicitly break any remaining cycles than to invoke gc.collect. Regards, Martin From tismer@tismer.com Thu Jan 2 15:57:25 2003 From: tismer@tismer.com (Christian Tismer) Date: Thu, 02 Jan 2003 16:57:25 +0100 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors References: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> <15891.17216.321848.816687@gargle.gargle.HOWL> <200301020219.h022Jea18255@pcp02138704pcs.reston01.va.comcast.net> <200301021403.h02E3fx19873@mercur.uphs.upenn.edu> Message-ID: <3E146165.6030507@tismer.com> Paul Hughett wrote: > Guido wrote: > > >>>Hey, if we'll killing off builtins, I vote for apply(). >> > >>Agreed, it's redundant. You can help by checking in documentation >>that marks it as deprecated and code that adds a >>PendingDeprecationWarning to it (unfortunately it's so common that I >>wouldn't want to risk a plain DeprecationWarning). > > > You've lost me here. I've recently written a piece of code that uses > a lookup table on the name of a file to find the right function to > apply to it; if I don't use apply for this, what should I use? An > explicit case statement cannot be dynamically modified; using eval() > requires a conversion to string (and is arguably even uglier than > apply). Since a function is a first class callable object, you just pick it out of your lookup table func = look[key] and call it with the args and kwds which you got, using the new asterisk syntax: ret = func(*args, **kwds) ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From Paul Hughett Thu Jan 2 16:15:16 2003 From: Paul Hughett (Paul Hughett) Date: Thu, 2 Jan 2003 11:15:16 -0500 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors In-Reply-To: <3E146165.6030507@tismer.com> (message from Christian Tismer on Thu, 02 Jan 2003 16:57:25 +0100) References: <200301010422.h014M0B15913@pcp02138704pcs.reston01.va.comcast.net> <15891.17216.321848.816687@gargle.gargle.HOWL> <200301020219.h022Jea18255@pcp02138704pcs.reston01.va.comcast.net> <200301021403.h02E3fx19873@mercur.uphs.upenn.edu> <3E146165.6030507@tismer.com> Message-ID: <200301021615.h02GFGu20026@mercur.uphs.upenn.edu> Chris Tismer wrote: > Since a function is a first class callable object, > you just pick it out of your lookup table > func = look[key] > and call it with the args and kwds which you got, > using the new asterisk syntax: > ret = func(*args, **kwds) Okay, that makes sense. I've been programming in too many of the wrong languages lately and am not used to thinking of functions as first class objects. [Alex made the same point in slightly different language.] Thanks for the explanations. Paul Hughett From mwh@python.net Thu Jan 2 16:26:25 2003 From: mwh@python.net (Michael Hudson) Date: 02 Jan 2003 16:26:25 +0000 Subject: [Python-Dev] PEP 303: Extend divmod() for Multiple Divisors In-Reply-To: Tim Peters's message of "Wed, 01 Jan 2003 12:35:34 -0500" References: Message-ID: <2mfzsbo6vi.fsf@starship.python.net> Tim Peters writes: > If we have to drop a builtin, I never liked reduce , I could do with forgetting about map and filter. I killed a fair few brain cells on new year's eve, but I can still remember them. will-try-harder-next-year-ly y'rs M. -- Also, remember to put the galaxy back when you've finished, or an angry mob of astronomers will come round and kneecap you with a small telescope for littering. -- Simon Tatham, ucam.chat, from Owen Dunn's review of the year From skip@pobox.com Thu Jan 2 16:32:59 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 2 Jan 2003 10:32:59 -0600 Subject: [Python-Dev] map, filter, reduce, lambda Message-ID: <15892.27067.231224.233677@montanaro.dyndns.org> Given the recent sentiment expressed about Python's various functional builtins, I suspect they will all be absent in Py3k. Skip From esr@thyrsus.com Thu Jan 2 16:31:23 2003 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 2 Jan 2003 11:31:23 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <15892.27067.231224.233677@montanaro.dyndns.org> References: <15892.27067.231224.233677@montanaro.dyndns.org> Message-ID: <20030102163123.GB5874@thyrsus.com> Skip Montanaro : > Given the recent sentiment expressed about Python's various functional > builtins, I suspect they will all be absent in Py3k. Arrrgghhh!! Nooooo!! Don't take away my lambdas!!! They're far too useful for little callback snippets. -- Eric S. Raymond From tismer@tismer.com Thu Jan 2 16:50:39 2003 From: tismer@tismer.com (Christian Tismer) Date: Thu, 02 Jan 2003 17:50:39 +0100 Subject: [Python-Dev] map, filter, reduce, lambda References: <15892.27067.231224.233677@montanaro.dyndns.org> Message-ID: <3E146DDF.80002@tismer.com> Skip Montanaro wrote: > Given the recent sentiment expressed about Python's various functional > builtins, I suspect they will all be absent in Py3k. Really absent? I suspect they will exist as a Python function builtin surrogate for compatibility. This allows to continue using them and can be optimized away, but frees code space for other stuff with higher priority to be coded in C. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From skip@pobox.com Thu Jan 2 17:15:45 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 2 Jan 2003 11:15:45 -0600 Subject: [Python-Dev] no expected test output for test_sort? Message-ID: <15892.29633.664617.560348@montanaro.dyndns.org> Lib/test/test_sort.py isn't coded using unittest, but I don't see an expected output file in the .../test/output directory. I just added a test to test_sort.py and verified manually that it fails with a version of the interpreter without my cmpfunc=None patch and succeeds with one that has the patch ("python" is 2.3a0, "./python.exe" is 2.3a1+): % python -E -tt ../Lib/test/test_sort.py ... Testing bug 453523 -- list.sort() crasher. Testing None as a comparison function. Passing None as cmpfunc failed. Test failed 1 % ./python.exe -E -tt ../Lib/test/test_sort.py ... Testing bug 453523 -- list.sort() crasher. Testing None as a comparison function. Test passed -- no errors. However, if I run test_sort.py via regrtest, it always succeeds: % ./python.exe -E -tt ../Lib/test/regrtest.py test_sort test_sort 1 test OK. % python -E -tt ../Lib/test/regrtest.py test_sort test_sort 1 test OK. Running regrtest.py with the -g flag doesn't seem to be generating a test_sort file anyplace obvious: % ./python.exe ../Lib/test/regrtest.py -g test_sort test_sort 1 test OK. % cd .. % find . -name '*test_sort*' ./Lib/test/test_sort.py ./Lib/test/test_sort.pyc Something seems amiss, but I'm at a loss to figure out what's wrong. Skip From aahz@pythoncraft.com Thu Jan 2 17:42:01 2003 From: aahz@pythoncraft.com (Aahz) Date: Thu, 2 Jan 2003 12:42:01 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <15892.27067.231224.233677@montanaro.dyndns.org> References: <15892.27067.231224.233677@montanaro.dyndns.org> Message-ID: <20030102174201.GA29771@panix.com> On Thu, Jan 02, 2003, Skip Montanaro wrote: > > Given the recent sentiment expressed about Python's various functional > builtins, I suspect they will all be absent in Py3k. If map() goes, we need a replacement: >>> a = [8, 3, 5, 11] >>> b = ('eggs', 'spam') >>> zip(a,b) [(8, 'eggs'), (3, 'spam')] >>> map(None, a, b) [(8, 'eggs'), (3, 'spam'), (5, None), (11, None)] Maybe zip() should take a "pad" keyword argument? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "There are three kinds of lies: Lies, Damn Lies, and Statistics." --Disraeli From michalszabelski@o2.pl Thu Jan 2 18:32:34 2003 From: michalszabelski@o2.pl (michalszabelski@o2.pl) Date: Thu, 2 Jan 2003 18:32:34 Subject: [Python-Dev] =?iso-8859-2?Q?Embedded=20Python=20-=20how=20do=20I=20implement=20a=20class??= Message-ID: <20030102173235.4CE606EEE8@rekin.go2.pl> Hello. I'm embedding Python in C++. I already know how to define modules, pass PyCObjects, define my own type with methods by implementing getattr and getattro functions (I use PyCFunction_New to make objects these functions return). This type is class-like, it has methods, and a pointer to C++ class instance. But I'd like to implement a full blown, inheritable (and/or inheriting) Python class in C++. In the includes I found PyClass_New and PyMethod_New, but I can't guess what arguments they take (these 3 PyObject pointers). In wxPython I found that they use SWIG somehow - I don't know how to use it. If it makes any difference - I do the embedding under Windows. I hope this is a right list. I already tried help@python.org, they have no idea. Michal Szabelski From guido@python.org Thu Jan 2 17:54:35 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 12:54:35 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Your message of "Thu, 02 Jan 2003 12:42:01 EST." <20030102174201.GA29771@panix.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> Message-ID: <200301021754.h02HsZr19162@odiug.zope.com> > If map() goes, we need a replacement: > > >>> a = [8, 3, 5, 11] > >>> b = ('eggs', 'spam') > >>> zip(a,b) > [(8, 'eggs'), (3, 'spam')] > >>> map(None, a, b) > [(8, 'eggs'), (3, 'spam'), (5, None), (11, None)] > > Maybe zip() should take a "pad" keyword argument? Only if there's every anybody who needs that feature of map(). --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Thu Jan 2 18:04:15 2003 From: martin@v.loewis.de (=?ISO-8859-2?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Jan 2003 19:04:15 +0100 Subject: [Python-Dev] Embedded Python - how do I implement a class? In-Reply-To: <20030102173235.4CE606EEE8@rekin.go2.pl> References: <20030102173235.4CE606EEE8@rekin.go2.pl> Message-ID: <3E147F1F.4050905@v.loewis.de> michalszabelski@o2.pl wrote: > I hope this is a right list. I already tried help@python.org, they > have no idea. It depends on what your message is; the message itself did not include a question. If you are asking the question in the subject, then this is not the right list - "how-do-I" questions are, in general, inappropriate for python-dev, as this list is for development of Python, not for development with Python. Python itself defines a number of types that you can inherit from, e.g. list, file, and dict. I recommend you study the source of these types and find out what you are doing incorrectly. If you then need further help, please ask on comp.lang.python (i.e. python-list). Regards, Martin From aahz@pythoncraft.com Thu Jan 2 18:07:29 2003 From: aahz@pythoncraft.com (Aahz) Date: Thu, 2 Jan 2003 13:07:29 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: References: <20030102010235.GA25005@panix.com> Message-ID: <20030102180729.GA6262@panix.com> On Wed, Jan 01, 2003, Tim Peters wrote: > [Aahz] >> >> What exactly do you mean by "wall time"? Is it supposed to be a >> timezone-less value? > > The diagram in the original gave agonizingly detailed answers to these > questions. US Eastern as a tzinfo class tries to combine both EDT and > EST; please read the original msg again, and note that the two "wall > time" lines in the diagram show what US Eastern displays at all the > relevant UTC times. Ah! Okay, didn't read the code and didn't realize that "US Eastern" was a concrete representation. >> Every other datetime package seems to live with the wiggle involved >> in round-trip conversions, why not Python? > > Round-trip conversions aren't at issue here, except to the extent > that the "unspellable hour" at the end of DST makes some one-way > conversions impossible. I'm not sure I agree. As I see it, "wall time" is for users. On the display side, I believe that users *expect* to see 1:59:57 am 1:59:58 am 1:59:59 am 1:00:00 am 1:00:01 am I therefore see no problem with the UTC->wall clock conversion. Going the other direction requires an explicit statement of which timezone you're in at the point of conversion (a real timezone, not a virtual one like "US Eastern"). Presumably that only occurs as a result of user input, and when you redisplay the input as a wall clock, it should be obvious to the user if the wrong time zone was selected because the time will be an hour (or whatever) off. The only way this is a problem seems to be if you want to do round-trip conversions purely programmatically. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "There are three kinds of lies: Lies, Damn Lies, and Statistics." --Disraeli From tim.one@comcast.net Thu Jan 2 18:21:13 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 02 Jan 2003 13:21:13 -0500 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: <15892.29633.664617.560348@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > Lib/test/test_sort.py isn't coded using unittest, but I don't see an > expected output file in the .../test/output directory. "Expected output" files are out of favor, and the majority of tests no longer have one. As a debugging aid, it's fine to display msgs in (only) verbose mode. > I just added a test to test_sort.py You didn't show us the actual test code you added, and it's too painful to try to guess what you did. > and verified manually that it fails with a version of the > interpreter without my cmpfunc=None patch and succeeds with one > that has the patch ("python" is 2.3a0, "./python.exe" is 2.3a1+): > > % python -E -tt ../Lib/test/test_sort.py > ... > Testing bug 453523 -- list.sort() crasher. > Testing None as a comparison function. > Passing None as cmpfunc failed. > Test failed 1 > > % ./python.exe -E -tt ../Lib/test/test_sort.py > ... > Testing bug 453523 -- list.sort() crasher. > Testing None as a comparison function. > Test passed -- no errors. > > However, if I run test_sort.py via regrtest, it always succeeds: > > % ./python.exe -E -tt ../Lib/test/regrtest.py test_sort > test_sort > 1 test OK. > % python -E -tt ../Lib/test/regrtest.py test_sort > test_sort > 1 test OK. Perhaps you're not incrementing test_sort's nerrors variable *unless* you're in verbose mode -- as above, without seeing what you actually did, it's a guessing game. > Running regrtest.py with the -g flag doesn't seem to be generating a > test_sort file anyplace obvious: That's expected. If a test doesn't produce output, -g deliberately avoids creating an expected-output file. Else the output directory would be littered with useless files for unittest- and doctest- based tests, as well as for tests like test_sort that do a simpler thing. From barry@python.org Thu Jan 2 18:23:43 2003 From: barry@python.org (Barry A. Warsaw) Date: Thu, 2 Jan 2003 13:23:43 -0500 Subject: [Python-Dev] map, filter, reduce, lambda References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> Message-ID: <15892.33711.250237.271636@gargle.gargle.HOWL> >>>>> "GvR" == Guido van Rossum writes: >> should take a "pad" keyword argument? GvR> Only if there's every anybody who needs that feature of GvR> map(). We talked about adding a pad argument to zip() at the time, but rejected it as yagni. And if fact, since then ihnni. -Barry From exarkun@intarweb.us Thu Jan 2 18:31:34 2003 From: exarkun@intarweb.us (Jp Calderone) Date: Thu, 2 Jan 2003 13:31:34 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301021754.h02HsZr19162@odiug.zope.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> Message-ID: <20030102183134.GA23582@meson.dyndns.org> --UugvWAfsgieZRqgk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jan 02, 2003 at 12:54:35PM -0500, Guido van Rossum wrote: > > If map() goes, we need a replacement: > >=20 > > >>> a =3D [8, 3, 5, 11] > > >>> b =3D ('eggs', 'spam') > > >>> zip(a,b) > > [(8, 'eggs'), (3, 'spam')] > > >>> map(None, a, b) > > [(8, 'eggs'), (3, 'spam'), (5, None), (11, None)] > >=20 > > Maybe zip() should take a "pad" keyword argument? >=20 > Only if there's every anybody who needs that feature of map(). FWIW, exarkun:~/projects/python$ grep -r "map(None," ./ -r | wc -l 41 I'm more worried about lambda going away, though. Is: class Foo: def setupStuff(self): def somethingOne(): return self.something(1) registerStuff('1', somethingOne) really preferable to: class Foo: def setupStuff(self): registerStuff('1', lambda: self.something(1)) Nested scopes make it better than it once would have been, but I can still see a lot of my GUI code getting a lot hairier if I don't have lambda at my disposal. I don't see a decent way to implement it in Python, either. Jp --UugvWAfsgieZRqgk Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (GNU/Linux) iD8DBQE+FIWGedcO2BJA+4YRAu6pAKDXXJyw6HHlDf8GlZxFtzgSCcxHjACgqHo6 1TZI3FeS0FZB8DEUgs646Ao= =LawN -----END PGP SIGNATURE----- --UugvWAfsgieZRqgk-- From martin@v.loewis.de Thu Jan 2 18:48:05 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 02 Jan 2003 19:48:05 +0100 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <20030102183134.GA23582@meson.dyndns.org> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> Message-ID: <3E148965.307@v.loewis.de> Jp Calderone wrote: > class Foo: > def setupStuff(self): > registerStuff('1', lambda: self.something(1)) > > Nested scopes make it better than it once would have been, but I can still > see a lot of my GUI code getting a lot hairier if I don't have lambda at my > disposal. I don't see a decent way to implement it in Python, either. For the specific example, you can easily do without lambda: from functional import bind class Foo: def setupStuff(self): registerStuff('1', bind(self.something, 1)) where bind could be defined as class bind: def __init__(self, f, *args): self.f = f self.args = args def __call__(self, *moreargs): return self.f(*(args+moreargs)) Regards, Martin From guido@python.org Thu Jan 2 18:53:34 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 13:53:34 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Your message of "Thu, 02 Jan 2003 13:31:34 EST." <20030102183134.GA23582@meson.dyndns.org> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> Message-ID: <200301021853.h02IrYk20504@odiug.zope.com> > I'm more worried about lambda going away, though. Is: > > class Foo: > def setupStuff(self): > def somethingOne(): > return self.something(1) > registerStuff('1', somethingOne) > > really preferable to: > > class Foo: > def setupStuff(self): > registerStuff('1', lambda: self.something(1)) > > Nested scopes make it better than it once would have been, but I can still > see a lot of my GUI code getting a lot hairier if I don't have lambda at my > disposal. I don't see a decent way to implement it in Python, either. Good question. Would this be acceptable? class Foo: def setupStuff(self): registerStuff('1', curry(self.something, 1)) curry() is something you could write now, and seems mildly useful. Of course it's more functional cruft, but I'd rather add more "function algebra" than more APL-style operations. --Guido van Rossum (home page: http://www.python.org/~guido/) From exarkun@intarweb.us Thu Jan 2 19:22:00 2003 From: exarkun@intarweb.us (Jp Calderone) Date: Thu, 2 Jan 2003 14:22:00 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301021853.h02IrYk20504@odiug.zope.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> Message-ID: <20030102192200.GA23667@meson.dyndns.org> --82I3+IH0IqGh5yIs Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jan 02, 2003 at 01:53:34PM -0500, Guido van Rossum wrote: > > I'm more worried about lambda going away, though. Is: > > [snip example] > > class Foo: > def setupStuff(self): > registerStuff('1', curry(self.something, 1)) >=20 > curry() is something you could write now, and seems mildly useful. Of > course it's more functional cruft, but I'd rather add more "function > algebra" than more APL-style operations. True (and I have implemented curry ;). For this kind of thing, I think it's a totally acceptable replacement. =20 But, another example: sometimes you want to conform to a particular interface, but return a constant or invoke another function with only some of the arguments provided. For example, overriding a Request object's getSession in the case where it hasn't got one: request.getSession =3D lambda interface =3D None: None Or in Tk, I use "lambda x: 'break'" as a callback, to indicate that an ev= ent should be propagated no further. Or if I want only part of a result passed on to a later callback, or want= it transformed somehow (or both), I often use something like: lambda result: tuple(result[0]) Obviously any of the could simply be a function created with 'def', but it seems so much less elegant that way: more lines of code are required, information about what's happening is moved away from where it happens to some potentially distant location, *and* worst of all, I have to come up with a name for these new functions ;) Losing lambda would be a huge inconvenience, but probably not a crippling lose. I would be happier if there were a better reason than that some people don't like functional elements or that it might save a few kilobytes in the interpreter core (if it would save more than a few, excuse me, I haven't investigated this part of the interpreter yet). Jp --82I3+IH0IqGh5yIs Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (GNU/Linux) iD8DBQE+FJFYedcO2BJA+4YRAoorAJ9LC1QmTPUq8Vz7CXdEW1p6HFefkgCfQj84 ah+R90zeATrgQl6vc3UCp2Y= =ym2j -----END PGP SIGNATURE----- --82I3+IH0IqGh5yIs-- From esr@thyrsus.com Thu Jan 2 19:14:40 2003 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 2 Jan 2003 14:14:40 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <20030102183134.GA23582@meson.dyndns.org> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> Message-ID: <20030102191440.GB7150@thyrsus.com> --nFreZHaLTZJo0R7j Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Jp Calderone : > Nested scopes make it better than it once would have been, but I can st= ill > see a lot of my GUI code getting a lot hairier if I don't have lambda at = my > disposal. I don't see a decent way to implement it in Python, either. Exactly my pragmatic objection. As distinct from my emotional LISP-head reaction, which is "AAAAARRGGHHH!!!" --=20 Eric S. Raymond --nFreZHaLTZJo0R7j Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQE+FI+grfUW04Qh8RwRAt3CAKCcAOWXCYxWGmVJaEDiYTwFHKJgqQCgswwp TIsnkS5WvT/qFpVNHXNzhas= =6NjA -----END PGP SIGNATURE----- --nFreZHaLTZJo0R7j-- From skip@pobox.com Thu Jan 2 19:27:11 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 2 Jan 2003 13:27:11 -0600 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: References: <15892.29633.664617.560348@montanaro.dyndns.org> Message-ID: <15892.37519.804602.556008@montanaro.dyndns.org> Tim> Perhaps you're not incrementing test_sort's nerrors variable Tim> *unless* you're in verbose mode -- as above, without seeing what Tim> you actually did, it's a guessing game. Sorry, my mistake. Here's the function I added to test_sort.py. It goes right after the call to bug453523(): def cmpNone(): global nerrors if verbose: print "Testing None as a comparison function." L = range(50) random.shuffle(L) try: L.sort(None) except TypeError: print " Passing None as cmpfunc failed." nerrors += 1 else: if L != range(50): print " Passing None as cmpfunc failed." nerrors += 1 cmpNone() As I said, when I run test_sort.py directly, it does the right thing (fails in an unpatched interpreter and succeeds in a patched interpreter). >> Running regrtest.py with the -g flag doesn't seem to be generating a >> test_sort file anyplace obvious: Tim> That's expected. If a test doesn't produce output, -g deliberately Tim> avoids creating an expected-output file. Ah, thanks. I didn't realize that. Skip From tim.one@comcast.net Thu Jan 2 19:43:54 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 02 Jan 2003 14:43:54 -0500 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: <15892.37519.804602.556008@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > ... > Here's the function I added to test_sort.py. It goes right after the call > to bug453523(): > > def cmpNone(): > global nerrors > > if verbose: > print "Testing None as a comparison function." > > L = range(50) > random.shuffle(L) > try: > L.sort(None) > except TypeError: > print " Passing None as cmpfunc failed." > nerrors += 1 > else: > if L != range(50): > print " Passing None as cmpfunc failed." > nerrors += 1 > > cmpNone() Sorry, this works fine for me. That is, I added this code to test_sort.py, and then tried it (on my unpatched 2.3 build): C:\Code\python\PCbuild>python ../lib/test/regrtest.py test_sort test_sort test test_sort produced unexpected output: ********************************************************************** *** lines 2-3 of actual output doesn't appear in expected output after line 1: + Passing None as cmpfunc failed. + Test failed 1 ********************************************************************** 1 test failed: test_sort C:\Code\python\PCbuild> Check your PATH and PYTHONPATH etc -- the evidence suggests you're not actually doing what you think you're doing. From skip@pobox.com Thu Jan 2 20:18:14 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 2 Jan 2003 14:18:14 -0600 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: References: <15892.37519.804602.556008@montanaro.dyndns.org> Message-ID: <15892.40582.284715.290847@montanaro.dyndns.org> Tim> Check your PATH and PYTHONPATH etc -- the evidence suggests you're Tim> not actually doing what you think you're doing. Ah, you know what? "python" is the installed Python (no cmpfunc patch). "./python.exe" is in the source tree (with the cmpfunc patch). I'll bet even though I was manually executing regrtest.py from the source tree, it was finding test_sort.py in the installed tree. Yup, here's the proof: % python -v ../Lib/test/regrtest.py -g test_sort ... # /Users/skip/local/lib/python2.3/test/test_sort.pyc matches /Users/skip/local/lib/python2.3/test/test_sort.py import test.test_sort # precompiled from /Users/skip/local/lib/python2.3/test/test_sort.py ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this is the install directory It never occurred to me that regrtest would use the normal import mechanism to find test files. I guess I assumed it would either tweak sys.path to guarantee its directory was searched first or use execfile. Thanks for the tip. Skip From martin@v.loewis.de Thu Jan 2 21:01:36 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 02 Jan 2003 22:01:36 +0100 Subject: [Python-Dev] _tkinter, Tcl, threads, and deadlocks Message-ID: It turns out that using a threaded Tcl is even more tricky; I'm now facing a problem where I need some advice. Tcl uses "appartment threading", meaning that you can use the interpreter only in the thread that has created it. _tkinter traditionally supports calls from arbitrary threads. To make these work together, invoking commands from another thread now sends messages (through Tcl_ThreadQueueEvent). The interpreter thread fetches these messages (in Tcl_DoOneEvent), and then calls the command in the context of the right thread. For this to work, the interpreter thread must actually dispatch events; if it doesn't, the caller would block. I consider this undesirable, since no blocking occurred in the past, and since people will have problems diagnosing the source of the deadlock. So I now raise a RuntimeError("main thread not in main loop"). Unfortunately, this produces a race condition. The typical scenario is window = None def gui_thread(): global window window = Tkinter.Toplevel() create more widgets tk.mainloop() def main_thread(): start_new_thread(gui_thread) while window is None: sleep(0.1) window.firstlabel['text'] = "new message" Here, the last statement of main_thread may raise that RuntimeError, as window is initialized, but mainloop is not yet entered. To solve this problem, I considered the following options. A. The application creates a threading.Event tk_event. The main_thread uses tk_event.wait(). gui_thread first builds up the GUI, then invokes tk_event.set on the condition immediately before invoking tk.mainloop. Unfortunately, this still gives a small window for a race condition, between setting the condition and actually entering the mainloop. B. The RuntimeError is removed; the calling thread will just block. As I said, this might give unexplicable deadlocks for applications that worked just fine in the past. Errors should never pass silently. C. main_thread not immediately raise the RuntimeError, but wait for some time for the mainloop to come up. I'm proposing 1s. This is still a heuristics, since 1s may still be too short (as might be any other value). D. gui_thread thread can indicate that it will invoke the mainloop by invoking tk.willdispatch(). With this indication, RuntimeError won't be raised anymore, but main_thread will block. This is different from B, as the application author made a deliberate change to indicate that the interpreter will start dispatching at some point in the near future. So for Python 2.3a1, I have implemented both C and D; Guido commented that he considers D to be ugly. Any alternative suggestions are encouraged; comments on the choices taken are appreciated. In case you wonder whether this race condition is real: In both applications that suffer from Tcl appartment threading in Debian by crashes of Python 2.2, this RuntimeError was observed after backporting my 2.3 changes to 2.2. Regards, Martin From guido@python.org Thu Jan 2 21:10:23 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 16:10:23 -0500 Subject: [Python-Dev] _tkinter, Tcl, threads, and deadlocks In-Reply-To: Your message of "02 Jan 2003 22:01:36 +0100." References: Message-ID: <200301022110.h02LANb00647@odiug.zope.com> Are we sure we want to use threaded Tcl/Tk? Why? --Guido van Rossum (home page: http://www.python.org/~guido/) From ark@research.att.com Thu Jan 2 21:32:54 2003 From: ark@research.att.com (Andrew Koenig) Date: 02 Jan 2003 16:32:54 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301021853.h02IrYk20504@odiug.zope.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> Message-ID: Guido> curry() is something you could write now, and seems mildly Guido> useful. Of course it's more functional cruft, but I'd rather Guido> add more "function algebra" than more APL-style operations. I'm presently working on a pattern-matching library that uses lambda as a way of representing circular data structures. The basic idea is that I build up these data structures from components, and if one of those components is a function, it signifies a request to call that function while the data structure is being traversed. Hence the circularity. Here's an example that defines a pattern that matches a parenthesis-balanced string: bal = Arbno(Notany("()") | "(" + Call(lambda: bal) + ")") I really, really don't want to have to define a separate function each time I write one of these expressions. -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark From jason@tishler.net Thu Jan 2 21:40:21 2003 From: jason@tishler.net (Jason Tishler) Date: Thu, 02 Jan 2003 16:40:21 -0500 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules _randommodule.c,1.1,1.2 In-Reply-To: References: Message-ID: <20030102214021.GC1996@tishler.net> --Boundary_(ID_meN2sFazSQzs1Ot5QBXrAg) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Content-disposition: inline On Tue, Dec 31, 2002 at 05:14:17PM -0500, Tim Peters wrote: > Please don't make this kind of change -- it makes the code so much > harder to follow. If this is needed for Cygwin, then, e.g., do > > [snip] I believe that I have found a cleaner solution to this problem. Cygwin's ld can auto-import functions: http://www.cygwin.com/ml/cygwin-apps/2001-08/msg00024.html Specifically, the following snippet is the most pertinent: We "always" have allowed 'auto-import' of *functions* that are exported by the DLL (as long as the DLL contains the appropriate symbols). Note I don't believe that "always" pertained when I first started down this path in the Python 2.0 time frame. Anyway, with the attached patch to pyport.h, I was able to build Cygwin Python without any errors. Note this includes the new datetime module from CVS -- not the patched one in sandbox. I feel this is the best approach because modules should build under Cygwin without the standard Cygwin style patch that I have been submitting for years. Do others concur? If so, then I will begin to clean up the "mess" that I have created. Now if SF could search for patches by the submitter, my job would be a little easier... Jason -- PGP/GPG Key: http://www.tishler.net/jason/pubkey.asc or key servers Fingerprint: 7A73 1405 7F2B E669 C19D 8784 1AFD E4CC ECF4 8EF6 --Boundary_(ID_meN2sFazSQzs1Ot5QBXrAg) Content-type: text/plain; charset=us-ascii; NAME=pyport.h.diff Content-transfer-encoding: 7BIT Content-disposition: attachment; filename=pyport.h.diff Index: pyport.h =================================================================== RCS file: /cvsroot/python/python/dist/src/Include/pyport.h,v retrieving revision 2.57 diff -u -p -r2.57 pyport.h --- pyport.h 28 Dec 2002 21:56:07 -0000 2.57 +++ pyport.h 2 Jan 2003 20:51:50 -0000 @@ -429,7 +429,11 @@ and both these use __declspec() # else /* Py_BUILD_CORE */ /* Building an extension module, or an embedded situation */ /* public Python functions and data are imported */ -# define PyAPI_FUNC(RTYPE) __declspec(dllimport) RTYPE +# if defined(__CYGWIN__) +# define PyAPI_FUNC(RTYPE) RTYPE +# else /* __CYGWIN__ */ +# define PyAPI_FUNC(RTYPE) __declspec(dllimport) RTYPE +# endif /* __CYGWIN__ */ # define PyAPI_DATA(RTYPE) extern __declspec(dllimport) RTYPE /* module init functions outside the core must be exported */ # if defined(__cplusplus) --Boundary_(ID_meN2sFazSQzs1Ot5QBXrAg)-- From guido@python.org Thu Jan 2 21:48:05 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 16:48:05 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Your message of "02 Jan 2003 16:32:54 EST." References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> Message-ID: <200301022148.h02Lm6K03766@odiug.zope.com> > I'm presently working on a pattern-matching library that uses lambda > as a way of representing circular data structures. The basic idea is > that I build up these data structures from components, and if one of > those components is a function, it signifies a request to call that > function while the data structure is being traversed. Hence the > circularity. > > Here's an example that defines a pattern that matches a > parenthesis-balanced string: > > bal = Arbno(Notany("()") | "(" + Call(lambda: bal) + ")") > > I really, really don't want to have to define a separate function each > time I write one of these expressions. If they really are all that simple and for this specific purpose, I wonder if you can't embed a reference to a variable directly, e.g. bal = Arbno(Notany("()") | "(" + Local("bal") + ")") --Guido van Rossum (home page: http://www.python.org/~guido/) From ark@research.att.com Thu Jan 2 21:57:46 2003 From: ark@research.att.com (Andrew Koenig) Date: Thu, 2 Jan 2003 16:57:46 -0500 (EST) Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301022148.h02Lm6K03766@odiug.zope.com> (message from Guido van Rossum on Thu, 02 Jan 2003 16:48:05 -0500) References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> Message-ID: <200301022157.h02Lvkq25988@europa.research.att.com> Guido> If they really are all that simple and for this specific purpose, I Guido> wonder if you can't embed a reference to a variable directly, e.g. Guido> bal = Arbno(Notany("()") | "(" + Local("bal") + ")") I will sometimes want the lambda to yield something more complicated than a variable. I also use lambda heavily in the implementation of the library. A sample code fragment: def evaluate(self): data, end = \ self.pat.traverse(lambda obj, *args: obj.evaluate(self.seq, *args), self.begin, self.data) return data Yes, I realize I could rewrite it this way: def evaluate(self): def foo(obj, *args): return obj.evaluate(self.seq, *args) data, end = self.pat.traverse(foo, self.begin, self.data) return data but I'd rather not. (Actually, what I really want here is to be able to write this: def evaluate(self): data, end = self.pat.traverse(virtual evaluate, self.begin, self.data) return data where "virtual evaluate" is a made-up notation for "the evaluate method of whatever object I wind up getting my hands on.") From pedronis@bluewin.ch Thu Jan 2 22:08:33 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Thu, 2 Jan 2003 23:08:33 +0100 Subject: [Python-Dev] map, filter, reduce, lambda References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> Message-ID: <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> From: "Guido van Rossum" > > I'm presently working on a pattern-matching library that uses lambda > > as a way of representing circular data structures. The basic idea is > > that I build up these data structures from components, and if one of > > those components is a function, it signifies a request to call that > > function while the data structure is being traversed. Hence the > > circularity. > > > > Here's an example that defines a pattern that matches a > > parenthesis-balanced string: > > > > bal = Arbno(Notany("()") | "(" + Call(lambda: bal) + ")") > > > > I really, really don't want to have to define a separate function each > > time I write one of these expressions. > > If they really are all that simple and for this specific purpose, I > wonder if you can't embed a reference to a variable directly, e.g. > > bal = Arbno(Notany("()") | "(" + Local("bal") + ")") I find that playing with callers' stack frame is more evil than lambda :) Although this whole thing is about Py3K, it seems rather a silly "waste" of time. I don't know how lambda really partition the user base. I know some are vocally against it as kind of sin. But I find as easy to argument about how much a conceptual dead-weight lambda is, a waste of precious users' memory/learning capabilities and the non purity of lambda, as to argument that "practicality beats purity" and yes we have lambda but we are not a Lisp, and yes it's limited (crippled in Lisper's eyes) but for that just fine and pythonic. don't-sweat-ly y'rs From martin@v.loewis.de Thu Jan 2 22:21:30 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 02 Jan 2003 23:21:30 +0100 Subject: [Python-Dev] _tkinter, Tcl, threads, and deadlocks In-Reply-To: <200301022110.h02LANb00647@odiug.zope.com> References: <200301022110.h02LANb00647@odiug.zope.com> Message-ID: Guido van Rossum writes: > Are we sure we want to use threaded Tcl/Tk? Why? It's not a choice. It is a configure option when you build Tcl. If the "system Tcl" is built with threads, we somehow have to accomodate. _tkinter determines at runtime whether Tcl was built in threaded or non-threaded mode. In the specific case, the Debian Tcl maintainer chose to build Tcl 8.4 in threaded mode for Debian-unstable (what will become Debian 3.1), apparently as some Debian Tcl users had requested that Tcl is built that way. The Debian Python maintainer has now the choice to link with Tcl 8.3 to get a non-threaded Tcl, or to link with Tcl 8.4. Using the old version is not satisfactory. I expect that other Linux distributions will follow, as threaded Tcl apparently has advantages for Tcl users; it also quite nicely cooperates with threaded Perl, which uses the same threading model (use Perl interpreter only in a single thread, use multiple interpreters if you want multiple threads). Regards, Martin From skip@pobox.com Thu Jan 2 22:24:14 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 2 Jan 2003 16:24:14 -0600 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> Message-ID: <15892.48142.993092.418042@montanaro.dyndns.org> >>>>> "Andrew" == Andrew Koenig writes: Andrew> I'm presently working on a pattern-matching library that uses Andrew> lambda as a way of representing circular data structures. ... Andrew> Here's an example ... Andrew> bal = Arbno(Notany("()") | "(" + Call(lambda: bal) + ")") That looks very cool. To avoid lambda I suspect you could have Call() take a string ("bal") which is later eval'd in the correct context to get the actual function object to be called. (This eval() could be performed once, caching the result to avoid repeated lookups.) In this particular case, "lambda: bal" is simply deferring evaluation of bal anyway. My observation about map and friends was just that. It seems there is little support for apply, it having already been deprecated. Map and reduce seem iffy. Nobody's stepped up to castigate filter, though I don't recall seeing any defenders either. Lambda seems to be the only one of the bunch with any strong support. It's perfect in those situations where it works, but too limited everywhere else. Couldn't lambda be more-or-less supplanted by a thin wrapper around new.function()? I took a quick crack at it but failed. Probably caffeine or sugar deficit. Skip From neal@metaslash.com Thu Jan 2 22:25:24 2003 From: neal@metaslash.com (Neal Norwitz) Date: Thu, 2 Jan 2003 17:25:24 -0500 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules _randommodule.c,1.1,1.2 In-Reply-To: <20030102214021.GC1996@tishler.net> References: <20030102214021.GC1996@tishler.net> Message-ID: <20030102222524.GB29873@epoch.metaslash.com> On Thu, Jan 02, 2003 at 04:40:21PM -0500, Jason Tishler wrote: > > Anyway, with the attached patch to pyport.h, I was able to build Cygwin > Python without any errors. Note this includes the new datetime module > from CVS -- not the patched one in sandbox. I think you can simplify the patch by doing: #if !defined(__CYGWIN__) #define PyAPI_FUNC(RTYPE) __declspec(dllimport) RTYPE #endif (ie, just don't define PyAPI_FUNC on line 432) PyAPI_FUNC will still get defined in the next block (445). > Now if SF could search for patches by the submitter, my job would be a > little easier... You can look for patches closed and assigned to you. That should help. But the line below (which has 35 hits) is probably faster/easier: egrep '\.tp_.* =' */*.c It may not be everything, but should be a pretty good start. Or you could grep all your checkins. :-) Neal -- > Index: pyport.h > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Include/pyport.h,v > retrieving revision 2.57 > diff -u -p -r2.57 pyport.h > --- pyport.h 28 Dec 2002 21:56:07 -0000 2.57 > +++ pyport.h 2 Jan 2003 20:51:50 -0000 > @@ -429,7 +429,11 @@ and both these use __declspec() > # else /* Py_BUILD_CORE */ > /* Building an extension module, or an embedded situation */ > /* public Python functions and data are imported */ > -# define PyAPI_FUNC(RTYPE) __declspec(dllimport) RTYPE > +# if defined(__CYGWIN__) > +# define PyAPI_FUNC(RTYPE) RTYPE > +# else /* __CYGWIN__ */ > +# define PyAPI_FUNC(RTYPE) __declspec(dllimport) RTYPE > +# endif /* __CYGWIN__ */ > # define PyAPI_DATA(RTYPE) extern __declspec(dllimport) RTYPE > /* module init functions outside the core must be exported */ > # if defined(__cplusplus) From martin@v.loewis.de Thu Jan 2 22:29:33 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 02 Jan 2003 23:29:33 +0100 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules _randommodule.c,1.1,1.2 In-Reply-To: <20030102214021.GC1996@tishler.net> References: <20030102214021.GC1996@tishler.net> Message-ID: Jason Tishler writes: > I feel this is the best approach because modules should build under > Cygwin without the standard Cygwin style patch that I have been > submitting for years. Do others concur? If so, then I will begin to > clean up the "mess" that I have created. This is what I thought a reasonable operating system and compiler should do by default, without even asking. I certainly agree that it is desirable that you can put function pointers into static structures, so if it takes additional compiler flags to make it so, then use those flags. I'm unclear why you have to *omit* the declspec, though, to make it work - I thought that __declspec(dllimport) is precisely the magic incantation that makes the compiler emit the necessary thunks. > Now if SF could search for patches by the submitter, my job would be a > little easier... Doing a full-text search on Cygwin should give a pretty good hit ratio. regards, Martin From guido@python.org Thu Jan 2 22:34:16 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 17:34:16 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Your message of "Thu, 02 Jan 2003 16:57:46 EST." <200301022157.h02Lvkq25988@europa.research.att.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <200301022157.h02Lvkq25988@europa.research.att.com> Message-ID: <200301022234.h02MYGB04029@odiug.zope.com> I'll have to think about this more. lambda is annoying to me because it adds so little compared to writing a one-line def. But it makes sense for certain programming styles. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 2 22:36:33 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 17:36:33 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Your message of "Thu, 02 Jan 2003 23:08:33 +0100." <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> Message-ID: <200301022236.h02MaXj04046@odiug.zope.com> > But I find as easy to argument about how much a conceptual > dead-weight lambda is, a waste of precious users' memory/learning > capabilities and the non purity of lambda, as to argument that > "practicality beats purity" and yes we have lambda but we are not a > Lisp, and yes it's limited (crippled in Lisper's eyes) but for that > just fine and pythonic. Good point. PBP beats TOOWTDI. :-) BTW, I don't think it's crippled any more -- we now have nested scopes, remember. --Guido van Rossum (home page: http://www.python.org/~guido/) From ark@research.att.com Thu Jan 2 22:43:45 2003 From: ark@research.att.com (Andrew Koenig) Date: Thu, 2 Jan 2003 17:43:45 -0500 (EST) Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <15892.48142.993092.418042@montanaro.dyndns.org> (message from Skip Montanaro on Thu, 2 Jan 2003 16:24:14 -0600) References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <15892.48142.993092.418042@montanaro.dyndns.org> Message-ID: <200301022243.h02Mhjr02102@europa.research.att.com> Andrew> bal = Arbno(Notany("()") | "(" + Call(lambda: bal) + ")") Skip> That looks very cool. I hope to be able to give a talk about it at the Oxford conference in April. From bac@OCF.Berkeley.EDU Thu Jan 2 22:41:43 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Thu, 2 Jan 2003 14:41:43 -0800 (PST) Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: <15892.40582.284715.290847@montanaro.dyndns.org> References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > It never occurred to me that regrtest would use the normal import mechanism > to find test files. I guess I assumed it would either tweak sys.path to > guarantee its directory was searched first or use execfile. > I have been bitten by that myself. Should this be changed? Adding ``.`` to the front of ``sys.path`` wouldn't be that big of a deal. Also, is there any desire to try to move all of the regression tests over to PyUnit? Or is the general consensus that it just isn't worth the time to move them over? -Brett From pedronis@bluewin.ch Thu Jan 2 22:45:09 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Thu, 2 Jan 2003 23:45:09 +0100 Subject: [Python-Dev] map, filter, reduce, lambda References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> Message-ID: <07f601c2b2b0$9ad47bc0$6d94fea9@newmexico> From: "Guido van Rossum" > > But I find as easy to argument about how much a conceptual > > dead-weight lambda is, a waste of precious users' memory/learning > > capabilities and the non purity of lambda, as to argument that > > "practicality beats purity" and yes we have lambda but we are not a > > Lisp, and yes it's limited (crippled in Lisper's eyes) but for that > > just fine and pythonic. > > Good point. PBP beats TOOWTDI. :-) > > BTW, I don't think it's crippled any more -- we now have nested > scopes, remember. just expressions and no statements, that's quite crippled for a Lisper :-). The whole statement/expression distinction is evil for "them", enforce readability for "us". From esr@thyrsus.com Thu Jan 2 22:48:29 2003 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 2 Jan 2003 17:48:29 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301022234.h02MYGB04029@odiug.zope.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <200301022157.h02Lvkq25988@europa.research.att.com> <200301022234.h02MYGB04029@odiug.zope.com> Message-ID: <20030102224829.GB8035@thyrsus.com> Guido van Rossum : > I'll have to think about this more. lambda is annoying to me because > it adds so little compared to writing a one-line def. But it makes > sense for certain programming styles. Indeed it does. There's a reason old LISP-heads like me have a tendency to snarl "You'll take my lambdas when you pry them from my cold, dead fingers", and it's not just orneriness. It comes a view of the world in which the property of "having a name" is orthogonal to the property of "being a function". If that's one's view of the world, writing lots of little one-line defs is not just annoying, it's a kind of coincidential cohesion or pollution. -- Eric S. Raymond From esr@thyrsus.com Thu Jan 2 22:52:17 2003 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 2 Jan 2003 17:52:17 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301022236.h02MaXj04046@odiug.zope.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> Message-ID: <20030102225217.GC8035@thyrsus.com> Guido van Rossum : > Good point. PBP beats TOOWTDI. :-) Is that a ukase of the Tsar? I'll remember it. :-) > BTW, I don't think it's crippled any more -- we now have nested > scopes, remember. Yes, that helps a lot. In particular, it greatly strengthens lambda for its practical role in anonymous callbacks for GUIs. There is still the problem thatlambdas can only capture expressions, not statements. But I've given up on that one, alas. -- Eric S. Raymond From esr@thyrsus.com Thu Jan 2 22:59:18 2003 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 2 Jan 2003 17:59:18 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <15892.48142.993092.418042@montanaro.dyndns.org> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <15892.48142.993092.418042@montanaro.dyndns.org> Message-ID: <20030102225918.GD8035@thyrsus.com> Skip Montanaro : > My observation about map and friends was just that. It seems there is > little support for apply, it having already been deprecated. Map and reduce > seem iffy. Nobody's stepped up to castigate filter, though I don't recall > seeing any defenders either. Lambda seems to be the only one of the bunch > with any strong support. It's perfect in those situations where it works, > but too limited everywhere else. Speaking for me...I would miss apply() and reduce() very little, and filter() only somewhat -- but losing map() and lambda would be *very* painful. I think that we shouldn't nibble at functional programming piecemeal, though. Either Python is going to support this or it's not. If it's not, shoot 'em all. If it is, I say keep 'em all and add curry(). I'm in favor of supporting functional programming. This is not mere zealotry on my part; I can imagine taking the opposite position if the complexity or implementation cost of those primitives were high. But my impression is that it is not. -- Eric S. Raymond From guido@python.org Thu Jan 2 23:15:29 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 18:15:29 -0500 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: Your message of "Thu, 02 Jan 2003 14:41:43 PST." References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> Message-ID: <200301022315.h02NFTP04449@odiug.zope.com> > > It never occurred to me that regrtest would use the normal import mechanism > > to find test files. I guess I assumed it would either tweak sys.path to > > guarantee its directory was searched first or use execfile. > > I have been bitten by that myself. Should this be changed? Adding ``.`` > to the front of ``sys.path`` wouldn't be that big of a deal. I agree it's surprising. Propose a patch! > Also, is there any desire to try to move all of the regression tests > over to PyUnit? Or is the general consensus that it just isn't > worth the time to move them over? I see little value in trying to move all the tests to PyUnit just for the sake of it -- some tests are better written in another way, some tests are using doctest which has its own place, and there are so many tests that it would be utterly tedious boring work. But many of the old tests suck badly (they don't test much of interest beyond the existence of API functions). It would be valuable to improve these tests, and I would recommend converting them to PyUnit (or doctest, in some cases) in the process. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 2 23:16:46 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 18:16:46 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Your message of "Thu, 02 Jan 2003 17:52:17 EST." <20030102225217.GC8035@thyrsus.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <20030102225217.GC8035@thyrsus.com> Message-ID: <200301022316.h02NGkU04469@odiug.zope.com> > Guido van Rossum : > > Good point. PBP beats TOOWTDI. :-) > > Is that a ukase of the Tsar? I'll remember it. :-) No, just an ad-hoc observation in this case. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Jan 2 23:12:06 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 03 Jan 2003 00:12:06 +0100 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301022236.h02MaXj04046@odiug.zope.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> Message-ID: <3E14C746.60709@lemburg.com> Guido van Rossum wrote: >>But I find as easy to argument about how much a conceptual >>dead-weight lambda is, a waste of precious users' memory/learning >>capabilities and the non purity of lambda, as to argument that >>"practicality beats purity" and yes we have lambda but we are not a >>Lisp, and yes it's limited (crippled in Lisper's eyes) but for that >>just fine and pythonic. > > > Good point. PBP beats TOOWTDI. :-) > > BTW, I don't think it's crippled any more -- we now have nested > scopes, remember. ... and they were added just for this reason. I was never a fan of lambda -- a simple def is just as readable, but the complexity that nested scopes added to the Python interpreter just to make lambda users happy has to be worth something. If you rip out lambda now, you might as well go back to the good old three step lookup scheme. As for the other APIs in the subject line: I really don't understand what this discussion is all about. map(), filter(), reduce() have all proven their usefulness in the past. Anyway, just my 2 cents, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From ark@research.att.com Thu Jan 2 23:11:54 2003 From: ark@research.att.com (Andrew Koenig) Date: Thu, 2 Jan 2003 18:11:54 -0500 (EST) Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <20030102225918.GD8035@thyrsus.com> (esr@thyrsus.com) References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <15892.48142.993092.418042@montanaro.dyndns.org> <20030102225918.GD8035@thyrsus.com> Message-ID: <200301022311.h02NBsj02619@europa.research.att.com> Eric> Speaking for me...I would miss apply() and reduce() very little, and filter() Eric> only somewhat -- but losing map() and lambda would be *very* painful. You can write apply, reduce, filter, and map if you want. You can't write lambda if it's not already there. Therefore, I feel more strongly about lambda than about the others. From guido@python.org Thu Jan 2 23:11:39 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 18:11:39 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Your message of "Thu, 02 Jan 2003 17:43:45 EST." <200301022243.h02Mhjr02102@europa.research.att.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <15892.48142.993092.418042@montanaro.dyndns.org> <200301022243.h02Mhjr02102@europa.research.att.com> Message-ID: <200301022311.h02NBdE04436@odiug.zope.com> > I hope to be able to give a talk about it at the Oxford conference in > April. I guess you won't make it to PyCon in DC the week before that? --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@alum.mit.edu Thu Jan 2 23:29:09 2003 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 2 Jan 2003 18:29:09 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <3E14C746.60709@lemburg.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14C746.60709@lemburg.com> Message-ID: <15892.52037.298884.948991@slothrop.zope.com> >>>>> "MAL" == mal writes: >> BTW, I don't think it's crippled any more -- we now have nested >> scopes, remember. MAL> ... and they were added just for this reason. One reason among many, yes. I've certainly written a lot of code in the last year or two that exploits nested scopes using nested functions. It's used in a lot of little corners of ZODB, like the test suite. MAL> I was never a fan MAL> of lambda -- a simple def is just as readable The seems to be a clear line here. Some folks think a def is just as readable, some folks think it is unnecessarily tedious to add a bunch of statements and invent a name. MAL> As for the other APIs in the subject line: I really don't MAL> understand what this discussion is all about. map(), filter(), MAL> reduce() have all proven their usefulness in the past. Useful enough to be builtins? They could just as easily live in a module. They could even be implemented in Python if performance weren't a big issue. Jeremy From bac@OCF.Berkeley.EDU Thu Jan 2 23:49:54 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Thu, 2 Jan 2003 15:49:54 -0800 (PST) Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: <200301022315.h02NFTP04449@odiug.zope.com> References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> <200301022315.h02NFTP04449@odiug.zope.com> Message-ID: [Guido van Rossum] > But many of the old tests suck badly (they don't test much of interest > beyond the existence of API functions). It would be valuable to > improve these tests, and I would recommend converting them to PyUnit > (or doctest, in some cases) in the process. > Is there a bug report or feature request or something listing what tests need a good reworking? -Brett From tismer@tismer.com Fri Jan 3 00:21:01 2003 From: tismer@tismer.com (Christian Tismer) Date: Fri, 03 Jan 2003 01:21:01 +0100 Subject: [Python-Dev] map, filter, reduce, lambda References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> Message-ID: <3E14D76D.4020400@tismer.com> Guido van Rossum wrote: >>But I find as easy to argument about how much a conceptual >>dead-weight lambda is, a waste of precious users' memory/learning >>capabilities and the non purity of lambda, as to argument that >>"practicality beats purity" and yes we have lambda but we are not a >>Lisp, and yes it's limited (crippled in Lisper's eyes) but for that >>just fine and pythonic. > > > Good point. PBP beats TOOWTDI. :-) > > BTW, I don't think it's crippled any more -- we now have nested > scopes, remember. I was never against lambda. map, reduce, filter, apply: They are just abstractions which take a function and let them work on arguments in certain ways. They are almost obsolete now since they can be replaced by powerful constructs like comprehensions and the very cute asterisk calls. lambda, on the other hand, *is* a powerful construct, since it provides an ad-hoc functional value. It is not restricted by having to be declared in a syntactical proper place. I think this is the major point. One of Python's strengths is that declarations are seldom needed, almost all objects can be created in-place. Not so with defs: They enforce a declaration before use, while lambda denotes a functional value. I'd even think to extend and generalize lambda. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tim@zope.com Fri Jan 3 00:26:26 2003 From: tim@zope.com (Tim Peters) Date: Thu, 2 Jan 2003 19:26:26 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <3E1403D2.3090605@lemburg.com> Message-ID: [M.-A. Lemburg] > The only right way to deal with these problems is to raise > ValueErrors. That was one of the presented alternatives, yes . > Calculations resulting in these local times are > simply doomed and should be done in UTC instead. That one isn't the issue here: DST-aware calculations under the datetime module indeed *must* convert to UTC (or some other fixed reference). Arithmetic within a single time zone is "naive" under datetime, doing no adjustments whatsoever to the date and time fields you'd expect if you had no notion of time zone. The question is what happens when you get a final result in UTC (or other fixed reference), try to convert it back to your local time zone, and the astimezone() method deduces it's the hour at the end of DST that can't be spelled in local time. > DST and local times are not mathematical properties, so you > shouldn't expect them to behave in that way. They *have* mathemetical properties, though, and very simple ones at that. The difficulty is that they're not complicated enough to give the illusion of meeting naive expectations . > For some fun reading have a look at the tzarchive package docs at: > ftp://elsie.nci.nih.gov/pub/ time zones are compressed dicts. From skip@pobox.com Fri Jan 3 00:48:13 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 2 Jan 2003 18:48:13 -0600 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> Message-ID: <15892.56781.473869.187757@montanaro.dyndns.org> Brett> Also, is there any desire to try to move all of the regression Brett> tests over to PyUnit? Or is the general consensus that it just Brett> isn't worth the time to move them over? I think this is something you probably don't want to do all at once, but as you add new tests. It's probably also best left for the earliest part of the development cycle, e.g., right after 2.3final is released. Screwing up your test suite any more than necessary during alpha/beta testing seems sort of like aiming a shotgun at your foot to me. Skip From tim@zope.com Fri Jan 3 01:04:36 2003 From: tim@zope.com (Tim Peters) Date: Thu, 2 Jan 2003 20:04:36 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <200301021255.h02CtTe19745@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido, on Britain's pecularities in 1945 and 1947] > ... > I wonder if this affects an assumption of Tim's correctness proof? I'd say it blows it all to hell. Is 1940's Britain an important use case ? Here's the most accurate and complete astimezone() method that can be written (ignoring endcases due to naive objects and insane implementations of tzinfo methods): def astimezone(self, tz): selfasutc = self.replace(tzinfo=None) - self.utcoffset() intz = selfasutc.replace(tzinfo=tz) # Make no assumptions. Since the offset-returning # methods are restricted to minute granularity and # magnitude less than one day, the only possible equivalents # in tz can be enumerated exhaustively. results = [] # list of all equivalents in tz for offset in range(1-24*60, 24*60): candidate = intz + timedelta(minutes=offset) asutc = candidate - candidate.utcoffset() if selfasutc == asutc.replace(tzinfo=None): results.append(candidate) return results Skipping all the subtleties and optimizations, the current astimezone() is more like this: def astimezone(self, tz): selfasutc = self.replace(tzinfo=None) - self.utcoffset() other = selfasutc.replace(tzinfo=tz) # Convert other to tz's notion of standard time. other += other.utcoffset() - other.dst() if self == other: # comparison works in UTC time by magic return other # Else tz must want a daylight spelling. assert other.dst() != 0 other += other.dst() if self == other: return other raise ValueError("self has no spelling in the " "target time zone") That's more efficient than trying thousands of possibilities . The crucial subtlety in the correctness proof hinges on that other.utcoffset() - other.dst() returns the same value before and after this line: other += other.utcoffset() - other.dst() But if that isn't true, the implementation is wrong. Worse, it has no legs left to stand on. Here's a happy idea: punt. We could define a new tzinfo method to be implemented by the user: def fromutc(self, dt): """Convert UTC time to local time. dt is a plain datetime or a naive datetimetz whose date and time members represent a UTC time. Return a datetimetz y, with y.tzinfo == self, representing the equivalent in tz's local time. """ Then I can raise an exception for unspellable hours in my Eastern class, and you can make them return your birthday in your Eastern class, and the two surviving members of the United Kindgom Memorial Double Daylight Society can eat watercress sandwiches at the local pub in peace . astimezone becomes: def astimezone(self, dt): return dt.fromutc((self - self.utcoffset()).replace(tzinfo=None)) From skip@pobox.com Fri Jan 3 01:08:11 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 2 Jan 2003 19:08:11 -0600 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <3E14D76D.4020400@tismer.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> Message-ID: <15892.57979.261663.613025@montanaro.dyndns.org> It just occurred to me one reason I'm not enamored of lambdas is that the keyword "lambda" has no mnemonic meaning at all the way for example that "def" is suggestive of "define". I suspect most programmers have never studied the Lambda Calculus, and other than perhaps during a brief exposure to Lisp will never have encountered the term at all. (Remember, Andrew & Eric are hardly poster children for your run-of-the-mill programmers.) Somewhat tongue-in-cheek here, and definitely not suggesting this for Python 2.x, but is there perhaps a little bit of line noise we can appropriate from Perl to replace the lambda construct? Lambda's brevity seems to be its major strength. Is there a reason def couldn't have been reused in this context? perhaps-i-should-write-a-pep-ly, y'rs, Skip From guido@python.org Fri Jan 3 01:21:18 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 20:21:18 -0500 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: Your message of "Thu, 02 Jan 2003 15:49:54 PST." References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> <200301022315.h02NFTP04449@odiug.zope.com> Message-ID: <200301030121.h031LIC06430@pcp02138704pcs.reston01.va.comcast.net> > Is there a bug report or feature request or something listing what tests > need a good reworking? Alas, no. But just look at any random test and you've got 50% to hit a bad apple. Look for tests that still have an expected output file in Lib/test/output, and the probability shoots way up. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jan 3 01:23:00 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 20:23:00 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Your message of "Fri, 03 Jan 2003 01:21:01 +0100." <3E14D76D.4020400@tismer.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> Message-ID: <200301030123.h031N0a06472@pcp02138704pcs.reston01.va.comcast.net> > I think this is the major point. One of Python's > strengths is that declarations are seldom needed, > almost all objects can be created in-place. > Not so with defs: They enforce a declaration before > use, while lambda denotes a functional value. You think of a def as a declaration. I don't: to me, it's just an assignment, no more obtrusive than using a variable to hold a subexpression that's too long to comfortably fit on a line. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim@zope.com Fri Jan 3 01:27:16 2003 From: tim@zope.com (Tim Peters) Date: Thu, 2 Jan 2003 20:27:16 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <20030102180729.GA6262@panix.com> Message-ID: [Aahz] > I'm not sure I agree. As I see it, "wall time" is for users. On the > display side, I believe that users *expect* to see > > 1:59:57 am > 1:59:58 am > 1:59:59 am > 1:00:00 am > 1:00:01 am > > I therefore see no problem with the UTC->wall clock conversion. That's defensible for some class of uses, although it may not be implementable. The tzinfo classes here are arbitrary pieces of user-written Python code, and all astimezone() can do is (1) build objects and pass them to its methods, to see what they return; and, (2) make assumptions. In particular, astimezone() has no knowledge of when DST begins or ends, or even of whether a tzinfo class believes there *is* such a thing as DST. In this case it can detect the unspellable hour, so I suppose it could add more code trying to infer what the local clock believes about it. > Going the other direction requires an explicit statement of which > timezone you're in at the point of conversion (a real timezone, > not a virtual one like "US Eastern"). As others have pointed out, C's struct tm tm_isdst flag serves this purpose for "US Eastern", and such a flag *could* be added to datetimetz objects too. The practical reality appears to be that people want hybrid classes like this, ill-defined or not. > Presumably that only occurs as a result of user input, and when you > redisplay the input as a wall clock, it should be obvious to the user > if the wrong time zone was selected because the time will be an hour > (or whatever) off. The only way this is a problem seems to be if you > want to do round-trip conversions purely programmatically. If you view Guido's appointment calendar over the web, which he enters in US Eastern these days, and want it displayed in your local time, then (a) there's nothing roundtrip about it -- it's a one-way conversion; and (b) you'll have no idea whether some items are an hour off. For a web-based group scheduling application, this isn't even a stretch. From pnorvig@google.com Fri Jan 3 01:41:26 2003 From: pnorvig@google.com (Peter Norvig) Date: Thu, 02 Jan 2003 17:41:26 -0800 Subject: [Python-Dev] map, filter, reduce, lambda References: <20030102222701.30434.69825.Mailman@mail.python.org> Message-ID: <3E14EA46.9010400@google> I use a function I call "method" for what Andrew wants (roughly): def method(name, *args): """Return a function that invokes the named method with the optional args. Ex: method('upper')('a') ==> 'A';method('count', 't')('test') ==> 2""" return lambda x: getattr(x, name)(*args) although if Andrew really wants to squeeze in an extra arg (which he didn't mention in his "virtual evaluate" syntax), then he needs something like def method(name, *static_args): return lambda x, *dyn_args: getattr(x, name)(*(dyn_args + static_args)) but the problem I have with this (as with curry) is that I can never remember the order of arguments. -Peter ark@research.att.com wrote: >Yes, I realize I could rewrite it this way: > > def evaluate(self): > def foo(obj, *args): > return obj.evaluate(self.seq, *args) > data, end = self.pat.traverse(foo, self.begin, self.data) > return data > >but I'd rather not. > >(Actually, what I really want here is to be able to write this: > > def evaluate(self): > data, end = self.pat.traverse(virtual evaluate, self.begin, self.data) > return data > From guido@python.org Fri Jan 3 01:45:53 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 20:45:53 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Thu, 02 Jan 2003 20:04:36 EST." References: Message-ID: <200301030145.h031jr506617@pcp02138704pcs.reston01.va.comcast.net> > Here's a happy idea: punt. I'd be happy to punt on the multiple DST switch points for astimezone(). > We could define a new tzinfo method to be implemented by the user: > > def fromutc(self, dt): > """Convert UTC time to local time. > > dt is a plain datetime or a naive datetimetz whose date > and time members represent a UTC time. Return a datetimetz > y, with y.tzinfo == self, representing the equivalent in > tz's local time. > """ > > Then I can raise an exception for unspellable hours in my Eastern class, and > you can make them return your birthday in your Eastern class, and the two > surviving members of the United Kindgom Memorial Double Daylight Society can > eat watercress sandwiches at the local pub in peace . > > astimezone becomes: > > def astimezone(self, dt): > return dt.fromutc((self - self.utcoffset()).replace(tzinfo=None)) But now even the simplest tzinfo implementations (those with a fixed offset) have to implement something a bit tricky: ZERO = timedelta() class fixed(tzinfo): def __init__(self, offset): self.offset = timedelta(minutes=offset) def utcoffset(self, dt): return self.offset def dst(self, dt): return ZERO def fromutc(self, dt): return datetimetz.combine(dt.date(), dt.time(), tzinfo=self) + self.offset I find that last line cumbersome. For variable-dst zones it goes way up: the DST transition boundaries have to be translated to UTC in order to be able to make sense out of them in fromutc(), but they're still needed in local time for dst(). But I agree that we could probably improve life for all involved if we changed the tzinfo implementation. Perhaps we should constrain DST-aware timezones to the most common model, where there are two offsets (standard and DST) and two transition datetime points in local standard time (DST on and DST off). A tzinfo would have methods that would let you inquire these things directly; the offsets would be simple attributes, and the DST on and off points would be returned by a method that only takes the year as input (astimezone() would have to be careful to expect DST on to be > DST off for the southern hemisphere). I suppose the Israeli implementation would have to use a table of transition points for past years and always return standard time for future years (since the Knesseth decides on the switch points each year). The implementation for British time during WWII would have to tell a little lie -- big deal. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jan 3 01:52:18 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 20:52:18 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Your message of "Thu, 02 Jan 2003 19:08:11 CST." <15892.57979.261663.613025@montanaro.dyndns.org> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> Message-ID: <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> > Somewhat tongue-in-cheek here, and definitely not suggesting this > for Python 2.x, but is there perhaps a little bit of line noise we > can appropriate from Perl to replace the lambda construct? Lambda's > brevity seems to be its major strength. Is there a reason def > couldn't have been reused in this context? You couldn't reuse def, because lambda can start an expression which can occur at the start of a line, so a line starting with def would be ambiguous (Python's parser is intentionally simple-minded and doesn't like having to look ahead more than one token). Moreover, def is short for define, and we don't really define something here. Maybe callback would work? I'd add the parentheses back to the argument list for more uniformity with def argument lists, so we'd get: lst.sort(callback(a, b): cmp(a.lower(), b.lower())) (Yes, I'm familiar with Schwartzian transform, but it's not worth the complexity if the list is short.) But I still think that to the casual programmer a two-liner looks better: def callback(a, b): return cmp(a.lower(), b.lower()) lst.sort(callback) Maybe the Lisp folks are more tolerant of nested parentheses. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From bac@OCF.Berkeley.EDU Fri Jan 3 02:02:59 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Thu, 2 Jan 2003 18:02:59 -0800 (PST) Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: <200301030121.h031LIC06430@pcp02138704pcs.reston01.va.comcast.net> References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> <200301022315.h02NFTP04449@odiug.zope.com> <200301030121.h031LIC06430@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > Is there a bug report or feature request or something listing what tests > > need a good reworking? > > Alas, no. But just look at any random test and you've got 50% to hit > a bad apple. Look for tests that still have an expected output file > in Lib/test/output, and the probability shoots way up. > But I did that and the testing suite was rejected (rewrote test_thread to use the dummy_thread testing suite), so I managed to still miss. =) -Brett From tim@zope.com Fri Jan 3 02:12:11 2003 From: tim@zope.com (Tim Peters) Date: Thu, 2 Jan 2003 21:12:11 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <200301030145.h031jr506617@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Tim suggests a tzinfo.fromutc() method] [Guido] > But now even the simplest tzinfo implementations (those with a fixed > offset) have to implement something a bit tricky: > > ZERO = timedelta() > class fixed(tzinfo): > def __init__(self, offset): > self.offset = timedelta(minutes=offset) > def utcoffset(self, dt): > return self.offset > def dst(self, dt): > return ZERO > def fromutc(self, dt): > return datetimetz.combine(dt.date(), dt.time(), tzinfo=self) > + self.offset > > I find that last line cumbersome. It is. Since astimezone() will only call it with a datetimetz argument, it could be simplified to return dt.replace(tzinfo=self) + self.offset For that matter, astimezone could attach self to dt before calling this, so that the user implementation becomes return dt + self.offset > For variable-dst zones it goes way up: the DST transition > boundaries have to be translated to UTC in order to be able to make > sense out of them in fromutc(), but they're still needed in local time > for dst(). Some time zones have to translate them to UTC in dst() anyway; your EU.py does this, for example. > But I agree that we could probably improve life for all involved if we > changed the tzinfo implementation. Perhaps we should constrain > DST-aware timezones to the most common model, where there are two > offsets (standard and DST) and two transition datetime points in local > standard time (DST on and DST off). A tzinfo would have methods that > would let you inquire these things directly; the offsets would be > simple attributes, and the DST on and off points would be returned by > a method that only takes the year as input (astimezone() would have to > be careful to expect DST on to be > DST off for the southern > hemisphere). I suppose the Israeli implementation would have to use a > table of transition points for past years and always return standard > time for future years (since the Knesseth decides on the switch points > each year). The implementation for British time during WWII would > have to tell a little lie -- big deal. Well, I'm not sure that adding a bunch of new methods is going to make life easier for users. Another idea: fromutc() is just a defined hook. If a tzinfo object has it, astimezone() will use it and believe whatever it returns. If it's not defined, then the current implementation is used -- and it should work for any tz where tz.utcoffset(dt)-tz.dst(dt) is invariant wrt dt. That covers the only worlds I live in . If we constrain what a tzinfo class *can* do to fit a one-size-fits-all implementation, then we may as well steal Java's SimpleTimeZone API and let users get away without coding anything (in return, they get to spend days trying to guess what all the constructor arguments really mean ). From aahz@pythoncraft.com Fri Jan 3 02:13:50 2003 From: aahz@pythoncraft.com (Aahz) Date: Thu, 2 Jan 2003 21:13:50 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: References: <20030102180729.GA6262@panix.com> Message-ID: <20030103021350.GA14851@panix.com> On Thu, Jan 02, 2003, Tim Peters wrote: > [Aahz] >> >> I'm not sure I agree. As I see it, "wall time" is for users. On the >> display side, I believe that users *expect* to see >> >> 1:59:57 am >> 1:59:58 am >> 1:59:59 am >> 1:00:00 am >> 1:00:01 am >> >> I therefore see no problem with the UTC->wall clock conversion. > > That's defensible for some class of uses, although it may not be > implementable. The tzinfo classes here are arbitrary pieces of > user-written Python code, and all astimezone() can do is (1) build > objects and pass them to its methods, to see what they return; and, > (2) make assumptions. In particular, astimezone() has no knowledge of > when DST begins or ends, or even of whether a tzinfo class believes > there *is* such a thing as DST. > > In this case it can detect the unspellable hour, so I suppose it could > add more code trying to infer what the local clock believes about it. It sure sounds like you're saying that the DateTime module can't handle timezone conversions at all. If it can, I don't understand what extra ambiguity (in human terms) is introduced by DST, as long as there are no round-trip conversions. >> Presumably that only occurs as a result of user input, and when you >> redisplay the input as a wall clock, it should be obvious to the user >> if the wrong time zone was selected because the time will be an hour >> (or whatever) off. The only way this is a problem seems to be if you >> want to do round-trip conversions purely programmatically. > > If you view Guido's appointment calendar over the web, which he > enters in US Eastern these days, and want it displayed in your local > time, then (a) there's nothing roundtrip about it -- it's a one-way > conversion; and (b) you'll have no idea whether some items are an hour > off. For a web-based group scheduling application, this isn't even a > stretch. I'm confused: are you saying that Guido's calendar doesn't get internally converted to UTC? If not, then the chart from your message starting this thread doesn't apply, because here you're talking about converting from wall clock to wall clock, which I think is a conversion that makes no sense -- that perhaps would be a suitable occasion for an exception. If Guido's appointment does get converted to UTC, then there's always a sensible (if possibly ambiguous) conversion to wall time, and if he looks at his calendar, he should be able to see if his appointment got converted to UTC correctly. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "There are three kinds of lies: Lies, Damn Lies, and Statistics." --Disraeli From guido@python.org Fri Jan 3 02:15:50 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 21:15:50 -0500 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: Your message of "Thu, 02 Jan 2003 18:02:59 PST." References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> <200301022315.h02NFTP04449@odiug.zope.com> <200301030121.h031LIC06430@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200301030215.h032Fo006797@pcp02138704pcs.reston01.va.comcast.net> > > > Is there a bug report or feature request or something listing what tests > > > need a good reworking? > > > > Alas, no. But just look at any random test and you've got 50% to hit > > a bad apple. Look for tests that still have an expected output file > > in Lib/test/output, and the probability shoots way up. > > But I did that and the testing suite was rejected (rewrote test_thread to > use the dummy_thread testing suite), so I managed to still miss. =) Maybe that was just a poor choice? A test suite for threads that can run just as well using dummy_thread seems to me one that doesn't really exercise threads that much... A good example of a lousy test is test_cmath.py. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Fri Jan 3 02:19:37 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 2 Jan 2003 20:19:37 -0600 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15892.62265.348753.317968@montanaro.dyndns.org> >> Is there a reason def couldn't have been reused in this context? Guido> You couldn't reuse def, because lambda can start an expression Guido> which can occur at the start of a line, so a line starting with Guido> def would be ambiguous (Python's parser is intentionally Guido> simple-minded and doesn't like having to look ahead more than one Guido> token). I'll leave that gauntlet thrown, since I have no interest in rewriting Python's parser. Maybe it will spark John Aycock's interest though. ;-) Guido> Moreover, def is short for define, and we don't really define Guido> something here. Yes we do, we define a function, we just don't associate it with a name. In theory the following two function definitions could be equivalent: def f(a,b): return a+b f = def(a,b): a+b and except for that pesky two-token lookahead would be possible. Guido> Maybe callback would work? I'd add the parentheses back to the Guido> argument list for more uniformity with def argument lists, so Guido> we'd get: I agree about adding back the parens, but not the name. Lambdas are used for more than callbacks. Returning to a previous gauntlet I threw which nobody picked up, I came up with this replacement for lambda. I call it "lamduh". import new import sys def lamduh(args, expr): argstr = ", ".join(map(str, args)) uplevel = sys._getframe(1) # for you tcl fans... g = {} exec "def f(%s): return (%s)" % (argstr, expr) in g f = g['f'] return new.function(f.func_code, uplevel.f_globals, "", f.func_defaults, f.func_closure) And here's some example usage: >>> f = lamduh.lamduh((), "bal") >>> f at 0xca0bc8> >>> bal = 2 >>> f() 2 >>> bal = 3 >>> f() 3 >>> bal = f >>> f() at 0xca0bc8> >>> g = lamduh.lamduh(('a','b'), "a+b") >>> g(1,2) 3 >>> g("arm", "oir") 'armoir' Skip From guido@python.org Fri Jan 3 02:29:06 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 21:29:06 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Thu, 02 Jan 2003 21:13:50 EST." <20030103021350.GA14851@panix.com> References: <20030102180729.GA6262@panix.com> <20030103021350.GA14851@panix.com> Message-ID: <200301030229.h032T6O06847@pcp02138704pcs.reston01.va.comcast.net> > I'm confused: are you saying that Guido's calendar doesn't get > internally converted to UTC? That's correct. I use a Palmpilot, which has no knowledge of timezones or DST at all. In fact, I write down appointments in whatever local time I will be in on a specific date, i.e. if I were to fly to California to meet with you at 1pm on some date, I'd put that down for 1pm on that date; on the plane I'll set the Palmpilot's clock back three hours. I've heard of a calendar application on Windows (Outlook?) that records all your appointments in UTC and displays them in local time. That means that when you are temporarily in a different timezone and you change your computer's clock to match local time, all your appointments are displayed in that local time -- which means that if you want to look up an appointment next week when you're back home, it is shown wrong! (E.g. a 1pm appointment in Washington DC would show up as 10am when the user is in San Francisco -- very confusing when you're in SF talking on the phone to a potential customer on the east coast!) > If not, then the chart from your message starting this thread > doesn't apply, because here you're talking about converting from > wall clock to wall clock, which I think is a conversion that makes > no sense -- that perhaps would be a suitable occasion for an > exception. You can get from wall clock + tzinfo to UTC quite easily using the utcoffset() method of the tzinfo -- that's what it is for. So wall clock to wall clock conversion reduces to UTC to wall clock conversion anyway. > If Guido's appointment does get converted to UTC, then there's always a > sensible (if possibly ambiguous) conversion to wall time, and if he > looks at his calendar, he should be able to see if his appointment got > converted to UTC correctly. No -- why would I care about UTC or even know the UTC offset of my wall clock time? If I make an appointment for early next April, why should I have to know whether DST applies or not at that particular date? --Guido van Rossum (home page: http://www.python.org/~guido/) From esr@thyrsus.com Fri Jan 3 02:32:11 2003 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 2 Jan 2003 21:32:11 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> References: <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030103023211.GA9050@thyrsus.com> Guido van Rossum : > Maybe the Lisp folks are more tolerant of nested parentheses. :-) Um. Well. Yes. "callback" is longer than "lambda". This makes it Not An Improvement. -- Eric S. Raymond From guido@python.org Fri Jan 3 02:47:28 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 21:47:28 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Thu, 02 Jan 2003 21:12:11 EST." References: Message-ID: <200301030247.h032lSc06970@pcp02138704pcs.reston01.va.comcast.net> > [Guido] > > But now even the simplest tzinfo implementations (those with a fixed > > offset) have to implement something a bit tricky: > > > > ZERO = timedelta() > > class fixed(tzinfo): > > def __init__(self, offset): > > self.offset = timedelta(minutes=offset) > > def utcoffset(self, dt): > > return self.offset > > def dst(self, dt): > > return ZERO > > def fromutc(self, dt): > > return datetimetz.combine(dt.date(), dt.time(), tzinfo=self) > > + self.offset > > > > I find that last line cumbersome. > > It is. Since astimezone() will only call it with a datetimetz > argument, That's not how I read your previous mail, but it's certainly more sensible, given that astimezone() is primarily a method on a datetimetz. :-) > it could be simplified to > > return dt.replace(tzinfo=self) + self.offset > > For that matter, astimezone could attach self to dt before calling > this, so that the user implementation becomes > > return dt + self.offset > > > For variable-dst zones it goes way up: the DST transition > > boundaries have to be translated to UTC in order to be able to make > > sense out of them in fromutc(), but they're still needed in local time > > for dst(). > > Some time zones have to translate them to UTC in dst() anyway; your > EU.py does this, for example. Agreed. DST-aware tzinfo classes will always be complex. > > But I agree that we could probably improve life for all involved > > if we changed the tzinfo implementation. Perhaps we should > > constrain DST-aware timezones to the most common model, where > > there are two offsets (standard and DST) and two transition > > datetime points in local standard time (DST on and DST off). A > > tzinfo would have methods that would let you inquire these things > > directly; the offsets would be simple attributes, and the DST on > > and off points would be returned by a method that only takes the > > year as input (astimezone() would have to be careful to expect DST > > on to be > DST off for the southern hemisphere). I suppose the > > Israeli implementation would have to use a table of transition > > points for past years and always return standard time for future > > years (since the Knesseth decides on the switch points each year). > > The implementation for British time during WWII would have to tell > > a little lie -- big deal. > > Well, I'm not sure that adding a bunch of new methods is going to > make life easier for users. I meant to use these *only* and throw away the other methods. (tzname() would have to be replaced by two string attributes, e.g. stdname and dstname, or perhaps a sequence of two strings, like the time module does in its tzname variable.) > Another idea: fromutc() is just a defined hook. If a tzinfo object > has it, astimezone() will use it and believe whatever it returns. > If it's not defined, then the current implementation is used -- and > it should work for any tz where tz.utcoffset(dt)-tz.dst(dt) is > invariant wrt dt. That covers the only worlds I live in . > > If we constrain what a tzinfo class *can* do to fit a > one-size-fits-all implementation, then we may as well steal Java's > SimpleTimeZone API and let users get away without coding anything > (in return, they get to spend days trying to guess what all the > constructor arguments really mean ). I want to be able to model at least table-based historic DST transitions, which SimpleTimeZone doesn't do. Its abstract base class, TimeZone, allows you to write a concrete subclass that *does* support historic DST transitions. But the mess Java made here is obvious in the number of deprecated methods and constructors, and I'd like to stay away from it as far as possible. (January==0? Gimme a break! :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From tim@zope.com Fri Jan 3 02:56:16 2003 From: tim@zope.com (Tim Peters) Date: Thu, 2 Jan 2003 21:56:16 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <20030103021350.GA14851@panix.com> Message-ID: [Aahz] > It sure sounds like you're saying that the DateTime module can't handle > timezone conversions at all. Nope, overall it does very well. Go back to the original msg: there are glitches during two hours per year, for tzinfo classes that try to model both daylight and standard time. That's it. > If it can, I don't understand what extra ambiguity (in human terms) > is introduced by DST, as long as there are no round-trip conversions. Sorry, I can't explain this any better than it's already been explained. There's no ambiguity in EST or EDT on their own, there is ambiguity twice a year in a single class that tries to combine both. > ... > I'm confused: are you saying that Guido's calendar doesn't get > internally converted to UTC? I'd say that whether it is is irrevelant to the example. Just as there are hours in UTC that can't be spelled unambiguously in US Eastern, there are hours in US Eastern than can't be spelled unambiguously in any other hybrid (daylight+standard) time zone. These problems have to do with the target time system, not with the originating time system. > If not, then the chart from your message starting this thread doesn't > apply, because here you're talking about converting from wall clock to > wall clock, which I think is a conversion that makes no sense Of course it does: at any moment, you can call Guido, and if he answers the phone you can ask him what his wall clock says. He doesn't have to convert his local time to UTC to answer the question, and neither do you. There's "almost never" a problem with this, either. > -- that perhaps would be a suitable occasion for an exception. It seems to depend on the app a person has in mind. > If Guido's appointment does get converted to UTC, then there's always a > sensible (if possibly ambiguous) conversion to wall time, and if he > looks at his calendar, he should be able to see if his appointment got > converted to UTC correctly. Fine, pretend Guido lives in Greenwich and UTC is his local time. The ambiguity remains, and it doesn't matter whether Guido can detect it: by hypothesis, *you're* the one looking at his calendar, and in your local time. If an appointment then shows up as starting at 1:00 in your local time on the day daylight time ends for you, you're most likely to believe that's your daylight time (assuming you live in the US and in an area that observes DST). It may or may not be in reality, and whether that matters depends on the use you make of the information. From aahz@pythoncraft.com Fri Jan 3 02:57:03 2003 From: aahz@pythoncraft.com (Aahz) Date: Thu, 2 Jan 2003 21:57:03 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <200301030229.h032T6O06847@pcp02138704pcs.reston01.va.comcast.net> References: <20030102180729.GA6262@panix.com> <20030103021350.GA14851@panix.com> <200301030229.h032T6O06847@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030103025703.GA29127@panix.com> On Thu, Jan 02, 2003, Guido van Rossum wrote: >Aahz: >> >> I'm confused: are you saying that Guido's calendar doesn't get >> internally converted to UTC? > > That's correct. I use a Palmpilot, which has no knowledge of > timezones or DST at all. In fact, I write down appointments in > whatever local time I will be in on a specific date, i.e. if I were to > fly to California to meet with you at 1pm on some date, I'd put that > down for 1pm on that date; on the plane I'll set the Palmpilot's clock > back three hours. > > I've heard of a calendar application on Windows (Outlook?) that > records all your appointments in UTC and displays them in local time. > That means that when you are temporarily in a different timezone and > you change your computer's clock to match local time, all your > appointments are displayed in that local time -- which means that if > you want to look up an appointment next week when you're back home, it > is shown wrong! (E.g. a 1pm appointment in Washington DC would show > up as 10am when the user is in San Francisco -- very confusing when > you're in SF talking on the phone to a potential customer on the east > coast!) All that is true and makes perfect sense -- but then Tim's comment about converting your calendar to someone else's wall clock makes no sense. >> If not, then the chart from your message starting this thread >> doesn't apply, because here you're talking about converting from >> wall clock to wall clock, which I think is a conversion that makes >> no sense -- that perhaps would be a suitable occasion for an >> exception. > > You can get from wall clock + tzinfo to UTC quite easily using the > utcoffset() method of the tzinfo -- that's what it is for. So wall > clock to wall clock conversion reduces to UTC to wall clock > conversion anyway. No, because there's no information about *which* wall clock is being used. As you yourself said above, you casually enter times for appointments in the local time that you expect to use them, without recording the extra information needed to make conversions. >> If Guido's appointment does get converted to UTC, then there's always a >> sensible (if possibly ambiguous) conversion to wall time, and if he >> looks at his calendar, he should be able to see if his appointment got >> converted to UTC correctly. > > No -- why would I care about UTC or even know the UTC offset of my > wall clock time? If I make an appointment for early next April, why > should I have to know whether DST applies or not at that particular > date? You don't -- but when your application displays wall time, you'd see a discrepency from the time you entered. This only applies when there's a consistent system of timezone information. You can't do date/time arithmetic and conversions with pure wall clock. Period. Python should raise exceptions if you try. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "There are three kinds of lies: Lies, Damn Lies, and Statistics." --Disraeli From barry@python.org Fri Jan 3 03:02:38 2003 From: barry@python.org (Barry A. Warsaw) Date: Thu, 2 Jan 2003 22:02:38 -0500 Subject: [Python-Dev] map, filter, reduce, lambda References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> Message-ID: <15892.64846.212924.977128@gargle.gargle.HOWL> >>>>> "SM" == Skip Montanaro writes: SM> Somewhat tongue-in-cheek here, and definitely not suggesting SM> this for Python 2.x, but is there perhaps a little bit of line SM> noise we can appropriate from Perl to replace the lambda SM> construct? Lambda's brevity seems to be its major strength. SM> Is there a reason def couldn't have been reused in this SM> context? For Py3K, I might suggest "anon" instead of lambda, especially if the construct were expanded to allow statements. -Barry From aahz@pythoncraft.com Fri Jan 3 03:08:40 2003 From: aahz@pythoncraft.com (Aahz) Date: Thu, 2 Jan 2003 22:08:40 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: References: <20030103021350.GA14851@panix.com> Message-ID: <20030103030840.GB29127@panix.com> On Thu, Jan 02, 2003, Tim Peters wrote: > [Aahz] >> >> It sure sounds like you're saying that the DateTime module can't handle >> timezone conversions at all. > > Nope, overall it does very well. Go back to the original msg: there > are glitches during two hours per year, for tzinfo classes that try to > model both daylight and standard time. That's it. > >> If it can, I don't understand what extra ambiguity (in human terms) >> is introduced by DST, as long as there are no round-trip conversions. > > Sorry, I can't explain this any better than it's already been > explained. There's no ambiguity in EST or EDT on their own, there is > ambiguity twice a year in a single class that tries to combine both. So there's ambiguity. So what? What I don't understand is why it's a problem. More precisely, I see these problems existing in the absence of computers, and I don't see where creating a Python DateTime class creates any more problems or makes the existing problems worse -- as long as you don't try to convert between timezones in the absence of sufficient information. >> If not, then the chart from your message starting this thread doesn't >> apply, because here you're talking about converting from wall clock to >> wall clock, which I think is a conversion that makes no sense > > Of course it does: at any moment, you can call Guido, and if he > answers the phone you can ask him what his wall clock says. He > doesn't have to convert his local time to UTC to answer the question, > and neither do you. There's "almost never" a problem with this, > either. But that's not a conversion. >> If Guido's appointment does get converted to UTC, then there's always a >> sensible (if possibly ambiguous) conversion to wall time, and if he >> looks at his calendar, he should be able to see if his appointment got >> converted to UTC correctly. > > Fine, pretend Guido lives in Greenwich and UTC is his local time. The > ambiguity remains, and it doesn't matter whether Guido can detect it: > by hypothesis, *you're* the one looking at his calendar, and in your > local time. If an appointment then shows up as starting at 1:00 in > your local time on the day daylight time ends for you, you're most > likely to believe that's your daylight time (assuming you live in > the US and in an area that observes DST). It may or may not be in > reality, and whether that matters depends on the use you make of the > information. >From my POV, this problem exists regardless of whether a computer mediates the transaction. The most likely error (if one happens) is that someone shows up an hour early for the appointment, and presumably that person knows that the day is a DST transition. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "There are three kinds of lies: Lies, Damn Lies, and Statistics." --Disraeli From esr@thyrsus.com Fri Jan 3 03:02:16 2003 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 2 Jan 2003 22:02:16 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <15892.64846.212924.977128@gargle.gargle.HOWL> References: <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <15892.64846.212924.977128@gargle.gargle.HOWL> Message-ID: <20030103030216.GB9221@thyrsus.com> Barry A. Warsaw : > For Py3K, I might suggest "anon" instead of lambda, especially if the > construct were expanded to allow statements. Speaking as one of the unregenerate LISP-heads, I don't care what it's called as long as it's *there*. -- Eric S. Raymond From Anthony Baxter Fri Jan 3 03:15:58 2003 From: Anthony Baxter (Anthony Baxter) Date: Fri, 03 Jan 2003 14:15:58 +1100 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <15892.57979.261663.613025@montanaro.dyndns.org> Message-ID: <200301030315.h033Fwv03939@localhost.localdomain> >>> Skip Montanaro wrote > > It just occurred to me one reason I'm not enamored of lambdas Well, I know the reason _I'm_ not enamored of them is that I keep spelling the little sucker 'lamdba'. From barry@python.org Fri Jan 3 03:19:32 2003 From: barry@python.org (Barry A. Warsaw) Date: Thu, 2 Jan 2003 22:19:32 -0500 Subject: [Python-Dev] map, filter, reduce, lambda References: <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <15892.64846.212924.977128@gargle.gargle.HOWL> <20030103030216.GB9221@thyrsus.com> Message-ID: <15893.324.121003.351526@gargle.gargle.HOWL> >>>>> "ESR" == Eric S Raymond writes: >> For Py3K, I might suggest "anon" instead of lambda, especially >> if the construct were expanded to allow statements. ESR> Speaking as one of the unregenerate LISP-heads, I don't care ESR> what it's called as long as it's *there*. Oh well if that's the case, then I'll suggest floob_boober_bab_boober_bubs. :) -Barry From guido@python.org Fri Jan 3 04:12:58 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 02 Jan 2003 23:12:58 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Thu, 02 Jan 2003 21:29:06 EST." Message-ID: <200301030412.h034Cw207328@pcp02138704pcs.reston01.va.comcast.net> Let me present the issue differently. On Sunday, Oct 27, 2002, both 5:30 UTC and 6:30 UTC map to 1:30 am US Eastern time: 5:30 UTC maps to 1:30 am EDT, and 6:30 UTC maps to 1:30 am EST. (UTC uses a 24 hour clock.) We have a tzinfo subclass representing the US Eastern (hybrid) timezone whose primary responsibility is to translate from local time in that timezone to UTC: Eastern.utcoffset(dt). It can also tell us how much of that offset is due to DST: Eastern.dst(dt). It is crucial to understand that with "Eastern" as tzinfo, there is only *one* value for 1:30 am on Oct 27, 2002. The Eastern tzinfo object arbitrarily decides this is EDT, so it maps to 5:30 UTC. (But the problem would still exist if it arbitrarily decided that it was EST.) It is also crucial to understand that we have no direct way to translate UTC to Eastern. We only have a direct way to translate Eastern to UTC (by subtracting Eastern.utcoffset(dt)). Usually, the utcoffset() for times that differ only a few hours is the same, so we can approximate the reverse mapping (i.e. from UTC to Eastern) by using the utcoffset() for the input time and assuming that it is the same as that for the output time. Code for this is: # Initially dt is a datetimetz expressed in UTC whose tzinfo is None dt = dt.replace(tzinfo=Eastern) dt = dt + Eastern.utcoffset(dt) This is not sufficient, however, close to the DST switch. For example, let's try this with an initial dt value of 4:30 am UTC on Oct 27, 2002. The code applies the UTC offset corresponding to 4:30 am Eastern, which is -5 hours (EST), so the result is 11:30 pm the previous day. But this is wrong! 11:30 pm Eastern that day is in DST, so the UTC offset should be -4 hours. We can know we must make a correction, because we can compare the UTC offset of the result to the UTC offset of the input, and see that they differ. But what correction to make? The problem is that when the input is 6:30 UTC, the result is 1:30 am Eastern, which is still taken to be EDT. If we apply the same correction as we did for 4:30 UTC, we get 2:30 am Eastern, but that's wrong, because that's in EST, corresponding to 7:30 UTC. But if we don't apply a correction, and stick with 1:30 am Eastern, we've got a time that corresponds to to 5:30 UTC. So what time corresponds to 6:30 UTC? The problem for astimezone() is to come up with the correct result whenever it can, and yet somehow to fudge things so that 6:30 UTC gets translated to 1:30 Eastern. And astimezone() must not make many assumptions about the nature of DST. It can assume that the DST correction is >= 0 and probably less than 10 hours or so, and that DST changes don't occur more frequently than twice a year (once on and once off), and that the DST correction is constant during the DST period, and that the only variation in UTC offset is due to DST. But Tim has already taken all of that into account -- read his proof at the end of datetime.py: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondist/sandbox/datetime/datetime.py?rev=1.140&content-type=text/vnd.viewcvs-markup Can you do better? --Guido van Rossum (home page: http://www.python.org/~guido/) From bbum@codefab.com Fri Jan 3 03:40:17 2003 From: bbum@codefab.com (Bill Bumgarner) Date: Thu, 2 Jan 2003 22:40:17 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <20030103030915.11917.29684.Mailman@mail.python.org> Message-ID: <138B18FE-1ECD-11D7-A03A-000393877AE4@codefab.com> On Thursday, Jan 2, 2003, at 22:09 US/Eastern, python-dev-request@python.org wrote: > For Py3K, I might suggest "anon" instead of lambda, especially if the > construct were expanded to allow statements. If I weren't reading this list via digests on a 9,600 bps link that is only up three times a day, I would have beat Barry to that suggestion about 10 messages ago... ;-) +1 When I originally had lambda calculus thoroughly pounded into my head w/the anonymous clue-by-four, it didn't really click until someone said 'lambda functions are just anonymous functions'. Right. So why the heck are they called 'lambda'? Merely to make the less clueful CS students feel even more lost than they otherwise might? 'anon' sounds like a great name -- unlikely to be used, shorter than 'lambda', and a heck of lot more indicative as to what is going on. I'd just as soon live without the parens and *I* came from a Lisp/Scheme environment when I learned All About the Wonderful World of Lambda. b.bum From bbum@codefab.com Fri Jan 3 04:03:07 2003 From: bbum@codefab.com (Bill Bumgarner) Date: Thu, 2 Jan 2003 23:03:07 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <20030103021301.24528.81186.Mailman@mail.python.org> Message-ID: <44577C94-1ED0-11D7-A03A-000393877AE4@codefab.com> On Thursday, Jan 2, 2003, at 21:13 US/Eastern, python-dev-request@python.org wrote: > map, reduce, filter, apply: They are just abstractions > which take a function and let them work on arguments > in certain ways. They are almost obsolete now > since they can be replaced by powerful constructs > like comprehensions and the very cute asterisk calls. Short answer: -1 for removing map/reduce/filter/apply unless doing so will vastly improve the Python core (i.e. less size, greater speed, smaller footprint -- some combination in a usefully large fashion). Personally, I find comprehensions and very cute asterisk calls to be incredibly unreadable. map(), reduce(), filter() and apply() are all easy to read and provide a keyword via which determining their meaning in the documentation is a trivial task. Comprehensions were incomprehensible the first 10 times I ran into them. A few years ago and spanning a number of years before that, I did a huge amount of Python programming. I ended up taking about a 2.5 year break from serious Python programming-- not for any particularly good reason, in hindsight-- and returned to Python in the last year. Somewhere in that break, list comprehensions were added. This is comprehensible, but looks alien in light of everything Python I had learned in the past (substitute a simple, but real world, comprehension into here): [x for x in range(0,3)] This hurt when I had to figure it out and the essence of perl seemed rife within: [(x,y) for x in range(0,3) for y in range(0,3)] And this just seemed to be a positively perlescent way of doing things: [(x,y) for x in range(0,4) if x is not 2 for y in range(0,4) if y is not 2] Yuck, yuck, yuck! And it isn't just me! I have a number of peers who have deep roots in CS, QA, and/or Project Management with whom we tend to discuss random computing issues.... every single one agreed that list comprehensions are mighty powerful, but have gnarly syntax that is not in line with the general "zen" of Python. The asterisk stuff seems to be a similar bit of syntactic magic that is equally as baffling to both the novice and to experienced Python programmers that have missed a year or two of the language's evolution. As a friend would say-- and not in a good way-- "That's positively Perlescent!". I know there isn't a snowballs chance in hell of these things changing anytime in the future... but before other similar features/extensions are added [dictionary comprehensions, for example], I beg of the community to step back and ask how much power is really gained in balance against the total confusion that may be caused? Python is a brilliant teaching tool-- I have taught OO programming to a number of people through the use of Python without them feeling overtly challenged by mechanics and esoterica that was outside of the focus of learning. Comprehensions and the asterisk notation have definitely taken away from the elegant simplicity of the language. Both very powerful constructs that I have grown relatively comfortable with. With power, comes a price... the result of using such constructs is that my code has become significantly less approachable by relative newcomers to Python-- including my co-workers. It also means that /usr/lib/python*/*.py has become less of a source of learning for folks new to python. There is a lot of power in the core set of functional API found in the builtins. Probably the most powerful feature of things like map/filter/apply/reduce are that they are easily approached and consumed by the Python newcomer, thereby making the language more attractive by making it easier to do powerful things without learning esoteric/alien syntax. (The other construct that bent my brain for a bit were generators-- only because of what they did, not because of the pythonic implementation. Very cool. I like generators.) b.bum From gisle@ActiveState.com Fri Jan 3 05:42:31 2003 From: gisle@ActiveState.com (Gisle Aas) Date: 02 Jan 2003 21:42:31 -0800 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <138B18FE-1ECD-11D7-A03A-000393877AE4@codefab.com> References: <138B18FE-1ECD-11D7-A03A-000393877AE4@codefab.com> Message-ID: Bill Bumgarner writes: > When I originally had lambda calculus thoroughly pounded into my head > w/the anonymous clue-by-four, it didn't really click until someone > said 'lambda functions are just anonymous functions'. > > Right. So why the heck are they called 'lambda'? Merely to make the > less clueful CS students feel even more lost than they otherwise might? > > 'anon' sounds like a great name -- unlikely to be used, shorter than > 'lambda', and a heck of lot more indicative as to what is going on. 'anon' does not sound that great to me. Anon what? There is lots of anonymous stuff. Arc is going for 'fn'. I would vote for 'sub' :) Regards, Gisle @ ActiveState.com From brett@python.org Fri Jan 3 06:23:25 2003 From: brett@python.org (Brett Cannon) Date: Thu, 2 Jan 2003 22:23:25 -0800 (PST) Subject: [Python-Dev] python-dev Summary for 2002-12-16 through 2002-12-31 Message-ID: Sorry this is late; enjoyed the holidays. =) As usual, people have about a day to comment and point out how imperfect I am before I send this off to the big world out there. +++++++++++++++++++++++++++++++++++++++++++++++++++++ python-dev Summary for 2002-12-15 through 2002-12-31 +++++++++++++++++++++++++++++++++++++++++++++++++++++ ====================== Summary Announcements ====================== `Python 2.3a1`_ has been released! Please download it and try it out. The testing of the code helps python-dev a lot. So at the very least download it and run the test suite to see if any errors or failures come up. I did skip a thread on some proposed changes to ``ConfigParser``. If you care about that module you can read the thread at http://mail.python.org/pipermail/python-dev/2002-December/031505.html . Those of you viewing this on the web or by running it through Docutils will have noticed the table of contents preceding this section. I am giving this a try to see if I like it. If you have an opinion, let me know. Go to PyCon_ (and yes, I am going to say this every summary until PyCon is upon us)! .. _Python 2.3a1: http://www.python.org/2.3/ .. _PyCon: http://www.python.org/pycon/ ====================================================== `Adding decimal (aka FixedPoint) numbers to Python`__ ====================================================== __ http://mail.python.org/pipermail/python-dev/2002-December/031171.html Michael McLay brought to the attention of the list `patch #653938I`_ which not only adds the FixedPoint_ module (originally by Tim Peters) to the library but the ``d`` suffix for numeric constants to be used for numbers to be of the FixedPoint type. For those of you unfamiliar with fixed-point math, it is basically decimal math where the issue of rounding and representation are severely cut back. It would allow 0.3 + 0.3 to actually equal 0.6 and not 0.59999999999. Consensus was quickly reached that Michael should first work on getting the module accepted into the language. This is done so that python-dev can gauge the usefulness of the module to the rest of the world and thus see if integrating it into the language is called for. It was also said that a PEP should be written (which Michael's first email practically is). There was a discussion of what a constructor method should be named that finally came down to Guido saying fixedpoint, fixed, or fixpoint all are acceptable. If you want to get into the nitty-gritty of this discussion read Michael's and Tim's emails from the threads. You can also look at http://www2.hursley.ibm.com/decimal/ for another way to do fixed-point. .. _patch #653938I: http://www.python.org/sf/653938I .. _FixedPoint: http://fixedpoint.sf.net/ ========================================= ``New Import Hooks PEP, a first draft`__ ========================================= __ http://mail.python.org/pipermail/python-dev/2002-December/031322.html Related threads: - `Zip Import and sys.path manipulation `__ - `Zipping Zope3 `__ - `Write All New Import Hooks (PEP 302) in Python, Not C `__ - `Packages and __path__ `__ - `PEP 203 and __path__ `__ So the thread that I am supposedly summarizing here is not the first thread in terms of chronology, but it has the best title, so I am using that as the thread to be summarized. Once again, the new import hooks mechanism was a big topic of discussion. For the latter half of this month most of the discussions were around `PEP 302`_ and its rough draft. It was accepted and was merged into the CVS tree in time for `Python 2.3a1`_. Since the implementation has already been accepted I will not summarize it here nor its objections since the PEP does an admirable job of covering all the bases. The other big discussion that was brought up was whether ``__path__`` should be removed or at least discourage the modification of it. Just van Rossum pushed for the removal since it would simplify the import hooks and he didn't see the use. But Guido said that Zope used it and that it can be useful, so it is staying. This discussion is what caused the creation of the pkgutil_ module. .. _PEP 302: http://www.python.org/peps/pep-0302.html .. _pkgutil: http://www.python.org/doc/2.3a1/lib/module-pkgutil.html ======================================== `known obvious thing or bug (rexec)?`__ ======================================== __ http://mail.python.org/pipermail/python-dev/2002-December/031160.html A question about something in ``rexec_`` ended up leading to a discussion over the value of rexec since it is not close to being truly secure. The module is still in the library but I would not expect it to be in there forever; Py3k will most likely be its undoing unless someone gets the jump now and rewrites the module to actually make it do its job well. .. _rexec: http://www.python.org/doc/current/lib/module-rexec.html ===================== `deprecating APIs`__ ===================== __ http://mail.python.org/pipermail/python-dev/2002-December/031255.html Neal Norwitz came up with a way to deprecate APIs by having them emit warnings during compile-time (at least for gcc). It can be found in `pyport.h`_ and the macro is called ``Py_DEPRECATED()``. .. _pyport.h: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Include/pyport.h =============================== `Mersenne Twister copyright`__ =============================== __ http://mail.python.org/pipermail/python-dev/2002-December/031403.html Splinter Threads: - `Mesenne Twister copyright notice `__ - `Third-Party Licenses `__ Raymond Hettinger asked how he should handle the copyright requirement for the Mersenne Twister (now a part of the ``random_`` module) that the authors be given credit in the documentation. Various ideas were thrown around from adding a ``__copyright__`` value to any module requiring it and having that contain the required notice to having a text file for each file that needed it. The last thing on this thread was Raymond saying that adding a license directory that contained the required notices (this has not been done yet, though). .. _random: http://www.python.org/dev/doc/devel/lib/module-random.html ============================================== `Extension modules, Threading, and the GIL`__ ============================================== __ http://mail.python.org/pipermail/python-dev/2002-December/031424.html David Abrahams brought up a question about the GIL and extension modules calling a Python and not knowing its Python and thus dealing with the GIL (the "summary" David gives is four paragraphs). This whole thread has still yet to be worked out, but if threading in extension modules interests you, have a read (time restraints prevent me from doing a thorough summary of this thread since it is so complicated). ================ `GC at exit?`__ ================ __ http://mail.python.org/pipermail/python-dev/2002-December/031429.html Aahz pointed out that a cycle of objects will not have their respective ``__del__()`` methods called unless you break the cycle or call ``gc.collect()``. But as Martin v. Lwis said, "If you need to guarantee that __del__ is called at the end for all objects, you have probably much bigger problems". So just watch out for those cycles. =================================================== `PEP 303: Extend divmod() for Multiple Divisors`__ =================================================== __ http://mail.python.org/pipermail/python-dev/2002-December/031511.html `PEP 303`_ by Thomas Bellman proposes changing ``divmod()`` so as to allow it to take an arbitrary number of arguments to chain together a bunch of ``divmod()`` calls. Guido has says that he does not like the change because it causes the function to act in a way that is not necessary and since it is a built-in that goes against keeping the the language simple. This thread has started a big discussion on what built-ins are needed, but that started after January 1 and thus will be covered in the next summary. .. _PEP 303: http://www.python.org/peps/pep-0303.html From skip@pobox.com Fri Jan 3 06:33:07 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 3 Jan 2003 00:33:07 -0600 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <44577C94-1ED0-11D7-A03A-000393877AE4@codefab.com> References: <20030103021301.24528.81186.Mailman@mail.python.org> <44577C94-1ED0-11D7-A03A-000393877AE4@codefab.com> Message-ID: <15893.11939.64080.784736@montanaro.dyndns.org> Bill> This hurt when I had to figure it out and the essence of perl seemed Bill> rife within: Bill> [(x,y) for x in range(0,3) for y in range(0,3)] I find it helps if you indent them: [(x,y) for x in range(0,3) for y in range(0,3)] or something similar. Bill> [(x,y) for x in range(0,4) if x is not 2 for y in range(0,4) if y is not 2] Bill> Yuck, yuck, yuck! And it isn't just me! Well, yeah. So don't write list comprehensions that are that complex, or at least not without a little indentation, and not until you see them in your dreams: [(x,y) for x in range(0,4) if x is not 2 for y in range(0,4) if y is not 2] It does seem a bit Perlish with all the trailing conditions when they get that complex. I'm pretty sure I've never written a listcomp with more than a single for and a single if. My list elements (the (x,y) part in your examples) tend to get a bit more complicated than your examples though, involving function calls or arithmetic expressions. Bill> Comprehensions and the asterisk notation have definitely taken Bill> away from the elegant simplicity of the language. Both very Bill> powerful constructs that I have grown relatively comfortable with. By "asterisk notation" I assume you mean the alternate to apply(). Note that apply() has been deprecated, so I suspect you are in the minority (at least within the python-dev community). Most people seem to prefer *-notation to apply(). I know I do. Bill> There is a lot of power in the core set of functional API found in Bill> the builtins. Probably the most powerful feature of things like Bill> map/filter/apply/reduce are that they are easily approached and Bill> consumed by the Python newcomer, thereby making the language more Bill> attractive by making it easier to do powerful things without Bill> learning esoteric/alien syntax. One thing I think people on this list have to keep in mind is that we are not average programmers as a whole (I'm fairly certain I drag down the class average a bit...), so what might seem second nature to us may well not make any sense to a programmer whose entire pre-Python experience was with Excel and VBA. I tend to think the functional stuff is harder to approach as a newcomer. It certainly was harder for me than many other topics when I took a programming languages course in college that dabbled in Lisp (and SNOBOL, and APL, and Algol). In any case, none of *-notation, list comprehensions, or functional builtins have to be taught to rank beginners. They are all like espresso, best consumed in fairly small quantities. Bill> (The other construct that bent my brain for a bit were generators Bill> -- only because of what they did, not because of the pythonic Bill> implementation. Very cool. I like generators.) Which I still have next to no experience with other than adding the occasional yield to Tim's spambayes tokenizer. I've yet to write one because I needed or wanted it. I guess it's to each his own. Skip From tim@zope.com Fri Jan 3 07:49:24 2003 From: tim@zope.com (Tim Peters) Date: Fri, 3 Jan 2003 02:49:24 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <20030103030840.GB29127@panix.com> Message-ID: [Aahz] > So there's ambiguity. So what? That was my question in the orginal msg, and hasn't changed: there are problems at two points, they're inherent to the current design (so can't be wished away), but the current implementation could change what it does in these cases -- what do people want to see done? > What I don't understand is why it's a problem. It depends on the app; like any other "surprise", depending on the app it may never be noticed, or may cost lives, or anything in between. Both you and Marc-Andre have been disturbed enough at the *possibilities* for bad consequences to claim that such cases should raise exceptions (although you seem to spend more energy arguing that there's no problem at all ). Guido made a case on the other side, that once-a-year exceptions unlikely to be caught by testing carry a different kind of real risk. > More precisely, I see these problems existing in the absence of > computers, and I don't see where creating a Python DateTime class > creates any more problems or makes the existing problems worse -- I don't think any of that matters: an API needs to define what it does. > as long as you don't try to convert between timezones in the absence > of sufficient information. The current design doesn't allow the possibility for sufficient information in these cases. One plausible response to that is to insist that the current design is fatally flawed. Another is to shrug "so it goes", and define what it does do, as best as can be done. >> ... at any moment, you can call Guido, and ... > But that's not a conversion. ? It's clearly a way to convert your local time to Guido's local time. The only real difference is that when you call Guido at 6:30 UTC on the day daylight time ends for him, and he replies "oh, it's 1:30 here", you have the further possibility to ask him whether he meant 1:30 EDT or 1:30 EST. Apart from that, dt.astimezone(Eastern) will give you answers identical to his every time you try it (assuming you start with dt.tzinfo=Pacific (from US.py), that you still live in the Pacific time zone, and that you and Guido both scrupulously adjust your clocks at the politically defined DST transition times). > ... > From my POV, this problem exists regardless of whether a computer > mediates the transaction. Of course it does. but that doesn't excuse Python from defining its own behavior in these cases. Apart from that, the consequences of ambiguity in a program are often called "bugs", but the consequences of ambiguity in real life are merely called "real life" <0.9 wink>. > The most likely error (if one happens) is that someone shows up an > hour early for the appointment, and presumably that person knows > that the day is a DST transition. This sounds like you don't want it to raise an exception, then. From python@rcn.com Fri Jan 3 09:57:52 2003 From: python@rcn.com (Raymond Hettinger) Date: Fri, 3 Jan 2003 04:57:52 -0500 Subject: [Python-Dev] Holes in time References: Message-ID: <00e001c2b30e$95152460$125ffea9@oemcomputer> Wow, this whole subject is still astonishing me. What an education. Two months ago, I would have thought that times were only complicated by UTC plus or minus a timezone and the international dateline. Also, I would have thought that dates were only complicated by 400 year cycles and the Julian/Gregorian/Mayan thing. Being a private pilot, ex-military, and having written perpetual calendar programs did not even suggest the complexity or depth of this subject. But, that was before watching the Timbot get ensnarled in 5600 lines of code development, testing, research, and documentation to create a little order out of chaos. Raymond From oren-py-d@hishome.net Fri Jan 3 10:29:45 2003 From: oren-py-d@hishome.net (Oren Tirosh) Date: Fri, 3 Jan 2003 05:29:45 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <15892.57979.261663.613025@montanaro.dyndns.org> References: <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> Message-ID: <20030103102945.GA3630@hishome.net> On Thu, Jan 02, 2003 at 07:08:11PM -0600, Skip Montanaro wrote: > > It just occurred to me one reason I'm not enamored of lambdas is that the > keyword "lambda" has no mnemonic meaning at all the way for example that > "def" is suggestive of "define". I suspect most programmers have never > studied the Lambda Calculus, And I suspect most programmers have never heard of George Boole. The concepts of boolean and lambda and their strange names are both equally alien to newbies. Some people happen to know one but not the other based on their programming background, that's all. I can be very particular about terminology. I have been known to spend half an hour with a thesaurus trying to find a term for a concept that is suggestive of its semantics yet not overloaded with other meanings that may lead to wrong assumptions. But when that fails, I sometimes choose a name that is intentionally meaningless. I think the name 'lambda' serves that purpose quite well I don't think any mnemonic name could possibly convey what lambda really means. Cute line noise like * and ** calls is no better. As Bill Bumgarner pointed out, a name at least gives you something so you can look up in the documentation. I won't mind too much if the map, filter, reduce and apply builtins would be gone in some future version. You can always import them from the __past__ for compatibility. But let the lambda stay. Oren From mwh@python.net Fri Jan 3 11:05:59 2003 From: mwh@python.net (Michael Hudson) Date: 03 Jan 2003 11:05:59 +0000 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Skip Montanaro's message of "Thu, 2 Jan 2003 20:19:37 -0600" References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> <15892.62265.348753.317968@montanaro.dyndns.org> Message-ID: <2mof6ywl0o.fsf@starship.python.net> Skip Montanaro writes: > >> Is there a reason def couldn't have been reused in this context? > > Guido> You couldn't reuse def, because lambda can start an expression > Guido> which can occur at the start of a line, so a line starting with > Guido> def would be ambiguous (Python's parser is intentionally > Guido> simple-minded and doesn't like having to look ahead more than one > Guido> token). > > I'll leave that gauntlet thrown, since I have no interest in rewriting > Python's parser. Maybe it will spark John Aycock's interest though. ;-) Did you miss the "intentionally"? Cheers, M. -- at any rate, I'm satisfied that not only do they know which end of the pointy thing to hold, but where to poke it for maximum effect. -- Eric The Read, asr, on google.com From mal@lemburg.com Fri Jan 3 11:21:26 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 03 Jan 2003 12:21:26 +0100 Subject: [Python-Dev] Holes in time In-Reply-To: <200301030412.h034Cw207328@pcp02138704pcs.reston01.va.comcast.net> References: <200301030412.h034Cw207328@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E157236.4000004@lemburg.com> Guido van Rossum wrote: > Let me present the issue differently. > > On Sunday, Oct 27, 2002, both 5:30 UTC and 6:30 UTC map to 1:30 am US > Eastern time: 5:30 UTC maps to 1:30 am EDT, and 6:30 UTC maps to 1:30 > am EST. (UTC uses a 24 hour clock.) > > We have a tzinfo subclass representing the US Eastern (hybrid) > timezone whose primary responsibility is to translate from local time > in that timezone to UTC: Eastern.utcoffset(dt). It can also tell us > how much of that offset is due to DST: Eastern.dst(dt). > > It is crucial to understand that with "Eastern" as tzinfo, there is > only *one* value for 1:30 am on Oct 27, 2002. The Eastern tzinfo > object arbitrarily decides this is EDT, so it maps to 5:30 UTC. (But > the problem would still exist if it arbitrarily decided that it was > EST.) > > It is also crucial to understand that we have no direct way to > translate UTC to Eastern. We only have a direct way to translate > Eastern to UTC (by subtracting Eastern.utcoffset(dt)). Why don't you take a look at how this is done in mxDateTime ? It has support for the C lib API timegm() (present in many C libs) and includes a work-around which works for most cases; even close to the DST switch time. BTW, you should also watch out for broken mktime() implementations and whether the C lib support leap seconds or not. That has bitten me a few times too. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tismer@tismer.com Fri Jan 3 10:04:03 2003 From: tismer@tismer.com (Christian Tismer) Date: Fri, 03 Jan 2003 11:04:03 +0100 Subject: [Python-Dev] map, filter, reduce, lambda References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <200301030123.h031N0a06472@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E156013.5010002@tismer.com> Guido van Rossum wrote: >>I think this is the major point. One of Python's >>strengths is that declarations are seldom needed, >>almost all objects can be created in-place. >>Not so with defs: They enforce a declaration before >>use, while lambda denotes a functional value. > > > You think of a def as a declaration. I don't: to me, it's just an > assignment, no more obtrusive than using a variable to hold a > subexpression that's too long to comfortably fit on a line. I think this is only half of the truth. Sure, a def is an expression. A class is an expression as well, since everything in a Python source file gets executed, with some parts producing immediate output and other parts creating functions, classes and methods. On the other hand, every analysing tool for Python treats classes and functions as declarations, and this is also how people usually think of it. The fact that execution of code that contains a "declaration" results in creating that object, is a somehow elegant implementation detail but doesn't change the fact that people are declaring a function; they are not assigning an expression result to a name. Your argument about using a variable to hold a subexpression that doesn't fit a line does not compare well, because def doesn't give you a chance to write a function body inline. You have to have a name, proper indentation, all of that. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer@tismer.com Fri Jan 3 10:19:33 2003 From: tismer@tismer.com (Christian Tismer) Date: Fri, 03 Jan 2003 11:19:33 +0100 Subject: [Python-Dev] map, filter, reduce, lambda References: <44577C94-1ED0-11D7-A03A-000393877AE4@codefab.com> Message-ID: <3E1563B5.5060506@tismer.com> Bill Bumgarner wrote: > On Thursday, Jan 2, 2003, at 21:13 US/Eastern, > python-dev-request@python.org wrote: > >> map, reduce, filter, apply: They are just abstractions >> which take a function and let them work on arguments >> in certain ways. They are almost obsolete now >> since they can be replaced by powerful constructs >> like comprehensions and the very cute asterisk calls. > > > Short answer: -1 for removing map/reduce/filter/apply unless doing so > will vastly improve the Python core (i.e. less size, greater speed, > smaller footprint -- some combination in a usefully large fashion). > > Personally, I find comprehensions and very cute asterisk calls to be > incredibly unreadable. I think it is a little too late to recognize this *now*. Also I agree that comprehensions are not my favorite construct, since I personally like the functional approach. Where I absolutely cannot follow you is why you dislike the asterisk notation so much? I see this as one of the most elegant addition to Python of the last years, since it creates a symmetric treatment of argument definition and argument passing. Anyway, this is not the place to discuss personal taste. My message only tried to spell that map, filter and reduce can be easily emulated, while lambda cannot. It is a unique feature. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From mwh@python.net Fri Jan 3 11:31:01 2003 From: mwh@python.net (Michael Hudson) Date: 03 Jan 2003 11:31:01 +0000 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Bill Bumgarner's message of "Thu, 2 Jan 2003 23:03:07 -0500" References: <44577C94-1ED0-11D7-A03A-000393877AE4@codefab.com> Message-ID: <2mlm22wjuy.fsf@starship.python.net> Bill Bumgarner writes: > Personally, I find comprehensions and very cute asterisk calls to be > incredibly unreadable. OTOH the most incomprehensible Python code I have *ever seen* involved sustained map & filter abuse. I think the trick with list comps (as Skip basically said) is not to go overboard with them. Moderation in all things! The same is true of making code using map & filter readable, of course. [...] > (The other construct that bent my brain for a bit were generators-- > only because of what they did, not because of the pythonic > implementation. Very cool. I like generators.) Generators are wonderful. Cheers, M. -- Imagine if every Thursday your shoes exploded if you tied them the usual way. This happens to us all the time with computers, and nobody thinks of complaining. -- Jeff Raskin From mal@lemburg.com Fri Jan 3 11:32:40 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 03 Jan 2003 12:32:40 +0100 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301030118.h031I9l06355@pcp02138704pcs.reston01.va.comcast.net> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14C746.60709@lemburg.com> <200301030118.h031I9l06355@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E1574D8.9060508@lemburg.com> Guido van Rossum wrote: >>>BTW, I don't think it's crippled any more -- we now have nested >>>scopes, remember. >> >>... and they were added just for this reason. > > Not true. Nested scopes were added because people expected them in > lots of situations, not just for lambdas. I wouldn't add *any feature* > to the language just to make lambdas better -- but my grudge isn't big > enough to veto a feature just because it makes lambda better too. :-) At the time we were discussing nested scopes, the lambda argument was the initial one. >>As for the other APIs in the subject line: I really don't >>understand what this discussion is all about. map(), filter(), >>reduce() have all proven their usefulness in the past. > > > Name one useful use for reduce(). Here are three: average = reduce(operator.add, values, 0.0) / len(values) hash = reduce(operator.xor, values, 0) passestests = reduce(operator.and, map(condition, values), 1) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Fri Jan 3 11:35:48 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 03 Jan 2003 12:35:48 +0100 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <15892.52037.298884.948991@slothrop.zope.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14C746.60709@lemburg.com> <15892.52037.298884.948991@slothrop.zope.com> Message-ID: <3E157594.2060000@lemburg.com> Jeremy Hylton wrote: >>>>>>"MAL" == mal writes: > > >> BTW, I don't think it's crippled any more -- we now have nested > >> scopes, remember. > > MAL> ... and they were added just for this reason. > > One reason among many, yes. > > I've certainly written a lot of code in the last year or two that > exploits nested scopes using nested functions. It's used in a lot of > little corners of ZODB, like the test suite. > > MAL> I was never a fan > MAL> of lambda -- a simple def is just as readable > > The seems to be a clear line here. Some folks think a def is just as > readable, some folks think it is unnecessarily tedious to add a bunch > of statements and invent a name. > > MAL> As for the other APIs in the subject line: I really don't > MAL> understand what this discussion is all about. map(), filter(), > MAL> reduce() have all proven their usefulness in the past. > > Useful enough to be builtins? They could just as easily live in a > module. They could even be implemented in Python if performance > weren't a big issue. I think that if they were added to Python now, they'd probably better be off in a separate "functional" module; perhaps we should move them there and only keep the reference in the builtins to them ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From ben@algroup.co.uk Fri Jan 3 11:47:19 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Fri, 03 Jan 2003 11:47:19 +0000 Subject: [Python-Dev] map, filter, reduce, lambda References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> <15892.62265.348753.317968@montanaro.dyndns.org> Message-ID: <3E157847.6040500@algroup.co.uk> Skip Montanaro wrote: > Yes we do, we define a function, we just don't associate it with a name. In > theory the following two function definitions could be equivalent: > > def f(a,b): > return a+b > > f = def(a,b): a+b > > and except for that pesky two-token lookahead would be possible. Huh? Where's two token lookahead required? The token after "def" is either a word, or its "(". Doesn't look very ambiguous to me! Doesn't even require one-token lookahead, does it? Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From mwh@python.net Fri Jan 3 11:51:56 2003 From: mwh@python.net (Michael Hudson) Date: 03 Jan 2003 11:51:56 +0000 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules _randommodule.c,1.1,1.2 In-Reply-To: Jason Tishler's message of "Thu, 02 Jan 2003 16:40:21 -0500" References: <20030102214021.GC1996@tishler.net> Message-ID: <2misx6wiw3.fsf@starship.python.net> Jason Tishler writes: > Now if SF could search for patches by the submitter, my job would be a > little easier... Ram the output of cvs log through Tools/scripts/logmerge.py and search through that (an option for fast network connections only...). Cheers, M. -- (Unfortunately, while you get Tom Baker saying "then we were attacked by monsters", he doesn't flash and make "neeeeooww-sploot" noises.) -- Gareth Marlow, ucam.chat, from Owen Dunn's review of the year From tismer@tismer.com Fri Jan 3 11:47:02 2003 From: tismer@tismer.com (Christian Tismer) Date: Fri, 03 Jan 2003 12:47:02 +0100 Subject: [Python-Dev] map, filter, reduce, lambda References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14C746.60709@lemburg.com> <15892.52037.298884.948991@slothrop.zope.com> <3E157594.2060000@lemburg.com> Message-ID: <3E157836.6020904@tismer.com> M.-A. Lemburg wrote: ... > I think that if they were added to Python now, they'd probably > better be off in a separate "functional" module; perhaps we should > move them there and only keep the reference in the builtins to > them ?! That's what I'm proposing. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From bbum@codefab.com Fri Jan 3 14:05:02 2003 From: bbum@codefab.com (Bill Bumgarner) Date: Fri, 3 Jan 2003 09:05:02 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Message-ID: <5A3A8B72-1F24-11D7-A03A-000393877AE4@codefab.com> On Friday, Jan 3, 2003, at 00:42 US/Eastern, Gisle Aas wrote: >> 'anon' sounds like a great name -- unlikely to be used, shorter than >> 'lambda', and a heck of lot more indicative as to what is going on. > 'anon' does not sound that great to me. Anon what? There is lots of > anonymous stuff. Arc is going for 'fn'. I would vote for 'sub' :) Either makes more sense than 'lambda'. I prefer 'anon' because it is a very common abbreviation for 'anonymous' and because it would have reduced the scarring during the learning of lambda calculus in CS so many years ago. 'fn' and 'sub' don't seem to be much differentiated from 'def'. 'fun' would be better than 'fn', though' because that's what lambda functions are... b.bum From jeremy@alum.mit.edu Fri Jan 3 14:36:04 2003 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 3 Jan 2003 09:36:04 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <3E1574D8.9060508@lemburg.com> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14C746.60709@lemburg.com> <200301030118.h031I9l06355@pcp02138704pcs.reston01.va.comcast.net> <3E1574D8.9060508@lemburg.com> Message-ID: <15893.40916.251179.642643@slothrop.zope.com> >>>>> "MAL" == mal writes: MAL> Guido van Rossum wrote: >>>> BTW, I don't think it's crippled any more -- we now have nested >>>> scopes, remember. >>> >>> ... and they were added just for this reason. >> >> Not true. Nested scopes were added because people expected them >> in lots of situations, not just for lambdas. I wouldn't add *any >> feature* to the language just to make lambdas better -- but my >> grudge isn't big enough to veto a feature just because it makes >> lambda better too. :-) MAL> At the time we were discussing nested scopes, the lambda MAL> argument was the initial one. No. My desire for nested scopes had nothing to do with lambda in particular, although nested scopes in general made lambda much more pleasant. Jeremy From jeremy@alum.mit.edu Fri Jan 3 14:51:04 2003 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 3 Jan 2003 09:51:04 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <15892.64846.212924.977128@gargle.gargle.HOWL> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <15892.64846.212924.977128@gargle.gargle.HOWL> Message-ID: <15893.41816.963078.165603@slothrop.zope.com> >>>>> "BAW" == Barry A Warsaw writes: BAW> For Py3K, I might suggest "anon" instead of lambda, especially BAW> if the construct were expanded to allow statements. The problem with anon is that the name doesn't suggest a function. Jeremy From nas@python.ca Fri Jan 3 15:04:54 2003 From: nas@python.ca (Neil Schemenauer) Date: Fri, 3 Jan 2003 07:04:54 -0800 Subject: [Python-Dev] Holes in time In-Reply-To: <00e001c2b30e$95152460$125ffea9@oemcomputer> References: <00e001c2b30e$95152460$125ffea9@oemcomputer> Message-ID: <20030103150454.GA9495@glacier.arctrix.com> Raymond Hettinger wrote: > Wow, this whole subject is still astonishing me. What an education. Here's a nice paper on the history of time http://www.naggum.no/lugm-time.html . Someone once said that time was invented by the gods to confuse man (roughly quoted). Neil From nas@python.ca Fri Jan 3 15:14:43 2003 From: nas@python.ca (Neil Schemenauer) Date: Fri, 3 Jan 2003 07:14:43 -0800 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules _randommodule.c,1.1,1.2 In-Reply-To: <2misx6wiw3.fsf@starship.python.net> References: <20030102214021.GC1996@tishler.net> <2misx6wiw3.fsf@starship.python.net> Message-ID: <20030103151443.GB9495@glacier.arctrix.com> Michael Hudson wrote: > Jason Tishler writes: > > > Now if SF could search for patches by the submitter, my job would be a > > little easier... > > Ram the output of cvs log through Tools/scripts/logmerge.py and search > through that (an option for fast network connections only...). I use cvschanges.¹ After looking at logmerge.py I see I basically reimplemented it (didn't know it existed). Still, cvschanges is a little easier to use and also allows a username to be specified. Neil ¹ http://arctrix.com/nas/python/cvschanges From ark@research.att.com Fri Jan 3 15:34:31 2003 From: ark@research.att.com (Andrew Koenig) Date: 03 Jan 2003 10:34:31 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <44577C94-1ED0-11D7-A03A-000393877AE4@codefab.com> References: <44577C94-1ED0-11D7-A03A-000393877AE4@codefab.com> Message-ID: Bill> This hurt when I had to figure it out and the essence of perl seemed Bill> rife within: Bill> [(x,y) for x in range(0,3) for y in range(0,3)] Reminds me more of Fortran, except that the loop indices nest in the oppposite direction... -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark From pinard@iro.umontreal.ca Fri Jan 3 15:26:11 2003 From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois_Pinard?=) Date: 03 Jan 2003 10:26:11 -0500 Subject: [Python-Dev] Re: map, filter, reduce, lambda In-Reply-To: <15892.57979.261663.613025@montanaro.dyndns.org> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > It just occurred to me one reason I'm not enamored of lambdas is that the > keyword "lambda" has no mnemonic meaning at all the way for example that > "def" is suggestive of "define". I suspect most programmers have never > studied the Lambda Calculus, and other than perhaps during a brief exposure > to Lisp will never have encountered the term at all. (Remember, Andrew & > Eric are hardly poster children for your run-of-the-mill programmers.) We might merely consider that `lambda' is not for children. :-). Most Lisp users have never studied Lambda Calculus either, and this does not create a problem to them for using it. And lambda expressions allow for statements, at least from a semantic viewpoint. So, if Python `lambda' were ever extended to include statements, I see nothing especially shocking with calling the result `lambda'. I'm not especially prone for `lambda' in Python, but if it stays anyway, why not just stick with `lambda'? What do we gain by changing the keyword? Some might question if it was judicious or not adding `lambda' to Python, but the questioning might be more related to the existence or specifications of anonymous functions than to the choice of the keyword, and changing the keyword will likely not get rid of the questioning. -- François Pinard http://www.iro.umontreal.ca/~pinard From bbum@codefab.com Fri Jan 3 14:25:39 2003 From: bbum@codefab.com (Bill Bumgarner) Date: Fri, 3 Jan 2003 09:25:39 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <15893.11939.64080.784736@montanaro.dyndns.org> Message-ID: <3BCD4C2A-1F27-11D7-A03A-000393877AE4@codefab.com> On Friday, Jan 3, 2003, at 01:33 US/Eastern, Skip Montanaro wrote: > Well, yeah. So don't write list comprehensions that are that complex, > or at > least not without a little indentation, and not until you see them in > your > dreams: This isn't completely about *my* coding style. One of the reasons why I was so attracted to Python back in, what, 1993 or 1994 or so was that it didn't matter *who* wrote the code, the syntax of the language and the design of the libraries was such that it just about anyone's code was comprehensible unless they actively worked to make it incomprehensible. Given that I had just spent 18 months doing very hard core OO perl with a relational backing store [Sybase], Python was like a bolt from the blue. Obfuscation was not a feature, the library code was readable, and I could immediately see that I would have a chance of being able to actually read my code six months after I had written it (and, yes, I succumbed to posting a 'python is superior to perl; i have seen the light' type message somewhere not long thereafter). List comprehensions and the */** notation are very powerful, but they also make the language less approachable to the newcomer to Python. At least, I think they do and most of the people I have run into think they do, as well. I'm not, for a moment, advocating that such features be deprecated or changed -- they are here and here to stay. At my experience level, I am comfortable with said constructs and use them heavily [though, I don't use */** as often as I probably should]. But this isn't just about me. Python is being used as a teaching tool and every experienced Python programmer was once a novice. The language itself is harder to learn now than it was 8 years ago. That is balanced by better documentation. It would be helpful to have a documentation with a syntactic table of contents. I.e. something like the tutorial, but with sections with two line titles -- the first being a title as found in the Tutorial [List Comprehensions] and a subtitle that is an actual example of a List Comprehension (don't know how you would do generators in such a fashion). I would have found this to be an incredibly useful learning tool when returning to python after a couple of year hiatus. b.bum From bbum@codefab.com Fri Jan 3 14:45:36 2003 From: bbum@codefab.com (Bill Bumgarner) Date: Fri, 3 Jan 2003 09:45:36 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <3E1563B5.5060506@tismer.com> Message-ID: <0536D5FE-1F2A-11D7-A03A-000393877AE4@codefab.com> On Friday, Jan 3, 2003, at 05:19 US/Eastern, Christian Tismer wrote: > Where I absolutely cannot follow you is why > you dislike the asterisk notation so much? > I see this as one of the most elegant addition > to Python of the last years, since it creates > a symmetric treatment of argument definition > and argument passing. Elegance does not always mean easy to understand. What is the old adage: Something significantly advanced will seem like pure magic to the layman? I just don't want to see the future directions of Python lose the incredible strength of being such a straightforward, yet still very powerful, language. To the newcomer, map/reduce/filter/apply, list comprehensions, lambda, and */** are all relatively impoderable. An * or ** in the arglist of a function call/definition is extremely easy to gloss over because the newcomer has no idea what they mean. Doing so, leaves the newcomer without a clue as to what is going on in the call/definition. At least with map/reduce/filter/apply, there is a WORD to latch onto and spoken/read (not computer) languages are generally all about words with punctuation taking a somewhat secondary role. Most people do not learn languages by studying the tutorials/documentation at great length-- the first exposure is often under circumstances where too little time is alloted to learning the language. map/filter/reduce may be less powerful and less elegant, but they are a heck of a lot easier to look up in the documentation. More and more commonly, that means going to google and typing in 'python filter documentation'. Try going to google and typing in 'python [] documentation' or 'python * documentation'.... b.bum From skip@pobox.com Fri Jan 3 14:58:20 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 3 Jan 2003 08:58:20 -0600 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <2mof6ywl0o.fsf@starship.python.net> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> <15892.62265.348753.317968@montanaro.dyndns.org> <2mof6ywl0o.fsf@starship.python.net> Message-ID: <15893.42252.365991.851873@montanaro.dyndns.org> Guido> (Python's parser is intentionally simple-minded ...). >> I'll leave that gauntlet thrown, since I have no interest in >> rewriting Python's parser. Maybe it will spark John Aycock's >> interest though. ;-) Michael> Did you miss the "intentionally"? No. I just assumed it meant Guido didn't personally want to burn lots of brain cells on parsing Python. I doubt he'd be as protective of your neurons. ;-) Skip From aahz@pythoncraft.com Fri Jan 3 15:47:55 2003 From: aahz@pythoncraft.com (Aahz) Date: Fri, 3 Jan 2003 10:47:55 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: References: <20030103030840.GB29127@panix.com> Message-ID: <20030103154755.GA14091@panix.com> On Fri, Jan 03, 2003, Tim Peters wrote: > [Aahz] >> >> So there's ambiguity. So what? > > That was my question in the orginal msg, and hasn't changed: there are > problems at two points, they're inherent to the current design (so > can't be wished away), but the current implementation could change > what it does in these cases -- what do people want to see done? Hmmm.... Maybe we don't have an argument. What *does* the current implementation do when it hits the witching hour of DST->ST? So far, I've been saying that it should return 1:MM twice when converting from UTC; if it already does that, I'm fine. >> What I don't understand is why it's a problem. > > It depends on the app; like any other "surprise", depending on the > app it may never be noticed, or may cost lives, or anything in > between. Both you and Marc-Andre have been disturbed enough at the > *possibilities* for bad consequences to claim that such cases should > raise exceptions (although you seem to spend more energy arguing that > there's no problem at all ). Guido made a case on the other > side, that once-a-year exceptions unlikely to be caught by testing > carry a different kind of real risk. The only case where I've advocated raising an exception was attempting to convert a pure wall clock time to any timezone-based time. (As in the case of Guido's calendar entries.) That would raise an exception for all times, not just DST change days. >> as long as you don't try to convert between timezones in the absence >> of sufficient information. > > The current design doesn't allow the possibility for sufficient > information in these cases. One plausible response to that is to > insist that the current design is fatally flawed. Another is to shrug > "so it goes", and define what it does do, as best as can be done. I'm starting to think that the current design is incomplete for tzinfo classes that model internal DST changes. If you say that the current design cannot be tweaked to return 1:MM twice at negative DST changes, then I'll respond to Guido's message asking for a better way with a proposed change to make that possible. >>> ... at any moment, you can call Guido, and ... >> >> But that's not a conversion. > > ? It's clearly a way to convert your local time to Guido's local time. > The only real difference is that when you call Guido at 6:30 UTC on > the day daylight time ends for him, and he replies "oh, it's 1:30 > here", you have the further possibility to ask him whether he meant > 1:30 EDT or 1:30 EST. Apart from that, dt.astimezone(Eastern) will > give you answers identical to his every time you try it (assuming you > start with dt.tzinfo=Pacific (from US.py), that you still live in the > Pacific time zone, and that you and Guido both scrupulously adjust > your clocks at the politically defined DST transition times). Okay, I got confused by your later paragraph about Guido living in UTC time. In this case, I'd say that you're doing a heuristic conversion rather than an algebraic conversion -- and that's precisely what I'm advocating. >> The most likely error (if one happens) is that someone shows up an >> hour early for the appointment, and presumably that person knows >> that the day is a DST transition. > > This sounds like you don't want it to raise an exception, then. Yup. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "There are three kinds of lies: Lies, Damn Lies, and Statistics." --Disraeli From guido@python.org Fri Jan 3 15:28:56 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 10:28:56 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Fri, 03 Jan 2003 12:21:26 +0100." <3E157236.4000004@lemburg.com> References: <200301030412.h034Cw207328@pcp02138704pcs.reston01.va.comcast.net> <3E157236.4000004@lemburg.com> Message-ID: <200301031528.h03FSuK12872@odiug.zope.com> > Guido van Rossum wrote: > > Let me present the issue differently. > > > > On Sunday, Oct 27, 2002, both 5:30 UTC and 6:30 UTC map to 1:30 am US > > Eastern time: 5:30 UTC maps to 1:30 am EDT, and 6:30 UTC maps to 1:30 > > am EST. (UTC uses a 24 hour clock.) > > > > We have a tzinfo subclass representing the US Eastern (hybrid) > > timezone whose primary responsibility is to translate from local time > > in that timezone to UTC: Eastern.utcoffset(dt). It can also tell us > > how much of that offset is due to DST: Eastern.dst(dt). > > > > It is crucial to understand that with "Eastern" as tzinfo, there is > > only *one* value for 1:30 am on Oct 27, 2002. The Eastern tzinfo > > object arbitrarily decides this is EDT, so it maps to 5:30 UTC. (But > > the problem would still exist if it arbitrarily decided that it was > > EST.) > > > > It is also crucial to understand that we have no direct way to > > translate UTC to Eastern. We only have a direct way to translate > > Eastern to UTC (by subtracting Eastern.utcoffset(dt)). > > Why don't you take a look at how this is done in mxDateTime ? I looked at the code, but I couldn't find where it does conversion between arbitrary timezones -- almost all timezone-related code seems to have to do with parsing timezone names and specifications. > It has support for the C lib API timegm() (present in many C libs) > and includes a work-around which works for most cases; even close > to the DST switch time. A goal of the new datetime module is to avoid all dependency on the C library's time facilities -- we must support calculataions outside the range that the C library can deal with. > BTW, you should also watch out for broken mktime() implementations > and whether the C lib support leap seconds or not. That has bitten > me a few times too. Ditto. --Guido van Rossum (home page: http://www.python.org/~guido/) From tino.lange@isg.de Fri Jan 3 15:30:07 2003 From: tino.lange@isg.de (Tino Lange) Date: Fri, 03 Jan 2003 16:30:07 +0100 Subject: [Python-Dev] [development doc updates] In-Reply-To: <20021231183847.B367218EC36@grendel.zope.com> References: <20021231183847.B367218EC36@grendel.zope.com> Message-ID: <3E15AC7F.1020302@isg.de> Fred L. Drake wrote: > The development version of the documentation has been updated: > http://www.python.org/dev/doc/devel/ > Possibly the last update before the Python 2.3 alpha release. Hi all! In the "improved standard modules" - section of the "WhatsNew" & Co. Texts the SSL support in imaplib.py is missing. (submitted March 2002 by Piers Lauder and myself). Python 2.3 will be the first Python release in which this change is official included. Cheers, Tino From ark@research.att.com Fri Jan 3 15:26:12 2003 From: ark@research.att.com (Andrew Koenig) Date: 03 Jan 2003 10:26:12 -0500 Subject: [Python-Dev] binutils In-Reply-To: <44577C94-1ED0-11D7-A03A-000393877AE4@codefab.com> Message-ID: You may recall that a while ago I mentioned that gnu binutils 2.13 has bugs in how it handles dynamic linking that can prevent Python from building on Solaris machines, and that 2.13.1 does not correct those bugs completely. Well, I am happy to report that the recently released binutils 2.13.2 appears to have fixed those bugs, and I can now build Python again without having to patch binutils first. -- Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark From barry@python.org Fri Jan 3 15:59:53 2003 From: barry@python.org (Barry A. Warsaw) Date: Fri, 3 Jan 2003 10:59:53 -0500 Subject: [Python-Dev] map, filter, reduce, lambda References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <15892.64846.212924.977128@gargle.gargle.HOWL> <15893.41816.963078.165603@slothrop.zope.com> Message-ID: <15893.45945.424660.8781@gargle.gargle.HOWL> >>>>> "JH" == Jeremy Hylton writes: >>>>> "BAW" == Barry A Warsaw writes: BAW> For Py3K, I might suggest "anon" instead of lambda, BAW> especially if the construct were expanded to allow BAW> statements. JH> The problem with anon is that the name doesn't suggest a JH> function. And "def" does? . Maybe we should call it "abc" since creating a function comes before defining the name for the function. It's also a nod to Python's historical roots. :) anondef-ly y'rs, -Barry From tim@zope.com Fri Jan 3 15:59:47 2003 From: tim@zope.com (Tim Peters) Date: Fri, 3 Jan 2003 10:59:47 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <200301030412.h034Cw207328@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido, on astimezone() assumptions] Most of these assumptions aren't needed by the current implementation. The crucial assumption (used in every step) is that tz.utcoffset(d) - tz.dst(d) is invariant across all d with d.tzinfo == tz. Apart from that, only one other assumption is made, at the end: > ... > It can assume that the DST correction is >= 0 and probably less than > 10 hours or so, Those aren't needed (or used). > and that DST changes don't occur more frequently than twice a year > (once on and once off), Ditto -- although you'll grow another unspellable hour each time DST ends. > and that the DST correction is constant during the DST period, A weaker form of that is needed at the end. I didn't get around to writing the end of the proof yet. At that point, we've got a guess z'. The missing part of the proof is that z' is UTC-equivalent to the input datetime if and only if (z' + z'.dst()).dst() == z'.dst() Intuitively, and because we know that z'.dst() != 0 at this point in the algorithm, it's saying the result is correct iff "moving a little farther into DST still leaves us in DST". For a class like Eastern, it fails to hold iff we start with the unspellable hour at the end of daylight time: 6:MM UTC maps to z' == 1:MM Eastern, which appears to be dayight time. z'.dst() returns 60 minutes then, but (z' + z'.dst()).dst() == (1:MM Eastern + 1 hour).dst() == (2:MM Eastern).dst() == 0 != 60 then. Another way this *could* hypothetically fail is if .dst() returned different non-zero values at different times. But the assumption is only needed in that specific expression, so it should be OK if distinct daylight periods have distinct dst() offsets (although maintaining the utcoffset() - dst() is-a-constant invariant appears unlikely then). > and that the only variation in UTC offset is due to DST. That's not used either -- although, again, the assumption that "the standard offset" (utcoffset-dst) is a constant is hard to maintain in a wacky time zone. From Jack.Jansen@cwi.nl Fri Jan 3 16:05:10 2003 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 3 Jan 2003 17:05:10 +0100 Subject: [Python-Dev] The meaning of __all__ Message-ID: <22D7F728-1F35-11D7-8A35-0030655234CE@cwi.nl> There's a discussion over on the pyobjc developers mailing list at the moment about the use of __all__. Some people suggested that for a certain module we only put "often used" items in __all__, so that "from module import *" will import only these commonly used items into your namespace. The module would contain other items which *are* useful and which should be callable from the outside, but these would have to be explicitly imported into your namespace (or used via "import objc; objc.SetVerbose()". This doesn't feel right to me, as I've always used __all__ to mean "the whole externally visible API", and I've considered anything else that the module exports basically an artefact of the implementation. But when I looked this up in the reference manual it is rather vague about this. Any opinions on the matter? -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From gward@python.net Fri Jan 3 16:09:37 2003 From: gward@python.net (Greg Ward) Date: Fri, 3 Jan 2003 11:09:37 -0500 Subject: [Python-Dev] GC at exit? In-Reply-To: <20030102135544.GA2891@panix.com> References: <20021230034551.GA18622@panix.com> <20021231005736.GA15066@panix.com> <20030102080358.GC2287@malva.ua> <20030102135544.GA2891@panix.com> Message-ID: <20030103160937.GA1761@cthulhu.gerg.ca> On 02 January 2003, Aahz said: > Because garbage cycles can point at non-garbage; when the garbage is > reclaimed, __del__() methods will run. You could argue that this is > another reason against using __del__(), but since this is part of the > way CPython works, I'm documenting it in my book. As I recall (and this knowledge dates back to 1997 or so, so could well be obsolete), Perl does a full GC run at process exit time. In normal contexts this is irrelevant; I believe the justification was to clean up resources used by an embedded interpreter. Greg -- Greg Ward http://www.gerg.ca/ From barry@python.org Fri Jan 3 16:14:33 2003 From: barry@python.org (Barry A. Warsaw) Date: Fri, 3 Jan 2003 11:14:33 -0500 Subject: [Python-Dev] The meaning of __all__ References: <22D7F728-1F35-11D7-8A35-0030655234CE@cwi.nl> Message-ID: <15893.46825.190020.779210@gargle.gargle.HOWL> >>>>> "JJ" == Jack Jansen writes: JJ> Some people suggested that for a certain module we only put JJ> "often used" items in __all__, so that "from module import *" JJ> will import only these commonly used items into your JJ> namespace. I hope __all__ isn't just for from-import-* support. If so, maybe we should get rid of most of them in the ongoing quest to actively discourage from-import-* (or better yet, set them to the empty list). -Barry From skip@pobox.com Fri Jan 3 16:15:30 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 3 Jan 2003 10:15:30 -0600 Subject: [Python-Dev] binutils In-Reply-To: References: <44577C94-1ED0-11D7-A03A-000393877AE4@codefab.com> Message-ID: <15893.46882.572350.837237@montanaro.dyndns.org> Andrew> You may recall that a while ago I mentioned that gnu binutils Andrew> 2.13 has bugs in how it handles dynamic linking that can prevent Andrew> Python from building on Solaris machines, and that 2.13.1 does Andrew> not correct those bugs completely. Can you narrow down the Solaris version for me? I'll update the warning in the README file of the distribution. Skip From pyth@devel.trillke.net Fri Jan 3 16:17:43 2003 From: pyth@devel.trillke.net (holger krekel) Date: Fri, 3 Jan 2003 17:17:43 +0100 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <0536D5FE-1F2A-11D7-A03A-000393877AE4@codefab.com>; from bbum@codefab.com on Fri, Jan 03, 2003 at 09:45:36AM -0500 References: <3E1563B5.5060506@tismer.com> <0536D5FE-1F2A-11D7-A03A-000393877AE4@codefab.com> Message-ID: <20030103171743.L2841@prim.han.de> Bill Bumgarner wrote: > On Friday, Jan 3, 2003, at 05:19 US/Eastern, Christian Tismer wrote: > > Where I absolutely cannot follow you is why > > you dislike the asterisk notation so much? > > I see this as one of the most elegant addition > > to Python of the last years, since it creates > > a symmetric treatment of argument definition > > and argument passing. > > Elegance does not always mean easy to understand. What is the old > adage: Something significantly advanced will seem like pure magic to > the layman? > > I just don't want to see the future directions of Python lose the > incredible strength of being such a straightforward, yet still very > powerful, language. > > To the newcomer, map/reduce/filter/apply, list comprehensions, lambda, > and */** are all relatively impoderable. An * or ** in the arglist of > a function call/definition is extremely easy to gloss over because the > newcomer has no idea what they mean. Doing so, leaves the newcomer > without a clue as to what is going on in the call/definition. At > least with map/reduce/filter/apply, there is a WORD to latch onto and > spoken/read (not computer) languages are generally all about words with > punctuation taking a somewhat secondary role. > > Most people do not learn languages by studying the > tutorials/documentation at great length-- the first exposure is often > under circumstances where too little time is alloted to learning the > language. > > map/filter/reduce may be less powerful and less elegant, but they are a > heck of a lot easier to look up in the documentation. More and more > commonly, that means going to google and typing in 'python filter > documentation'. > > Try going to google and typing in 'python [] documentation' or 'python > * documentation'.... i agree with your viewpoint. List Comprehensions break the uniformity of the simple original python syntax. Bigger list comprehension constructs and lambda/map etc. constructs are equally bad to read IMHO. I say this although i like to do 'oneliners' where people run away screaming. While the '*/**' notation adds syntactic complexity it doesn't feel too bad to me. (although i am currently refactoring a module to not use '*/**' anymore because it gets hard to read). holger From skip@pobox.com Fri Jan 3 16:19:38 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 3 Jan 2003 10:19:38 -0600 Subject: [Python-Dev] The meaning of __all__ In-Reply-To: <22D7F728-1F35-11D7-8A35-0030655234CE@cwi.nl> References: <22D7F728-1F35-11D7-8A35-0030655234CE@cwi.nl> Message-ID: <15893.47130.90968.687565@montanaro.dyndns.org> Jack> This doesn't feel right to me, as I've always used __all__ to mean Jack> "the whole externally visible API", and I've considered anything Jack> else that the module exports basically an artefact of the Jack> implementation. But when I looked this up in the reference manual Jack> it is rather vague about this. I believe you are correct. If it's in the external API and is (or should be) documented, it belongs in __all__. Skip From ark@research.att.com Fri Jan 3 16:19:52 2003 From: ark@research.att.com (Andrew Koenig) Date: Fri, 3 Jan 2003 11:19:52 -0500 (EST) Subject: [Python-Dev] binutils In-Reply-To: <15893.46882.572350.837237@montanaro.dyndns.org> (message from Skip Montanaro on Fri, 3 Jan 2003 10:15:30 -0600) References: <44577C94-1ED0-11D7-A03A-000393877AE4@codefab.com> <15893.46882.572350.837237@montanaro.dyndns.org> Message-ID: <200301031619.h03GJqv18545@europa.research.att.com> Andrew> You may recall that a while ago I mentioned that gnu binutils Andrew> 2.13 has bugs in how it handles dynamic linking that can prevent Andrew> Python from building on Solaris machines, and that 2.13.1 does Andrew> not correct those bugs completely. Skip> Can you narrow down the Solaris version for me? I'll update the warning in Skip> the README file of the distribution. I have encountered the problem on Solaris 2.7 and 2.8. I would expect it to be present in 2.9. I don't think it's present in 2.6, but I'm not sure and can no longer test it easily. If you like, I can send you patches to binutils 2.13 and 2.13.1, but I suspect that for almost everyone, upgrading to 2.13.2 will be the easiest course. From jason@tishler.net Fri Jan 3 16:29:34 2003 From: jason@tishler.net (Jason Tishler) Date: Fri, 03 Jan 2003 11:29:34 -0500 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules _randommodule.c,1.1,1.2 In-Reply-To: <20030102222524.GB29873@epoch.metaslash.com> References: <20030102214021.GC1996@tishler.net> <20030102222524.GB29873@epoch.metaslash.com> Message-ID: <20030103162934.GE1712@tishler.net> On Thu, Jan 02, 2003 at 05:25:24PM -0500, Neal Norwitz wrote: > On Thu, Jan 02, 2003 at 04:40:21PM -0500, Jason Tishler wrote: > > Anyway, with the attached patch to pyport.h, I was able to build > > Cygwin Python without any errors. > > I think you can simplify the patch by doing: > > #if !defined(__CYGWIN__) > #define PyAPI_FUNC(RTYPE) __declspec(dllimport) RTYPE > #endif I will modify my patch to use the above instead. > But the line below (which has 35 hits) is probably faster/easier: > > egrep '\.tp_.* =' */*.c I will submit a patch to the SF Python collector to fix the following: Modules/arraymodule.c: Arraytype.tp_getattro = PyObject_GenericGetAttr; Modules/arraymodule.c: Arraytype.tp_alloc = PyType_GenericAlloc; Modules/arraymodule.c: Arraytype.tp_free = PyObject_Del; Modules/bz2module.c: BZ2File_Type.tp_base = &PyFile_Type; Modules/bz2module.c: BZ2File_Type.tp_new = PyFile_Type.tp_new; Modules/bz2module.c: BZ2File_Type.tp_getattro = PyObject_GenericGetAttr; Modules/bz2module.c: BZ2File_Type.tp_setattro = PyObject_GenericSetAttr; Modules/bz2module.c: BZ2File_Type.tp_alloc = PyType_GenericAlloc; Modules/bz2module.c: BZ2File_Type.tp_free = _PyObject_Del; Modules/bz2module.c: BZ2Comp_Type.tp_getattro = PyObject_GenericGetAttr; Modules/bz2module.c: BZ2Comp_Type.tp_setattro = PyObject_GenericSetAttr; Modules/bz2module.c: BZ2Comp_Type.tp_alloc = PyType_GenericAlloc; Modules/bz2module.c: BZ2Comp_Type.tp_new = PyType_GenericNew; Modules/bz2module.c: BZ2Comp_Type.tp_free = _PyObject_Del; Modules/bz2module.c: BZ2Decomp_Type.tp_getattro = PyObject_GenericGetAttr; Modules/bz2module.c: BZ2Decomp_Type.tp_setattro = PyObject_GenericSetAttr; Modules/bz2module.c: BZ2Decomp_Type.tp_alloc = PyType_GenericAlloc; Modules/bz2module.c: BZ2Decomp_Type.tp_new = PyType_GenericNew; Modules/bz2module.c: BZ2Decomp_Type.tp_free = _PyObject_Del; Modules/cPickle.c: Picklertype.tp_getattro = PyObject_GenericGetAttr; Modules/cPickle.c: Picklertype.tp_setattro = PyObject_GenericSetAttr; Modules/posixmodule.c: StatResultType.tp_new = statresult_new; Modules/socketmodule.c: sock_type.tp_getattro = PyObject_GenericGetAttr; Modules/socketmodule.c: sock_type.tp_alloc = PyType_GenericAlloc; Modules/socketmodule.c: sock_type.tp_free = PyObject_Del; Modules/threadmodule.c: Locktype.tp_doc = lock_doc; Modules/xxsubtype.c: spamdict_type.tp_base = &PyDict_Type; Modules/xxsubtype.c: spamlist_type.tp_base = &PyList_Type; Modules/_hotshot.c: LogReaderType.tp_getattro = PyObject_GenericGetAttr; Modules/_hotshot.c: ProfilerType.tp_getattro = PyObject_GenericGetAttr; Modules/_randommodule.c: Random_Type.tp_getattro = PyObject_GenericGetAttr; Modules/_randommodule.c: Random_Type.tp_alloc = PyType_GenericAlloc; Modules/_randommodule.c: Random_Type.tp_free = _PyObject_Del; Modules/_tkinter.c: PyTclObject_Type.tp_getattro = &PyObject_GenericGetAttr; Note that I intend to skip PC/_winreg.c unless someone feels strongly that I should change this one too. Thanks, Jason -- PGP/GPG Key: http://www.tishler.net/jason/pubkey.asc or key servers Fingerprint: 7A73 1405 7F2B E669 C19D 8784 1AFD E4CC ECF4 8EF6 From skip@pobox.com Fri Jan 3 16:23:45 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 3 Jan 2003 10:23:45 -0600 Subject: [Python-Dev] binutils In-Reply-To: <200301031619.h03GJqv18545@europa.research.att.com> References: <44577C94-1ED0-11D7-A03A-000393877AE4@codefab.com> <15893.46882.572350.837237@montanaro.dyndns.org> <200301031619.h03GJqv18545@europa.research.att.com> Message-ID: <15893.47377.50318.118642@montanaro.dyndns.org> Skip> Can you narrow down the Solaris version for me? I'll update the Skip> warning in the README file of the distribution. Andrew> I have encountered the problem on Solaris 2.7 and 2.8. I would Andrew> expect it to be present in 2.9. I don't think it's present in Andrew> 2.6, but I'm not sure and can no longer test it easily. Thanks, that will do. Andrew> If you like, I can send you patches to binutils 2.13 and 2.13.1, Andrew> but I suspect that for almost everyone, upgrading to 2.13.2 will Andrew> be the easiest course. Thanks for the offer, but I wouldn't know what to do with them (thankfully). :-) Skip From jason@tishler.net Fri Jan 3 16:32:24 2003 From: jason@tishler.net (Jason Tishler) Date: Fri, 03 Jan 2003 11:32:24 -0500 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules _randommodule.c,1.1,1.2 In-Reply-To: References: <20030102214021.GC1996@tishler.net> Message-ID: <20030103163224.GF1712@tishler.net> On Thu, Jan 02, 2003 at 11:29:33PM +0100, Martin v. L=F6wis wrote: > Jason Tishler writes: > > I feel this is the best approach because modules should build und= er > > Cygwin without the standard Cygwin style patch that I have been > > submitting for years. Do others concur? If so, then I will begi= n to > > clean up the "mess" that I have created. >=20 > This is what I thought a reasonable operating system and compiler > should do by default, without even asking. Well then, I guess Windows, Cygwin, and/or gcc are not "reasonable." = :,) > I certainly agree that it is desirable that you can put function > pointers into static structures, so if it takes additional compiler > flags to make it so, then use those flags. Back when I first submitted my Cygwin Python DLL and Shared Extension Patch: http://sf.net/tracker/?group_id=3D5470&atid=3D305470&func=3Ddetai= l&aid=3D402409 there were no such options. Since then Cygwin ld has been significan= tly enhanced to support the following options: --enable-auto-import (currently defaults to enabled) --enable-runtime-pseudo-reloc (currently defaults to disabled) See the Cygwin ld man page for more details, if interested. Since Python's source already had the __declspec(dllexport) and __declspec(dllimport) indicators for Win32, I never pursued leveragin= g off of the new functionality. That is, until Tim informed me that MS= VC could deal with __declspec(dllimport) function pointers as static initializers. I had erroneously concluded from the following: http://python.org/doc/FAQ.html#3.24 that it couldn't. > I'm unclear why you have to *omit* the declspec, though, to make it > work - I thought that __declspec(dllimport) is precisely the magic > incantation that makes the compiler emit the necessary thunks. Before --enable-auto-import was added to Cygwin ld, both __declspec(dllexport) and __declspec(dllimport) were necessary for successful linking. After --enable-auto-import was added, the linker could automatically import functions as long as the function was exported by any of the various methods (e.g., __declspec(dllexport), --export-all-symbols, .def file, etc.). Since the __declspec(dllimport) indicators are causing compilation problems with shared extension modules and they are no longer needed,= it seems that the simplest (and best) solution is to just remove them. Jason --=20 PGP/GPG Key: http://www.tishler.net/jason/pubkey.asc or key servers Fingerprint: 7A73 1405 7F2B E669 C19D 8784 1AFD E4CC ECF4 8EF6 From akuchlin@mems-exchange.org Fri Jan 3 16:11:59 2003 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 03 Jan 2003 11:11:59 -0500 Subject: [Python-Dev] PEP 301 implementation checked in Message-ID: I've checked in Richard Jones's patches, so those of you who follow the Python CVS trunk can now play with the PEP 301 implementation. Changes: * The DistributionMetadata object and the setup() function now support 'classifiers' as a keyword. * The 'register' distutil command has been added to upload a package's metadata to the PyPI server. The registry is still running on my web server at amk.ca, a bandwidth-limited DSL line, but it'll be moved to something at python.org before too long. (Certainly before 2.3final ships!) Please bang on the distutils code in any way you can think of, filing bug reports and offering comments, so we can be sure that it's solid. You can comment on the web site, too, but the site can be updated independently of the Python code so there's less pressure to get it finished before 2.3final. Here's an example of how the new code works. A setup.py file can now contain a 'classifiers' argument listing Trove-style classifiers: setup (name = "Quixote", version = "0.5.1", description = "A highly Pythonic Web application framework", ... classifiers= ['Topic :: Internet :: WWW/HTTP :: Dynamic Content', 'Environment :: No Input/Output (Daemon)', 'Intended Audience :: Developers'], ... ) I can then run 'python setup.py register' to upload an entry to the package index, which is browseable at http://www.amk.ca/cgi-bin/pypi.cgi . ('register --list-classifiers' will output a list of legal classifiers.) See PEP 301 for more details. --amk (www.amk.ca) OLIVIA: Why, this is very midsummer madness. -- _Twelfth Night_, III, iv From guido@python.org Fri Jan 3 16:30:26 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 11:30:26 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Your message of "Fri, 03 Jan 2003 08:58:20 CST." <15893.42252.365991.851873@montanaro.dyndns.org> References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> <15892.62265.348753.317968@montanaro.dyndns.org> <2mof6ywl0o.fsf@starship.python.net> <15893.42252.365991.851873@montanaro.dyndns.org> Message-ID: <200301031630.h03GUQs13259@odiug.zope.com> > Guido> (Python's parser is intentionally simple-minded ...). > > >> I'll leave that gauntlet thrown, since I have no interest in > >> rewriting Python's parser. Maybe it will spark John Aycock's > >> interest though. ;-) > > Michael> Did you miss the "intentionally"? > > No. I just assumed it meant Guido didn't personally want to burn lots of > brain cells on parsing Python. I doubt he'd be as protective of your > neurons. ;-) The parser is intentionally dumb so that the workings of the parser are easy to understand to users who care. Compare C++. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jan 3 16:32:22 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 11:32:22 -0500 Subject: [Python-Dev] [development doc updates] In-Reply-To: Your message of "Fri, 03 Jan 2003 16:30:07 +0100." <3E15AC7F.1020302@isg.de> References: <20021231183847.B367218EC36@grendel.zope.com> <3E15AC7F.1020302@isg.de> Message-ID: <200301031632.h03GWMO13306@odiug.zope.com> > In the "improved standard modules" - section of the "WhatsNew" & Co. > Texts the SSL support in imaplib.py is missing. (submitted March 2002 by > Piers Lauder and myself). Python 2.3 will be the first Python release in > which this change is official included. Thanks! I've added this to http://www.python.org/2.3/highlights.html --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jan 3 16:37:23 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 11:37:23 -0500 Subject: [Python-Dev] The meaning of __all__ In-Reply-To: Your message of "Fri, 03 Jan 2003 17:05:10 +0100." <22D7F728-1F35-11D7-8A35-0030655234CE@cwi.nl> References: <22D7F728-1F35-11D7-8A35-0030655234CE@cwi.nl> Message-ID: <200301031637.h03GbOb13373@odiug.zope.com> > There's a discussion over on the pyobjc developers mailing list at the > moment about the use of __all__. > > Some people suggested that for a certain module we only put "often > used" items in __all__, so that "from module import *" will import only > these commonly used items into your namespace. The module would contain > other items which *are* useful and which should be callable from the > outside, but these would have to be explicitly imported into your > namespace (or used via "import objc; objc.SetVerbose()". > > This doesn't feel right to me, as I've always used __all__ to mean "the > whole externally visible API", and I've considered anything else that > the module exports basically an artefact of the implementation. But > when I looked this up in the reference manual it is rather vague about > this. > > Any opinions on the matter? __all__ should contain the entire public API. It's mostly intended to avoid accidentally exporting other things you've *imported* like os, sys, and other library modules. Another way of doing that would be to write e.g. import sys as _sys but I don't like excessive underscoring. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jan 3 16:39:11 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 11:39:11 -0500 Subject: [Python-Dev] GC at exit? In-Reply-To: Your message of "Fri, 03 Jan 2003 11:09:37 EST." <20030103160937.GA1761@cthulhu.gerg.ca> References: <20021230034551.GA18622@panix.com> <20021231005736.GA15066@panix.com> <20030102080358.GC2287@malva.ua> <20030102135544.GA2891@panix.com> <20030103160937.GA1761@cthulhu.gerg.ca> Message-ID: <200301031639.h03GdB713399@odiug.zope.com> > As I recall (and this knowledge dates back to 1997 or so, so could well > be obsolete), Perl does a full GC run at process exit time. In normal > contexts this is irrelevant; I believe the justification was to clean up > resources used by an embedded interpreter. Embedded interpreters may do this, but the "main" interpreter may still not do this. The only time the "main" interpreter would benefit from this is if there are external resources that otherwise don't get cleaned up when the process exits (e.g. temp files). --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Fri Jan 3 16:59:16 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 3 Jan 2003 10:59:16 -0600 Subject: [Python-Dev] Directories w/ spaces - pita Message-ID: <15893.49508.827550.246089@montanaro.dyndns.org> Is it possible to tweak the build process so directories containing spaces aren't created. From the Unix side of things it's just a pain in the ass. (Yes, I know about find's -print0 action and xargs's -0 flag. Not everybody uses them.) For example, on my Mac OS X system, building creates these two directories: ./build/lib.darwin-6.3-Power Macintosh-2.3 ./build/temp.darwin-6.3-Power Macintosh-2.3 I assume this is a configure thing, accepting the output of uname -m without further processing. Skip From jepler@unpythonic.net Fri Jan 3 17:03:22 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Fri, 3 Jan 2003 11:03:22 -0600 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <5A3A8B72-1F24-11D7-A03A-000393877AE4@codefab.com> References: <5A3A8B72-1F24-11D7-A03A-000393877AE4@codefab.com> Message-ID: <20030103170309.GJ17390@unpythonic.net> On Fri, Jan 03, 2003 at 09:05:02AM -0500, Bill Bumgarner wrote: > On Friday, Jan 3, 2003, at 00:42 US/Eastern, Gisle Aas wrote: > >>'anon' sounds like a great name -- unlikely to be used, shorter than > >>'lambda', and a heck of lot more indicative as to what is going on. > >'anon' does not sound that great to me. Anon what? There is lots of > >anonymous stuff. Arc is going for 'fn'. I would vote for 'sub' :) > > Either makes more sense than 'lambda'. I prefer 'anon' because it is a > very common abbreviation for 'anonymous' and because it would have > reduced the scarring during the learning of lambda calculus in CS so > many years ago. > > 'fn' and 'sub' don't seem to be much differentiated from 'def'. 'fun' > would be better than 'fn', though' because that's what lambda functions > are... Of course, you could just extend the syntax of 'def'. the 'funcdef' statement remains as now: funcdef: 'def' NAME parameters ':' suite suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT but the 'anon_funcdef' expression would be something like anon_funcdef: 'def' parameters ':' suite No new keyword needs to be introduced, and the fact that they have the same name is an even bigger encouragement to identify funcdef and anon_funcdef as working in the same way. (I don't think any of these problems are related specifically to the use of 'def' for both anonymous and named functions, but I'll use my own suggested spelling throughout the following) The first kind of problem with this is figuring out how to make the parser recognize set_callback(def (): print "I am a callback, short and stout" print "Here is my handle, here is my spout", ()) correctly (that comma could either end the first argument to set_callback(), or it could mean to not print a newline after the second print statement. Then the empty tuple could be a (rather silly) third statement of the anon_funcdef, or a second parameter to set_callback(). Lua handles this by having another token to end a function definition. This is probably not going to happen in Python. Instead, you have to find a way to recognize it based on its indentation. So the above example would presumably be interpreted as a 1-argument call to set_callback(), and you'd have to write something like set_callback(def (): print "I am a callback, short and stout" print "Here is my handle, here is my spout" , ()) to be parsed as a 2-argument call to set_callback. But there are two problems with this. First, when inside grouping characters ((), [], {}), the production of INDENT and DEDENT tokens is suppressed. So you'll need to find a way to turn it back on inside the anon_funcdef production. This is a kind of coupling that I don't think yet exists between the tokenizer and parser. (Right now, 'level' is incremented when '(', '[', or '{' is seen, and decremented when ')', ']' or '}' is seen, and INDENT/DEDENT processing is suppressed whenever 'level' is nonzero) Second, you must correctly maintain the indentation level stack in a somewhat new way. In set_callback(def (): INDENT print "I am a callback, short and stout" the INDENT is not from the level of set_callback to the level of 'print', it's from some intermediate and unspecified level to the level of the print token. So you must add machinery to specify that there's some anonymous indentation level. Then, on the DEDENT in print "Here is my handle, here is my spout" DEDENT , ()) you must produce the dedent token and pop that one level of unspecified indent. I'm not sure if this is enough to handle nested anonymous functions. For instance, are there cases of "ambiguous indentation", similer to the following, that would be interpreted wrongly with the suggested method I gave above? def(): def(): pass return 1 You must also decide whether you want to accept f(def(): pass) in which case the ) must generate a DEDENT for the corresponding anonymous INDENT. On the other hand, maybe you feel comfortable requiring f(def(): pass ) The second kind of problem is that people want to be able to write the moral equivalent of code like def f(): i=0 with_lock o.l: if o.m(): i=1 but due to the way nested scopes work, you can't write def f(): i=0 with_lock(o.l, def(): if o.m(): i=1) just like you can't write def f(): i=0 def g(): i=1 g() return i and have the value of i in the scope of f be modified by the assignment. So the rules of nested scopes would require changes to make anonymous functions useful. It seems that this also complicates some simpler uses of lambda. For instance, right now f(lambda: 1, lambda: 0) works, but f(def(): return 1, def(): return 0) doesn't. The body of the first anon_funcdef is 'return 1, def (): return 0'. You have to write f((def(): return 1), def(): return 0) or, for the sake of symmetry f((def(): return 1), (def(): return 0)) that's nearly a 50% increase in the number of characters. I suppose that the production for anon_funcdef could be changed compared to somewhat alleviate this: anon_funcdef: 'def' [parameters] ':' ( test | complex_stmt ) complex_stmt: NEWLINE INDENT stmt+ DEDENT (also let an empty parameter list be omitted) then I think you could still write f(def: 1, def: 0) since the 'test' form of anon_funcdef can be recognized when the token '1' is hit (instead of NEWLINE), and the ',' is recognized as being after the 'test' token. And, due to the shorter spelling of 'def' than 'lambda', you've even gained a few characters. But this means that a 1-liner anon_funcdef has a different syntax than a 1-liner funcdef, which is a mark against this idea. Well, sorry that this message got a bit long, but I think there are some significant issues to resolve before this (cool, imo) feature can be added to the language. Unfortunately, the difficulty implementing some of them is likely to outweigh the cool, certainly in the eyes of the people with the power to decide whether to include the feature. Jeff From jacobs@penguin.theopalgroup.com Fri Jan 3 17:02:32 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 3 Jan 2003 12:02:32 -0500 (EST) Subject: [Python-Dev] Smart(er) language consumers In-Reply-To: <20030103171743.L2841@prim.han.de> Message-ID: On Fri, 3 Jan 2003, holger krekel wrote: > While the '*/**' notation adds syntactic complexity it doesn't > feel too bad to me. (although i am currently refactoring a module > to not use '*/**' anymore because it gets hard to read). I agree: */** syntax makes for hard(er) to read code. However, I make a point of only using it in code that _should_ be hard(er) to read, because it is _doing_ somethat hard. Any feature can be naively overused to the point of becoming a mental blinder. This applies equally to lambda, list comprehensions, dictionaries, objects, print, indentation, chaining methods, etc... The key is to be a smart language-feature consumer and not over do the sugary features. My rule is to ration out 'cool tricks' to at most one per 4 lines of code or 2 lines of comments. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From nas@python.ca Fri Jan 3 17:14:58 2003 From: nas@python.ca (Neil Schemenauer) Date: Fri, 3 Jan 2003 09:14:58 -0800 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301031630.h03GUQs13259@odiug.zope.com> References: <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> <15892.62265.348753.317968@montanaro.dyndns.org> <2mof6ywl0o.fsf@starship.python.net> <15893.42252.365991.851873@montanaro.dyndns.org> <200301031630.h03GUQs13259@odiug.zope.com> Message-ID: <20030103171458.GA10262@glacier.arctrix.com> Guido van Rossum wrote: > The parser is intentionally dumb so that the workings of the parser > are easy to understand to users who care. How about the grammar? Is it simple purely so the parser was easier to write? Personally, I have a theory that the one of the main reasons Python considered readable is because parsing it doesn't require more than one token of look ahead. > Compare C++. "Run away, run away" :-) Neil From jeremy@alum.mit.edu Fri Jan 3 17:08:08 2003 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 3 Jan 2003 12:08:08 -0500 Subject: [Python-Dev] Smart(er) language consumers In-Reply-To: References: <20030103171743.L2841@prim.han.de> Message-ID: <15893.50040.569537.460855@slothrop.zope.com> >>>>> "KJ" == Kevin Jacobs writes: KJ> On Fri, 3 Jan 2003, holger krekel wrote: >> While the '*/**' notation adds syntactic complexity it doesn't >> feel too bad to me. (although i am currently refactoring a module >> to not use '*/**' anymore because it gets hard to read). KJ> I agree: */** syntax makes for hard(er) to read code. However, KJ> I make a point of only using it in code that _should_ be KJ> hard(er) to read, because it is _doing_ somethat hard. I can't recall a necessary use of apply that was easy to read. The problem that these two feature solve is just messy. With apply, there's usually explicit, hard-to-read tuple create to pass to apply. The extended call syntax eliminates that particular mess, so I'm very happy with it. Jeremy From guido@python.org Fri Jan 3 17:21:20 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 12:21:20 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Fri, 03 Jan 2003 10:59:47 EST." References: Message-ID: <200301031721.h03HLKM13657@odiug.zope.com> Tim & I spent some more time in front of a whiteboard today. We've found a solution that takes the ValueError away. It needs to make assumptions about what the tzinfo implementation does with the impossible and ambiguous times at the DST switch points. Tim thinks that this is the same solution that Aahz arrived at with han dwaving. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jan 3 17:25:53 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 12:25:53 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Fri, 03 Jan 2003 12:21:20 EST." Message-ID: <200301031725.h03HPrW13718@odiug.zope.com> > Tim & I spent some more time in front of a whiteboard today. We've > found a solution that takes the ValueError away. It needs to make > assumptions about what the tzinfo implementation does with the > impossible and ambiguous times at the DST switch points. Tim thinks > that this is the same solution that Aahz arrived at with han dwaving. ^^^^^^^^^^^ That's not Aahz's imaginary friend, it's a typo for hand waving. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From eric@enthought.com Fri Jan 3 17:30:08 2003 From: eric@enthought.com (eric jones) Date: Fri, 3 Jan 2003 11:30:08 -0600 Subject: [Python-Dev] Directories w/ spaces - pita In-Reply-To: <15893.49508.827550.246089@montanaro.dyndns.org> Message-ID: <000001c2b34d$c5f88f10$8901a8c0@ERICDESKTOP> I would prefer this also. scipy_distutils removes the space because there are reported problems with Fortran compilers (g77) failing to compile code when spaces are in the file name on OS X builds. The code we use in scipy_distutils/command/build.py below. from distutils.command.build import build as old_build class build(old_build): ... def finalize_options (self): #------------------------------------------------------------------- # This line is re-factored to a function -- everything else in the # function is identical to the finalize_options function in the # standard distutils build. #------------------------------------------------------------------- plat_specifier = self.get_plat_specifier() ... def get_plat_specifier(self): """ Return a unique string that identifies this platform. The string is used to build path names and contains no spaces or control characters. (we hope) """ plat_specifier = ".%s-%s" % (util.get_platform(), sys.version[0:3]) #------------------------------------------------------------------- # get rid of spaces -- added for OS X support. #------------------------------------------------------------------- plat_specifier = plat_specifier.replace(' ','') return plat_specifier ... ---------------------------------------------- eric jones 515 Congress Ave www.enthought.com Suite 1614 512 536-1057 Austin, Tx 78701 > -----Original Message----- > From: python-dev-admin@python.org [mailto:python-dev-admin@python.org] On > Behalf Of Skip Montanaro > Sent: Friday, January 03, 2003 10:59 AM > To: python-dev@python.org > Subject: [Python-Dev] Directories w/ spaces - pita > > > Is it possible to tweak the build process so directories containing spaces > aren't created. From the Unix side of things it's just a pain in the ass. > (Yes, I know about find's -print0 action and xargs's -0 flag. Not > everybody > uses them.) For example, on my Mac OS X system, building creates these two > directories: > > ./build/lib.darwin-6.3-Power Macintosh-2.3 > ./build/temp.darwin-6.3-Power Macintosh-2.3 > > I assume this is a configure thing, accepting the output of uname -m > without > further processing. > > Skip > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From guido@python.org Fri Jan 3 17:29:57 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 12:29:57 -0500 Subject: [Python-Dev] Smart(er) language consumers In-Reply-To: Your message of "Fri, 03 Jan 2003 12:02:32 EST." References: Message-ID: <200301031729.h03HTw613752@odiug.zope.com> > > While the '*/**' notation adds syntactic complexity it doesn't > > feel too bad to me. (although i am currently refactoring a module > > to not use '*/**' anymore because it gets hard to read). > > I agree: */** syntax makes for hard(er) to read code. However, I make a > point of only using it in code that _should_ be hard(er) to read, because it > is _doing_ somethat hard. > > Any feature can be naively overused to the point of becoming a mental > blinder. This applies equally to lambda, list comprehensions, dictionaries, > objects, print, indentation, chaining methods, etc... The key is to be a > smart language-feature consumer and not over do the sugary features. My > rule is to ration out 'cool tricks' to at most one per 4 lines of code or 2 > lines of comments. I recently learned that some people use apply() unnecessary whenever the function is something computed by an expression, e.g. apply(dict[key], (x, y)) rather than dict[key](x, y) More recently I learned that some people use * and ** unnecessarily too: f(x, y, *[a, b], **{}) for f(x, y, a, b) Finally I note that * and ** are slightly more flexibly than apply: you can write f(a, *args) where with apply you'd have to write apply(f, (a,)+args) And, of course, * and ** should be no strangers to anyone who has ever coded a function *declaration* using varargs or variable keyword arguments. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jan 3 17:35:51 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 12:35:51 -0500 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: Your message of "Fri, 03 Jan 2003 09:14:58 PST." <20030103171458.GA10262@glacier.arctrix.com> References: <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> <15892.62265.348753.317968@montanaro.dyndns.org> <2mof6ywl0o.fsf@starship.python.net> <15893.42252.365991.851873@montanaro.dyndns.org> <200301031630.h03GUQs13259@odiug.zope.com> <20030103171458.GA10262@glacier.arctrix.com> Message-ID: <200301031735.h03HZqo13797@odiug.zope.com> > Guido van Rossum wrote: > > The parser is intentionally dumb so that the workings of the parser > > are easy to understand to users who care. > > How about the grammar? Is it simple purely so the parser was easier to > write? Personally, I have a theory that the one of the main reasons > Python considered readable is because parsing it doesn't require more > than one token of look ahead. Good point -- the grammar is also intentionally dumb for usability (although there are a few cases where the grammar has to be complicated and a second pass is necessary to implement features that the dumb parser cannot handle, like disambiguating "x = y" from "x = y = z", and detecting keyword arguments). One of my personal theories (fed by a comment here by someone whose name don't recall right now) is that, unlike in other languages, the fact that so little happens at compile time is a big bonus to usability. --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis@bluewin.ch Fri Jan 3 18:01:23 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Fri, 3 Jan 2003 19:01:23 +0100 Subject: [Python-Dev] map, filter, reduce, lambda References: <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> <15892.62265.348753.317968@montanaro.dyndns.org> <2mof6ywl0o.fsf@starship.python.net> <15893.42252.365991.851873@montanaro.dyndns.org> <200301031630.h03GUQs13259@odiug.zope.com> <20030103171458.GA10262@glacier.arctrix.com> <200301031735.h03HZqo13797@odiug.zope.com> Message-ID: <035401c2b352$20c88ea0$6d94fea9@newmexico> From: "Guido van Rossum" > > One of my personal theories (fed by a comment here by someone whose > name don't recall right now) is that, unlike in other languages, the > fact that so little happens at compile time is a big bonus to > usability. > a "collective" thought stream in a thread about macros: http://aspn.activestate.com/ASPN/Mail/Message/1076878 regards From mal@lemburg.com Fri Jan 3 18:27:50 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 03 Jan 2003 19:27:50 +0100 Subject: [Python-Dev] Holes in time In-Reply-To: <200301031528.h03FSuK12872@odiug.zope.com> References: <200301030412.h034Cw207328@pcp02138704pcs.reston01.va.comcast.net> <3E157236.4000004@lemburg.com> <200301031528.h03FSuK12872@odiug.zope.com> Message-ID: <3E15D626.2030901@lemburg.com> Guido van Rossum wrote: >>Guido van Rossum wrote: >> >>>Let me present the issue differently. >>> >>>On Sunday, Oct 27, 2002, both 5:30 UTC and 6:30 UTC map to 1:30 am US >>>Eastern time: 5:30 UTC maps to 1:30 am EDT, and 6:30 UTC maps to 1:30 >>>am EST. (UTC uses a 24 hour clock.) >>> >>>We have a tzinfo subclass representing the US Eastern (hybrid) >>>timezone whose primary responsibility is to translate from local time >>>in that timezone to UTC: Eastern.utcoffset(dt). It can also tell us >>>how much of that offset is due to DST: Eastern.dst(dt). >>> >>>It is crucial to understand that with "Eastern" as tzinfo, there is >>>only *one* value for 1:30 am on Oct 27, 2002. The Eastern tzinfo >>>object arbitrarily decides this is EDT, so it maps to 5:30 UTC. (But >>>the problem would still exist if it arbitrarily decided that it was >>>EST.) >>> >>>It is also crucial to understand that we have no direct way to >>>translate UTC to Eastern. We only have a direct way to translate >>>Eastern to UTC (by subtracting Eastern.utcoffset(dt)). >> >>Why don't you take a look at how this is done in mxDateTime ? > > > I looked at the code, but I couldn't find where it does conversion > between arbitrary timezones -- almost all timezone-related code seems > to have to do with parsing timezone names and specifications. It doesn't do conversion between time zone, but it does provide you with the offset information from UTC to local time in both directions. >>It has support for the C lib API timegm() (present in many C libs) >>and includes a work-around which works for most cases; even close >>to the DST switch time. > > A goal of the new datetime module is to avoid all dependency on the C > library's time facilities -- we must support calculataions outside the > range that the C library can deal with. I don't see how that can be done for time zones and DST. Timezones and even more the DST settings change more often for various locales than you think, so assumptions about the offset between UTC and local time for the future as well as for historical dates can easily be wrong. The tz data used by most C libs has tables which account for many of the known offsets in the past; they can only guess about the future. The only usable time scale for historic and future date/time is UTC. The same is true if you're interested in date/time calculations in terms of absolute time. Now, for current time zones, the C lib is a good source of information, so I don't see why you wouldn't want to use it. >>BTW, you should also watch out for broken mktime() implementations >>and whether the C lib support leap seconds or not. That has bitten >>me a few times too. > > Ditto. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Fri Jan 3 18:39:44 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 13:39:44 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: Your message of "Fri, 03 Jan 2003 19:27:50 +0100." <3E15D626.2030901@lemburg.com> References: <200301030412.h034Cw207328@pcp02138704pcs.reston01.va.comcast.net> <3E157236.4000004@lemburg.com> <200301031528.h03FSuK12872@odiug.zope.com> <3E15D626.2030901@lemburg.com> Message-ID: <200301031839.h03Idil14232@odiug.zope.com> > Now, for current time zones, the C lib is a good source > of information, so I don't see why you wouldn't want to > use it. It seems you haven't been following this discussion. The issue is not how to get information about timezones. The issue is, given an API that implements an almost-but-not-quite-reversible function from local time to UTC, how to invert that function. Please go read the datetime Wiki before commenting further in this thread. --Guido van Rossum (home page: http://www.python.org/~guido/) From jepler@unpythonic.net Fri Jan 3 18:53:33 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Fri, 3 Jan 2003 12:53:33 -0600 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> References: <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030103185331.GK17390@unpythonic.net> On Thu, Jan 02, 2003 at 08:52:18PM -0500, Guido van Rossum wrote: > You couldn't reuse def, because lambda can start an expression which > can occur at the start of a line, so a line starting with def would be > ambiguous (Python's parser is intentionally simple-minded and doesn't > like having to look ahead more than one token). Ah. I knew I was missing something when I made this suggestion (in a different subthread). Could the parser be taught that def-at-start-of-line is always good old fashioned funcdef instead of anon_funcdef (my name), since the statement def(): None is a useless one (you can't call it, you can't assign it, ..)? (And you could write the statement '(def(): None)' if you were really adamant that you wanted to perform that particular useless thing) The conflict may exist in the grammar, but could be resolved in a particular way in the parser. Or maybe there's even a clever way to write the grammar to make the ambiguity go away. Something along the lines of simple_stmt: expr_stmt_nodef | ... expr_stmt_nodef: testlist_nodef ... testlist_nodef: test_nodef (',' test)* [','] test_nodef: and_test ('or' and_test)* I think the complications stop cascading at that point. Jeff From ark@research.att.com Fri Jan 3 19:06:18 2003 From: ark@research.att.com (Andrew Koenig) Date: Fri, 3 Jan 2003 14:06:18 -0500 (EST) Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <200301022311.h02NBdE04436@odiug.zope.com> (message from Guido van Rossum on Thu, 02 Jan 2003 18:11:39 -0500) References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <15892.48142.993092.418042@montanaro.dyndns.org> <200301022243.h02Mhjr02102@europa.research.att.com> <200301022311.h02NBdE04436@odiug.zope.com> Message-ID: <200301031906.h03J6Ik11646@europa.research.att.com> >> I hope to be able to give a talk about it at the Oxford conference in >> April. Guido> I guess you won't make it to PyCon in DC the week before that? Hadn't thought that far ahead, actually. Looking at the calendar now, I think there's a good chance I'll be able to make it. From erik@pythonware.com Fri Jan 3 19:03:07 2003 From: erik@pythonware.com (Erik Heneryd) Date: Fri, 3 Jan 2003 20:03:07 +0100 (CET) Subject: [Python-Dev] Directories w/ spaces - pita In-Reply-To: <15893.49508.827550.246089@montanaro.dyndns.org> Message-ID: On Fri, 3 Jan 2003, Skip Montanaro wrote: > > Is it possible to tweak the build process so directories containing spaces > aren't created. From the Unix side of things it's just a pain in the ass. > (Yes, I know about find's -print0 action and xargs's -0 flag. Not everybody > uses them.) For example, on my Mac OS X system, building creates these two > directories: > > ./build/lib.darwin-6.3-Power Macintosh-2.3 > ./build/temp.darwin-6.3-Power Macintosh-2.3 > > I assume this is a configure thing, accepting the output of uname -m without > further processing. > > Skip i'd like to ask why there are files with names containing space in the source tarball? even though you don't use the mac stuff, and could possibly just do rm -fr Mac, sometimes you'd like to keep it for completeness. not a big issue, but it annoys me every time i have to check in a new release into the pythonware repository. i'd like to see a policy not to use anything but hmm... a-zA-Z0-9.-_ in file names. erik From neal@metaslash.com Fri Jan 3 19:33:40 2003 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 3 Jan 2003 14:33:40 -0500 Subject: [Python-Dev] bz2 problem deriving subclass in C Message-ID: <20030103193340.GG29873@epoch.metaslash.com> I just submitted a patch: http://python.org/sf/661796 (BZ2File leaking fd and memory). BZ2File inherits from PyFileObject and there's a problem when deallocating BZ2File that it doesn't call the file_dealloc also. I don't know if this "problem" has been solved before. The patch is less than elegant. It exposes the file_dealloc and BZ2File.dealloc calls file_dealloc. (names are changed to add _Py_ prefix, etc). Is there another way to solve this problem? Neal From shane@zope.com Fri Jan 3 19:33:19 2003 From: shane@zope.com (Shane Hathaway) Date: Fri, 03 Jan 2003 14:33:19 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time References: Your message of "Fri, 03 Jan 2003 19:27:50 +0100." <3E15D626.2030901@lemburg.com> <200301031839.h03Idil14232@odiug.zope.com> Message-ID: <3E15E57F.2000407@zope.com> Guido van Rossum wrote: >>Now, for current time zones, the C lib is a good source >>of information, so I don't see why you wouldn't want to >>use it. > > > It seems you haven't been following this discussion. The issue is not > how to get information about timezones. The issue is, given an API > that implements an almost-but-not-quite-reversible function from local > time to UTC, how to invert that function. I have to admit I've also had trouble following the discussion also, until you described it that way. Correct me if I'm wrong: you need the inverse of a mathematical function. utc_time = f(local_time) I'll represent the inverse as "g": local_time = g(utc_time) The graph of "f" looks like an upward slope, but it has little holes and overlaps at daylight savings boundaries. The inverse is almost as strange, with periodic small jumps up and down. You'd like to use the time zone information provided by the C library in the computation of "f", but the C library doesn't quite provide all the information you need to compute "g" correctly. With those requirements, your only hope of computing "g" is to make some assumptions about "f". That sounds perfectly reasonable, but may I suggest moving the assumption by changing the interface of the tzinfo class. The utcoffset() method leads one to naively assume that functions f and g can both depend reliably on utcoffset(). Instead, tzinfo might have two methods, to_local(utc_date) and to_utc(local_date). That way, the tzinfo object encapsulates the madness. One downside is that then you can't expect normal programmers to write a correct tzinfo based on the C libraries. They'll never get it right. :-) It would have to be supplied with Python. Shane From fdrake@acm.org Fri Jan 3 19:56:58 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 3 Jan 2003 14:56:58 -0500 Subject: [Python-Dev] bz2 problem deriving subclass in C In-Reply-To: <20030103193340.GG29873@epoch.metaslash.com> References: <20030103193340.GG29873@epoch.metaslash.com> Message-ID: <15893.60170.186597.293108@grendel.zope.com> Neal Norwitz writes: > BZ2File inherits from PyFileObject and there's a problem when > deallocating BZ2File that it doesn't call the file_dealloc also. > > I don't know if this "problem" has been solved before. The patch is > less than elegant. It exposes the file_dealloc and BZ2File.dealloc > calls file_dealloc. (names are changed to add _Py_ prefix, etc). > > Is there another way to solve this problem? Since you have the BZ2File type object, you have a reference to the file type object. Can't you just call the tp_dealloc slot from that? That seems a very reasonable approach from where I'm sitting. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From neal@metaslash.com Fri Jan 3 20:03:59 2003 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 3 Jan 2003 15:03:59 -0500 Subject: [Python-Dev] bz2 problem deriving subclass in C In-Reply-To: <15893.60170.186597.293108@grendel.zope.com> References: <20030103193340.GG29873@epoch.metaslash.com> <15893.60170.186597.293108@grendel.zope.com> Message-ID: <20030103200359.GH29873@epoch.metaslash.com> On Fri, Jan 03, 2003 at 02:56:58PM -0500, Fred L. Drake, Jr. wrote: > > Neal Norwitz writes: > > BZ2File inherits from PyFileObject and there's a problem when > > deallocating BZ2File that it doesn't call the file_dealloc also. > > > > I don't know if this "problem" has been solved before. The patch is > > less than elegant. It exposes the file_dealloc and BZ2File.dealloc > > calls file_dealloc. (names are changed to add _Py_ prefix, etc). > > > > Is there another way to solve this problem? > > Since you have the BZ2File type object, you have a reference to the > file type object. Can't you just call the tp_dealloc slot from that? > That seems a very reasonable approach from where I'm sitting. Makes sense to me too. I tried it and it didn't work. At least I think I tried it, but I may have called tp_free. I'm not sure now. Neal From guido@python.org Fri Jan 3 20:52:30 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 15:52:30 -0500 Subject: [Python-Dev] Directories w/ spaces - pita In-Reply-To: Your message of "Fri, 03 Jan 2003 20:03:07 +0100." References: Message-ID: <200301032052.h03KqUT05587@odiug.zope.com> > i'd like to ask why there are files with names containing space in the > source tarball? even though you don't use the mac stuff, and could > possibly just do rm -fr Mac, sometimes you'd like to keep it for > completeness. not a big issue, but it annoys me every time i have to > check in a new release into the pythonware repository. i'd like to see a > policy not to use anything but hmm... a-zA-Z0-9.-_ in file names. Me too, but this is up to the Mac developers, and they really like their spaces. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) From Jack.Jansen@oratrix.com Fri Jan 3 20:59:44 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Fri, 3 Jan 2003 21:59:44 +0100 Subject: [Python-Dev] bz2 problem deriving subclass in C In-Reply-To: <20030103200359.GH29873@epoch.metaslash.com> Message-ID: <48FE8D34-1F5E-11D7-8857-000A27B19B96@oratrix.com> On vrijdag, jan 3, 2003, at 21:03 Europe/Amsterdam, Neal Norwitz wrote: >> Since you have the BZ2File type object, you have a reference to the >> file type object. Can't you just call the tp_dealloc slot from that? >> That seems a very reasonable approach from where I'm sitting. > > Makes sense to me too. I tried it and it didn't work. > At least I think I tried it, but I may have called tp_free. > I'm not sure now. I did just this last week (when I converted the Mac toolbox modules to new-style objects) and it worked. But if you accidentally do this with old-style objects or objects without a base type (which happened to me, because the code is generated) you get a spectacular crash (old-style objects don't have the tp_free slot). The logic I use for generating the body of xxx_dealloc is now def generate_dealloc(....): generate("cleanup my own mess") if basetype: generate("basetype.tp_dealloc(self);") elif new-style-object: generate("self->ob_type->tp_free(self);") else: generate("PyObject_Free(self);") This seems to work. Or, at least, I haven't had a reproducible crash yet:-) I found this the hardest part of the whole PEP252/PEP253 business to get right, as the only explanation in the documentation and the PEPs is basically "do the alloc/new/init logic in reverse". -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From mal@lemburg.com Fri Jan 3 21:10:43 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 03 Jan 2003 22:10:43 +0100 Subject: [Python-Dev] Holes in time In-Reply-To: <200301031839.h03Idil14232@odiug.zope.com> References: <200301030412.h034Cw207328@pcp02138704pcs.reston01.va.comcast.net> <3E157236.4000004@lemburg.com> <200301031528.h03FSuK12872@odiug.zope.com> <3E15D626.2030901@lemburg.com> <200301031839.h03Idil14232@odiug.zope.com> Message-ID: <3E15FC53.5070006@lemburg.com> Guido van Rossum wrote: >>Now, for current time zones, the C lib is a good source >>of information, so I don't see why you wouldn't want to >>use it. > > It seems you haven't been following this discussion. The issue is not > how to get information about timezones. The issue is, given an API > that implements an almost-but-not-quite-reversible function from local > time to UTC, how to invert that function. I am not talking about how to get the current timezone; the C lib APIs provide functions to convert between local time and UTC -- that's what was referring to. Local time is all about timezones and DST which is why conversions between UTC and local time always have to deal with timezones and DST. > Please go read the datetime Wiki before commenting further in this > thread. Sorry to have bothered you, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tismer@tismer.com Fri Jan 3 21:14:37 2003 From: tismer@tismer.com (Christian Tismer) Date: Fri, 03 Jan 2003 22:14:37 +0100 Subject: [Python-Dev] map, filter, reduce, lambda References: <200301022148.h02Lm6K03766@odiug.zope.com> <071c01c2b2ab$7e0677a0$6d94fea9@newmexico> <200301022236.h02MaXj04046@odiug.zope.com> <3E14D76D.4020400@tismer.com> <15892.57979.261663.613025@montanaro.dyndns.org> <200301030152.h031qIM06632@pcp02138704pcs.reston01.va.comcast.net> <15892.62265.348753.317968@montanaro.dyndns.org> <2mof6ywl0o.fsf@starship.python.net> <15893.42252.365991.851873@montanaro.dyndns.org> <200301031630.h03GUQs13259@odiug.zope.com> <20030103171458.GA10262@glacier.arctrix.com> <200301031735.h03HZqo13797@odiug.zope.com> Message-ID: <3E15FD3D.7070301@tismer.com> Guido van Rossum wrote: >>Guido van Rossum wrote: >> >>>The parser is intentionally dumb so that the workings of the parser >>>are easy to understand to users who care. >> >>How about the grammar? Is it simple purely so the parser was easier to >>write? Personally, I have a theory that the one of the main reasons >>Python considered readable is because parsing it doesn't require more >>than one token of look ahead. > > > Good point -- the grammar is also intentionally dumb for usability > (although there are a few cases where the grammar has to be > complicated and a second pass is necessary to implement features that > the dumb parser cannot handle, like disambiguating "x = y" from "x = y > = z", and detecting keyword arguments). > > One of my personal theories (fed by a comment here by someone whose > name don't recall right now) is that, unlike in other languages, the > fact that so little happens at compile time is a big bonus to > usability. The intentional dumbness of the parser and the grammer are not only good for people who want to understand the implementation. I believe that this simplicity is also one reason why reading Python code seems to be so easy for humans: Although a parser works differently than a human brain (which uses more look-ahead and whole-line perception of course), the lack of complicated analysis necessary while reading is also consuming much less concentration than other languages do. I don't say people cannot read complicated languages. But it is a waste of resources if that cognitive power can be used for the real problem. Most probably another reason why Python is said to "Step back behind the problem". No idea whether this was a planned principle by you or if it happened to be a consequence of other simplicity, Python code is simplest to read and simplest to think. As a side note, list comprehensions with a couple of appended "for..." and "if..." phrases might in fact be a bit controversal, when I think of the above. Instead of Python's straight simplicity with "no suprizes to come", these constructs are a little upside-down, with a lot to remember until getting it all. So the advice to keep these appendices as short as possible is necessary; it is a little similar to reading RPN code. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tim.one@comcast.net Fri Jan 3 21:21:34 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 03 Jan 2003 16:21:34 -0500 Subject: [Python-Dev] GC at exit? In-Reply-To: <20030103160937.GA1761@cthulhu.gerg.ca> Message-ID: [Greg Ward] > As I recall (and this knowledge dates back to 1997 or so, so could well > be obsolete), Perl does a full GC run at process exit time. In normal > contexts this is irrelevant; I believe the justification was to clean up > resources used by an embedded interpreter. See http://tinyurl.com/41wy searching down for "Two-Phased Garbage Collection". Perl's mark-&-sweep phase runs when a thread terminates, which includes the main thread exiting, but isn't really aimed at that. From jepler@unpythonic.net Fri Jan 3 21:49:39 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Fri, 3 Jan 2003 15:49:39 -0600 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <20030103170309.GJ17390@unpythonic.net> References: <5A3A8B72-1F24-11D7-A03A-000393877AE4@codefab.com> <20030103170309.GJ17390@unpythonic.net> Message-ID: <20030103214938.GL17390@unpythonic.net> On Fri, Jan 03, 2003 at 11:03:22AM -0600, Jeff Epler wrote: > On Fri, Jan 03, 2003 at 09:05:02AM -0500, Bill Bumgarner wrote: > > On Friday, Jan 3, 2003, at 00:42 US/Eastern, Gisle Aas wrote: > > >>'anon' sounds like a great name -- unlikely to be used, shorter than > > >>'lambda', and a heck of lot more indicative as to what is going on. > > >'anon' does not sound that great to me. Anon what? There is lots of > > >anonymous stuff. Arc is going for 'fn'. I would vote for 'sub' :) > > > > Either makes more sense than 'lambda'. I prefer 'anon' because it is a > > very common abbreviation for 'anonymous' and because it would have > > reduced the scarring during the learning of lambda calculus in CS so > > many years ago. > > > > 'fn' and 'sub' don't seem to be much differentiated from 'def'. 'fun' > > would be better than 'fn', though' because that's what lambda functions > > are... > > Of course, you could just extend the syntax of 'def'. the 'funcdef' > statement remains as now: > funcdef: 'def' NAME parameters ':' suite > suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT > > but the 'anon_funcdef' expression would be > something like > anon_funcdef: 'def' parameters ':' suite > No new keyword needs to be introduced, and the fact that they have the > same name is an even bigger encouragement to identify funcdef and > anon_funcdef as working in the same way. (I don't think any of these > problems are related specifically to the use of 'def' for both anonymous > and named functions, but I'll use my own suggested spelling throughout > the following) Okay, apparently I'm wrong about there being no ambiguity in the grammar. See another subthread where Guido points this out. At least, I'm assuming he's right here. But if we decide that we want anonymous functions that work like expressions, mightn't we decide we want anonymous classes that work like expressions? For instance, instead of def run_server(server_class, socket_class, *args): class Server(server_logic, socket_class): pass Server(*args).run() run_server(DNSServer, UDPSocket) you would write def run_server(server_class, socket_class, *args): (class (server_logic, socket_class): pass)(*args).run() Obviously I've crossed over into the land of the insane here, but I am failing to have the insight to understand why (one-liner) anonymous functions are so useful to many Python programmers, but why a one-liner anonymous class seems like such a frankenstein monster. Of course, right now you could write def run_server(server_class, socket_class, *args): new.classobj("", (server_class, socket_class), {})(*args).run() but I suspect my first alternative is still preferred. Not being a java user I get the impression that little anonymous classes (or at least classes whose name is not important) are sometimes declared but that this is due to some kind of language wart more than a real desire to do this. Is this executive summary accurate? ("inner classes" or somesuch name?) Jeff From tim@zope.com Fri Jan 3 21:51:58 2003 From: tim@zope.com (Tim Peters) Date: Fri, 3 Jan 2003 16:51:58 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <20030103154755.GA14091@panix.com> Message-ID: [Aahz] > ... > Hmmm.... Maybe we don't have an argument. We don't, unless you're claiming there isn't ambiguity at DST switch points. > What *does* the current implementation do when it hits the witching > hour of DST->ST? Please read the original msg again. Nothing that followed has been of much help, and I've really had nothing new to say since then -- it was all there at the start (including the answer to this question). > So far, I've been saying that it should return 1:MM twice when converting > from UTC; if it already does that, I'm fine. It doesn't now, but I believe it will be reasonably cheap to do so, and that's what Guido wants it to do too. Provided that's the defined and documented behavior, fine by me too. Thanks for pushing for it! (Right or wrong, it fosters the illusion of consensus .) > ... > The only case where I've advocated raising an exception was attempting > to convert a pure wall clock time to any timezone-based time. The datetime module doesn't have a class named "PureWallClockTime" . Seriously, the original msg posed focused questions about how a specific class method should act in its debatable endcases, and anything beyond that is out-of-scope for me here. > (As in the case of Guido's calendar entries.) That would raise an > exception for all times, not just DST change days. Under the agreement, Guido's calendar entries will display as 1:MM in this case, if a programmer uses datetimetz.astimezone() to convert them. It will never raise an exception merely for "I don't think you should do that" reasons, or even for "but the hour can't be spelled unambiguously in your time zone" reasons. > ... > I'm starting to think that the current design is incomplete for tzinfo > classes that model internal DST changes. Yes. Without a moral equivalent to struct tm's tm_isdst flag (which datetime does not have), it's necessarily incomplete, and that has consquences for two (and only two) hours per year in hybrid (standard+daylight) tzinfo classs. One consequence seems trivial (assigning "a meaning" to the non-extant 2:MM local hour at DST start); the importance of the other (an unspellable in local time hour at DST end) varies by app, and seemingly by what mood someone is in when they think about it. From guido@python.org Fri Jan 3 22:04:18 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 17:04:18 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: Your message of "Fri, 03 Jan 2003 14:33:19 EST." <3E15E57F.2000407@zope.com> References: Your message of "Fri, 03 Jan 2003 19:27:50 +0100." <3E15D626.2030901@lemburg.com> <200301031839.h03Idil14232@odiug.zope.com> <3E15E57F.2000407@zope.com> Message-ID: <200301032204.h03M4IV26039@odiug.zope.com> > > The issue is, given an API that implements an > > almost-but-not-quite-reversible function from local time to UTC, > > how to invert that function. > > I have to admit I've also had trouble following the discussion also, > until you described it that way. Correct me if I'm wrong: you need the > inverse of a mathematical function. > > utc_time = f(local_time) > > I'll represent the inverse as "g": > > local_time = g(utc_time) > > The graph of "f" looks like an upward slope, but it has little holes and > overlaps at daylight savings boundaries. The inverse is almost as > strange, with periodic small jumps up and down. Yes, exactly. The bizarre thing is that g() is a true function, with a shape like this: . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . o . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . u . . . . . . . . . . o . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . * . . . . . . . . . . . . . . . . . . . . . . . . t . . . . . . . . * . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . q . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . p . . . . . o . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A B C D Here the x-axis is UTC, and the y-axis is local time. The 'o' points are a feeble attempt at drawing the end points of a half-open interval. Points A and B mark vertical lines at the DST switch points: DST is active between A and B (and again between C and D, etc.). This makes f(), its inverse, not quite a true function in the mathematical sense: in [p, q) it has no value, and in [t, u) it is two-valued. (To see f()'s graph, just transpose the above graph in your head. :-) But our tzinfo implementation in fact implements f(), and makes it into a real function by assigning some output value to inputs in [p, q) and by picking one of the two possible output for inputs in the range [u, v). Now when we want to translate from UTC to local time, we have to recover the parameters of g() by interpreting f()! (There's more, but I don't want to spend all day writing this.) > You'd like to use the time zone information provided by the C > library in the computation of "f", but the C library doesn't quite > provide all the information you need to compute "g" correctly. With > those requirements, your only hope of computing "g" is to make some > assumptions about "f". Yes, except the C library doesn't enter into it. > That sounds perfectly reasonable, but may I suggest moving the > assumption by changing the interface of the tzinfo class. The > utcoffset() method leads one to naively assume that functions f and g > can both depend reliably on utcoffset(). Instead, tzinfo might have two > methods, to_local(utc_date) and to_utc(local_date). That way, the > tzinfo object encapsulates the madness. This is similar to Tim's suggestion of letting the tzinfo subclass implement fromutc(). We may have to do this. > One downside is that then you can't expect normal programmers to write a > correct tzinfo based on the C libraries. They'll never get it right. > :-) It would have to be supplied with Python. This is one of the reasons against it; we can't possibly supply timezone implementations for every country, so we really need people to write their own. Maybe this is what Marc-Andre was hinting at: apparently mxDateTime knows how to access the C library's timezone tables. --Guido van Rossum (home page: http://www.python.org/~guido/) From rjones@ekit-inc.com Fri Jan 3 22:22:52 2003 From: rjones@ekit-inc.com (Richard Jones) Date: Sat, 4 Jan 2003 09:22:52 +1100 Subject: [Python-Dev] PEP 301 implementation checked in In-Reply-To: References: Message-ID: <200301040922.53010.rjones@ekit-inc.com> On Sat, 4 Jan 2003 3:11 am, Andrew Kuchling wrote: > You can comment on the web site, too, but the site can be updated > independently of the Python code so there's less pressure to get it > finished before 2.3final. I intend to make the web interface conform to the python.org look before it goes live. Richard From DavidA@ActiveState.com Fri Jan 3 23:26:34 2003 From: DavidA@ActiveState.com (David Ascher) Date: Fri, 03 Jan 2003 15:26:34 -0800 Subject: [Python-Dev] Directories w/ spaces - pita In-Reply-To: <200301032052.h03KqUT05587@odiug.zope.com> References: <200301032052.h03KqUT05587@odiug.zope.com> Message-ID: <3E161C2A.8000702@ActiveState.com> Guido van Rossum wrote: >>i'd like to ask why there are files with names containing space in the >>source tarball? even though you don't use the mac stuff, and could >>possibly just do rm -fr Mac, sometimes you'd like to keep it for >>completeness. not a big issue, but it annoys me every time i have to >>check in a new release into the pythonware repository. i'd like to see a >>policy not to use anything but hmm... a-zA-Z0-9.-_ in file names. > > > Me too, but this is up to the Mac developers, and they really like > their spaces. :-( While we're on the topic, getting rid of the "..." in some of the mac files would make it possible to check the Mac branch of the tree in Perforce, where ... is magic. --david From aahz@pythoncraft.com Fri Jan 3 23:41:31 2003 From: aahz@pythoncraft.com (Aahz) Date: Fri, 3 Jan 2003 18:41:31 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: References: <20030103154755.GA14091@panix.com> Message-ID: <20030103234131.GA26356@panix.com> On Fri, Jan 03, 2003, Tim Peters wrote: > [Aahz] >> >> What *does* the current implementation do when it hits the witching >> hour of DST->ST? > > Please read the original msg again. Nothing that followed has been of > much help, and I've really had nothing new to say since then -- it was > all there at the start (including the answer to this question). Your love of mathematical precision sometimes gets in the way of clear answers to specific questions. ;-) (Yes, I saw the bit about ValueError earlier, but it was implied, not explicit.) >> So far, I've been saying that it should return 1:MM twice when converting >> from UTC; if it already does that, I'm fine. > > It doesn't now, but I believe it will be reasonably cheap to do so, > and that's what Guido wants it to do too. Provided that's the defined > and documented behavior, fine by me too. Thanks for pushing for it! > (Right or wrong, it fosters the illusion of consensus .) Cool! >> The only case where I've advocated raising an exception was attempting >> to convert a pure wall clock time to any timezone-based time. > > The datetime module doesn't have a class named "PureWallClockTime" > . Seriously, the original msg posed focused questions about > how a specific class method should act in its debatable endcases, and > anything beyond that is out-of-scope for me here. Uh, it sure looks to me like timetz defaults tzinfo to None, which I'd call "pure wall clock". But you're probably right that it's out-of-scope for this discussion -- I only brought it up because you mentioned Guido's calendar. (Nevertheless, I can't resist continuing the argument below.) >> (As in the case of Guido's calendar entries.) That would raise an >> exception for all times, not just DST change days. > > Under the agreement, Guido's calendar entries will display as 1:MM in > this case, if a programmer uses datetimetz.astimezone() to convert > them. ... No, they shouldn't, assuming Guido's calendar entries are built out of timetz instances with tzinfo set to None. Remember that Guido sticks in entries for 1pm in SF while he's still in DC. If you're going to handle this use case, there needs to be a way to spell it, and someone trying to convert this to a timezone ought to get an exception. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "There are three kinds of lies: Lies, Damn Lies, and Statistics." --Disraeli From shane@zope.com Fri Jan 3 23:42:12 2003 From: shane@zope.com (Shane Hathaway) Date: Fri, 03 Jan 2003 18:42:12 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time References: Your message of "Fri, 03 Jan 2003 19:27:50 +0100." <3E15D626.2030901@lemburg.com> <200301031839.h03Idil14232@odiug.zope.com> <3E15E57F.2000407@zope.com> <200301032204.h03M4IV26039@odiug.zope.com> Message-ID: <3E161FD4.6010106@zope.com> Guido van Rossum wrote: > Now when we want to translate from UTC to local time, we have to > recover the parameters of g() by interpreting f()! > > (There's more, but I don't want to spend all day writing this.) That was a great ASCII art graph, though. It made me smile. :-) > Maybe this is what Marc-Andre was hinting at: apparently mxDateTime > knows how to access the C library's timezone tables. I was working on a possible solution when I stumbled across the fact that the current tzinfo documentation doesn't seem to specify whether the dst() method expects the "dt" argument to be in terms of UTC or local time. Sometimes when working on the problem I assumed dt was in UTC, making the conversion from UTC to local time easy, and at other times I assumed dt was in local time, making the conversion from local time to UTC easy. Which is it? Once that's decided, it seems like the "hard" case (whichever is the hard one) could be solved by first computing the UTC offset at the time requested, then computing the UTC offset at a time adjusted by the offset. If the two computed offsets are different, you know you've straddled a daylight savings boundary, and maybe the second offset is the correct offset. That's just a guess. feebly-trying-to-catch-up-to-tims-genius-ly y'rs, Shane From Jack.Jansen@oratrix.com Fri Jan 3 23:45:14 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Sat, 4 Jan 2003 00:45:14 +0100 Subject: [Python-Dev] Directories w/ spaces - pita In-Reply-To: <200301032052.h03KqUT05587@odiug.zope.com> Message-ID: <67D1E98C-1F75-11D7-8857-000A27B19B96@oratrix.com> On vrijdag, jan 3, 2003, at 21:52 Europe/Amsterdam, Guido van Rossum wrote: > Me too, but this is up to the Mac developers, and they really like > their spaces. :-( There's two issues, really (or maybe three, now that I've numerated them:-): - For things that are not end user visible we shouldn't use spaces. So, if distutils removes spaces from the uname output I'm all for it. - For things that are end user visible we should try very hard to keep the spaces in. This goes especially for the IDE plugins. - For some end-user visible things this is plain impossible, due to the Python build process. For example, the MacPython-OSX user-visible applications should ideally be in a folder "MacPython 2.2", but trying to explain this to "make" leads to a completely unintelligible tangle of backslashes. So, you shout "I HATE UNIX!! I HATE UNIX!!" a couple of times (don't bother replying to that unless you either know me personally or have also happily used unix for more than 25 years:-) and put a hyphen in the folder name. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From Jack.Jansen@oratrix.com Fri Jan 3 23:49:58 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Sat, 4 Jan 2003 00:49:58 +0100 Subject: [Python-Dev] Directories w/ spaces - pita In-Reply-To: <3E161C2A.8000702@ActiveState.com> Message-ID: <10E7366A-1F76-11D7-8857-000A27B19B96@oratrix.com> On zaterdag, jan 4, 2003, at 00:26 Europe/Amsterdam, David Ascher wrote: > While we're on the topic, getting rid of the "..." in some of the mac=20= > files would make it possible to check the Mac branch of the tree in=20 > Perforce, where ... is magic. :-) Originally, there was no three dots "..." but an ellipses "=85", which = is=20 the Mac way of saying a dialog shows up when you select that menu=20 entry. After much gnashing of teeth, wailing and pulling out of hair we=20= followed Guido's dictum that non-ascii was a no-no in the Python=20 repository... -- - Jack Jansen =20 http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma=20 Goldman - From guido@python.org Sat Jan 4 00:05:38 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 03 Jan 2003 19:05:38 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: Your message of "Fri, 03 Jan 2003 18:42:12 EST." <3E161FD4.6010106@zope.com> References: Your message of "Fri, 03 Jan 2003 19:27:50 +0100." <3E15D626.2030901@lemburg.com> <200301031839.h03Idil14232@odiug.zope.com> <3E15E57F.2000407@zope.com> <200301032204.h03M4IV26039@odiug.zope.com> <3E161FD4.6010106@zope.com> Message-ID: <200301040005.h0405cb09856@pcp02138704pcs.reston01.va.comcast.net> > I was working on a possible solution when I stumbled across the fact > that the current tzinfo documentation doesn't seem to specify whether > the dst() method expects the "dt" argument to be in terms of UTC or > local time. Sometimes when working on the problem I assumed dt was in > UTC, making the conversion from UTC to local time easy, and at other > times I assumed dt was in local time, making the conversion from local > time to UTC easy. Which is it? Local time. This is the source of most problems! > Once that's decided, it seems like the "hard" case (whichever is the > hard one) could be solved by first computing the UTC offset at the time > requested, then computing the UTC offset at a time adjusted by the > offset. If the two computed offsets are different, you know you've > straddled a daylight savings boundary, and maybe the second offset is > the correct offset. That's just a guess. You're slowly rediscovering the guts of datetimetz.astimezone(). Have a look at the python code in python/nondist/sandbox/datetime/dateyime.py before you go any further. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim@zope.com Sat Jan 4 00:47:02 2003 From: tim@zope.com (Tim Peters) Date: Fri, 3 Jan 2003 19:47:02 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: <3E161FD4.6010106@zope.com> Message-ID: [Shane Hathaway] > I was working on a possible solution when I stumbled across the fact > that the current tzinfo documentation doesn't seem to specify whether > the dst() method expects the "dt" argument to be in terms of UTC or > local time. I'm not keeping the plain-text docs up to date anymore, but the doc.txt under Zope3's src/datetime/ sez: An instance of (a concrete subclass of) tzinfo can be passed to the constructors for datetimetz and timetz objects. The latter objects view their fields as being in local time, and the tzinfo object supports ... ... These methods are called by a datetimetz or timetz object, in response to their methods of the same names. A datetimetz object passes itself as the argument, and a timetz object passes None as the argument. A tzinfo subclass's methods should therefore be prepared to accept a dt argument of None, or of class datetimetz. When None is passed, it's up to the class designer to decide the best response. For example, ... ... When a datetimetz object is passed in response to a datetimetz method, dt.tzinfo is the same object as self. tzinfo methods can rely on this, unless user code calls tzinfo methods directly. The intent is that the tzinfo methods interpret dt as being in local time, and not need to worry about objects in other timezones. So I can't tell you what *your* dst() method should expect if you call it directly, but I can (and do) tell you that whenever the implementation calls a tzinfo method by magic, the argument will be None, or a datetimetz with a matching tzinfo member and is to be viewed as local time (hmm -- perhaps the distinction between self's notion of local time and your own notion of local time remains unclear). > ... > Once that's decided, it seems like the "hard" case (whichever is the > hard one) could be solved by first computing the UTC offset at the time > requested, then computing the UTC offset at a time adjusted by the > offset. If the two computed offsets are different, you know you've > straddled a daylight savings boundary, and maybe the second offset is > the correct offset. That's just a guess. For a formal proof , see the long comment at the end of Zope3's src/datetime/_datetime.py (which I keep in synch with the Python sandbox version Guido pointed you at). > feebly-trying-to-catch-up-to-tims-genius-ly y'rs, Shane It's a lousy 3-segment step function. This isn't genius, it's just a stubborn refusal to give up <0.7 wink>. if-you're-on-an-irritating-project-it's-energizing-to-attack- a-piece-of-it-you-hate-ly y'rs - tim From nas@python.ca Sat Jan 4 01:18:13 2003 From: nas@python.ca (Neil Schemenauer) Date: Fri, 3 Jan 2003 17:18:13 -0800 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: References: <3E161FD4.6010106@zope.com> Message-ID: <20030104011813.GA11598@glacier.arctrix.com> Tim Peters wrote: > It's a lousy 3-segment step function. This isn't genius, it's just a > stubborn refusal to give up <0.7 wink>. Maybe you should just move to Saskatchewan. home-of-the-enlightened-ly y'rs Neil From Raymond Hettinger" For about a month, Martijn Pieters has had a pending feature request (with attached an attached patch) for a class that extends Differ class in the difflib module. It creates unidiffs from two sequences of lines. I think it would be a useful addition to Py2.3. If everyone agrees, then I'll clean-up the code, integrate it into the module, adds docs, and write a thorough test suite. See: www.python.org/sf/635144 . Raymond Hettinger From skip@pobox.com Sat Jan 4 04:44:55 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 3 Jan 2003 22:44:55 -0600 Subject: [Python-Dev] Misc/*-NOTES? Message-ID: <15894.26311.9593.933548@montanaro.dyndns.org> In looking for the place to add Andrew's info about binutils I stumbled upon the following four files in Misc: Misc% ls -l *NOTES -rw-rw-r-- 1 skip staff 7517 Oct 26 2000 AIX-NOTES -rw-rw-r-- 1 skip staff 1751 Jun 11 2002 AtheOS-NOTES -rw-rw-r-- 1 skip staff 1436 Apr 10 2001 BeOS-NOTES -rw-rw-r-- 1 skip staff 1065 Jul 19 1997 HPUX-NOTES I merged the AtheOS notes into the README file and removed it. I sent a note to Donn Cave asking him if he has any updates for the BeOS notes. The HPUX-NOTES and AIX-NOTES file seem particularly old. I'm tempted to simply dump them. Any comments? Skip From tim.one@comcast.net Sat Jan 4 06:44:27 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 04 Jan 2003 01:44:27 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: <3E15E57F.2000407@zope.com> Message-ID: [Shane Hathaway] > ... > That sounds perfectly reasonable, but may I suggest moving the > assumption by changing the interface of the tzinfo class. The > utcoffset() method leads one to naively assume that functions f and g > can both depend reliably on utcoffset(). Instead, tzinfo might > have two methods, to_local(utc_date) and to_utc(local_date). That > way, the tzinfo object encapsulates the madness. I think we may need from_utc() before this is over, but that most people won't have any need for it. In the other direction, it's already the tzinfo subclass author's responsibility to ensure that the current: d - d.utcoffset() yields exactly the same date and time members as would the hypothesized: d.to_utc() > One downside is that then you can't expect normal programmers to > write a correct tzinfo based on the C libraries. They'll never get > it right. :-) It would have to be supplied with Python. I doubt the latter will happen, and it certainly won't happen for 2.3. The current scheme has actually become about as easy as it can become. From the next iteration of the docs, here's a full implementation of a class for DST-aware major US time zones (using the rules that have been in effect for more than a decade): """ from datetime import tzinfo, timedelta, datetime ZERO = timedelta(0) HOUR = timedelta(hours=1) def first_sunday_on_or_after(dt): days_to_go = 6 - dt.weekday() if days_to_go: dt += timedelta(days_to_go) return dt # In the US, DST starts at 2am (standard time) on the first Sunday in # April. DSTSTART = datetime(1, 4, 1, 2) # and ends at 2am (DST time; 1am standard time) on the last Sunday # of October, which is the first Sunday on or after Oct 25. DSTEND = datetime(1, 10, 25, 2) class USTimeZone(tzinfo): def __init__(self, hours, reprname, stdname, dstname): self.stdoffset = timedelta(hours=hours) self.reprname = reprname self.stdname = stdname self.dstname = dstname def __repr__(self): return self.reprname def tzname(self, dt): if self.dst(dt): return self.dstname else: return self.stdname def utcoffset(self, dt): return self.stdoffset + self.dst(dt) def dst(self, dt): if dt is None or dt.tzinfo is None: # An exception may be sensible here, in one or both cases. # It depends on how you want to treat them. The astimezone() # implementation always passes a datetimetz with # dt.tzinfo == self. return ZERO assert dt.tzinfo is self # Find first Sunday in April & the last in October. start = first_sunday_on_or_after(DSTSTART.replace(year=dt.year)) end = first_sunday_on_or_after(DSTEND.replace(year=dt.year)) # Can't compare naive to aware objects, so strip the timezone $ from dt first. if start <= dt.replace(tzinfo=None) < end: return HOUR else: return ZERO Eastern = USTimeZone(-5, "Eastern", "EST", "EDT") Central = USTimeZone(-6, "Central", "CST", "CDT") Mountain = USTimeZone(-7, "Mountain", "MST", "MDT") Pacific = USTimeZone(-8, "Pacific", "PST", "PDT") """ The test suite beats the snot out of this class, and .astimezone() behaves exactly as we've talked about here in all cases now, whether Eastern or Pacific (etc) are source zones or target zones or both. But the coding is really quite simple, doing nothing more nor less than implementing "the plain rules". (BTW, note that no use is made of the platform C time functions here) A similar class for European rules can be found in EU.py in the Python datetime sandbox, and is just as straightforward (relative to the complexity inherent in those rules). Because the only strong assumption astimezone() makes is that tz.utcoffset(d) - tz.dst(d) # tz's "standard offset" is invariant wrt d, it should work fine for tzinfo subclasses that want to use different switch points in different years, or have multiple DST periods in a year (including none at all in some years), etc. So long as a time zone's "standard offset" depends only on a location's longitude, astimezone() is very likely to do the right thing no matter how goofy the rest of the zone is. So, at the moment, I don't have an actual use case in hand anymore that requires a from_utc() method. astimezone() could be written in terms of it, though: def astimezone(self, tz): self -= self.utcoffset() # as UTC other = self.replace(tzinfo=tz) return other.from_utc() and the tzinfo base class could supply a default from_utc() method capturing the current astimezone() implementation. Then we'd have a powerful hook tzinfo subclasses could override -- but I'm not sure anyone will find a need to! From guido@python.org Sat Jan 4 06:49:33 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 04 Jan 2003 01:49:33 -0500 Subject: [Python-Dev] Misc/*-NOTES? In-Reply-To: Your message of "Fri, 03 Jan 2003 22:44:55 CST." <15894.26311.9593.933548@montanaro.dyndns.org> References: <15894.26311.9593.933548@montanaro.dyndns.org> Message-ID: <200301040649.h046nXQ10784@pcp02138704pcs.reston01.va.comcast.net> > In looking for the place to add Andrew's info about binutils I stumbled upon > the following four files in Misc: > > Misc% ls -l *NOTES > -rw-rw-r-- 1 skip staff 7517 Oct 26 2000 AIX-NOTES > -rw-rw-r-- 1 skip staff 1751 Jun 11 2002 AtheOS-NOTES > -rw-rw-r-- 1 skip staff 1436 Apr 10 2001 BeOS-NOTES > -rw-rw-r-- 1 skip staff 1065 Jul 19 1997 HPUX-NOTES > > I merged the AtheOS notes into the README file and removed it. I sent a > note to Donn Cave asking him if he has any updates for the BeOS notes. The > HPUX-NOTES and AIX-NOTES file seem particularly old. I'm tempted to simply > dump them. Any comments? +1 for HPUX-NOTES +0 for AIX-NOTES (some stuff there may still be relevant? some people still have very old systems) Thanks for doing this thankless work! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jan 4 06:57:00 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 04 Jan 2003 01:57:00 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: Your message of "Sat, 04 Jan 2003 01:44:27 EST." References: Message-ID: <200301040657.h046v0Y10818@pcp02138704pcs.reston01.va.comcast.net> > So, at the moment, I don't have an actual use case in hand anymore that > requires a from_utc() method. astimezone() could be written in terms of it, > though: > > def astimezone(self, tz): > self -= self.utcoffset() # as UTC > other = self.replace(tzinfo=tz) > return other.from_utc() > > and the tzinfo base class could supply a default from_utc() method capturing > the current astimezone() implementation. Then we'd have a powerful hook > tzinfo subclasses could override -- but I'm not sure anyone will find a need > to! That's a potentially powerful idea (but call it fromutc() since utcoffset() doesn't have an underscore either). I'd also then perhaps favor the idea of implementing utcoffset() in that base class as returning a fixed standard offset plus whatever dst() returns -- though that requires us to fix the name and type of the offset, and may require a special case for when dst() returns None, so maybe it's not worth it. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Sat Jan 4 07:02:01 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 04 Jan 2003 02:02:01 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <3E157236.4000004@lemburg.com> Message-ID: [M.-A. Lemburg] > Why don't you take a look at how this is done in mxDateTime ? > It has support for the C lib API timegm() (present in many C libs) > and includes a work-around which works for most cases; even close > to the DST switch time. > > BTW, you should also watch out for broken mktime() implementations > and whether the C lib support leap seconds or not. That has bitten > me a few times too. I think there's a relevant difference in datetime: it makes almost no use of timestamps. There is no datetime method that returns a timestamp, for example. All we've got are "backward compatability" constructors that will build a datetime object from a timestamp, if you insist . Those use the platform localtime() and gmtime() functions, and inherit whatever limitations and problems the C libraries have. Eek -- that reminds me, I should add code to clamp out tm_sec values of 60 and 61 The broken-out year, month, etc, struct tm members aren't combined again internally either, as dates and times in this module are stored with distinct year, month, etc fields. It's not clear what you mean by "broken mktime() implementations", but the implementation of datetime never calls the platform mktime(). The test suite does, though. From tim.one@comcast.net Sat Jan 4 07:14:01 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 04 Jan 2003 02:14:01 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <3E15D626.2030901@lemburg.com> Message-ID: [Guido] >> A goal of the new datetime module is to avoid all dependency on the >> C library's time facilities -- we must support calculataions outside >> the range that the C library can deal with. [M.-A. Lemburg] > I don't see how that can be done for time zones and DST. You may be missing that datetime supplies no time zone classes, not even a class for UTC. What it does provide is an abstract base class (tzinfo), and a protocol users can follow if they want to supply concrete time zone subclasses of their own. datetimetz.astimezone() is a pretty general tz conversion routine that works with the tzinfo protocol, but datetime supplies no objects astimezone can work *with* out of the box. The time zone rules a user can support are thus whatever can be expressed by arbitrary user-written Python code -- but they have to write that code themself (or talk someone else into writing it for them). > Timezones and even more the DST settings change more often > for various locales than you think, so assumptions about the > offset between UTC and local time for the future as well as > for historical dates can easily be wrong. Since the datetime module supplies no concrete time zone objects, it makes no concrete time zone assumptions (whether about past, present, or future). > The tz data used by most C libs has tables which account for many > of the known offsets in the past; they can only guess about > the future. A user who wants to use such tables will have to write Python code to read them up. If they want their code to search the web for updates and incorporate them on the fly, they can do that too. > The only usable time scale for historic and future date/time > is UTC. The same is true if you're interested in date/time > calculations in terms of absolute time. Users who buy that can pay for it . Note that datetime doesn't support years outside the range 1-9999, so its appeal to astronomers and ancient history buffs is limited anyway. > Now, for current time zones, the C lib is a good source > of information, so I don't see why you wouldn't want to > use it. As with all the rest here, users are free to, if that's what they want. datetime just supplies a framework for what are essentially pluggable time zone strategy objects. From tim.one@comcast.net Sat Jan 4 07:25:18 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 04 Jan 2003 02:25:18 -0500 Subject: [Python-Dev] Holes in time In-Reply-To: <20030103234131.GA26356@panix.com> Message-ID: [Aahz] > Your love of mathematical precision sometimes gets in the way of > clear answers to specific questions. ;-) (Yes, I saw the bit about > ValueError earlier, but it was implied, not explicit.) Given The current implementation of dt.astimezone(tz) raises ValueError if dt can't be expressed as a local time in tz. following an explict example of that case, that's a meaning for "implied" with which I was previously unacquainted . > ... > Uh, it sure looks to me like timetz defaults tzinfo to None, which I'd > call "pure wall clock". The docs call this "naive time", probably the same thing. ... >> Under the agreement, Guido's calendar entries will display as 1:MM in >> this case, if a programmer uses datetimetz.astimezone() to convert >> them. ... > No, they shouldn't, assuming Guido's calendar entries are built out of > timetz instances with tzinfo set to None. Remember that Guido sticks > in entries for 1pm in SF while he's still in DC. I can't deduce from that whether the generation of Palm that builds its apps with Python 2.3 will choose to create naive or aware times in its appointment app. > If you're going to handle this use case, there needs to be a way to > spell it, There are surely many ways to spell it, depending on all sorts of design criteria that haven't been made explicit here. > and someone trying to convert this to a timezone ought to get an > exception. You can't pass a time object or a timetz object (whether naive or aware makes no difference) to astimezone(), if that's what you're worried about. So, sure, you'll get a TypeError exception if you try. From bac@OCF.Berkeley.EDU Sat Jan 4 07:34:58 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Fri, 3 Jan 2003 23:34:58 -0800 (PST) Subject: [Python-Dev] Unidiff tool In-Reply-To: <002001c2b39b$d0d4b960$125ffea9@oemcomputer> References: <002001c2b39b$d0d4b960$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > For about a month, Martijn Pieters has had a pending feature request (with attached an attached patch) for a class that extends > Differ class in the difflib module. It creates unidiffs from two sequences of lines. I think it would be a useful addition to > Py2.3. If everyone agrees, then I'll clean-up the code, integrate it into the module, adds docs, and write a thorough test suite. > See: www.python.org/sf/635144 . > +0; no need for it but it definitely wouldn't hurt. Has it ever been considered to make ``difflib`` output diffs that could be passed to ``patch``? That way there would be one less barrier for people to get over when wanting to help in development. -Brett From whisper@oz.net Sat Jan 4 07:41:57 2003 From: whisper@oz.net (David LeBlanc) Date: Fri, 3 Jan 2003 23:41:57 -0800 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: Message-ID: In case it's of interest: http://www.twinsun.com/tz/tz-link.htm David LeBlanc Seattle, WA USA > -----Original Message----- > From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On > Behalf Of Tim Peters > Sent: Friday, January 03, 2003 22:44 > To: Shane Hathaway > Cc: zope3-dev@zope.org; PythonDev > Subject: RE: [Zope3-dev] Re: [Python-Dev] Holes in time > > > [Shane Hathaway] > > ... > > That sounds perfectly reasonable, but may I suggest moving the > > assumption by changing the interface of the tzinfo class. The > > utcoffset() method leads one to naively assume that functions f and g > > can both depend reliably on utcoffset(). Instead, tzinfo might > > have two methods, to_local(utc_date) and to_utc(local_date). That > > way, the tzinfo object encapsulates the madness. > > I think we may need from_utc() before this is over, but that most people > won't have any need for it. In the other direction, it's already > the tzinfo > subclass author's responsibility to ensure that the current: > > d - d.utcoffset() > > yields exactly the same date and time members as would the hypothesized: > > d.to_utc() > > > One downside is that then you can't expect normal programmers to > > write a correct tzinfo based on the C libraries. They'll never get > > it right. :-) It would have to be supplied with Python. > > I doubt the latter will happen, and it certainly won't happen for 2.3. > > The current scheme has actually become about as easy as it can > become. From > the next iteration of the docs, here's a full implementation of a > class for > DST-aware major US time zones (using the rules that have been in > effect for > more than a decade): > > """ > from datetime import tzinfo, timedelta, datetime > > ZERO = timedelta(0) > HOUR = timedelta(hours=1) > > def first_sunday_on_or_after(dt): > days_to_go = 6 - dt.weekday() > if days_to_go: > dt += timedelta(days_to_go) > return dt > > # In the US, DST starts at 2am (standard time) on the first Sunday in > # April. > DSTSTART = datetime(1, 4, 1, 2) > # and ends at 2am (DST time; 1am standard time) on the last Sunday > # of October, which is the first Sunday on or after Oct 25. > DSTEND = datetime(1, 10, 25, 2) > > class USTimeZone(tzinfo): > > def __init__(self, hours, reprname, stdname, dstname): > self.stdoffset = timedelta(hours=hours) > self.reprname = reprname > self.stdname = stdname > self.dstname = dstname > > def __repr__(self): > return self.reprname > > def tzname(self, dt): > if self.dst(dt): > return self.dstname > else: > return self.stdname > > def utcoffset(self, dt): > return self.stdoffset + self.dst(dt) > > def dst(self, dt): > if dt is None or dt.tzinfo is None: > # An exception may be sensible here, in one or both cases. > # It depends on how you want to treat them. The astimezone() > # implementation always passes a datetimetz with > # dt.tzinfo == self. > return ZERO > assert dt.tzinfo is self > > # Find first Sunday in April & the last in October. > start = first_sunday_on_or_after(DSTSTART.replace(year=dt.year)) > end = first_sunday_on_or_after(DSTEND.replace(year=dt.year)) > > # Can't compare naive to aware objects, so strip the timezone > $ from dt first. > if start <= dt.replace(tzinfo=None) < end: > return HOUR > else: > return ZERO > > Eastern = USTimeZone(-5, "Eastern", "EST", "EDT") > Central = USTimeZone(-6, "Central", "CST", "CDT") > Mountain = USTimeZone(-7, "Mountain", "MST", "MDT") > Pacific = USTimeZone(-8, "Pacific", "PST", "PDT") > """ > > The test suite beats the snot out of this class, and .astimezone() behaves > exactly as we've talked about here in all cases now, whether Eastern or > Pacific (etc) are source zones or target zones or both. But the coding is > really quite simple, doing nothing more nor less than implementing "the > plain rules". (BTW, note that no use is made of the platform C time > functions here) > > A similar class for European rules can be found in EU.py in the Python > datetime sandbox, and is just as straightforward (relative to the > complexity > inherent in those rules). > > Because the only strong assumption astimezone() makes is that > > tz.utcoffset(d) - tz.dst(d) # tz's "standard offset" > > is invariant wrt d, it should work fine for tzinfo subclasses that want to > use different switch points in different years, or have multiple > DST periods > in a year (including none at all in some years), etc. So long as a time > zone's "standard offset" depends only on a location's longitude, > astimezone() is very likely to do the right thing no matter how goofy the > rest of the zone is. > > So, at the moment, I don't have an actual use case in hand anymore that > requires a from_utc() method. astimezone() could be written in > terms of it, > though: > > def astimezone(self, tz): > self -= self.utcoffset() # as UTC > other = self.replace(tzinfo=tz) > return other.from_utc() > > and the tzinfo base class could supply a default from_utc() > method capturing > the current astimezone() implementation. Then we'd have a powerful hook > tzinfo subclasses could override -- but I'm not sure anyone will > find a need > to! > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From tim.one@comcast.net Sat Jan 4 07:53:32 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 04 Jan 2003 02:53:32 -0500 Subject: [Python-Dev] Unidiff tool In-Reply-To: Message-ID: [Brett Cannon] > Has it ever been considered to make ``difflib`` output diffs that > could be passed to ``patch``? That way there would be one less > barrier for people to get over when wanting to help in development. If they have CVS, they can make patches properly with cvs diff, and any developer wannabe who doesn't have CVS is missing a more basic part of the story than how to make patches . From bac@OCF.Berkeley.EDU Sat Jan 4 08:34:17 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Sat, 4 Jan 2003 00:34:17 -0800 (PST) Subject: [Python-Dev] Unidiff tool In-Reply-To: References: Message-ID: [Tim Peters] > [Brett Cannon] > > Has it ever been considered to make ``difflib`` output diffs that > > could be passed to ``patch``? That way there would be one less > > barrier for people to get over when wanting to help in development. > > If they have CVS, they can make patches properly with cvs diff, and any > developer wannabe who doesn't have CVS is missing a more basic part of the > story than how to make patches . > Good point. =) -Brett From martin@v.loewis.de Sat Jan 4 08:53:28 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 04 Jan 2003 09:53:28 +0100 Subject: [Python-Dev] Misc/*-NOTES? In-Reply-To: <200301040649.h046nXQ10784@pcp02138704pcs.reston01.va.comcast.net> References: <15894.26311.9593.933548@montanaro.dyndns.org> <200301040649.h046nXQ10784@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > +0 for AIX-NOTES (some stuff there may still be relevant? some people > still have very old systems) Sounds like an issue for PEP 11: What is the oldest AIX release that we need to support? It is probably a bit late for Python 2.3 to add more systems to PEP 11, but I would guess that nobody needs AIX 3 support anymore. It is not clear whether AIX 4.1 is still used; the current version appears to be AIX 5L Version 5.2, and IBM still mentions AIX 4.3.3. Regards, Martin From tjw@omnigroup.com Sat Jan 4 12:00:24 2003 From: tjw@omnigroup.com (Timothy J. Wood) Date: Sat, 4 Jan 2003 04:00:24 -0800 Subject: [Python-Dev] Cross compiling Message-ID: <1BB1EE3A-1FDC-11D7-99C2-0003933F3BC2@omnigroup.com> I've been doing some research this evening into getting Python to cross compile from Mac OS X to MinGW. With Python 2.2.2, I'm able to get a ways before trouble crops up -- a parser program is compiled with the target CC instead of the host CC and won't run locally, and the configure script wants to include Mac OS X modules and support files (dy loading stuff, for example). With the recent 2.3a1 it falls over completely with the standard '--target=i386-mingw32msvc', but if I contort into: % CC=/Users/Shared/bungi/MinGW/bin/i386-mingw32msvc-gcc ./configure --build=powerpc-apple-darwin6.3 --host=powerpc-apple-darwin6.3 --target=i386-mingw32msvc --prefix=/tmp/python-win checking MACHDEP... darwin checking EXTRAPLATDIR... $(PLATMACDIRS) checking for --without-gcc... no checking for --with-cxx=... no checking for c++... c++ checking for C++ compiler default output... a.out checking whether the C++ compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for powerpc-apple-darwin6.3-gcc... /Users/Shared/bungi/MinGW/bin/i386-mingw32msvc-gcc checking for C compiler default output... a.exe checking whether the C compiler works... configure: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. The '--host=i386-mingw32msvc' is wrong, but at least I get farther. I still run into the 2.2.2-ish problems, though. (All in all, I'd expect to be able to just use '--target=i386-mingw32msvc' and have --host and --build default properly). At any rate, I'm wondering if anyone is working on this. There have been various post by various people flailing on this for the past two or three years, so it seems like a moderately common thing to want to do. I'd certainly be willing to devote some time to testing stuff, but I'm no configure guru, so help would be appreciated :) Thanks! -tim From sidnei@x3ng.com Sat Jan 4 12:03:43 2003 From: sidnei@x3ng.com (Sidnei da Silva) Date: Sat, 4 Jan 2003 10:03:43 -0200 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: <200301032204.h03M4IV26039@odiug.zope.com> References: <3E15D626.2030901@lemburg.com> <200301031839.h03Idil14232@odiug.zope.com> <3E15E57F.2000407@zope.com> <200301032204.h03M4IV26039@odiug.zope.com> Message-ID: <20030104120343.GA9158@x3ng.com> On Fri, Jan 03, 2003 at 05:04:18PM -0500, Guido van Rossum wrote: | > One downside is that then you can't expect normal programmers to write a | > correct tzinfo based on the C libraries. They'll never get it right. | > :-) It would have to be supplied with Python. | | This is one of the reasons against it; we can't possibly supply | timezone implementations for every country, so we really need people | to write their own. Specially when the daylight savings time is not always fixed like in Brazil, where last year the govt simply decided that it would start in a different date to save something like 0.001% more energy :) -- Sidnei da Silva (dreamcatcher) X3ng Web Technology GNU/Linux user 257852 Debian GNU/Linux 3.0 (Sid) 2.4.18 ppc This login session: $13.76, but for you $11.88. From tjw@omnigroup.com Sat Jan 4 12:16:19 2003 From: tjw@omnigroup.com (Timothy J. Wood) Date: Sat, 4 Jan 2003 04:16:19 -0800 Subject: [Python-Dev] Re: Cross compiling In-Reply-To: <1BB1EE3A-1FDC-11D7-99C2-0003933F3BC2@omnigroup.com> Message-ID: <5468AC24-1FDE-11D7-99C2-0003933F3BC2@omnigroup.com> On Saturday, January 4, 2003, at 04:00 AM, Timothy J. Wood wrote: > The '--host=i386-mingw32msvc' is wrong, but at least I get farther. > I still run into the 2.2.2-ish problems, though. Dur. I've clearly been going too many gcc cross builds and not enough of other stuff :) It still fails, though: % CC=/Users/Shared/bungi/MinGW/bin/i386-mingw32msvc-gcc ./configure --build=powerpc-apple-darwin6.3 --host=i386-mingw32msvc --prefix=/tmp/python-win Particularly worrisome portions of the configure output: checking for --enable-toolbox-glue... yes checking for --enable-framework... no checking for dyld... always on for Darwin checking SO... .so checking LDSHARED... $(CC) $(LDFLAGS) -bundle -bundle_loader $(BINDIR)/$(PYTHON) checking CCSHARED... checking LINKFORSHARED... -u __dummy -u _PyMac_Error -framework System -framework CoreServices -framework Foundation checking DYNLOADFILE... dynload_next.o ... and finally ... checking whether setpgrp takes no argument... configure: error: cannot check setpgrp when cross compiling So, again, if anyone else is working on this, let me know :) -tim From guido@python.org Sat Jan 4 14:01:14 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 04 Jan 2003 09:01:14 -0500 Subject: [Python-Dev] Unidiff tool In-Reply-To: Your message of "Fri, 03 Jan 2003 23:34:58 PST." References: <002001c2b39b$d0d4b960$125ffea9@oemcomputer> Message-ID: <200301041401.h04E1Ej11746@pcp02138704pcs.reston01.va.comcast.net> > Has it ever been considered to make ``difflib`` output diffs that could be > passed to ``patch``? That way there would be one less barrier for people > to get over when wanting to help in development. +1 --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sat Jan 4 14:18:02 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 04 Jan 2003 09:18:02 -0500 Subject: [Python-Dev] Cross compiling In-Reply-To: Your message of "Sat, 04 Jan 2003 04:00:24 PST." <1BB1EE3A-1FDC-11D7-99C2-0003933F3BC2@omnigroup.com> References: <1BB1EE3A-1FDC-11D7-99C2-0003933F3BC2@omnigroup.com> Message-ID: <200301041418.h04EI2G19410@pcp02138704pcs.reston01.va.comcast.net> > I've been doing some research this evening into getting Python to > cross compile from Mac OS X to MinGW. With Python 2.2.2, I'm able to > get a ways before trouble crops up -- a parser program is compiled with > the target CC instead of the host CC and won't run locally, and the > configure script wants to include Mac OS X modules and support files > (dy loading stuff, for example). > > With the recent 2.3a1 it falls over completely with the standard > '--target=i386-mingw32msvc', but if I contort into: > > % CC=/Users/Shared/bungi/MinGW/bin/i386-mingw32msvc-gcc ./configure > --build=powerpc-apple-darwin6.3 --host=powerpc-apple-darwin6.3 > --target=i386-mingw32msvc --prefix=/tmp/python-win > checking MACHDEP... darwin > checking EXTRAPLATDIR... $(PLATMACDIRS) > checking for --without-gcc... no > checking for --with-cxx=... no > checking for c++... c++ > checking for C++ compiler default output... a.out > checking whether the C++ compiler works... yes > checking whether we are cross compiling... no > checking for suffix of executables... > checking for powerpc-apple-darwin6.3-gcc... > /Users/Shared/bungi/MinGW/bin/i386-mingw32msvc-gcc > checking for C compiler default output... a.exe > checking whether the C compiler works... configure: error: cannot run C > compiled programs. > If you meant to cross compile, use `--host'. > > The '--host=i386-mingw32msvc' is wrong, but at least I get farther. > I still run into the 2.2.2-ish problems, though. > > (All in all, I'd expect to be able to just use > '--target=i386-mingw32msvc' and have --host and --build default > properly). > > At any rate, I'm wondering if anyone is working on this. There have > been various post by various people flailing on this for the past two > or three years, so it seems like a moderately common thing to want to > do. I'd certainly be willing to devote some time to testing stuff, but > I'm no configure guru, so help would be appreciated :) I remember cross-compiling for the iPAQ, and being reasonably successful (after making some changes to the Makefile that made it into CVS for 2.2). I think I probably didn't bother to get the pgen build right; I just edited the Makefile after it was generated to point to a working pgen in a different tree, or I copied a working pgen in from another tree, or I made sure that the pgen output (which doesn't change unless you edit Grammar/Grammar!) was newer than Grammar/Grammar, so the Makefile would never try to invoke pgen in the first place. It's quite possible though that recent additions to configure.in have broken things again -- if you don't test this regularly, it will succumb to bitrot quickly. :-( If you can fix it, by all means submit a patch to SF! --Guido van Rossum (home page: http://www.python.org/~guido/) From blunck@gst.com Sat Jan 4 17:46:34 2003 From: blunck@gst.com (Christopher Blunck) Date: Sat, 4 Jan 2003 12:46:34 -0500 Subject: [Python-Dev] tutor(function|module) Message-ID: <20030104174634.GA6906@homer.gst.com> Hi all- I'm new to the mailing list so I'd like to introduce myself. My name is Christopher Blunck and I'm fairly new to the Python language. I am a Java developer (please do not mock me) professionally, but use Python in all of my home projects and the open source projects I contribute to. Now on to my question. Not sure if this is the proper forum to ask this, or if it has even been discussed before, so apologies in advance if this is inappropriate. Question #1: How can I contribute to the documentation effort? As a new python developer, I use "help(function)" frequently. In a lot of cases, I learn exactly what I need from help(function). Other times, I don't. In those cases I hit google, or my "Python Cookbook", or give a holler to friends through instant messaging and the answer usually comes. When I do figure out something (either by myself or with the help of one of the resources above), I'd like to contribute that knowledge *back* to the project by improving what is displayed by help(function). How can I do so? Question #2: How about examples(function) that displays multiple examples of how to use the function provided? Similar to above, there are times when I look at the API and even if the parameters to the function are fully explained, it would help to have an example (case in point is os.path.walk - docs are great, but an example of how to make use of the third parameter (the arg you pass) would really clarify why 'the powers that be' chose to implement this capability. -c -- 12:35pm up 75 days, 3:51, 2 users, load average: 5.37, 5.14, 5.08 From mal@lemburg.com Sat Jan 4 18:30:26 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 04 Jan 2003 19:30:26 +0100 Subject: [Python-Dev] PEP 297: Support for System Upgrades In-Reply-To: <3D37DFEA.9070506@lemburg.com> References: <3D37DFEA.9070506@lemburg.com> Message-ID: <3E172842.8070501@lemburg.com> Any chance of getting one of the proposed solution into Python 2.3 ? http://www.python.org/peps/pep-0297.html -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim.one@comcast.net Sat Jan 4 19:02:00 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 04 Jan 2003 14:02:00 -0500 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib random.py,1.26.6.7,1.26.6.8 In-Reply-To: Message-ID: [rhettinger@users.sourceforge.net] > Modified Files: > Tag: release22-maint > random.py > Log Message: > Correct long standing bugs in the methods for random distributions. > The range of u=random() is [0,1), so log(u) and 1/x can fail. > Fix by setting u=1-random() or by reselecting for a usable value. These were intentional at the time: the Wichmann-Hill generator couldn't actually ever return 0.0 (or 1.0, although proving it couldn't return 1.0 took a bit of effort; that it couldn't return 0.0 was obvious). Can the Twister? I don't know. It's obvious that it can't return 1.0 on a platform where C double has at least 53 bits, but it's not obvious that it can't return 1.0 on a feebler box, and it's not obvious to me whether it can or can't return 0.0 on any box. Empirical evidence: in 10 tries, I didn't see it produce 0.0 even once . From neal@metaslash.com Sat Jan 4 19:04:28 2003 From: neal@metaslash.com (Neal Norwitz) Date: Sat, 4 Jan 2003 14:04:28 -0500 Subject: [Python-Dev] Misc/*-NOTES? In-Reply-To: <15894.26311.9593.933548@montanaro.dyndns.org> References: <15894.26311.9593.933548@montanaro.dyndns.org> Message-ID: <20030104190428.GT29873@epoch.metaslash.com> On Fri, Jan 03, 2003 at 10:44:55PM -0600, Skip Montanaro wrote: > > Misc% ls -l *NOTES > -rw-rw-r-- 1 skip staff 7517 Oct 26 2000 AIX-NOTES > -rw-rw-r-- 1 skip staff 1751 Jun 11 2002 AtheOS-NOTES > -rw-rw-r-- 1 skip staff 1436 Apr 10 2001 BeOS-NOTES > -rw-rw-r-- 1 skip staff 1065 Jul 19 1997 HPUX-NOTES > > The HPUX-NOTES and AIX-NOTES file seem particularly old. I'm > tempted to simply dump them. Any comments? HPUX-NOTES should go. Python builds cleanly on the snake-farm HP-UX boxes. AIX-NOTES still has some useful info. I know one place that has 3.2.5 boxes, although I don't know if they build recent Python versions on 3.2.5. The notes should probably be updated, but since the snake-farm AIX boxes can't build Python, I can't update it. Neal From guido@python.org Sat Jan 4 19:20:51 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 04 Jan 2003 14:20:51 -0500 Subject: [Python-Dev] PEP 297: Support for System Upgrades In-Reply-To: Your message of "Sat, 04 Jan 2003 19:30:26 +0100." <3E172842.8070501@lemburg.com> References: <3D37DFEA.9070506@lemburg.com> <3E172842.8070501@lemburg.com> Message-ID: <200301041920.h04JKpm20430@pcp02138704pcs.reston01.va.comcast.net> > Any chance of getting one of the proposed solution into Python 2.3 ? > > http://www.python.org/peps/pep-0297.html I'd be in favor of having an extra path item in front of the default path. Maybe instead of system-packages you could call it site-upgrades? We could also add this to Python 2.2.3 when it comes out and to Python 2.1.4 if and when it is released. --Guido van Rossum (home page: http://www.python.org/~guido/) From python@rcn.com Sat Jan 4 19:34:47 2003 From: python@rcn.com (Raymond Hettinger) Date: Sat, 4 Jan 2003 14:34:47 -0500 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib random.py,1.26.6.7,1.26.6.8 References: Message-ID: <004401c2b428$57cadf80$125ffea9@oemcomputer> "Tim Peters" > [rhettinger@users.sourceforge.net] > > Modified Files: > > Tag: release22-maint > > random.py > > Log Message: > > Correct long standing bugs in the methods for random distributions. > > The range of u=random() is [0,1), so log(u) and 1/x can fail. > > Fix by setting u=1-random() or by reselecting for a usable value. > > These were intentional at the time: the Wichmann-Hill generator couldn't > actually ever return 0.0 (or 1.0, although proving it couldn't return 1.0 > took a bit of effort; that it couldn't return 0.0 was obvious). Shhh, don't tell Fred that it differed from the documented [0,1) range. Interestingly, the Py2.2 code contained a number of places that used the 1.0-random() step or code for reselecting whenever u<1e-7. Someone thought it was important. Also, folks were supposed to be able to subclass with a different [0,1) generator and not have it fail. It happened to me last night and that is how I found the problem. > Can the > Twister? I don't know. It's obvious that it can't return 1.0 on a platform > where C double has at least 53 bits, but it's not obvious that it can't > return 1.0 on a feebler box, and it's not obvious to me whether it can or > can't return 0.0 on any box. Look again, the tempering steps assure that there are about 2000 ways to produce a pure zero. > Empirical evidence: in 10 tries, I didn't see > it produce 0.0 even once . Oh, I wish I had your luck ;) Raymond Hettinger From martin@v.loewis.de Sat Jan 4 20:51:38 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 04 Jan 2003 21:51:38 +0100 Subject: [Python-Dev] Cross compiling In-Reply-To: <1BB1EE3A-1FDC-11D7-99C2-0003933F3BC2@omnigroup.com> References: <1BB1EE3A-1FDC-11D7-99C2-0003933F3BC2@omnigroup.com> Message-ID: "Timothy J. Wood" writes: > At any rate, I'm wondering if anyone is working on this. To my knowledge, no. > I'd certainly be willing to devote some time to testing stuff, but > I'm no configure guru, so help would be appreciated :) I guess you will have to become one. Do all the analysis of each problem you encounter (what is it doing, what should it be doing instead), and try to find out a way to correct this. If you feel that the solutions you come up with are more involved than they should be, feel free to discuss them on this list. Regards, Martin From bac@OCF.Berkeley.EDU Sat Jan 4 21:05:27 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Sat, 4 Jan 2003 13:05:27 -0800 (PST) Subject: [Python-Dev] tutor(function|module) In-Reply-To: <20030104174634.GA6906@homer.gst.com> References: <20030104174634.GA6906@homer.gst.com> Message-ID: [Christopher Blunck] > I'm new to the mailing list so I'd like to introduce myself. My name is > Christopher Blunck and I'm fairly new to the Python language. Welcome! >I am a Java > developer (please do not mock me) professionally, but use Python in all of > my home projects and the open source projects I contribute to. The latter half of that sentences will stem the mocking. =) > Not sure if this is the proper forum to ask this, or if it has even been > discussed before, so apologies in advance if this is inappropriate. > It's OK. Question #1 kind of isn't since it is documented on how to go about this, but Question #2 is okay to ask here. > Question #1: How can I contribute to the documentation effort? Look around at http://www.python.org/dev/ . There is info there about how to help with the documentation effort and how to do patches to SourceForge. > Question #2: How about examples(function) that displays multiple examples > of how to use the function provided? Similar to above, there are times > when I look at the API and even if the parameters to the function are > fully explained, it would help to have an example (case in point is > os.path.walk - docs are great, but an example of how to make use of the > third parameter (the arg you pass) would really clarify why 'the powers > that be' chose to implement this capability. > -1 If you look in the source distribution for Python there is a directory called Demos/ that has just examples of how to use things. Stuff like what you are suggesting can go there. You can also write an examples page for the documentation. But I don't think it belongs in the language itself. If we started doing that we would take on the role of Python Cookbook and c.l.py and that is more work than we need. =) -Brett From Jack.Jansen@oratrix.com Sat Jan 4 21:16:17 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Sat, 4 Jan 2003 22:16:17 +0100 Subject: [Python-Dev] Need help with test_unicode failure Message-ID: I need help with a test_unicode failure in MacPython-OS9. The following bit of the test try: u'\xe2' in 'g\xe2teau' except UnicodeError: pass else: print '*** contains operator does not propagate UnicodeErrors' fails, i.e. it prints that unicode errors are not propagated. Can anyone give me a hint as to where I should start looking? -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From tjw@omnigroup.com Sat Jan 4 21:21:58 2003 From: tjw@omnigroup.com (Timothy J. Wood) Date: Sat, 4 Jan 2003 13:21:58 -0800 Subject: [Python-Dev] Cross compiling In-Reply-To: <200301041418.h04EI2G19410@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <8EA20509-202A-11D7-99C2-0003933F3BC2@omnigroup.com> On Saturday, January 4, 2003, at 06:18 AM, Guido van Rossum wrote: > It's quite possible though that recent additions to configure.in have > broken things again -- if you don't test this regularly, it will > succumb to bitrot quickly. :-( Right now configure.in doesn't even look close: - It uses 'uname' instead of actually obeying --build and --host - The files to include in the project are selected via a bunch of external scripts and files (makesetup, Setup) that don't seem to care about the difference between a Unix and PC box - Other configure tests assume you are building for Unix. All in all, it looks like it will be easier for me to just take the route of rebuilding with VC++ and the included dsw file when I need. This will be moderately annoying, since I'm planning on trying to make a 'secure' Python library that links as few OS services as possible (or redirects to hooks that I'll provide). The eventual goal is to have Python embedded in a game (where users can trade modules written in Python which therefor might be malicious). The notes I read about the current state of sandboxing in Python didn't make me warm and fuzzy -- I probably don't understand it (which is one reason for the not-warm/not-fuzzy feeling). I'd feel warm and fuzzy if the Python library routed through my OS abstraction layer for everything rather than directly linking to the OS. > If you can fix it, by all means submit a patch to SF! I may yet, but since it looks like cross compiling is only marginally working for Unix->Unix and completely not working for Unix->Win32, this is above the level of my Python-Fu :) Thanks! -tim From tjw@omnigroup.com Sat Jan 4 21:30:58 2003 From: tjw@omnigroup.com (Timothy J. Wood) Date: Sat, 4 Jan 2003 13:30:58 -0800 Subject: [Python-Dev] Cross compiling In-Reply-To: <8EA20509-202A-11D7-99C2-0003933F3BC2@omnigroup.com> Message-ID: On Saturday, January 4, 2003, at 01:21 PM, Timothy J. Wood wrote: > The notes I read about the current state of sandboxing in Python > didn't make me warm and fuzzy -- I probably don't understand it (which > is one reason for the not-warm/not-fuzzy feeling). I'd feel warm and > fuzzy if the Python library routed through my OS abstraction layer for > everything rather than directly linking to the OS. Hmmm... It looks like I was right to feel unsettled by rexec: http://www.amk.ca/python/howto/rexec/ Python provides a rexec module running untrusted code. However, it's never been exhaustively audited for security and it hasn't been updated to take into account recent changes to Python such as new-style classes. Therefore, the rexec module should not be trusted. To discourage use of rexec, this HOWTO has been withdrawn. Yikes! -tim From bac@OCF.Berkeley.EDU Sat Jan 4 21:47:02 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Sat, 4 Jan 2003 13:47:02 -0800 (PST) Subject: [Python-Dev] Cross compiling In-Reply-To: References: Message-ID: [Timothy J. Wood] > http://www.amk.ca/python/howto/rexec/ > > Python provides a rexec module running untrusted code. However, it's > never > been exhaustively audited for security and it hasn't been updated to > take > into account recent changes to Python such as new-style classes. > Therefore, the rexec module should not be trusted. To discourage use of > rexec, this HOWTO has been withdrawn. > > Yikes! > Guido recently stated on python-dev that he would like to see rexec just removed from Python completely since it gives a false sense of security. If you want to work on it, Gustavo Niemeyer has expressed interest on attempting to get a rexec module that works. You might want to contact him and see if you two could work together on something. -Brett From cnetzer@mail.arc.nasa.gov Sat Jan 4 21:47:22 2003 From: cnetzer@mail.arc.nasa.gov (Chad Netzer) Date: Sat, 4 Jan 2003 13:47:22 -0800 Subject: [Python-Dev] tutor(function|module) In-Reply-To: References: <20030104174634.GA6906@homer.gst.com> Message-ID: <200301042147.NAA30669@mail.arc.nasa.gov> On Saturday 04 January 2003 13:05, Brett Cannon wrote: > [Christopher Blunck] > > Question #2: How about examples(function) that displays multiple examples > > of how to use the function provided? Similar to above, there are times > > when I look at the API and even if the parameters to the function are > > fully explained, it would help to have an example (case in point is > > os.path.walk - docs are great, but an example of how to make use of the > > third parameter (the arg you pass) would really clarify why 'the powers > > that be' chose to implement this capability. > If you look in the source distribution for Python there is a directory > called Demos/ that has just examples of how to use things. Stuff like > what you are suggesting can go there. Perhaps an importable "examples" module, which could have, well, examples of things. It could and would be a separately maintained module, but would probably be useful to me and others it it just had examples of all the standard library functions (contributed to on a volunteer basis). hmmmmm... Best to move to python-list for followups... -- Bay Area Python Interest Group - http://www.baypiggies.net/ Chad Netzer cnetzer@mail.arc.nasa.gov From Jack.Jansen@oratrix.com Sat Jan 4 22:03:49 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Sat, 4 Jan 2003 23:03:49 +0100 Subject: [Python-Dev] tutor(function|module) In-Reply-To: Message-ID: <6761BBA0-2030-11D7-BF12-000A27B19B96@oratrix.com> On zaterdag, jan 4, 2003, at 22:05 Europe/Amsterdam, Brett Cannon wrote: >> Question #2: How about examples(function) that displays multiple >> examples >> of how to use the function provided? Similar to above, there are >> times >> when I look at the API and even if the parameters to the function are >> fully explained, it would help to have an example (case in point is >> os.path.walk - docs are great, but an example of how to make use of >> the >> third parameter (the arg you pass) would really clarify why 'the >> powers >> that be' chose to implement this capability. >> > > -1 > > If you look in the source distribution for Python there is a directory > called Demos/ that has just examples of how to use things. Stuff like > what you are suggesting can go there. > > You can also write an examples page for the documentation. > > But I don't think it belongs in the language itself. I agree that it doesn't belong in the language itself, but what would be nice is if the help module could point you to examples. A first stab at this could be to use a well-defined naming scheme for the examples (so that examples for the "os" module would be in "Demos/os", and examples of os.walk() would be in, say, Demos/os/walk.py or walk_1.py and walk_2.py). And this could be extended with having index files in the Demos tree which would map Python names to example files. For example, if there was a demo Demos/os/renamefilesintreetouppercase.py then and index file in Demos/os could tell the help module that this is a useful file to examine if you're interested in os.walk or os.path. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From martin@v.loewis.de Sat Jan 4 22:22:46 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 04 Jan 2003 23:22:46 +0100 Subject: [Python-Dev] Need help with test_unicode failure In-Reply-To: References: Message-ID: Jack Jansen writes: > Can anyone give me a hint as to where I should start looking? You need to verify a number of things: 1. PyUnicode_Contains is invoked. 2. the conversion PyUnicode_FromObject(container) reports an exception. 3. that exception is propagated to the caller. It is likely that 1 works ok. I can't see who the eventual caller should be, but it is likely that 3 works as well. If the conversion fails to produce an exception, a possible explanation would be that the system encoding is not ASCII. Regards, Martin From bac@OCF.Berkeley.EDU Sat Jan 4 22:44:19 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Sat, 4 Jan 2003 14:44:19 -0800 (PST) Subject: [Python-Dev] tutor(function|module) In-Reply-To: <6761BBA0-2030-11D7-BF12-000A27B19B96@oratrix.com> References: <6761BBA0-2030-11D7-BF12-000A27B19B96@oratrix.com> Message-ID: [Jack Jansen] > I agree that it doesn't belong in the language itself, but what would > be nice is if the help module could point you to examples. A first stab > at this could be to use a well-defined naming scheme for the examples > (so that examples for the "os" module would be in "Demos/os", and > examples of os.walk() would be in, say, Demos/os/walk.py or walk_1.py > and walk_2.py). And this could be extended with having index files in > the Demos tree which would map Python names to example files. For > example, if there was a demo Demos/os/renamefilesintreetouppercase.py > then and index file in Demos/os could tell the help module that this is > a useful file to examine if you're interested in os.walk or os.path. This is more reasonable, but I still have reservations over having this kind of thing in the core. It isn't over whether it is useful or not, but whether people who do the bulk of CVS checkins want to be bothered with reading a possible deluge of example Python code on top of their usual Python work. Isn't the Cookbook supposed to cover stuff like this? And it was suggested to have a separate ``examples`` package, but that might be a barrier of entry for newbies (albeit a low barrier). -Brett From barry@python.org Sat Jan 4 22:53:24 2003 From: barry@python.org (Barry A. Warsaw) Date: Sat, 4 Jan 2003 17:53:24 -0500 Subject: [Python-Dev] PEP 297: Support for System Upgrades References: <3D37DFEA.9070506@lemburg.com> <3E172842.8070501@lemburg.com> Message-ID: <15895.26084.121086.435407@gargle.gargle.HOWL> MAL, I think an idea like this is good, but I have one concern. Let's say Python 2.3 comes out with email pkg 2.5. Later I find some bugs and release email 2.5.1 and also fix the Python 2.3 maint branch. Now Python 2.3.1 comes out. At best the email package in system-packages should be ignored. At worse, it /has/ to be ignored. E.g. Python 2.3.2 comes out with email 2.5.2. My suggestion is that the Python micro-release number be included in the path to system-packages. IOW, system-packages must exactly match the Python version number, not just the maj.min number. -Barry From guido@python.org Sun Jan 5 00:33:36 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 04 Jan 2003 19:33:36 -0500 Subject: [Python-Dev] Cross compiling In-Reply-To: Your message of "Sat, 04 Jan 2003 13:21:58 PST." <8EA20509-202A-11D7-99C2-0003933F3BC2@omnigroup.com> References: <8EA20509-202A-11D7-99C2-0003933F3BC2@omnigroup.com> Message-ID: <200301050033.h050Xa121052@pcp02138704pcs.reston01.va.comcast.net> > On Saturday, January 4, 2003, at 06:18 AM, Guido van Rossum wrote: > > It's quite possible though that recent additions to configure.in have > > broken things again -- if you don't test this regularly, it will > > succumb to bitrot quickly. :-( > > Right now configure.in doesn't even look close: > > - It uses 'uname' instead of actually obeying --build and --host Is that easy to fix? I've never heard of --build and --host; I've long stopped keeping up with changes in autoconf. > - The files to include in the project are selected via a bunch of > external scripts and files (makesetup, Setup) that don't seem to care > about the difference between a Unix and PC box > - Other configure tests assume you are building for Unix. Well, that assumption has always been true until you tried something else (althoug I should mention that AFAIK the CygWin build uses the configure script running in an emulated Unix-ish environment on Windows. > All in all, it looks like it will be easier for me to just take the > route of rebuilding with VC++ and the included dsw file when I need. Whatever's the least work for you. > This will be moderately annoying, since I'm planning on trying to > make a 'secure' Python library that links as few OS services as > possible (or redirects to hooks that I'll provide). I feel uncomfortable hearing the word "annoying" here, since it seems to imply that the Python developers have done something wrong. At best we've not done something that would benefit you. Given that you don't have to pay for Python, perhaps you could keep your being annoyed to yourself. > The eventual goal is to have Python embedded in a game (where users > can trade modules written in Python which therefor might be malicious). Cool. > The notes I read about the current state of sandboxing in Python > didn't make me warm and fuzzy -- I probably don't understand it (which > is one reason for the not-warm/not-fuzzy feeling). I'd feel warm and > fuzzy if the Python library routed through my OS abstraction layer for > everything rather than directly linking to the OS. Sorry, you've lost me there. What is it that you want exactly? > > If you can fix it, by all means submit a patch to SF! > > I may yet, but since it looks like cross compiling is only marginally > working for Unix->Unix and completely not working for Unix->Win32, this > is above the level of my Python-Fu :) Such a thing is called a learning opportunity. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Jan 5 00:37:40 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 04 Jan 2003 19:37:40 -0500 Subject: [Python-Dev] Cross compiling In-Reply-To: Your message of "Sat, 04 Jan 2003 13:30:58 PST." References: Message-ID: <200301050037.h050beu21074@pcp02138704pcs.reston01.va.comcast.net> > Hmmm... It looks like I was right to feel unsettled by rexec: > > http://www.amk.ca/python/howto/rexec/ > > Python provides a rexec module running untrusted > code. However, it's never been exhaustively audited for > security and it hasn't been updated to take into account > recent changes to Python such as new-style classes. > Therefore, the rexec module should not be trusted. To > discourage use of rexec, this HOWTO has been withdrawn. > > Yikes! Keeping a body of code as large and evolving as Python secure is an enormous amount of work. Java had security as an explicit goal and despite many efforts, security problems are still found in JVM implementations. For most Python users, this kind of security isn't much of an issue (they run code they write themselves or that is written by someone they trust), so obviously nobody has been very motivated to spend the effort. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Jan 5 00:43:58 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 04 Jan 2003 19:43:58 -0500 Subject: [Python-Dev] tutor(function|module) In-Reply-To: Your message of "Sat, 04 Jan 2003 13:47:22 PST." <200301042147.NAA30669@mail.arc.nasa.gov> References: <20030104174634.GA6906@homer.gst.com> <200301042147.NAA30669@mail.arc.nasa.gov> Message-ID: <200301050043.h050hxc21106@pcp02138704pcs.reston01.va.comcast.net> > > [Christopher Blunck] > > > > Question #2: How about examples(function) that displays multiple examples > > > of how to use the function provided? Similar to above, there are times > > > when I look at the API and even if the parameters to the function are > > > fully explained, it would help to have an example (case in point is > > > os.path.walk - docs are great, but an example of how to make use of the > > > third parameter (the arg you pass) would really clarify why 'the powers > > > that be' chose to implement this capability. > > > If you look in the source distribution for Python there is a directory > > called Demos/ that has just examples of how to use things. Stuff like > > what you are suggesting can go there. [Chad Netzer] > Perhaps an importable "examples" module, which could have, well, examples of > things. It could and would be a separately maintained module, but would > probably be useful to me and others it it just had examples of all the > standard library functions (contributed to on a volunteer basis). hmmmmm... I'm not sure what you would want to do after importing the examples module; it's not easy to look at the source of an imported module, and that's (presumably) what you want to do with an examples module. IMO examples belong in the documentation. The problem with the Demo directory is that few people have bothered to contribute or even weed out obsolete code, and so the collection there is quite stale. I would welcome contributions of any kind, but especially useful would be a somewhat separate project with the explicit goal of providing working examples of all the important parts of Python. Watch out for focusing too much on all the standard library functions; much of Python's power is in things that aren't functions (e.g. syntactic constructs, or methods, or classes, or modules). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Jan 5 00:49:52 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 04 Jan 2003 19:49:52 -0500 Subject: [Python-Dev] PEP 297: Support for System Upgrades In-Reply-To: Your message of "Sat, 04 Jan 2003 17:53:24 EST." <15895.26084.121086.435407@gargle.gargle.HOWL> References: <3D37DFEA.9070506@lemburg.com> <3E172842.8070501@lemburg.com> <15895.26084.121086.435407@gargle.gargle.HOWL> Message-ID: <200301050049.h050nqa21172@pcp02138704pcs.reston01.va.comcast.net> > MAL, I think an idea like this is good, but I have one concern. Let's > say Python 2.3 comes out with email pkg 2.5. Later I find some bugs > and release email 2.5.1 and also fix the Python 2.3 maint branch. Now > Python 2.3.1 comes out. > > At best the email package in system-packages should be ignored. At > worse, it /has/ to be ignored. E.g. Python 2.3.2 comes out with email > 2.5.2. > > My suggestion is that the Python micro-release number be included in > the path to system-packages. IOW, system-packages must exactly match > the Python version number, not just the maj.min number. Excellent point. Or a micro-release should clear out the system-packages directory. --Guido van Rossum (home page: http://www.python.org/~guido/) From blunck@gst.com Sun Jan 5 01:35:39 2003 From: blunck@gst.com (Christopher Blunck) Date: Sat, 4 Jan 2003 20:35:39 -0500 Subject: [Python-Dev] tutor(function|module) In-Reply-To: References: <6761BBA0-2030-11D7-BF12-000A27B19B96@oratrix.com> Message-ID: <20030105013539.GB7675@homer.gst.com> On Sat, Jan 04, 2003 at 02:44:19PM -0800, Brett Cannon wrote: > And it was suggested to have a separate ``examples`` package, but that > might be a barrier of entry for newbies (albeit a low barrier). There seems to be diverging opinions in this thread, which I think is good. While I recognize (I have also personally used them) the existence of the "Demos" section in CVS, chances are strong that newbies do not. If somebody is bright enough to pull the python cvs module, build it, and use it then they probably are bright enough to discover the Demos. But the bright ones are not the folks I'm thinking about right now - I'm thinking of the common case user who gets a Linux distro with python installed out-of-the-box. They may be totally new to programming and have never even heard of CVS before (and thus wouldn't have the Demos available to them). Maybe I'm not familiar with very many py installations for platforms other than my own, so feel free to correct me if the Demos *do* come pre-installed on many different environments. :) I also recognize that this is ?feature? is not a core language functionality. But due to its special nature (I put this into the same area as help) I think that it does warrant thought about if it deserves to be included as a core feature *simply because* it might really help newbies out. A web page full of examples would be great, but again - would Joe New User know about the web page if they didn't even know about the lang (because it came pre-installed)? I don't know... It is encouraging to read peoples comments on this. It appears as tho there may be a desire for this feature. -c -- 8:20pm up 75 days, 11:36, 3 users, load average: 4.54, 4.82, 4.89 From bac@OCF.Berkeley.EDU Sun Jan 5 01:41:55 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Sat, 4 Jan 2003 17:41:55 -0800 (PST) Subject: [Python-Dev] tutor(function|module) In-Reply-To: <20030105013539.GB7675@homer.gst.com> References: <6761BBA0-2030-11D7-BF12-000A27B19B96@oratrix.com> <20030105013539.GB7675@homer.gst.com> Message-ID: [Christopher Blunck] > On Sat, Jan 04, 2003 at 02:44:19PM -0800, Brett Cannon wrote: > > And it was suggested to have a separate ``examples`` package, but that > > might be a barrier of entry for newbies (albeit a low barrier). > > There seems to be diverging opinions in this thread, which I think is good. Stick around; divergence happens all the time here at python-dev. =) > While I recognize (I have also personally used them) the existence of > the "Demos" section in CVS, chances are strong that newbies do not. Especially since it is not included in the Windows distro. >If > somebody is bright enough to pull the python cvs module, build it, and use > it then they probably are bright enough to discover the Demos. But the > bright ones are not the folks I'm thinking about right now - I'm thinking > of the common case user who gets a Linux distro with python installed > out-of-the-box. I wouldn't worry about people like that either. I think the question here is how do most people learn about Python. For instance, how do people even learn about the ``help()`` function? If they get it from the docs, we can point them to whatever to learn about examples. If they learn from a friend, then they probably would (and should) tell them where to get examples. And if they learn about it from launching the interpreter and typing ``help``, well that can have an added line mentioning where to go to get examples. >They may be totally new to programming and have never even > heard of CVS before (and thus wouldn't have the Demos available to them). > Maybe I'm not familiar with very many py installations for platforms other > than my own, so feel free to correct me if the Demos *do* come pre-installed > on many different environments. :) > It is not installed. You get it only with the source code. > I also recognize that this is ?feature? is not a core language functionality. > But due to its special nature (I put this into the same area as help) > I think that it does warrant thought about if it deserves to be included > as a core feature *simply because* it might really help newbies out. > I still think putting it in the documentation is the better solution. ``help()`` should really be used as a quick reference. If that quick reference is not enough you should read the documentation which can have examples to help clarify usage. > A web page full of examples would be great, but again - would Joe New User > know about the web page if they didn't even know about the lang (because it > came pre-installed)? I don't know... > See above. -Brett From cnetzer@mail.arc.nasa.gov Sun Jan 5 02:03:37 2003 From: cnetzer@mail.arc.nasa.gov (Chad Netzer) Date: Sat, 4 Jan 2003 18:03:37 -0800 Subject: [Python-Dev] Cross compiling In-Reply-To: <200301050037.h050beu21074@pcp02138704pcs.reston01.va.comcast.net> References: <200301050037.h050beu21074@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200301050203.SAA01788@mail.arc.nasa.gov> On Saturday 04 January 2003 16:37, Guido van Rossum wrote: > > Hmmm... It looks like I was right to feel unsettled by rexec: > Keeping a body of code as large and evolving as Python secure is an > enormous amount of work. Java had security as an explicit goal and > despite many efforts, security problems are still found in JVM > implementations. Not wanting to disrespect those who are working on rexec(), my opinion is that it should be removed from the distribution (and possibly maintained separately by those who wish to do it). As you say, each new release just means there are more chances that it is broken (since the primary goal of the Python developers is NOT in producing a secure sandbox), and as it exists, it appears to offer few real guarantees of the security it is meant to provide (I've never used it, so forgive me if I'm being overly harsh on rexec(); my impression is that it was created because Perl, Java, etc were all doing it, but these days there appear to be better alternatives) People who want to maintain rexec() could do it for older releases of Python, where the people wanting to use it in a secure environment probably care less about new features and more about maturity of the code. Furthermore, so many operating systems are offering "sandbox" environments these days (Linux has "user-mode Linux", among others; BSD has jail(), both have virtual machine emulators like Bochs, plex86, or VMWare, as does Windows NT and XP), that the smart thing do to is simply run Python in one of these restricted operating environments. The OS makers care more about security, and have better resources to insure it. So, perhaps it is time to seriously consider removing rexec() from the standard distribution. Who is really using it, I wonder, and for what? -- Bay Area Python Interest Group - http://www.baypiggies.net/ Chad Netzer cnetzer@mail.arc.nasa.gov From bac@OCF.Berkeley.EDU Sun Jan 5 02:30:19 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Sat, 4 Jan 2003 18:30:19 -0800 (PST) Subject: [Python-Dev] Cross compiling In-Reply-To: <200301050203.SAA01788@mail.arc.nasa.gov> References: <200301050037.h050beu21074@pcp02138704pcs.reston01.va.comcast.net> <200301050203.SAA01788@mail.arc.nasa.gov> Message-ID: [Chad Netzer] > > Not wanting to disrespect those who are working on rexec(), my opinion is > that it should be removed from the distribution (and possibly maintained > separately by those who wish to do it). Guido has already stated twice by my count in two separate threads revolving around rexec that he wants to rip it out. Problem with that is ripping out a module from the stdlib is not easy since some people might actually be using the module. Perhaps we should move forward and put a PendingDeprecationWarning in rexec for 2.3 (or maybe be harsher and making DeprecationWarning instead)? -Brett From tim.one@comcast.net Sun Jan 5 02:39:33 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 04 Jan 2003 21:39:33 -0500 Subject: [Python-Dev] Proposing to simplify datetime In-Reply-To: Message-ID: The current datetime module class hierarchy looks like: object timedelta tzinfo time timetz date datetime datetimetz Several people have noted (on the wiki, and in email), with some bewilderment, that: 1. A timetz object with tzinfo=None acts like a time object. 2. A datetimetz object with tzinfo=None acts like a datetime object. A natural suggestion is to change the hierarchy to: object timedelta tzinfo time date datetime where the current time and datetime classes go away, the current timetz is renamed to time, and the current datetimetz to datetime. I tried that this evening for the Python implementation, and it was easy enough. No subtle problems popped up. The results are checked in to the Python datetime sandbox, as datetime2.py and test_datetime2.py; it appears to be fully functional, and allowed tossing about 10% of the code lines. IIRC, the original motivation for making a finer distinction among objects with and without tzinfo members was to save memory in the objects. Guido thinks that's a red herring, though, and I agree. The C implementation of datetime objects currently has unused "pad bytes" in the structs (due to compiler alignment of struct members), and adding a flag byte to record "does this have a non-None tzinfo member or not?" wouldn't actually consume any more memory. The C implementation would still need to have more than one way to allocate a time/datetime under the covers, in order to avoid allocating an additional pointer field (for the PyObject* tzinfo pointer) when it wasn't needed, so it wouldn't really simplify the C implementation much. It would simplify the user's mental model, and cut out a ton of near-redundant docs (those two are pretty much the same thing ). From cnetzer@mail.arc.nasa.gov Sun Jan 5 03:12:03 2003 From: cnetzer@mail.arc.nasa.gov (Chad Netzer) Date: Sat, 4 Jan 2003 19:12:03 -0800 Subject: [Python-Dev] Cross compiling In-Reply-To: References: <200301050203.SAA01788@mail.arc.nasa.gov> Message-ID: <200301050312.TAA03857@mail.arc.nasa.gov> On Saturday 04 January 2003 18:30, Brett Cannon wrote: > Perhaps we should move forward and put a PendingDeprecationWarning in > rexec for 2.3 (or maybe be harsher and making DeprecationWarning instead)? Yes, I certainly would be pleased with some form of deprecation warning for the upcoming release, with perhaps a note in the docs about using an appropriate operating system facility to do the sandboxing instead. Of course, if others are truly interested in maintaining or improving rexec(), even if it were removed from Python itself, deprecation status might be a bit of a buzzkill. :) -- Bay Area Python Interest Group - http://www.baypiggies.net/ Chad Netzer cnetzer@mail.arc.nasa.gov From tim.one@comcast.net Sun Jan 5 03:37:05 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 04 Jan 2003 22:37:05 -0500 Subject: [Python-Dev] Unidiff tool In-Reply-To: <002001c2b39b$d0d4b960$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger] > For about a month, Martijn Pieters has had a pending feature > request (with attached an attached patch) for a class that extends > Differ class in the difflib module. It creates unidiffs from two > sequences of lines. I think it would be a useful addition to > Py2.3. If everyone agrees, then I'll clean-up the code, > integrate it into the module, adds docs, and write a thorough test > suite. > See: www.python.org/sf/635144 . +1 here. Reviewing that keeps falling off my horizon. Random observations: Would likely be simpler & faster if based on get_opcodes() instead of picking apart strings. The timestamp calculations are hairy so should be shuffled into a helper function. I believe they were incorrect if the zone offset wasn't a whole number of hours (due to i/60 returning the floor, and when i<0 that isn't what the code expected). Perhaps the datetime module's format functions (there are billions of them <0.9 wink>) could be pressed into service here instead. From tjw@omnigroup.com Sun Jan 5 03:38:05 2003 From: tjw@omnigroup.com (Timothy J. Wood) Date: Sat, 4 Jan 2003 19:38:05 -0800 Subject: [Python-Dev] Cross compiling In-Reply-To: <200301050033.h050Xa121052@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <1999844A-205F-11D7-99C2-0003933F3BC2@omnigroup.com> On Saturday, January 4, 2003, at 04:33 PM, Guido van Rossum wrote: > I feel uncomfortable hearing the word "annoying" here, since it seems > to imply that the Python developers have done something wrong. At > best we've not done something that would benefit you. Given that you > don't have to pay for Python, perhaps you could keep your being > annoyed to yourself. Sorry -- I didn't mean to imply anything of the sort. Of all the scripting languages I've evaluated Python seems the best maintained and least overloaded with cruft. If I'm annoyed by having to build on Windows instead of cross compiling from Unix, that's my problem :) >> The notes I read about the current state of sandboxing in Python >> didn't make me warm and fuzzy -- I probably don't understand it (which >> is one reason for the not-warm/not-fuzzy feeling). I'd feel warm and >> fuzzy if the Python library routed through my OS abstraction layer for >> everything rather than directly linking to the OS. > > Sorry, you've lost me there. What is it that you want exactly? Well, the general idea would be to look at the system calls that Python uses and evaluate them on a security basis (pretty much by looking at the undefined symbols in the library and C modules as reported by nm). Things like fork(), unlink(), being able to open random files for writing are not things that I need (or want to have to worry over whether rexec protects me from malicious use of them). So, one way to worry less would be for me to track down anywhere that one of these functions is called and either not ship with that module or instead call into my own OS abstraction layer in my game code. For example, I could just have a table of function pointers that the embedder could set -- if a entry was NULL, that would imply that the OS doesn't support that capability or that the embedding application doesn't want it used for whatever reason (security in my case). You could sort of look at this as taking the OS dependent modules and splitting them in half -- one half would contain a somewhat abstracted interface to the OS and the other half would contain the stuff to do the desired Python stuff (file iterators, for example). I don't know if this could be of general interest -- probably not, but I thought I'd mention it in case it is (since in that case I'd spend more time making sure that my work was suitable for submission back to Python instead of just hacking and slashing on my own local copy :) This would be a very rudimentary means of achieving one type of security, but it may be useful to some and fairly easy to maintain. This clearly doesn't address potential overflow problems or DoS attacks, but it make it much harder for malicious game players corrupt other people's systems. Thanks! -tim From jepler@unpythonic.net Sun Jan 5 03:44:15 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Sat, 4 Jan 2003 21:44:15 -0600 Subject: [Python-Dev] Unidiff tool In-Reply-To: <002001c2b39b$d0d4b960$125ffea9@oemcomputer> References: <002001c2b39b$d0d4b960$125ffea9@oemcomputer> Message-ID: <20030104214402.A3504@unpythonic.net> --fdj2RfSjLxBAspz7 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Fri, Jan 03, 2003 at 09:48:50PM -0500, Raymond Hettinger wrote: > For about a month, Martijn Pieters has had a pending feature request (with attached an attached patch) for a class that extends > Differ class in the difflib module. It creates unidiffs from two sequences of lines. I think it would be a useful addition to > Py2.3. If everyone agrees, then I'll clean-up the code, integrate it into the module, adds docs, and write a thorough test suite. > See: www.python.org/sf/635144 . Here's the one that i wrote. No test-suite. probably only barely conforms to unidiff format requirements. jeff --fdj2RfSjLxBAspz7 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="pyunidiff.py" import difflib REPLACE='replace' DELETE='delete' INSERT='insert' EQUAL='equal' def unidiff(f1, f2, ctx): sys.stdout.write("--- %s\n" % (f1,)) sys.stdout.write("+++ %s\n" % (f2,)) c1 = open(f1).readlines() c2 = open(f2).readlines() m = difflib.SequenceMatcher(None, c1, c2) ops = m.get_opcodes() i = 0 while i < len(ops): while i < len(ops) and ops[i][0] == EQUAL: i=i+1 if i == len(ops): break j = i st1 = max(0, ops[j][1]-ctx) st2 = max(0, ops[j][3]-ctx) while i < len(ops) and (ops[i][0]!=EQUAL or ops[i][2]-ops[i][1]<2*ctx): i=i+1 en1 = min(len(c1), ops[i-1][2]+ctx) en2 = min(len(c2), ops[i-1][4]+ctx) sys.stdout.write("@@ -%d,%d +%d,%d @@\n" % (st1+1, en1-st1, st2+1, en2-st2)) for k in range(st1, ops[j][1]): sys.stdout.write(" %s" % c1[k]) for (opcode, i1, i2, j1, j2) in ops[j:i]: if opcode == EQUAL: for l in range(i1, i2): sys.stdout.write(" %s" % c1[l]) if opcode == DELETE or opcode==REPLACE: for l in range(i1, i2): sys.stdout.write("-%s" % c1[l]) if opcode == INSERT or opcode==REPLACE: for l in range(j1, j2): sys.stdout.write("+%s" % c2[l]) for k in range(ops[i-1][2], en1): sys.stdout.write(" %s" % c1[k]) def __main__(args): unidiff(args[0], args[1], 3) if __name__ == '__main__': import sys __main__(sys.argv[1:]) --fdj2RfSjLxBAspz7-- From jdadson@ix.netcom.com Sun Jan 5 03:45:15 2003 From: jdadson@ix.netcom.com (Jive Dadson) Date: Sat, 04 Jan 2003 19:45:15 -0800 Subject: [Python-Dev] Hello world. Message-ID: <3E17AA4B.D0224D25@ix.netcom.com> I would like to introduce myself to the group. I am a new-comer to Python. My first programming job was in 1971. Since 1979 I have worked fulltime in the computer field, mostly in research and developement. For three years in the early 80;'s I was visiting associate professor of computer science at a well-known university in the midwest. On the internet, I have posted under the nom du net of Jive Dadson for about 8 years or so. I am most at home programming in C++. I have used the language since its earliest days. My current work involves multi-threaded applications. Principle among them is an embedded robot control system. Best regards, Jive From python-list@python.org Sun Jan 5 04:03:13 2003 From: python-list@python.org (Aahz) Date: Sat, 4 Jan 2003 23:03:13 -0500 Subject: [Python-Dev] Cross compiling In-Reply-To: <1999844A-205F-11D7-99C2-0003933F3BC2@omnigroup.com> References: <200301050033.h050Xa121052@pcp02138704pcs.reston01.va.comcast.net> <1999844A-205F-11D7-99C2-0003933F3BC2@omnigroup.com> Message-ID: <20030105040313.GA25905@panix.com> [setting Reply-To: for python-list to get this off python-dev] On Sat, Jan 04, 2003, Timothy J. Wood wrote: > > This would be a very rudimentary means of achieving one type of > security, but it may be useful to some and fairly easy to maintain. > This clearly doesn't address potential overflow problems or DoS > attacks, but it make it much harder for malicious game players corrupt > other people's systems. This discussion should be moved to comp.lang.python, but here's a thought for you to address there: how do you intend to handle while 1: pass -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "There are three kinds of lies: Lies, Damn Lies, and Statistics." --Disraeli From tjw@omnigroup.com Sun Jan 5 04:09:41 2003 From: tjw@omnigroup.com (Timothy J. Wood) Date: Sat, 4 Jan 2003 20:09:41 -0800 Subject: [Python-Dev] Cross compiling In-Reply-To: <20030105040313.GA25905@panix.com> Message-ID: <83A59F38-2063-11D7-99C2-0003933F3BC2@omnigroup.com> On Saturday, January 4, 2003, at 08:03 PM, Aahz wrote: > This discussion should be moved to comp.lang.python, but here's a > thought for you to address there: how do you intend to handle Well, actually, I'll just probably shut up until I have something to show :) > while 1: pass Note the part where I said I wasn't trying to prevent DoS attacks. If the user's game hangs when they play module X, they haven't really lost anything -- they'll just know not to play that module again. -tim From barry@python.org Sun Jan 5 04:16:39 2003 From: barry@python.org (Barry A. Warsaw) Date: Sat, 4 Jan 2003 23:16:39 -0500 Subject: [Python-Dev] PEP 297: Support for System Upgrades References: <3D37DFEA.9070506@lemburg.com> <3E172842.8070501@lemburg.com> <15895.26084.121086.435407@gargle.gargle.HOWL> <200301050049.h050nqa21172@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15895.45479.235214.554416@gargle.gargle.HOWL> >>>>> "GvR" == Guido van Rossum writes: GvR> Excellent point. GvR> Or a micro-release should clear out the system-packages GvR> directory. The only reason I'd rather not do that is so that if a package still needs an update for the new Python micro release, a sysadmin could at least copy the package over from one version to the next. -Barry From tim.one@comcast.net Sun Jan 5 04:37:11 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 04 Jan 2003 23:37:11 -0500 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib random.py,1.26.6.7,1.26.6.8 In-Reply-To: <004401c2b428$57cadf80$125ffea9@oemcomputer> Message-ID: [Raymond Hettinger, on that the previous random() couldn't actually return 0.0 ] > Shhh, don't tell Fred that it differed from the documented [0,1) range. This was a good doc trick: it would also have been accurate to document the range as [-1e123, 1e200]. We're not obligated to produce all those numbers . IOW, since many RNGs do produce 0.0, the docs always allowed for that possibility "just in case". > Interestingly, the Py2.2 code contained a number of places that used > the 1.0-random() step or code for reselecting whenever u<1e-7. > Someone thought it was important. Sure. They didn't leave behind comments, though, so we can only guess. > Also, folks were supposed to be able to subclass with a different [0,1) > generator and not have it fail. It happened to me last night and that > is how I found the problem. That's fine. I'm not objecting to making the code robust in the face of a 0.0 result, I was just explaining why some code didn't bother. [on whether the Twister can produce 0.0] > Look again, the tempering steps assure that there are about 2000 ways > to produce a pure zero. Yup, I was being brain-dead -- they couldn't have proved equidistribution if 0 wasn't a possible output. >> Empirical evidence: in 10 tries, I didn't see it produce 0.0 even >> once . > Oh, I wish I had your luck ;) I've since tried it > 300 million times, and still haven't seen a 0.0 pop out. But since there are 2**53 possible outputs now, and 0.0 is one of them, code that blows up when a 0.0 pops out will be on the losing end of a miracle. It makes me wonder whether it would be prudent to guarantee that 0.0 isn't a possible output (which wouldn't require fiddling the Twister, but would requiring trying again in random_random if 0.0 was seen internally). Na. From python@rcn.com Sun Jan 5 05:02:16 2003 From: python@rcn.com (Raymond Hettinger) Date: Sun, 5 Jan 2003 00:02:16 -0500 Subject: [Python-Dev] Unidiff tool References: <002001c2b39b$d0d4b960$125ffea9@oemcomputer> <20030104214402.A3504@unpythonic.net> Message-ID: <006201c2b477$9e1b6aa0$125ffea9@oemcomputer> From: "Jeff Epler" > Here's the one that i wrote. > > No test-suite. probably only barely conforms to unidiff format > requirements. Will compare it to the other an take the best of both. Thx, Raymond Hettinger From skip@manatee.mojam.com Sun Jan 5 13:00:18 2003 From: skip@manatee.mojam.com (Skip Montanaro) Date: Sun, 5 Jan 2003 07:00:18 -0600 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200301051300.h05D0Idm030901@manatee.mojam.com> Bug/Patch Summary ----------------- 335 open / 3179 total bugs (+8) 105 open / 1879 total patches (+1) New Bugs -------- compileall doesn't notice syntax errors (2001-03-30) http://python.org/sf/412436 parameters for int(), str(), etc. (2002-12-30) http://python.org/sf/660022 GNU readline 4.2 prompt issue (2002-12-30) http://python.org/sf/660083 GNU readline version confusion (2002-12-30) http://python.org/sf/660095 New style classes and __hash__ (2002-12-30) http://python.org/sf/660098 typeobject provides incorrect __mul__ (2002-12-30) http://python.org/sf/660144 -hex/oct const generates wrong code (2002-12-31) http://python.org/sf/660455 readline and threads crashes (2002-12-31) http://python.org/sf/660476 ossaudiodev issues (2003-01-01) http://python.org/sf/660697 refman: importing x.y.z as m is possible, docs say otherwise (2003-01-01) http://python.org/sf/660811 datetimetz constructors behave counterintuitively (2.3a1) (2003-01-01) http://python.org/sf/660872 urllib2 and proxy (2003-01-02) http://python.org/sf/661042 inspect.getsource bug (2003-01-02) http://python.org/sf/661184 test_pep263 fails in MacPython-OS9 (2003-01-02) http://python.org/sf/661330 test_httplib fails on the mac (2003-01-02) http://python.org/sf/661340 test_strptime fails on the Mac (2003-01-02) http://python.org/sf/661354 plat-mac not on sys.path (2003-01-03) http://python.org/sf/661521 macpath.py missing ismount splitunc (2003-01-03) http://python.org/sf/661762 Add clarification of __all__ to refman? (2003-01-03) http://python.org/sf/661848 str.index() exception message not consistent (2003-01-03) http://python.org/sf/661913 need to skip some build tests for sunos5 (2003-01-03) http://python.org/sf/661981 urllib2 exceptions need improvement (2003-01-04) http://python.org/sf/662099 New Patches ----------- fix Makefile.pre to use config env (2002-12-29) http://python.org/sf/659809 Check for readline 2.2 features (2002-12-29) http://python.org/sf/659834 make commands.getstatusoutput work on windows (2002-12-31) http://python.org/sf/660505 Add sysexits.h EX_* symbols to posix (2003-01-02) http://python.org/sf/661368 apply() should get PendingDeprecation (2003-01-02) http://python.org/sf/661437 handle unary op of constant in transformer.py (2003-01-03) http://python.org/sf/661536 Remove old code from lib\os.py (2003-01-03) http://python.org/sf/661583 allow py_compile to re-raise exceptions (2003-01-03) http://python.org/sf/661719 Cygwin auto-import module patch (2003-01-03) http://python.org/sf/661760 BZ2File leaking fd and memory (2003-01-03) http://python.org/sf/661796 gcc 3.2 /usr/local/include patch (2003-01-03) http://python.org/sf/661869 bug 661354 fix; _strptime handle OS9's lack of timezone info (2003-01-04) http://python.org/sf/662053 Add array_contains() to arraymodule (2003-01-04) http://python.org/sf/662433 (Bug 660811: importing x.y.z as m is possible, docs say othe (2003-01-04) http://python.org/sf/662454 659188: no docs for HTMLParser (2003-01-04) http://python.org/sf/662464 642391: tempfile.mktemp() docs to include dir info (2003-01-04) http://python.org/sf/662475 Closed Bugs ----------- Thread-Support don't work with HP-UX 11 (2002-01-23) http://python.org/sf/507442 rfc822.parsedate() too strict (2002-05-04) http://python.org/sf/552345 build dumps core (binutils 2.13/solaris) (2002-08-17) http://python.org/sf/596422 compiler package and SET_LINENO (2002-08-20) http://python.org/sf/597919 ext module generation problem (2002-08-23) http://python.org/sf/599248 sq_concat prevents __radd__ from working (2002-10-17) http://python.org/sf/624807 FAIL: test_crlf_separation (email.test.t (2002-10-28) http://python.org/sf/629756 test_poll fails on FreeBSD (2002-12-01) http://python.org/sf/646547 more email ASCII decoding errors (2002-12-03) http://python.org/sf/648119 email test external dependencies (2002-12-08) http://python.org/sf/650441 Review libshelve.tex when possible (2002-12-09) http://python.org/sf/651149 email: huge address lines blow stack (2002-12-15) http://python.org/sf/654362 Slightly modify locals() doc? (2002-12-17) http://python.org/sf/655271 Closed Patches -------------- SocketServer: don't flush closed wfile (2002-05-20) http://python.org/sf/558547 Micro optimizations (2002-05-27) http://python.org/sf/561244 Solaris openpty() and forkpty() addition (2002-07-09) http://python.org/sf/579433 dummy_thread.py implementation (2002-10-13) http://python.org/sf/622537 Fix breakage caused when user sets OPT (2002-11-19) http://python.org/sf/640843 fix buffer overrun in pmerge (2002-11-28) http://python.org/sf/645404 Import from Zip Archive (2002-11-29) http://python.org/sf/645650 Add warnings to unsafe Cookie classes (2002-12-18) http://python.org/sf/655760 /dev/ptmx support for ptys (cygwin) (2002-12-19) http://python.org/sf/656590 Documentation support for PEP 301 (2002-12-23) http://python.org/sf/658093 PEP 301 implementation (2002-12-23) http://python.org/sf/658094 Mersenne Twister (2002-12-24) http://python.org/sf/658251 regex fixes for _strptime (2002-12-26) http://python.org/sf/658820 posixpath missing getctime (2002-12-27) http://python.org/sf/658927 Use PyArg_UnpackTuple where possible (2002-12-28) http://python.org/sf/659536 From pedronis@bluewin.ch Sun Jan 5 13:08:17 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Sun, 5 Jan 2003 14:08:17 +0100 Subject: [Python-Dev] Cross compiling References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> Message-ID: <011701c2b4bb$83c4ec00$6d94fea9@newmexico> From: "Chad Netzer" > On Saturday 04 January 2003 18:30, Brett Cannon wrote: > > > Perhaps we should move forward and put a PendingDeprecationWarning in > > rexec for 2.3 (or maybe be harsher and making DeprecationWarning instead)? > > Yes, I certainly would be pleased with some form of deprecation warning for > the upcoming release, with perhaps a note in the docs about using an > appropriate operating system facility to do the sandboxing instead. > > Of course, if others are truly interested in maintaining or improving > rexec(), even if it were removed from Python itself, deprecation status might > be a bit of a buzzkill. :) It is worth to remember that rexec is _broken_. Is is broken wrt new-style classes (that means (ignoring other passing issues) since 2.2) and that's a problem in the Python C code, not the python code of rexec. regards. From bbum@codefab.com Sun Jan 5 04:56:53 2003 From: bbum@codefab.com (Bill Bumgarner) Date: Sat, 4 Jan 2003 23:56:53 -0500 Subject: [Python-Dev] PyBuffer* vs. array.array() Message-ID: <1BE17C58-206A-11D7-A03A-000393877AE4@codefab.com> I'm in the process of bridging various methods in PyObjC that take (void *) types as arguments or return (void *) buffer references. On the C side of the fence, I'm using PyBuffer_FromReadWriteMemory() or PyBuffer_FromMemory() as appropriate to create the buffer. On the Python side, I use array.array('B') to create the buffer that is passed across the bridge and mapped to the (void *) method arguments. However, I noticed that the two types are not the same. They both work as byte buffers just fine, but the result of PyBuffer*() acts as, basically, a big on the Python side whereas array.array('B') is a buffer of specifically typed unsigned chars. In writing the unit tests, I came across a problematic situation that could easily arise in code (feel free to comment on the silliness of this code, if any... and note that I'm using the comprehension style even after that long rant I posted earlier :-): singlePlane = array.array('B') singlePlane.fromlist([0 for x in range(0, width*height*3)] ) for i in range(0, 256*256): si = i * 3 singlePlane[si] = rPlane[i] singlePlane[si+1] = gPlane[i] singlePlane[si+2] = bPlane[i] i2 = ... create i2 using data in singlePlane ... .... then ... bitmapData = i2.bitmapData() self.assertEquals(len(bitmapData), len(singlePlane)) for i in range(0,100): self.assertEquals(bitmapData[i], singlePlane[i], "bitmapData and singlePlane differ at byte %d" % i) The contents of singlePlane and bitmapData are identical, but the unit test fails: -- type of element 0 of bitmapData -- type of element 0 of singlePlane F. ====================================================================== FAIL: testImageData (__main__.TestNSBitmapImageRep) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/AppKit/test/test_nsbitmapimagerep.py", line 74, in testImageData self.assertEquals(bitmapData[i], singlePlane[i], "bitmapData and singlePlane differ at byte %d" % i) File "/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/ unittest.py", line 292, in failUnlessEqual raise self.failureException, \ AssertionError: bitmapData and singlePlane differ at byte 0 ---------------------------------------------------------------------- Ran 2 tests in 1.564s FAILED (failures=1) -- The real problem is that something created via array.array() comes back as an object that behaves differently. Clearly, a bug in my bridge code, yes, but raises a question: What should I be using? From the python side, it appears to only be possible to create buffer style objects-- big byte bags-- via the array module [yes, a will work, but a byte-array seems so much more correct]. On the C side, it seems that the only way to create a byte buffer like object is to use the PyBuffer* API. Advice, please? BTW: All of this lays the foundation for creating a PILImage object in Cocoa that is a seamless part of the NSImage family of classes. I.e. PIL would then be fully integrated into any Cocoa app that can dynamically load the python interpreter (another problem I'm working through). Thanks. b.bum b.bum We gladly feast on those who would subdue us. From oussoren@cistron.nl Sun Jan 5 16:06:17 2003 From: oussoren@cistron.nl (Ronald Oussoren) Date: Sun, 5 Jan 2003 17:06:17 +0100 Subject: [Python-Dev] Re: [Pyobjc-dev] PyBuffer* vs. array.array() In-Reply-To: <1BE17C58-206A-11D7-A03A-000393877AE4@codefab.com> Message-ID: <9F98A166-20C7-11D7-8AB2-0003931CFE24@cistron.nl> On Sunday, Jan 5, 2003, at 05:56 Europe/Amsterdam, Bill Bumgarner wrote: > > The real problem is that something created via array.array() comes > back as an object that behaves differently. Clearly, a bug in my > bridge code, yes, but raises a question: > > What should I be using? > > From the python side, it appears to only be possible to create buffer > style objects-- big byte bags-- via the array module [yes a will > work, but a byte-array seems so much more correct]. On the C side, it > seems that the only way to create a byte buffer like object is to use > the PyBuffer* API. You could use the array module on the C side. That is more work than just using the PyBuffer API but would solve your problem. However, what if I raw strings into the API. If I get array objects back from the API there still is an inconsistency. I agree that array objects are more suitable for 'bags of bytes' than the (builtin) alternatives. Note that some people seem to use Numeric python for image processing (http://www.pfdubois.com/numpy/). Adding support for using numeric arrays instead of array.array might be a usefull extension, but probably more as an item on the TODO list than something to implement right now. All things considered I'd go for using array.array from C, with a fallback to the PyBuffer API when you cannot import the array module. Ronald From neal@metaslash.com Sun Jan 5 20:14:55 2003 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 5 Jan 2003 15:14:55 -0500 Subject: [Python-Dev] bz2 problem deriving subclass in C In-Reply-To: <48FE8D34-1F5E-11D7-8857-000A27B19B96@oratrix.com> References: <20030103200359.GH29873@epoch.metaslash.com> <48FE8D34-1F5E-11D7-8857-000A27B19B96@oratrix.com> Message-ID: <20030105201455.GB29873@epoch.metaslash.com> On Fri, Jan 03, 2003 at 09:59:44PM +0100, Jack Jansen wrote: > > On vrijdag, jan 3, 2003, at 21:03 Europe/Amsterdam, Neal Norwitz wrote: > > >>Since you have the BZ2File type object, you have a reference to the > >>file type object. Can't you just call the tp_dealloc slot from that? > >>That seems a very reasonable approach from where I'm sitting. > > > >Makes sense to me too. I tried it and it didn't work. > >At least I think I tried it, but I may have called tp_free. > >I'm not sure now. > > The logic I use for generating the body of xxx_dealloc is now > def generate_dealloc(....): > generate("cleanup my own mess") > if basetype: > generate("basetype.tp_dealloc(self);") > elif new-style-object: > generate("self->ob_type->tp_free(self);") > else: > generate("PyObject_Free(self);") > > This seems to work. Or, at least, I haven't had a reproducible crash > yet:-) I think I found the correct solution, which seems to be different from Jack's code above. Perhaps, it's just my interpretation though. The correct solution (well, I think it's correct, it solves my problem) was to change the code in tp_dealloc: - ((PyObject*)self)->ob_type->tp_free((PyObject *)self); + PyFile_Type.tp_dealloc((PyObject *)self); I expected the original (-) to work, which seems to correspond to Jack's example above. Neal From guido@python.org Sun Jan 5 21:16:13 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 05 Jan 2003 16:16:13 -0500 Subject: [Python-Dev] Cross compiling In-Reply-To: Your message of "Sat, 04 Jan 2003 19:12:03 PST." <200301050312.TAA03857@mail.arc.nasa.gov> References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> Message-ID: <200301052116.h05LGDS31011@pcp02138704pcs.reston01.va.comcast.net> > > Perhaps we should move forward and put a PendingDeprecationWarning > > in rexec for 2.3 (or maybe be harsher and making > > DeprecationWarning instead)? > > Yes, I certainly would be pleased with some form of deprecation > warning for the upcoming release, with perhaps a note in the docs > about using an appropriate operating system facility to do the > sandboxing instead. > > Of course, if others are truly interested in maintaining or > improving rexec(), even if it were removed from Python itself, > deprecation status might be a bit of a buzzkill. :) The problem is that the security issues are not in the rexec module itself, they are in the Python interpreter C code. (E.g. the new-style classes problem that Samuele mentioned.) So the issue isn't so much maintaining or fixing rexec, it's fixing the security issues in the Python interpreter. I think that the only way to arrive at a truly secure version of Python may be a semi-forked version, not quite in the style of OpenBSD, but at least on a separate branch that is being developed *only* to plug security leaks. In the mean time, I think the best thing to do about rexec.py is to *delete* it from the releases (both 2.2.3 and 2.3). Keeping it around, even with a warning, is an invitation for disasters. If people were using it seriously, they *should* be warned. --Guido van Rossum (home page: http://www.python.org/~guido/) From python@rcn.com Sun Jan 5 21:23:21 2003 From: python@rcn.com (Raymond Hettinger) Date: Sun, 5 Jan 2003 16:23:21 -0500 Subject: [Python-Dev] Cross compiling References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> <200301052116.h05LGDS31011@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <003301c2b500$c54ed2e0$0311a044@oemcomputer> [GvR] > In the mean time, I think the best thing to do about rexec.py is to > *delete* it from the releases (both 2.2.3 and 2.3). Keeping it > around, even with a warning, is an invitation for disasters. If > people were using it seriously, they *should* be warned. When using Python as a embedded scripting language, rexec.py still has some value in blocking off part of the system from non-deliberate access. Raymond Hettinger From guido@python.org Sun Jan 5 21:24:51 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 05 Jan 2003 16:24:51 -0500 Subject: [Python-Dev] Cross compiling In-Reply-To: Your message of "Sat, 04 Jan 2003 19:38:05 PST." <1999844A-205F-11D7-99C2-0003933F3BC2@omnigroup.com> References: <1999844A-205F-11D7-99C2-0003933F3BC2@omnigroup.com> Message-ID: <200301052124.h05LOpV31030@pcp02138704pcs.reston01.va.comcast.net> [Timothy Wood] > >> The notes I read about the current state of sandboxing in Python > >> didn't make me warm and fuzzy -- I probably don't understand it (which > >> is one reason for the not-warm/not-fuzzy feeling). I'd feel warm and > >> fuzzy if the Python library routed through my OS abstraction layer for > >> everything rather than directly linking to the OS. [GvR] > > Sorry, you've lost me there. What is it that you want exactly? [TW] > Well, the general idea would be to look at the system calls that > Python uses and evaluate them on a security basis (pretty much by > looking at the undefined symbols in the library and C modules as > reported by nm). Things like fork(), unlink(), being able to open > random files for writing are not things that I need (or want to have to > worry over whether rexec protects me from malicious use of them). Um, Python uses the stdio library, which has all sorts of I/O implications (including the ability to overwrite arbitrary files). You don't seriously suggest that we stop using stdio? > So, one way to worry less would be for me to track down anywhere that > one of these functions is called and either not ship with that module > or instead call into my own OS abstraction layer in my game code. For > example, I could just have a table of function pointers that the > embedder could set -- if a entry was NULL, that would imply that the OS > doesn't support that capability or that the embedding application > doesn't want it used for whatever reason (security in my case). Somehow it sounds like you haven't studied Python's implementation much yet. Do you have any idea of the amount of work you're asking for here? > You could sort of look at this as taking the OS dependent modules and > splitting them in half -- one half would contain a somewhat abstracted > interface to the OS and the other half would contain the stuff to do > the desired Python stuff (file iterators, for example). What do you mean by OS dependent modules? Modules that depend on a specific OS, or modules that do I/O of any kind? Given that many modules interface to 3rd party libraries that do I/O (e.g. gzip, bz2, [g]dbm, and the list continues), it seems this would be impossible, unless you want to reduce the standard library to maybe 10 % of what it was. (rexec.py does this, by the way -- it replaces 'open' and only lets you import a handful of extension modules that are deemed safe.) > I don't know if this could be of general interest -- probably not, > but I thought I'd mention it in case it is (since in that case I'd > spend more time making sure that my work was suitable for submission > back to Python instead of just hacking and slashing on my own local > copy :) Depends on your approach, which I still don't understand -- I recommend that you spend some time studying Python's implementation before commenting more. > This would be a very rudimentary means of achieving one type of > security, but it may be useful to some and fairly easy to maintain. > This clearly doesn't address potential overflow problems or DoS > attacks, but it make it much harder for malicious game players corrupt > other people's systems. If you don't care about potential overflows, you're not very security-minded in my book. After all a potential overflow can be abused by a crafty programmer to execute arbitrary machine code which most certainly will have access to system calls. --Guido van Rossum (home page: http://www.python.org/~guido/) From neal@metaslash.com Sun Jan 5 21:26:58 2003 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 5 Jan 2003 16:26:58 -0500 Subject: [Python-Dev] new features for 2.3? Message-ID: <20030105212658.GD29873@epoch.metaslash.com> I think a nice addition to the stdlib would be the tarfile module in http://python.org/sf/651082 I've reviewed the code (1927 lines of python). There is documentation and a test. Suggestions on adding this, waiting, or rejecting this module? Are there any other (big) features that are planned for 2.3? Neal From bbum@codefab.com Sun Jan 5 19:34:25 2003 From: bbum@codefab.com (Bill Bumgarner) Date: Sun, 5 Jan 2003 14:34:25 -0500 Subject: [Python-Dev] Re: [Pyobjc-dev] PyBuffer* vs. array.array() In-Reply-To: <9F98A166-20C7-11D7-8AB2-0003931CFE24@cistron.nl> Message-ID: On Sunday, Jan 5, 2003, at 11:06 US/Eastern, Ronald Oussoren wrote: > On Sunday, Jan 5, 2003, at 05:56 Europe/Amsterdam, Bill Bumgarner=20 > wrote: >> The real problem is that something created via array.array() comes=20 >> back as an object that behaves differently. Clearly, a bug in my=20 >> bridge code, yes, but raises a question: >> >> What should I be using? >> >> =46rom the python side, it appears to only be possible to create = buffer=20 >> style objects-- big byte bags-- via the array module [yes a =20 >> will work, but a byte-array seems so much more correct]. On the C=20 >> side, it seems that the only way to create a byte buffer like object=20= >> is to use the PyBuffer* API. > You could use the array module on the C side. That is more work than=20= > just using the PyBuffer API but would solve your problem. Looking at this more closely, it seems that the array module requires=20 that it 'owns' the pointer to the memory it contains. That is, it=20 allocates/deallocate the memory and-- the killer-- resizes the hunk of=20= memory at will. So, it appears that the answer is that I need to figure out how to=20 create true buffer objects from the python side. Which is trivial. I hadn't realized that 'buffer' is a primitive type=20= and, therefore, is effectively built in... Not so trivial-- primitive in 2.3, built-in function in 2.2. Need to=20= make sure that whatever I implement works with bost. Deal killer: results of buffer() are read-only on 2.2 (didn't check=20 '' type on 2.3). I would have to expose=20 PyBuffer_FromReadWriteObject() to Python to be able to create a=20 readwrite buffer object from within Python itself. End result: Stick with the current dichotomy. It is annoying, but not=20= killer.=13 > > However, what if I raw strings into the API. If I get array objects=20 > back from the API there still is an inconsistency. > Raw string? As in '1230x045678'? Since the underlying code specifically parses for objects that=20 implement the character buffer API, strings work fine-- as do basically=20= anything else that encapsulates a single-segment buffer. > I agree that array objects are more suitable for 'bags of bytes' than=20= > the (builtin) alternatives. Note that some people seem to use Numeric=20= > python for image processing (http://www.pfdubois.com/numpy/). Adding=20= > support for using numeric arrays instead of array.array might be a=20 > usefull extension, but probably more as an item on the TODO list than=20= > something to implement right now. Right. I also want the code to build/run against a stock installation=20= of Python 2.2 or greater. I believe that array/buffer is a part of the=20= core? I.e. will always be there? In the future, the answer may be to have a 'buffer factory' type API=20 [on the C side more than the Python side] that, given a pointer and=20 length, produces an instance of the appropriate buffer class, as=20 configured by the developer. If the developer wishes to use the=20 Numeric array classes (which I have not looked at yet), it would simply=20= be a matter of implementing the appropriate 'delegate' method and=20 making the factory aware of it. But, as you say, it is a TODO item as opposed to being on the immediate=20= radar. > All things considered I'd go for using array.array from C, with a=20 > fallback to the PyBuffer API when > you cannot import the array module. I was going to go down that path, but the deeper analysis of array=20 (discussed above) indicates that it isn't the right answer. Instead,=20= I'm going to see what it takes to create a true 'buffer' instance in=20 Python. The answer may be that it can't be done and that the developer=20= is simply going to have to live with the dichotomy between buffer and=20 array. If that is the case, then this does sound like an issue truly pertinent=20= to discussion on python-dev. Notably, it would be useful to have a=20 buffer/array that behaves the same on both sides of the Python/C wall=20 and can encapsulate an arbitrary, fixed length, buffer of memory=20 regardless of which side created the memory. It should likely also=20 have the notion of 'ownership' of that memory-- i.e. whether or not it=20= should deallocate the memory when the last reference to the object is=20 destroyed. b.bum From guido@python.org Sun Jan 5 21:58:30 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 05 Jan 2003 16:58:30 -0500 Subject: [Python-Dev] PyBuffer* vs. array.array() In-Reply-To: Your message of "Sat, 04 Jan 2003 23:56:53 EST." <1BE17C58-206A-11D7-A03A-000393877AE4@codefab.com> References: <1BE17C58-206A-11D7-A03A-000393877AE4@codefab.com> Message-ID: <200301052158.h05LwUR31086@pcp02138704pcs.reston01.va.comcast.net> > In writing the unit tests, I came across a problematic situation that > could easily arise in code (feel free to comment on the silliness of > this code, if any... and note that I'm using the comprehension style > even after that long rant I posted earlier :-): > > singlePlane = array.array('B') > singlePlane.fromlist([0 for x in range(0, width*height*3)] ) I'm not sure if you were joking, but why not write singlePlane.fromlist([0] * (width*height*3)) ??? > for i in range(0, 256*256): > si = i * 3 > singlePlane[si] = rPlane[i] > singlePlane[si+1] = gPlane[i] > singlePlane[si+2] = bPlane[i] > > i2 = ... create i2 using data in singlePlane ... > .... then ... > > bitmapData = i2.bitmapData() > self.assertEquals(len(bitmapData), len(singlePlane)) > for i in range(0,100): > self.assertEquals(bitmapData[i], singlePlane[i], > "bitmapData and singlePlane differ at byte %d" % i) > > The contents of singlePlane and bitmapData are identical, but the unit > test fails: > > -- type of element 0 of bitmapData > -- type of element 0 of singlePlane > F. > ====================================================================== > FAIL: testImageData (__main__.TestNSBitmapImageRep) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "Lib/AppKit/test/test_nsbitmapimagerep.py", line 74, in > testImageData > self.assertEquals(bitmapData[i], singlePlane[i], "bitmapData and > singlePlane differ at byte %d" % i) > File > "/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/ > unittest.py", line 292, in failUnlessEqual > raise self.failureException, \ > AssertionError: bitmapData and singlePlane differ at byte 0 > > ---------------------------------------------------------------------- > Ran 2 tests in 1.564s > > FAILED (failures=1) > > -- > > The real problem is that something created via array.array() comes back > as an object that behaves differently. Clearly, a bug in my bridge > code, yes, but raises a question: > > What should I be using? > > From the python side, it appears to only be possible to create buffer > style objects-- big byte bags-- via the array module [yes, a will > work, but a byte-array seems so much more correct]. On the C side, it > seems that the only way to create a byte buffer like object is to use > the PyBuffer* API. > > Advice, please? > > BTW: All of this lays the foundation for creating a PILImage object in > Cocoa that is a seamless part of the NSImage family of classes. I.e. > PIL would then be fully integrated into any Cocoa app that can > dynamically load the python interpreter (another problem I'm working > through). I'm not sure I understand the problem. You could use the 'c' code for creating an array instead of 'B'. Or you can use the tostring() method on the array to convert it to a string. Or you could use buffer() on the array. But why don't you just use strings for binary data, like everyone else? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Jan 5 22:00:15 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 05 Jan 2003 17:00:15 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: Your message of "Sun, 05 Jan 2003 16:26:58 EST." <20030105212658.GD29873@epoch.metaslash.com> References: <20030105212658.GD29873@epoch.metaslash.com> Message-ID: <200301052200.h05M0FB31135@pcp02138704pcs.reston01.va.comcast.net> > I think a nice addition to the stdlib would be the tarfile module in > http://python.org/sf/651082 > > I've reviewed the code (1927 lines of python). There is documentation > and a test. > > Suggestions on adding this, waiting, or rejecting this module? +1 > Are there any other (big) features that are planned for 2.3? Who knows. Read PEP 283. --Guido van Rossum (home page: http://www.python.org/~guido/) From Jack.Jansen@oratrix.com Sun Jan 5 22:12:12 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Sun, 5 Jan 2003 23:12:12 +0100 Subject: [Python-Dev] Need help with test_unicode failure In-Reply-To: Message-ID: On zaterdag, jan 4, 2003, at 23:22 Europe/Amsterdam, Martin v. L=F6wis=20= wrote: > Jack Jansen writes: > >> Can anyone give me a hint as to where I should start looking? > > You need to verify a number of things: > 1. PyUnicode_Contains is invoked. > 2. the conversion PyUnicode_FromObject(container) reports an > exception. > 3. that exception is propagated to the caller. > > It is likely that 1 works ok. I can't see who the eventual caller > should be, but it is likely that 3 works as well. If the conversion > fails to produce an exception, a possible explanation would be that > the system encoding is not ASCII. Thanks, the latter is indeed true for MacPython=20 (sys.getdefaultencoding() is your local MacOS default, mac-roman for=20 me). I'll skip the test if sys.getdefaultencoding() !=3D 'ascii', does that=20= sound right? -- - Jack Jansen =20 http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma=20 Goldman - From xscottg@yahoo.com Sun Jan 5 22:19:09 2003 From: xscottg@yahoo.com (Scott Gilbert) Date: Sun, 5 Jan 2003 14:19:09 -0800 (PST) Subject: [Python-Dev] PyBuffer* vs. array.array() In-Reply-To: <200301052158.h05LwUR31086@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030105221909.61705.qmail@web40106.mail.yahoo.com> --- Guido van Rossum wrote: > > In writing the unit tests, I came across a problematic situation that > > could easily arise in code (feel free to comment on the silliness of > > this code, if any... and note that I'm using the comprehension style > > even after that long rant I posted earlier :-): > > > > singlePlane = array.array('B') > > singlePlane.fromlist([0 for x in range(0, width*height*3)] ) > > I'm not sure if you were joking, but why not write > > singlePlane.fromlist([0] * (width*height*3)) > > ??? > Or cheaper and faster for large width and height: singlePlane = array.array('B', [0])*width*height*3 __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com From esr@thyrsus.com Sun Jan 5 22:15:41 2003 From: esr@thyrsus.com (Eric S. Raymond) Date: Sun, 5 Jan 2003 17:15:41 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: <20030105212658.GD29873@epoch.metaslash.com> References: <20030105212658.GD29873@epoch.metaslash.com> Message-ID: <20030105221541.GA17957@thyrsus.com> Neal Norwitz : > I think a nice addition to the stdlib would be the tarfile module in > http://python.org/sf/651082 > > I've reviewed the code (1927 lines of python). There is documentation > and a test. > > Suggestions on adding this, waiting, or rejecting this module? +1 -- Eric S. Raymond From Jack.Jansen@oratrix.com Sun Jan 5 22:31:15 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Sun, 5 Jan 2003 23:31:15 +0100 Subject: [Python-Dev] bz2 problem deriving subclass in C In-Reply-To: <20030105201455.GB29873@epoch.metaslash.com> Message-ID: <670E79F1-20FD-11D7-B774-000A27B19B96@oratrix.com> On zondag, jan 5, 2003, at 21:14 Europe/Amsterdam, Neal Norwitz wrote: >> The logic I use for generating the body of xxx_dealloc is now >> def generate_dealloc(....): >> generate("cleanup my own mess") >> if basetype: >> generate("basetype.tp_dealloc(self);") >> elif new-style-object: >> generate("self->ob_type->tp_free(self);") >> else: >> generate("PyObject_Free(self);") >> >> This seems to work. Or, at least, I haven't had a reproducible crash >> yet:-) > > I think I found the correct solution, which seems to be different > from Jack's code above. Perhaps, it's just my interpretation though. > > The correct solution (well, I think it's correct, it solves my problem) > was to change the code in tp_dealloc: > > - ((PyObject*)self)->ob_type->tp_free((PyObject *)self); > + PyFile_Type.tp_dealloc((PyObject *)self); Your solution is the same as mine. Your type has a base type (PyFile_Type) so you should call the tp_dealloc() of the basetype. All the way at the bottom of the hierarchy (PyFile_Type, in this case) tp_free will be called. And note that this is the tp_free of the type at the *top* of the hierarchy: it has been responsible for allocating the storage, so it will also know if deallocating has to be done in a non-standard way. -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From guido@python.org Sun Jan 5 22:53:11 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 05 Jan 2003 17:53:11 -0500 Subject: [Python-Dev] PyBuffer* vs. array.array() In-Reply-To: Your message of "Sun, 05 Jan 2003 14:19:09 PST." <20030105221909.61705.qmail@web40106.mail.yahoo.com> References: <20030105221909.61705.qmail@web40106.mail.yahoo.com> Message-ID: <200301052253.h05MrBb31241@pcp02138704pcs.reston01.va.comcast.net> > > I'm not sure if you were joking, but why not write > > > > singlePlane.fromlist([0] * (width*height*3)) > > Or cheaper and faster for large width and height: > > singlePlane = array.array('B', [0])*width*height*3 Correct; then even better: singlePlane = array.array('B', [0]) * (width*height*3) i.e. do only one sequence repeat rather than three. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Jan 5 22:54:28 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 05 Jan 2003 17:54:28 -0500 Subject: [Python-Dev] Cross compiling In-Reply-To: Your message of "Sun, 05 Jan 2003 16:23:21 EST." <003301c2b500$c54ed2e0$0311a044@oemcomputer> References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> <200301052116.h05LGDS31011@pcp02138704pcs.reston01.va.comcast.net> <003301c2b500$c54ed2e0$0311a044@oemcomputer> Message-ID: <200301052254.h05MsSv31255@pcp02138704pcs.reston01.va.comcast.net> > When using Python as a embedded scripting language, rexec.py > still has some value in blocking off part of the system from > non-deliberate access. I don't understand. Why not simply remove the features you don't want to make available? Can you give an example? --Guido van Rossum (home page: http://www.python.org/~guido/) From Jack.Jansen@oratrix.com Sun Jan 5 23:14:29 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Mon, 6 Jan 2003 00:14:29 +0100 Subject: [Python-Dev] Re: [Pyobjc-dev] PyBuffer* vs. array.array() In-Reply-To: <9F98A166-20C7-11D7-8AB2-0003931CFE24@cistron.nl> Message-ID: <70FF30BC-2103-11D7-B774-000A27B19B96@oratrix.com> Note that there's the buffer *interface* and the buffer *object*. You want to use the buffer interface here, then on the Python side there's lots of objects that will work (arrays, NumPy arrays, strings for readonly access, etc). -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From tismer@tismer.com Sun Jan 5 23:26:41 2003 From: tismer@tismer.com (Christian Tismer) Date: Mon, 06 Jan 2003 00:26:41 +0100 Subject: [Python-Dev] PyBuffer* vs. array.array() References: <20030105221909.61705.qmail@web40106.mail.yahoo.com> <200301052253.h05MrBb31241@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E18BF31.2050409@tismer.com> Guido van Rossum wrote: >>>I'm not sure if you were joking, but why not write >>> >>> singlePlane.fromlist([0] * (width*height*3)) >> >> Or cheaper and faster for large width and height: >> >> singlePlane = array.array('B', [0])*width*height*3 > > > Correct; then even better: > > singlePlane = array.array('B', [0]) * (width*height*3) > > i.e. do only one sequence repeat rather than three. For "large" widths and heights, like 1000*1000, this effect is remarkably small: About 3 percent only. The above is true for simple lists. There are also counterexamples, where you are extremely wrong (sorry), most probably due to the mplementation, but also by the effect, that medium sized flat objects can be copied more efficiently than very small ones. >>> if 1: ... t = time.clock() ... for i in xrange(100): ... s = ' ' * 1000 * 1000 ... print time.clock()-t ... 0.674784644417 >>> if 1: ... t = time.clock() ... for i in xrange(100): ... s = ' ' * 1000000 ... print time.clock()-t ... 6.28695295072 >>> Did I hear you head knocking on the keyborard? ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tismer@tismer.com Sun Jan 5 23:46:04 2003 From: tismer@tismer.com (Christian Tismer) Date: Mon, 06 Jan 2003 00:46:04 +0100 Subject: Slow String Repeat (was: [Python-Dev] PyBuffer* vs. array.array()) References: <20030105221909.61705.qmail@web40106.mail.yahoo.com> <200301052253.h05MrBb31241@pcp02138704pcs.reston01.va.comcast.net> <3E18BF31.2050409@tismer.com> Message-ID: <3E18C3BC.1020108@tismer.com> Christian Tismer wrote: > Guido van Rossum wrote: ... >> Correct; then even better: >> >> singlePlane = array.array('B', [0]) * (width*height*3) >> >> i.e. do only one sequence repeat rather than three. Here an addition to my former note. Doing some simple analysis of this, I found that it is generally safer *not* to do huge repetitions of very small objects. If you always use intermediate steps, you are creating some slight overhead, but you will never step into traps like these: > >>> if 1: > ... t = time.clock() > ... for i in xrange(100): > ... s = ' ' * 1000 * 1000 > ... print time.clock()-t > ... > 0.674784644417 > >>> if 1: > ... t = time.clock() > ... for i in xrange(100): > ... s = ' ' * 1000000 > ... print time.clock()-t > ... > 6.28695295072 > >>> Analysis: The central copying code in stringobject.c is the following tight loop: for (i = 0; i < size; i += a->ob_size) memcpy(op->ob_sval+i, a->ob_sval, (int) a->ob_size); For my example, this memcpy is started for every single of the one million bytes. So the overhead of memcpy, let is be a function call or a macro, will be executed a million times. On the other hand, doing ' ' * 1000 * 1000 only has to call memcpy 2000 times. My advice: Do not go from very small to very large in one big step, but go to reasonable chunks. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From guido@python.org Mon Jan 6 00:33:30 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 05 Jan 2003 19:33:30 -0500 Subject: Slow String Repeat (was: [Python-Dev] PyBuffer* vs. array.array()) In-Reply-To: Your message of "Mon, 06 Jan 2003 00:46:04 +0100." <3E18C3BC.1020108@tismer.com> References: <20030105221909.61705.qmail@web40106.mail.yahoo.com> <200301052253.h05MrBb31241@pcp02138704pcs.reston01.va.comcast.net> <3E18BF31.2050409@tismer.com> <3E18C3BC.1020108@tismer.com> Message-ID: <200301060033.h060XZe31484@pcp02138704pcs.reston01.va.comcast.net> > > >>> if 1: > > ... t = time.clock() > > ... for i in xrange(100): > > ... s = ' ' * 1000 * 1000 > > ... print time.clock()-t > > ... > > 0.674784644417 > > >>> if 1: > > ... t = time.clock() > > ... for i in xrange(100): > > ... s = ' ' * 1000000 > > ... print time.clock()-t > > ... > > 6.28695295072 > > >>> > > Analysis: > The central copying code in stringobject.c is the following > tight loop: > > for (i = 0; i < size; i += a->ob_size) > memcpy(op->ob_sval+i, a->ob_sval, (int) a->ob_size); > > For my example, this memcpy is started for every single > of the one million bytes. So the overhead of memcpy, > let is be a function call or a macro, will be executed > a million times. > > On the other hand, doing ' ' * 1000 * 1000 only > has to call memcpy 2000 times. Do I hear a challenge for Raymond H? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jan 6 00:36:24 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 05 Jan 2003 19:36:24 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: Your message of "Sat, 04 Jan 2003 01:57:00 EST." Message-ID: <200301060036.h060aOp31522@pcp02138704pcs.reston01.va.comcast.net> If anyone is interested, I have a geometric proof that the astimezone() method is correct. I personally find the geometric proof easier to understand than Tim's analytic proof. It's also shorter -- except it needs drawings. Should I bother writing it down? --Guido van Rossum (home page: http://www.python.org/~guido/) From zack@codesourcery.com Mon Jan 6 00:41:18 2003 From: zack@codesourcery.com (Zack Weinberg) Date: Sun, 05 Jan 2003 16:41:18 -0800 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <15892.48142.993092.418042@montanaro.dyndns.org> (Skip Montanaro's message of "Thu, 2 Jan 2003 16:24:14 -0600") References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <15892.48142.993092.418042@montanaro.dyndns.org> Message-ID: <877kdjgle9.fsf@egil.codesourcery.com> Skip Montanaro writes: > It seems there is little support for apply, it having already been > deprecated. Apply has been deprecated?! What am I supposed to use instead? zw From ark@research.att.com Mon Jan 6 00:45:38 2003 From: ark@research.att.com (Andrew Koenig) Date: Sun, 5 Jan 2003 19:45:38 -0500 (EST) Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <877kdjgle9.fsf@egil.codesourcery.com> (message from Zack Weinberg on Sun, 05 Jan 2003 16:41:18 -0800) References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <15892.48142.993092.418042@montanaro.dyndns.org> <877kdjgle9.fsf@egil.codesourcery.com> Message-ID: <200301060045.h060jcf10301@europa.research.att.com> Zack> Apply has been deprecated?! What am I supposed to use instead? You can replace apply(f, args) by f(*args) and apply(f, args, kwds) by f(*args, **kwds). From zack@codesourcery.com Mon Jan 6 00:49:04 2003 From: zack@codesourcery.com (Zack Weinberg) Date: Sun, 05 Jan 2003 16:49:04 -0800 Subject: [Python-Dev] map, filter, reduce, lambda In-Reply-To: <877kdjgle9.fsf@egil.codesourcery.com> (Zack Weinberg's message of "Sun, 05 Jan 2003 16:41:18 -0800") References: <15892.27067.231224.233677@montanaro.dyndns.org> <20030102174201.GA29771@panix.com> <200301021754.h02HsZr19162@odiug.zope.com> <20030102183134.GA23582@meson.dyndns.org> <200301021853.h02IrYk20504@odiug.zope.com> <15892.48142.993092.418042@montanaro.dyndns.org> <877kdjgle9.fsf@egil.codesourcery.com> Message-ID: <874r8ngl1b.fsf@egil.codesourcery.com> Zack Weinberg writes: > Skip Montanaro writes: > >> It seems there is little support for apply, it having already been >> deprecated. > > Apply has been deprecated?! What am I supposed to use instead? ... sorry, ignore me, forgot that function call can be applied to variables ... zw From dave@boost-consulting.com Mon Jan 6 01:05:09 2003 From: dave@boost-consulting.com (David Abrahams) Date: Sun, 05 Jan 2003 20:05:09 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: <200301060036.h060aOp31522@pcp02138704pcs.reston01.va.comcast.net> (Guido van Rossum's message of "Sun, 05 Jan 2003 19:36:24 -0500") References: <200301060036.h060aOp31522@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > If anyone is interested, I have a geometric proof that the > astimezone() method is correct. I personally find the geometric proof > easier to understand than Tim's analytic proof. It's also shorter -- > except it needs drawings. Should I bother writing it down? I think it's worthwhile if it's going to go somewhere people will see it later (like in the source). Lots of people, including me, do better with drawings. -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From Jack.Jansen@oratrix.com Mon Jan 6 01:37:24 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Mon, 6 Jan 2003 02:37:24 +0100 Subject: [Python-Dev] PyBuffer* vs. array.array() In-Reply-To: <3E18BF31.2050409@tismer.com> Message-ID: <67DB665C-2117-11D7-BB12-000A27B19B96@oratrix.com> On maandag, jan 6, 2003, at 00:26 Europe/Amsterdam, Christian Tismer wrote: > There are also counterexamples, where you are > extremely wrong [...] > >>> if 1: > ... t = time.clock() > ... for i in xrange(100): > ... s = ' ' * 1000 * 1000 > ... print time.clock()-t > ... > 0.674784644417 > >>> if 1: > ... t = time.clock() > ... for i in xrange(100): > ... s = ' ' * 1000000 > ... print time.clock()-t > ... > 6.28695295072 > >>> That's interesting... This must have something to do with the memory allocator becoming very inefficient for these big blocks. I repeated this with a larger repeat count, and the VM size doesn't increase so it's not a leak or swap problem, it looks like it's something in the implementation of the internal administration. There does appear to be an oscillation in the resident set size, whatever that means... (I had just, in private email, (jokingly) called Guido pedantic for his reply. Reality always turns out to be more interesting than it appears:-) -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From pedronis@bluewin.ch Mon Jan 6 00:38:44 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Mon, 6 Jan 2003 01:38:44 +0100 Subject: [Python-Dev] Cross compiling References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> <200301052116.h05LGDS31011@pcp02138704pcs.reston01.va.comcast.net> <003301c2b500$c54ed2e0$0311a044@oemcomputer> Message-ID: <01b301c2b51b$f86358c0$6d94fea9@newmexico> From: "Raymond Hettinger" > [GvR] > > In the mean time, I think the best thing to do about rexec.py is to > > *delete* it from the releases (both 2.2.3 and 2.3). Keeping it > > around, even with a warning, is an invitation for disasters. If > > people were using it seriously, they *should* be warned. > > When using Python as a embedded scripting language, rexec.py > still has some value in blocking off part of the system from > non-deliberate access. Should we keep Bastion too or should it go? >From Bastion.py __doc__ : "... Bastions have a number of uses, but the most obvious one is to provide code executing in restricted mode with a safe interface to an object implemented in unrestricted mode..." consider this potential setup (inspired by the Bastion test code in Bastion.py): Python 2.3a1 (#38, Dec 31 2002, 17:53:59) [MSC v.1200 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> class C: pass ... >>> import Bastion; b=Bastion.Bastion(C()) >>> import rexec; r=rexec.RExec() >>> r.add_module('__main__').b=b >>> what can be done inside r? a bastionized empty instance b of a classic class, a restricted enviroment... From Jack.Jansen@oratrix.com Mon Jan 6 01:45:03 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Mon, 6 Jan 2003 02:45:03 +0100 Subject: Slow String Repeat (was: [Python-Dev] PyBuffer* vs. array.array()) In-Reply-To: <3E18C3BC.1020108@tismer.com> Message-ID: <79CC342C-2118-11D7-BB12-000A27B19B96@oratrix.com> On maandag, jan 6, 2003, at 00:46 Europe/Amsterdam, Christian Tismer wrote: > The central copying code in stringobject.c is the following > tight loop: > > for (i = 0; i < size; i += a->ob_size) > memcpy(op->ob_sval+i, a->ob_sval, (int) a->ob_size); > > For my example, this memcpy is started for every single > of the one million bytes. So the overhead of memcpy, > let is be a function call or a macro, will be executed > a million times. Oops, I replied before seeing this message, this does sound plausible. But that gives an easy way to fix it: for copies larger than a certain factor just copy the source object, then duplicate the source object until you're at size/2, then duplicat the last bit. That is, if it is worth the trouble to optimize this, -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From mhammond@skippinet.com.au Mon Jan 6 02:14:44 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Mon, 6 Jan 2003 13:14:44 +1100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: Message-ID: <000601c2b529$62a85200$530f8490@eden> [Tim] > [David Abrahams] > > ... > > but AFAICT there _is_ a Python core issue here: there's no > > way to find out whether you've already got the GIL**, so if you > > _might_ have to acquire it, you must always acquire it. > > > > **If I'm wrong about that, please let me know. It isn't > > obvious from the documentation. > It's true -- you can't know whether you have the GIL, unless > you code up another layer of your own machinery to keep track > of who has the GIL. Mark Hammond faces this issue (in all its > multifacted glories) in the Win32 extensions, and built some C++ > classes there to help him out. It's difficult at best, and last I > looked (several years ago) I wasn't convinced. We do have a real problem here, and I keep stumbling across it. So far, this issue has hit me in the win32 extensions, in Mozilla's PyXPCOM, and even in Gordon's "installer". IMO, the reality is that the Python external thread-state API sucks. I can boldly make that assertion as I have heard many other luminaries say it before me. As Tim suggests, time is the issue. I fear the only way to approach this is with a PEP. We need to clearly state our requirements, and clearly show scenarios where interpreter states, thread states, the GIL etc all need to cooperate. Eg, InterpreterState's seem YAGNI, but manage to complicate using ThreadStates, which are certainly YNI. The ability to "unconditionally grab the lock" may be useful, as may a construct meaning "I'm calling out to/in from an external API" discrete from the current singular "release/acquire the GIL" construct available today. I'm willing to help out with this, but not take it on myself. I have a fair bit to gain - if I can avoid toggling locks every time I call out to each and every function there would be some nice perf gains to be had, and horrible code to remove. Once I clear the mail from my break I will try and find the thread-sig conclusions... Mark. From tjreedy@udel.edu Mon Jan 6 02:52:49 2003 From: tjreedy@udel.edu (Terry Reedy) Date: Sun, 5 Jan 2003 21:52:49 -0500 Subject: [Python-Dev] Re: Slow String Repeat (was: PyBuffer* vs. array.array()) References: <20030105221909.61705.qmail@web40106.mail.yahoo.com> <200301052253.h05MrBb31241@pcp02138704pcs.reston01.va.comcast.net> <3E18BF31.2050409@tismer.com> <3E18C3BC.1020108@tismer.com> <200301060033.h060XZe31484@pcp02138704pcs.reston01.va.comcast.net> Message-ID: "Guido van Rossum" wrote in message > > Analysis: > > The central copying code in stringobject.c is the following > > tight loop: > > > > for (i = 0; i < size; i += a->ob_size) > > memcpy(op->ob_sval+i, a->ob_sval, (int) a->ob_size); > > > > For my example, this memcpy is started for every single > > of the one million bytes. So the overhead of memcpy, > > let is be a function call or a macro, will be executed > > a million times. > > > > On the other hand, doing ' ' * 1000 * 1000 only > > has to call memcpy 2000 times. > > Do I hear a challenge for Raymond H? :-) If the challenge is to minimize memcpy calls, I believe one should copy from a to op (->ob_sval) just once, copy and double the initialized portion of op until it is at least size//2 + size %2, and then fill in the rest of op with one last call. This will take log2(n) + 1* calls, which is 20 for n==1000000. The two lines above would expand to about ten. There would be a threshhold, depending on n and perhaps the the template size (and the system) at which the decrease in calls would pay for the extra overhead associated with each. Terry J. Reedy From dave@boost-consulting.com Mon Jan 6 03:18:19 2003 From: dave@boost-consulting.com (David Abrahams) Date: Sun, 05 Jan 2003 22:18:19 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <000601c2b529$62a85200$530f8490@eden> ("Mark Hammond"'s message of "Mon, 6 Jan 2003 13:14:44 +1100") References: <000601c2b529$62a85200$530f8490@eden> Message-ID: "Mark Hammond" writes: > We do have a real problem here, and I keep stumbling across it. So far, > this issue has hit me in the win32 extensions, in Mozilla's PyXPCOM, and > even in Gordon's "installer". IMO, the reality is that the Python external > thread-state API sucks. I can boldly make that assertion as I have heard > many other luminaries say it before me. As Tim suggests, time is the issue. > > I fear the only way to approach this is with a PEP. We need to clearly > state our requirements, and clearly show scenarios where interpreter states, > thread states, the GIL etc all need to cooperate. Eg, InterpreterState's > seem YAGNI, but manage to complicate using ThreadStates, which are certainly > YNI. The ability to "unconditionally grab the lock" may be useful, as may a > construct meaning "I'm calling out to/in from an external API" discrete from > the current singular "release/acquire the GIL" construct available today. > > I'm willing to help out with this, but not take it on myself. I have a fair > bit to gain - if I can avoid toggling locks every time I call out to each > and every function there would be some nice perf gains to be had, and > horrible code to remove. I'm also willing to lend a hand with a PEP, if it's worth anything. I don't know as much about the problems in this domain as you do; I've only seen this one example that bit me. I'm prepared to spend a few brain cycles on it and help with the writing, though. -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From tdelaney@avaya.com Mon Jan 6 04:58:58 2003 From: tdelaney@avaya.com (Delaney, Timothy) Date: Mon, 6 Jan 2003 15:58:58 +1100 Subject: [Python-Dev] map, filter, reduce, lambda Message-ID: > From: Guido van Rossum [mailto:guido@python.org] > > Moreover, def is short for define, and we don't really define > something here. Maybe callback would work? I'd add the parentheses > back to the argument list for more uniformity with def argument lists, > so we'd get: > > lst.sort(callback(a, b): cmp(a.lower(), b.lower())) Probably the thing I dislike most about lambda is the lack of parentheses on the parameters. Uniformity is good. > But I still think that to the casual programmer a two-liner looks > better: > > def callback(a, b): return cmp(a.lower(), b.lower()) > lst.sort(callback) However, for some of us, that's actually a 4-liner ... def callback (a, b): return cmp(a.lower(), b.lower()) lst.sort(callback) I like lambdas, but definitely restrict myself to things that can be fully expressed on a single line - and that's including the surrounding cruft. Anything more than that and the simplicity of the lambda goes away. But then again, I would *really* hate to have map(), filter() and reduce() disappear - I find them so much more readable than list comprehensions most of the time. Tim Delaney From tim.one@comcast.net Mon Jan 6 06:10:01 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 06 Jan 2003 01:10:01 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: <200301060036.h060aOp31522@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > If anyone is interested, I have a geometric proof that the > astimezone() method is correct. I personally find the geometric proof > easier to understand than Tim's analytic proof. I hope so . Diagrams give insight, algebra gives certainty and usually teases out assumptions better. > It's also shorter -- except it needs drawings. Should I bother > writing it down? Sure! Include it with the analytic proof in the source file. Half the words there are explaining the edge cases and what we want to do when we see one, and so are resuable. BTW, the analytic proof is about three times as long as it needs to be, because it explains each tiny step in detail. More typical would be something like Let y = x but in tz's time zone. Let z = y - x.o + y.s. Then diff = (x.n - x.o) - (z.n - z.o) = x.n - x.o - y.n + x.o - y.s + z.o = z.o - y.s = z.d We're done iff z.d = 0, and if it is then we have the std time spelling we want in the start-of-DST case. Else let z' = z + z.d. Then diff' = (x.n - x.o) - (z'.n - z'.o) = x.n - x.o - z.n - x.n + x.o + z.n - z.o + z'.o = z'.o - z.o = z'.d - z.d So now we're done iff z'.d = z.d. If not, we must be in the end-of-DST case (there is no UTC equivalent to x in tz's local time), so we want z (in daylight time) instead of z' (in std time for most realistic time zones, but perhaps in a different branch of double-daylight time -- figuring out exactly how it can be that z'.d != z.d at this point is an example of the analytic method forcing assumptions into the open). From python@rcn.com Mon Jan 6 07:49:50 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 6 Jan 2003 02:49:50 -0500 Subject: Slow String Repeat (was: [Python-Dev] PyBuffer* vs. array.array()) References: <20030105221909.61705.qmail@web40106.mail.yahoo.com> <200301052253.h05MrBb31241@pcp02138704pcs.reston01.va.comcast.net> <3E18BF31.2050409@tismer.com> <3E18C3BC.1020108@tismer.com> <200301060033.h060XZe31484@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <006501c2b558$31a14fc0$125ffea9@oemcomputer> ----- Original Message ----- From: "Guido van Rossum" To: "Christian Tismer" Cc: Sent: Sunday, January 05, 2003 7:33 PM Subject: Re: Slow String Repeat (was: [Python-Dev] PyBuffer* vs. array.array()) > > > >>> if 1: > > > ... t = time.clock() > > > ... for i in xrange(100): > > > ... s = ' ' * 1000 * 1000 > > > ... print time.clock()-t > > > ... > > > 0.674784644417 > > > >>> if 1: > > > ... t = time.clock() > > > ... for i in xrange(100): > > > ... s = ' ' * 1000000 > > > ... print time.clock()-t > > > ... > > > 6.28695295072 > > > >>> > > > > Analysis: > > The central copying code in stringobject.c is the following > > tight loop: > > > > for (i = 0; i < size; i += a->ob_size) > > memcpy(op->ob_sval+i, a->ob_sval, (int) a->ob_size); > > > > For my example, this memcpy is started for every single > > of the one million bytes. So the overhead of memcpy, > > let is be a function call or a macro, will be executed > > a million times. > > > > On the other hand, doing ' ' * 1000 * 1000 only > > has to call memcpy 2000 times. > > Do I hear a challenge for Raymond H? :-) > > --Guido van Rossum (home page: http://www.python.org/~guido/) This ought to do it: i = 0; if (size >= 1 ){ // copy in first one memcpy(op->ob_sval, a->ob_sval, (int) a->ob_size); i = (int) a->ob_size; } for ( ; i + i < size ; i <<= 1) // grow in doubles memcpy(op->ob_sval+i, op->ob_sval, i); if (i < size ) // fill remainder memcpy(op->ob_sval+i, op->ob_sval, size-i); Raymond Hettinger From python@rcn.com Mon Jan 6 07:49:50 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 6 Jan 2003 02:49:50 -0500 Subject: Slow String Repeat (was: [Python-Dev] PyBuffer* vs. array.array()) References: <20030105221909.61705.qmail@web40106.mail.yahoo.com> <200301052253.h05MrBb31241@pcp02138704pcs.reston01.va.comcast.net> <3E18BF31.2050409@tismer.com> <3E18C3BC.1020108@tismer.com> <200301060033.h060XZe31484@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <006501c2b558$31a14fc0$125ffea9@oemcomputer> ----- Original Message ----- From: "Guido van Rossum" To: "Christian Tismer" Cc: Sent: Sunday, January 05, 2003 7:33 PM Subject: Re: Slow String Repeat (was: [Python-Dev] PyBuffer* vs. array.array()) > > > >>> if 1: > > > ... t = time.clock() > > > ... for i in xrange(100): > > > ... s = ' ' * 1000 * 1000 > > > ... print time.clock()-t > > > ... > > > 0.674784644417 > > > >>> if 1: > > > ... t = time.clock() > > > ... for i in xrange(100): > > > ... s = ' ' * 1000000 > > > ... print time.clock()-t > > > ... > > > 6.28695295072 > > > >>> > > > > Analysis: > > The central copying code in stringobject.c is the following > > tight loop: > > > > for (i = 0; i < size; i += a->ob_size) > > memcpy(op->ob_sval+i, a->ob_sval, (int) a->ob_size); > > > > For my example, this memcpy is started for every single > > of the one million bytes. So the overhead of memcpy, > > let is be a function call or a macro, will be executed > > a million times. > > > > On the other hand, doing ' ' * 1000 * 1000 only > > has to call memcpy 2000 times. > > Do I hear a challenge for Raymond H? :-) > > --Guido van Rossum (home page: http://www.python.org/~guido/) This ought to do it: i = 0; if (size >= 1 ){ // copy in first one memcpy(op->ob_sval, a->ob_sval, (int) a->ob_size); i = (int) a->ob_size; } for ( ; i + i < size ; i <<= 1) // grow in doubles memcpy(op->ob_sval+i, op->ob_sval, i); if (i < size ) // fill remainder memcpy(op->ob_sval+i, op->ob_sval, size-i); Raymond Hettinger From mal@lemburg.com Mon Jan 6 10:22:58 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 06 Jan 2003 11:22:58 +0100 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: References: Message-ID: <3E195902.5040709@lemburg.com> Tim Peters wrote: > Half the > words there are explaining the edge cases and what we want to do when we see > one, and so are resuable. BTW, the analytic proof is about three times as > long as it needs to be, because it explains each tiny step in detail. More > typical would be something like > > Let y = x but in tz's time zone. > Let z = y - x.o + y.s. Care to explain what .o, .n and .s mean for the casual reader ;-) > Then diff = (x.n - x.o) - (z.n - z.o) = > x.n - x.o - y.n + x.o - y.s + z.o = > z.o - y.s = > z.d > > We're done iff z.d = 0, and if it is then we have the std time spelling we > want in the start-of-DST case. > > Else let z' = z + z.d. > > Then diff' = (x.n - x.o) - (z'.n - z'.o) = > x.n - x.o - z.n - x.n + x.o + z.n - z.o + z'.o = > z'.o - z.o = > z'.d - z.d > > So now we're done iff z'.d = z.d. If not, we must be in the end-of-DST case > (there is no UTC equivalent to x in tz's local time), so we want z (in > daylight time) instead of z' (in std time for most realistic time zones, but > perhaps in a different branch of double-daylight time -- figuring out > exactly how it can be that z'.d != z.d at this point is an example of the > analytic method forcing assumptions into the open). I think you are missing a point here: time zones don't have DST. Each time zone describes a fixed offset from UTC and whenever the locale applies DST, the time zone for that locale switches to a new time zone, e.g. MET becomes MEST. The disruption you are seeing at the DST switch times comes from switching the time zone and is not caused by the locale's clocks doing a "jump" like in a Feynman diagram. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Mon Jan 6 16:57:08 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 11:57:08 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: Your message of "Sun, 05 Jan 2003 17:00:15 EST." Message-ID: <200301061657.h06Gv8Z14933@odiug.zope.com> I've thought of two things I'd like to land in alpha 2. Both are inspired by Zope3 needs (but this doesn't mean Zope Corp is now taking over making decisions). First of all, reST is going to be used a lot in Zope3. Maybe it could become a standard library module? Next, I really, really, really would like to improve pickling of new-style classes. I've finally come to the conclusion that any solution to making pickled new-style class instances (and hence pickled datetime objects) more efficient will require adding new codes to the pickle protocol. We can do that in Python 2.3. Because this is backwards incompatible, I propose that you have to request this protocol explicitly. I propose to "upgrade' the binary flag to a general "protocol version" flag, with values: 0 - original protocol 1 - binary protocol 2 - new protocol The new protocol can contain an explicit pickle code for the new datetime objects. That's about all the thinking I've done so far. We need to decide on the new format, but first we must figure out ways how to efficiently pickle and unpickle subclass instances of (picklable) built-in types, preferably without having to copy all the data twice, and instances of new-style classes with slots. And we need to implement these twice: in Python for pickle.py and in C for cPickle.py. I'd also like to get rid of __safe_for_unpickling__ and all other pseudo security features. Attempting to unpickle pickles from an untrusted source is insane, and nothing can help us there; I'd rather make the marshal protocol bulletproof (all it needs is a few more checks for inconsistent data and a little better error handling). --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Jan 6 13:33:33 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 06 Jan 2003 14:33:33 +0100 Subject: [Python-Dev] PEP 297: Support for System Upgrades In-Reply-To: <15895.45479.235214.554416@gargle.gargle.HOWL> References: <3D37DFEA.9070506@lemburg.com> <3E172842.8070501@lemburg.com> <15895.26084.121086.435407@gargle.gargle.HOWL> <200301050049.h050nqa21172@pcp02138704pcs.reston01.va.comcast.net> <15895.45479.235214.554416@gargle.gargle.HOWL> Message-ID: <3E1985AD.8010202@lemburg.com> Barry A. Warsaw wrote: >>>>>>"GvR" == Guido van Rossum writes: > > > GvR> Excellent point. > > GvR> Or a micro-release should clear out the system-packages > GvR> directory. > > The only reason I'd rather not do that is so that if a package still > needs an update for the new Python micro release, a sysadmin could at > least copy the package over from one version to the next. +1 Ok, then, let's call the dir "site-upgrades-" with being major.minor.patchlevel. It seems that only site.py needs to be changed. Hmm, but what happens if someone invokes Python with -S (don't load site.py) ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tismer@tismer.com Mon Jan 6 14:42:01 2003 From: tismer@tismer.com (Christian Tismer) Date: Mon, 06 Jan 2003 15:42:01 +0100 Subject: Slow String Repeat (was: [Python-Dev] PyBuffer* vs. array.array()) References: <20030105221909.61705.qmail@web40106.mail.yahoo.com> <200301052253.h05MrBb31241@pcp02138704pcs.reston01.va.comcast.net> <3E18BF31.2050409@tismer.com> <3E18C3BC.1020108@tismer.com> <200301060033.h060XZe31484@pcp02138704pcs.reston01.va.comcast.net> <006501c2b558$31a14fc0$125ffea9@oemcomputer> Message-ID: <3E1995B9.2010608@tismer.com> Raymond Hettinger wrote: [ " " * 1000000 ten times slower than " " * 1000 * 1000 ] > This ought to do it: > > > i = 0; > if (size >= 1 ){ // copy in first one > memcpy(op->ob_sval, a->ob_sval, (int) a->ob_size); > i = (int) a->ob_size; > } > for ( ; i + i < size ; i <<= 1) // grow in doubles > memcpy(op->ob_sval+i, op->ob_sval, i); > if (i < size ) // fill remainder > memcpy(op->ob_sval+i, op->ob_sval, size-i); Looks good, not too much code to add. This solves the problem for string repetition. Theoretically, there are other objects which might expose a similar problem: If the overhead of starting the copy process is larger than the actual copying process, then it is cheaper to do the repeat operation stepwise. I'm asking whether it makes sense to look for every possible occurrence of such and fix it like above? I could imagine not to change string repeat and others, but the abstract implementation of the repetition of a sequence. An algorithm like the above could be written for general sequences, and do this break-up on the abstract level, once and for all repetitions of arbitrary objects. But I have to admit that this is a bit less efficient, since the target object cannto be allocated in advance, before looking into the actual type. cheers -- chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From python@rcn.com Mon Jan 6 14:50:07 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 6 Jan 2003 09:50:07 -0500 Subject: Slow String Repeat (was: [Python-Dev] PyBuffer* vs. array.array()) References: <20030105221909.61705.qmail@web40106.mail.yahoo.com> <200301052253.h05MrBb31241@pcp02138704pcs.reston01.va.comcast.net> <3E18BF31.2050409@tismer.com> <3E18C3BC.1020108@tismer.com> <200301060033.h060XZe31484@pcp02138704pcs.reston01.va.comcast.net> <006501c2b558$31a14fc0$125ffea9@oemcomputer> <3E1995B9.2010608@tismer.com> Message-ID: <003501c2b592$e83c16e0$125ffea9@oemcomputer> From: "Christian Tismer" > I could imagine not to change string repeat and > others, but the abstract implementation of the > repetition of a sequence. > An algorithm like the above could be written > for general sequences, and do this break-up > on the abstract level, once and for all > repetitions of arbitrary objects. Something similar can be done for arraymodule.c The use cases there may involve longer than normal repetition counts so some attention should be paid to cache invalidation. In other places, there is less of an opportunity since replication involves more than copying (there are Py_INCREFs to worry about). Raymond Hettinger From bbum@codefab.com Mon Jan 6 15:00:23 2003 From: bbum@codefab.com (Bill Bumgarner) Date: Mon, 6 Jan 2003 10:00:23 -0500 Subject: [Python-Dev] PyBuffer* vs. array.array() In-Reply-To: <200301052158.h05LwUR31086@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <954781EC-2187-11D7-A03A-000393877AE4@codefab.com> On Sunday, Jan 5, 2003, at 16:58 US/Eastern, Guido van Rossum wrote: >> singlePlane = array.array('B') >> singlePlane.fromlist([0 for x in range(0, width*height*3)] ) > > I'm not sure if you were joking, but why not write > > singlePlane.fromlist([0] * (width*height*3)) > > ??? Not joking; not thinking and haven't really done large blob manipulation in Python before. That answers another question, though -- if I were to build an image with four channels-- red, green, blue, alpha-- and wanted the alpha channel to be set to 255 throughout, then I would do... singlePlane.fromlist([0, 0, 0, 255] * (width * height)) ... or ... array.array('B', [0, 0, 0, 255]) * width * height >> ........... >> -- > > I'm not sure I understand the problem. I was hoping that there was a single object type that could easily be used from both the C and Python side that could contain a large buffer of binary/byte data. What I really need is a fixed length buffer that supports slicing style assignments / getters. The type of the elements is largely irrelevant save for that each element needs to be accessed as a single byte. The fixed length requirement comes from the need to encapsulate buffers of memory as returned by various APIs outside of Python. In this case, I'm providing access to hunks of memory controlled by the APIs provided by the Foundation and the AppKit within Cocoa (or GNUstep). I also need to allocate a hunk of memory-- an array of bytes, a string, a buffer, whatever-- and pass it off through the AppKit/Foundation APIs. Once those APIs have the address and length of the buffer, that address and length must remain constant over time. I would really like to be able to do the allocation from the Python side of the fence-- allocate, initialize with a particular byte pattern, and pass it off to the Foundation/AppKit (while still being able to manipulate the contents in Python). The PyBuffer* C API seems to be ideal in that a buffer object produced via the PyBuffer_New() function is read/write (unlike a buffer produced by buffer() in Python), contains a reference to a fixed length array at a fixed address, and is truly a bag o' bytes. At this point, I'll probably add some kind of an 'allocate' function to the 'objc' module that simply calls PyBuffer_New(). Did that -- works except, of course, the resulting buffer is an array of chars such that slicing assignments have to take strings. Inconvenient, but workable: >>> import objc >>> b = objc.allocateBuffer(100) >>> type(b) >>> b[0:10] = range(0,10) Traceback (most recent call last): File "", line 1, in ? TypeError: bad argument type for built-in operation >>> b[0:10] = [chr(x) for x in range(0,10)] Traceback (most recent call last): File "", line 1, in ? TypeError: bad argument type for built-in operation >>> b[0:10] = "".join([chr(x) for x in range(0,10)]) >>> b >>> b[0:15] '\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\x00\x00\x00\x00\x00' > You could use the 'c' code for creating an array instead of 'B'. Right; as long as it is a byte, it doesn't matter. I chose 'B' because it is an unsigned numeric type. Since I'm generating numeric data that is shoved into the bitmap as R,G,B triplets, a numeric type seemed to be the most convenient. > Or you can use the tostring() method on the array to convert it to a > string. > > Or you could use buffer() on the array. > But why don't you just use strings for binary data, like everyone > else? Because strings are variable length, do not support slice style assignments, and require all numeric data to be converted to a string before being used as 'data'. b.bum From akuchlin@mems-exchange.org Mon Jan 6 17:06:37 2003 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Mon, 06 Jan 2003 12:06:37 -0500 Subject: [Python-Dev] Whither rexec? Message-ID: Given the rexec module's current state of undetermined bugginess, I'd like to mention something about it in the "What's New" document, but there was no clear resolution about what's going to be done. Some proposals were 1) deprecate it 2) rip it out and move it to nondist/ or somewhere, 3) leave it alone and just document it as not-to-be-trusted. So, can the BDFL or someone please tell me what I should say about rexec? --amk (www.amk.ca) That jackanapes! All he ever does is cause trouble. -- The Doctor talking about the Master, in "Terror of the Autons" From mal@lemburg.com Mon Jan 6 17:09:17 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 06 Jan 2003 18:09:17 +0100 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: References: Message-ID: <3E19B83D.2090906@lemburg.com> Tim Peters wrote: >>>... >>>Let z = y - x.o + y.s. >>>... > > [M.-A. Lemburg] > >>Care to explain what .o, .n and .s mean for the casual reader ;-) > > > Nope. The casual reader isn't interested enough to be worth the bother. > The non-casual reader should read the full proof (at the end of > datetimemodule.c or datetime.py), which defines the notation. """ For a datetimetz x, let x.n = x stripped of its timezone -- its naive time. x.o = x.utcoffset(), and assuming that doesn't raise an exception or return None x.d = x.dst(), and assuming that doesn't raise an exception or return None x.s = x's standard offset, x.o - x.d """ >>... >>I think you are missing a point here: time zones don't have DST. > > > I understand. tzinfo subclasses can, though. The USTimeZone tzinfo > subclass posted earlier in this thread, and its 4 instances, was a fully > fleshed-out example. > > >>>>from datetime import * >>>>t = timetz(3, 15, tzinfo=Eastern) >>>>print datetimetz.combine(date(2003, 1, 6), t) > > 2003-01-06 03:15:00-05:00 > >>>>print datetimetz.combine(date(2003, 8 ,1), t) > > 2003-08-01 03:15:00-04:00 > > As the output shows, Eastern models both EST (-05:00) and EDT (-04:00). > astimezone() wants to give sensible (as sensible as possible) results for > converting between such hybrid "time zones" too. It's also possible (easy) > to create tzinfo subclasses that model only EST, or only EDT, but they seem > less useful in real apps. In that case you should follow the standard way of using the name of the locale to define your timetz subclasses, e.g. EasternUS, CentralUS, etc. >>Each time zone describes a fixed offset from UTC and whenever >>the locale applies DST, the time zone for that locale switches >>to a new time zone, e.g. MET becomes MEST. The disruption you are >>seeing at the DST switch times comes from switching the time zone >>and is not caused by the locale's clocks doing a "jump" like in a >>Feynman diagram. > > The disruption exists in real life, and we're allowing for a tzinfo subclass > to model it faithfully. Fair enough. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From nas@python.ca Mon Jan 6 17:14:55 2003 From: nas@python.ca (Neil Schemenauer) Date: Mon, 6 Jan 2003 09:14:55 -0800 Subject: [Python-Dev] new features for 2.3? In-Reply-To: <200301061657.h06Gv8Z14933@odiug.zope.com> References: <200301061657.h06Gv8Z14933@odiug.zope.com> Message-ID: <20030106171455.GA18213@glacier.arctrix.com> Guido van Rossum wrote: > First of all, reST is going to be used a lot in Zope3. Maybe it could > become a standard library module? Don't care. > Next, I really, really, really would like to improve pickling of > new-style classes. +1. Neil From tim.one@comcast.net Mon Jan 6 17:13:45 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 06 Jan 2003 12:13:45 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: <3E19B83D.2090906@lemburg.com> Message-ID: [M.-A. Lemburg] >> ... >> It's also possible (easy) to create tzinfo subclasses that model >> only EST, or only EDT, but they seem less useful in real apps. [M.-A. Lemburg] > In that case you should follow the standard way of using > the name of the locale to define your timetz subclasses, > e.g. EasternUS, CentralUS, etc. I don't care what people call their classes. Remember that datetime supplies no tzinfo subclasses. If a user wants some, they have to supply their own, and can use any naming convention they like. Eastern was just an example (not part of the 2.3 distribution). From guido@python.org Mon Jan 6 15:11:04 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 10:11:04 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: Your message of "Mon, 06 Jan 2003 13:14:44 +1100." <000601c2b529$62a85200$530f8490@eden> References: <000601c2b529$62a85200$530f8490@eden> Message-ID: <200301061511.h06FB4O01302@odiug.zope.com> > We do have a real problem here, and I keep stumbling across it. So far, > this issue has hit me in the win32 extensions, in Mozilla's PyXPCOM, and > even in Gordon's "installer". IMO, the reality is that the Python external > thread-state API sucks. I can boldly make that assertion as I have heard > many other luminaries say it before me. As Tim suggests, time is the issue. > > I fear the only way to approach this is with a PEP. We need to clearly > state our requirements, and clearly show scenarios where interpreter states, > thread states, the GIL etc all need to cooperate. Eg, InterpreterState's > seem YAGNI, but manage to complicate using ThreadStates, which are certainly > YNI. The ability to "unconditionally grab the lock" may be useful, as may a > construct meaning "I'm calling out to/in from an external API" discrete from > the current singular "release/acquire the GIL" construct available today. > > I'm willing to help out with this, but not take it on myself. I have a fair > bit to gain - if I can avoid toggling locks every time I call out to each > and every function there would be some nice perf gains to be had, and > horrible code to remove. I welcome a PEP on this! It's above my own level of expertise, mostly because I'm never in a position to write code that runs into this... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jan 6 15:13:33 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 10:13:33 -0500 Subject: [Python-Dev] PEP 297: Support for System Upgrades In-Reply-To: Your message of "Mon, 06 Jan 2003 14:33:33 +0100." <3E1985AD.8010202@lemburg.com> References: <3D37DFEA.9070506@lemburg.com> <3E172842.8070501@lemburg.com> <15895.26084.121086.435407@gargle.gargle.HOWL> <200301050049.h050nqa21172@pcp02138704pcs.reston01.va.comcast.net> <15895.45479.235214.554416@gargle.gargle.HOWL> <3E1985AD.8010202@lemburg.com> Message-ID: <200301061513.h06FDXO01338@odiug.zope.com> > > GvR> Or a micro-release should clear out the system-packages > > GvR> directory. > > > > The only reason I'd rather not do that is so that if a package still > > needs an update for the new Python micro release, a sysadmin could at > > least copy the package over from one version to the next. > > +1 > > Ok, then, let's call the dir "site-upgrades-" with > being major.minor.patchlevel. +1 > It seems that only site.py needs to be changed. Hmm, but > what happens if someone invokes Python with -S (don't load > site.py) ? They deserve what they get; they'll have to do their own sys.path manipulation. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jan 6 15:15:53 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 10:15:53 -0500 Subject: [Python-Dev] Cross compiling In-Reply-To: Your message of "Mon, 06 Jan 2003 01:38:44 +0100." <01b301c2b51b$f86358c0$6d94fea9@newmexico> References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> <200301052116.h05LGDS31011@pcp02138704pcs.reston01.va.comcast.net> <003301c2b500$c54ed2e0$0311a044@oemcomputer> <01b301c2b51b$f86358c0$6d94fea9@newmexico> Message-ID: <200301061515.h06FFr901379@odiug.zope.com> > > [GvR] > > > In the mean time, I think the best thing to do about rexec.py is to > > > *delete* it from the releases (both 2.2.3 and 2.3). Keeping it > > > around, even with a warning, is an invitation for disasters. If > > > people were using it seriously, they *should* be warned. > From: "Raymond Hettinger" > > When using Python as a embedded scripting language, rexec.py > > still has some value in blocking off part of the system from > > non-deliberate access. [Samuele] > Should we keep Bastion too or should it go? Good question. > From Bastion.py __doc__ : > > "... Bastions have a number of uses, but the most > obvious one is to provide code executing in restricted mode with a > safe interface to an object implemented in unrestricted mode..." > > > consider this potential setup (inspired by the Bastion test code in > Bastion.py): > > Python 2.3a1 (#38, Dec 31 2002, 17:53:59) [MSC v.1200 32 bit (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> class C: pass > ... > >>> import Bastion; b=Bastion.Bastion(C()) > >>> import rexec; r=rexec.RExec() > >>> r.add_module('__main__').b=b > >>> > > what can be done inside r? Who knows, at this point. I don't want to have to worry about that, that's why I'm proposing to punt. > a bastionized empty instance b of a classic class, a restricted enviroment... --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@python.org Mon Jan 6 15:26:09 2003 From: barry@python.org (Barry A. Warsaw) Date: Mon, 6 Jan 2003 10:26:09 -0500 Subject: [Python-Dev] PEP 297: Support for System Upgrades References: <3D37DFEA.9070506@lemburg.com> <3E172842.8070501@lemburg.com> <15895.26084.121086.435407@gargle.gargle.HOWL> <200301050049.h050nqa21172@pcp02138704pcs.reston01.va.comcast.net> <15895.45479.235214.554416@gargle.gargle.HOWL> <3E1985AD.8010202@lemburg.com> Message-ID: <15897.40977.587166.213369@gargle.gargle.HOWL> >>>>> "MAL" == M writes: MAL> Ok, then, let's call the dir "site-upgrades-" with MAL> being major.minor.patchlevel. +1 MAL> It seems that only site.py needs to be changed. Hmm, but MAL> what happens if someone invokes Python with -S (don't load MAL> site.py) ? They lose, just like they lose site-packages if they do it. -Barry From tim.one@comcast.net Mon Jan 6 15:24:55 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 06 Jan 2003 10:24:55 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time In-Reply-To: <3E195902.5040709@lemburg.com> Message-ID: >> ... >> Let z = y - x.o + y.s. >> ... [M.-A. Lemburg] > Care to explain what .o, .n and .s mean for the casual reader ;-) Nope. The casual reader isn't interested enough to be worth the bother. The non-casual reader should read the full proof (at the end of datetimemodule.c or datetime.py), which defines the notation. > ... > I think you are missing a point here: time zones don't have DST. I understand. tzinfo subclasses can, though. The USTimeZone tzinfo subclass posted earlier in this thread, and its 4 instances, was a fully fleshed-out example. >>> from datetime import * >>> t = timetz(3, 15, tzinfo=Eastern) >>> print datetimetz.combine(date(2003, 1, 6), t) 2003-01-06 03:15:00-05:00 >>> print datetimetz.combine(date(2003, 8 ,1), t) 2003-08-01 03:15:00-04:00 >>> As the output shows, Eastern models both EST (-05:00) and EDT (-04:00). astimezone() wants to give sensible (as sensible as possible) results for converting between such hybrid "time zones" too. It's also possible (easy) to create tzinfo subclasses that model only EST, or only EDT, but they seem less useful in real apps. > Each time zone describes a fixed offset from UTC and whenever > the locale applies DST, the time zone for that locale switches > to a new time zone, e.g. MET becomes MEST. The disruption you are > seeing at the DST switch times comes from switching the time zone > and is not caused by the locale's clocks doing a "jump" like in a > Feynman diagram. The disruption exists in real life, and we're allowing for a tzinfo subclass to model it faithfully. From pedronis@bluewin.ch Mon Jan 6 14:26:34 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Mon, 6 Jan 2003 15:26:34 +0100 Subject: [Python-Dev] Bastion too (was: Cross compiling) References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> <200301052116.h05LGDS31011@pcp02138704pcs.reston01.va.comcast.net> <003301c2b500$c54ed2e0$0311a044@oemcomputer> <01b301c2b51b$f86358c0$6d94fea9@newmexico> <200301061515.h06FFr901379@odiug.zope.com> Message-ID: <032201c2b58f$9d978960$6d94fea9@newmexico> From: "Guido van Rossum" > > > [GvR] > > > > In the mean time, I think the best thing to do about rexec.py is to > > > > *delete* it from the releases (both 2.2.3 and 2.3). Keeping it > > > > around, even with a warning, is an invitation for disasters. If > > > > people were using it seriously, they *should* be warned. > > > From: "Raymond Hettinger" > > > When using Python as a embedded scripting language, rexec.py > > > still has some value in blocking off part of the system from > > > non-deliberate access. > > [Samuele] > > Should we keep Bastion too or should it go? > > Good question. > > > From Bastion.py __doc__ : > > > > "... Bastions have a number of uses, but the most > > obvious one is to provide code executing in restricted mode with a > > safe interface to an object implemented in unrestricted mode..." > > > > > > consider this potential setup (inspired by the Bastion test code in > > Bastion.py): > > > > Python 2.3a1 (#38, Dec 31 2002, 17:53:59) [MSC v.1200 32 bit (Intel)] on win32 > > Type "help", "copyright", "credits" or "license" for more information. > > >>> class C: pass > > ... > > >>> import Bastion; b=Bastion.Bastion(C()) > > >>> import rexec; r=rexec.RExec() > > >>> r.add_module('__main__').b=b > > >>> > > > > what can be done inside r? > > Who knows, at this point. I don't want to have to worry about that, > that's why I'm proposing to punt. and I wholeheartedly agree. Sorry for the melodramatic approach, but people seem not serious enough about the issues. Anyway the answer is _Everything_: >>> r.r_exec("t=b._get_") >>> r.r_exec("b._get_=t.__getattribute__") >>> r.r_exec("g=b.func_globals") >>> r.r_exec("""exec "open('c:/evil','w')" in g""") like opening a file for writing. regards. From guido@python.org Mon Jan 6 15:38:20 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 10:38:20 -0500 Subject: [Python-Dev] Bastion too (was: Cross compiling) In-Reply-To: Your message of "Mon, 06 Jan 2003 15:26:34 +0100." <032201c2b58f$9d978960$6d94fea9@newmexico> References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> <200301052116.h05LGDS31011@pcp02138704pcs.reston01.va.comcast.net> <003301c2b500$c54ed2e0$0311a044@oemcomputer> <01b301c2b51b$f86358c0$6d94fea9@newmexico> <200301061515.h06FFr901379@odiug.zope.com> <032201c2b58f$9d978960$6d94fea9@newmexico> Message-ID: <200301061538.h06FcKu01899@odiug.zope.com> OK. I'll sabotage rexec.py and Bastion.py so they don't work without editing them, and post to c.l.py.a that they are withdrawn, and warning against using them in 2.2(.x). --Guido van Rossum (home page: http://www.python.org/~guido/) From shane@zope.com Mon Jan 6 15:54:00 2003 From: shane@zope.com (Shane Hathaway) Date: Mon, 06 Jan 2003 10:54:00 -0500 Subject: [Python-Dev] Re: [Zope3-dev] Proposing to simplify datetime References: Message-ID: <3E19A698.10404@zope.com> Tim Peters wrote: > A natural suggestion is to change the hierarchy to: > > object > timedelta > tzinfo > time > date > datetime I think the proposal is so good and well-reasoned that no one had anything to say. ;-) +1 Shane From bbum@codefab.com Mon Jan 6 15:56:33 2003 From: bbum@codefab.com (Bill Bumgarner) Date: Mon, 6 Jan 2003 10:56:33 -0500 Subject: [Python-Dev] Plain HTML DocUtils writer available Message-ID: <6DC71282-218F-11D7-A03A-000393877AE4@codefab.com> If anyone is interested, my pure HTML DocUtils writer is now complete to the point where it can parse/process the 'test.txt' document in the DocUtils cvs archive. That is, it should not be able to handle any ReST source and produce legible HTML that does not use CSS. The writer is designed to produce HTML that is compliant with O'Reilly's article submission guidelines [for their DevCenter, at least]. The output is currently not as pretty as it could be-- suggestion/input/patches would be most welcome-- but it should be 'correct' in structure. It can be found in the bbum/DocArticle/ sandbox of the DocUtils project and is packaged as a standard 'disutils' based module. Install DocUtils, then install DocArticle. A command line tool -- docarticle.py -- can used to process a ReST document. http://docutils.sourceforge.net/ b.bum From shane@zope.com Mon Jan 6 17:18:07 2003 From: shane@zope.com (Shane Hathaway) Date: Mon, 06 Jan 2003 12:18:07 -0500 Subject: [Zope3-dev] Re: [Python-Dev] Holes in time References: <3E195902.5040709@lemburg.com> Message-ID: <3E19BA4F.2050700@zope.com> M.-A. Lemburg wrote: > I think you are missing a point here: time zones don't have DST. There are two kinds of time zones: those with a fixed offset, which are easy to deal with, and what I would call "geographical" time zones, which map to one of two time zones depending on whether daylight savings is in effect. EDT, EST, MDT, MST, etc. are time zones with fixed offsets. "US/Eastern" and "America/New_York" are geographical time zones. Nearly every desktop computer in the world is set to a geographical rather than fixed time zone. That's my understanding, anyway. I wouldn't mind being wrong, since it would make time zone manipulations oh-so-much simpler. :-) Shane From guido@python.org Mon Jan 6 17:26:32 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 12:26:32 -0500 Subject: [Python-Dev] Whither rexec? In-Reply-To: Your message of "Mon, 06 Jan 2003 12:06:37 EST." References: Message-ID: <200301061726.h06HQWe28737@odiug.zope.com> > can the BDFL or someone please tell me what I > should say about rexec? See my recent checkins and what I just sent to python-announce (not sure when the moderator will get to it): | Subject: Deleting rexec.py and Bastion.py | From: Guido van Rossum | To: python-announce@python.org | Date: Mon, 06 Jan 2003 11:17:50 -0500 | | There have been reports of serious security problems with rexec.py and | Bastion.py starting with Python 2.2. We do not have the resources to | fix these problems. Therefore, I will disable these modules in the next | 2.3 alpha release and in the next 2.2 release (2.2.3, no release date | scheduled). If you are using rexec.py or Bastion.py with any version | of Python 2.2 or 2.3 to safeguard anonymously submitted source code, I | strongly recommend that you stop doing so immediately, because it is | *not* safe. | | There are also known security problems with older versions of Python, | but the holes created by Python 2.2 are much bigger (big enough to | drive an airplane carrier through). --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Mon Jan 6 17:26:18 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 6 Jan 2003 11:26:18 -0600 Subject: [Python-Dev] Misc/*-NOTES? In-Reply-To: <20030104190428.GT29873@epoch.metaslash.com> References: <15894.26311.9593.933548@montanaro.dyndns.org> <20030104190428.GT29873@epoch.metaslash.com> Message-ID: <15897.48186.466586.617614@montanaro.dyndns.org> >> The HPUX-NOTES and AIX-NOTES file seem particularly old. I'm >> tempted to simply dump them. Any comments? Neal> HPUX-NOTES should go. Python builds cleanly on the snake-farm Neal> HP-UX boxes. Okay, it's gone. Neal> AIX-NOTES still has some useful info. It stays. References in README have been adjusted accordingly. Skip From guido@python.org Mon Jan 6 17:31:04 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 12:31:04 -0500 Subject: [Python-Dev] Bastion too (was: Cross compiling) In-Reply-To: Your message of "Mon, 06 Jan 2003 10:38:20 EST." <200301061538.h06FcKu01899@odiug.zope.com> References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> <200301052116.h05LGDS31011@pcp02138704pcs.reston01.va.comcast.net> <003301c2b500$c54ed2e0$0311a044@oemcomputer> <01b301c2b51b$f86358c0$6d94fea9@newmexico> <200301061515.h06FFr901379@odiug.zope.com> <032201c2b58f$9d978960$6d94fea9@newmexico> <200301061538.h06FcKu01899@odiug.zope.com> Message-ID: <200301061731.h06HV4728852@odiug.zope.com> > OK. I'll sabotage rexec.py and Bastion.py so they don't work without > editing them, and post to c.l.py.a that they are withdrawn, and > warning against using them in 2.2(.x). (For now I'm sabotaging them rather than deleting them, because if someone miraculously fixes all the holes, I don't want to lose all the CVS history of what line was changed by what revision.) --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Mon Jan 6 17:40:50 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 6 Jan 2003 11:40:50 -0600 Subject: [Python-Dev] Misc/*-NOTES? In-Reply-To: References: <15894.26311.9593.933548@montanaro.dyndns.org> <200301040649.h046nXQ10784@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15897.49058.885627.66808@montanaro.dyndns.org> Martin> Guido van Rossum writes: >> +0 for AIX-NOTES (some stuff there may still be relevant? some people >> still have very old systems) Martin> What is the oldest AIX release that we need to support? It is Martin> probably a bit late for Python 2.3 to add more systems to PEP Martin> 11, but I would guess that nobody needs AIX 3 support Martin> anymore. Okay, so how about this added to PEP11: Name: AIX 3 Unsupported in: Python 2.4 Code removed in: Python 2.5 ? It's not obvious to a casual observer - me - that there is much AIX3-specfic code in the source tree. Pyconfig.h.in has a comment suggesting _ALL_SOURCE should be definted if running on AIX3, but I don't see it used anywhere. Skip From martin@v.loewis.de Mon Jan 6 18:26:53 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 Jan 2003 19:26:53 +0100 Subject: [Python-Dev] Misc/*-NOTES? In-Reply-To: <15897.49058.885627.66808@montanaro.dyndns.org> References: <15894.26311.9593.933548@montanaro.dyndns.org> <200301040649.h046nXQ10784@pcp02138704pcs.reston01.va.comcast.net> <15897.49058.885627.66808@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > ? It's not obvious to a casual observer - me - that there is much > AIX3-specfic code in the source tree. Pyconfig.h.in has a comment > suggesting _ALL_SOURCE should be definted if running on AIX3, but I don't > see it used anywhere. That is part of the problem: Nobody knows what AIX versions need which of the AIX changes. The _ALL_SOURCE stuff comes from AC_AIX; it is not clear either whether this is relevant only for AIX 3, or whether the the comment is wrong (or incomplete). By removing support for unused systems, and recording the version that triggered a certain change, when the change is made, we can eventually know what fragments are valid for what systems (there won't be an automatic way to find out whether a certain fragment is not needed anymore, so we have to find somebody who tests this for us). So if we can find somebody with AIX 4.1 who can confirm that it works without AC_AIX, we can remove AC_AIX in Python 2.5. If we can find nobody, we can try to eliminate AIX 4.1 support in 2.6, and so on. Regards, Martin From esr@thyrsus.com Mon Jan 6 18:39:30 2003 From: esr@thyrsus.com (Eric S. Raymond) Date: Mon, 6 Jan 2003 13:39:30 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: <200301061657.h06Gv8Z14933@odiug.zope.com> References: <200301061657.h06Gv8Z14933@odiug.zope.com> Message-ID: <20030106183930.GA24690@thyrsus.com> Guido van Rossum : > First of all, reST is going to be used a lot in Zope3. Maybe it could > become a standard library module? I'm unfamiliar with this issue. > We can do that in Python 2.3. Because this is backwards incompatible, > I propose that you have to request this protocol explicitly. I > propose to "upgrade' the binary flag to a general "protocol version" > flag, with values: > > 0 - original protocol > 1 - binary protocol > 2 - new protocol +0. That is, I don't care but the change seems reasonable and harmless. > I'd also like to get rid of __safe_for_unpickling__ and all other > pseudo security features. Attempting to unpickle pickles from an > untrusted source is insane, and nothing can help us there; I'd rather > make the marshal protocol bulletproof (all it needs is a few more > checks for inconsistent data and a little better error handling). I do care about *this*, and it's the reason I'm responding. The `safety' feature always struck me as grubby and non-orthogonal, an attempt to patch over a problem that fundamentally cannot be solved at that level, and one that could only backfire by creating a false sense of security in people who weren't really thinking about the underlying difficulty. If we're going to have a sandboxing[1] facility in Python, it should be decoupled from pickling and more general. +1. Scrap that feature, it was wrong to begin with. -- Eric S. Raymond [1] I just realized that `sandbox' in this sense isn't in the Jargon File. I'm off to add it... From guido@python.org Mon Jan 6 20:15:41 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 15:15:41 -0500 Subject: [Python-Dev] PEP 297: Support for System Upgrades In-Reply-To: Your message of "Mon, 06 Jan 2003 21:09:46 +0100." <3E19E28A.2030304@lemburg.com> References: <3D37DFEA.9070506@lemburg.com> <3E172842.8070501@lemburg.com> <15895.26084.121086.435407@gargle.gargle.HOWL> <200301050049.h050nqa21172@pcp02138704pcs.reston01.va.comcast.net> <15895.45479.235214.554416@gargle.gargle.HOWL> <3E1985AD.8010202@lemburg.com> <200301061513.h06FDXO01338@odiug.zope.com> <3E19E28A.2030304@lemburg.com> Message-ID: <200301062015.h06KFf403144@odiug.zope.com> > Ok, I've started looking at adding support for this. Here's > a couple of things I found: > > * getpath.c: > Some of the '/' path delimiters are hard coded; shouldn't > these be replaced with SEP ? All the platforms that I'm awware of that don't use '/' have their own getpath.c copy anyway (the one for Windows is PC/getpathp.c). > * There's no easy way to find the first item on sys.path which > starts the default path added by Python at startup time. It seems > that a suffix search for "python23.zip" gives the best hint. > The only other possibility I see is writing the support code > directly into getpath.c. That's where I'd put it, yes. > * site.py contains code which prefixes "site-packages" with both > sys.prefix and sys.exec_prefix. Is this really used anywhere ? > (distutils and the old Makefile.pre.in both install to > sys.prefix per default) I thought they might install extension modules in exec_prefix. But maybe it's a YAGNI. --Guido van Rossum (home page: http://www.python.org/~guido/) From goodger@python.org Tue Jan 7 00:51:33 2003 From: goodger@python.org (David Goodger) Date: Mon, 06 Jan 2003 19:51:33 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: <200301061657.h06Gv8Z14933@odiug.zope.com> Message-ID: Guido van Rossum wrote: > First of all, reST is going to be used a lot in Zope3. Cool! Is there a website or a mailing list thread containing the master plan? > Maybe it could become a standard library module? That would be great. But could it be a bit premature? * There's at least one major parser element that has yet to be implemented: interpreted text. * Should the entire Docutils codebase be added to the standard library at the same time? Or only the parser core? - There are many dependencies: much of the top-level Docutils code supports the reStructuredText parser, such as the nodes, statemachine, roman, urischemes, and utils modules. Of course, the parser-related support code could be isolated. - There's a pervasive runtime-settings mechanism that may need an overhaul before long. It uses Optik/optparse and ConfigParser, but in a simplistic way that's showing some rough edges. - Docutils is incomplete. I'm making progress on a Python Source Reader component, but only slowly. There's no support for splitting output into multiple files yet. The "To Do" list is long and growing: . * Zope integration (or other major use) will probably uncover flaws and omissions in the Docutils design. I welcome these as opportunities for improvement. * I have been designing and implementing Docutils in an evolutionary, XP-inspired way (with some exceptions ;), and I feel that the evolution isn't over. Although stdlib integration would undoubtedly make Docutils/reST much more visible and accessible, I'm concerned that it would fix the API and make rip-out-the-guts changes difficult. Ideally I'd like to be able to complete and perfect Docutils before submitting it for stdlib integration, but who knows how long that will take. Maybe I'm just being too protective of "my baby". Perhaps it's time for it to face the realities of the big bad world. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From Anthony Baxter Tue Jan 7 03:59:17 2003 From: Anthony Baxter (Anthony Baxter) Date: Tue, 07 Jan 2003 14:59:17 +1100 Subject: [Python-Dev] PEP 297: Support for System Upgrades In-Reply-To: <200301050049.h050nqa21172@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200301070359.h073xHd07438@localhost.localdomain> >>> Guido van Rossum wrote > > My suggestion is that the Python micro-release number be included in > > the path to system-packages. IOW, system-packages must exactly match > > the Python version number, not just the maj.min number. +1 > Excellent point. > Or a micro-release should clear out the system-packages directory. -1. That's just a bit too sudden and nasty. If this is the way to go, make a subdirectory of system-packages/site-upgrades and move the old files there (and warn the installer that they've been moved) Anthony From pyth@devel.trillke.net Mon Jan 6 20:30:36 2003 From: pyth@devel.trillke.net (holger krekel) Date: Mon, 6 Jan 2003 21:30:36 +0100 Subject: [Python-Dev] new features for 2.3? In-Reply-To: <20030106183930.GA24690@thyrsus.com>; from esr@thyrsus.com on Mon, Jan 06, 2003 at 01:39:30PM -0500 References: <200301061657.h06Gv8Z14933@odiug.zope.com> <20030106183930.GA24690@thyrsus.com> Message-ID: <20030106213036.F17919@prim.han.de> Eric S. Raymond wrote: > Guido van Rossum : > > I'd also like to get rid of __safe_for_unpickling__ and all other > > pseudo security features. Attempting to unpickle pickles from an > > untrusted source is insane, and nothing can help us there; I'd rather > > make the marshal protocol bulletproof (all it needs is a few more > > checks for inconsistent data and a little better error handling). > > I do care about *this*, and it's the reason I'm responding. The > `safety' feature always struck me as grubby and non-orthogonal, an > attempt to patch over a problem that fundamentally cannot be solved at > that level, and one that could only backfire by creating a false sense > of security in people who weren't really thinking about the underlying > difficulty. > > If we're going to have a sandboxing[1] facility in Python, it should be > decoupled from pickling and more general. I wholeheartedly agree. Maybe a (hyptothetic) pyeval.py as a python version of ceval.c could provide the ground for a simple sandboxing facility? Taking control at the bytecode interpretation level is quite general. Of course you might want to use PSYCO with it. If i understand Armin Rigo correctly this would also help with his efforts. See http://psyco.sourceforge.net/plans.html regards, holger From bac@OCF.Berkeley.EDU Mon Jan 6 20:31:52 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 6 Jan 2003 12:31:52 -0800 (PST) Subject: [Python-Dev] new features for 2.3? In-Reply-To: <200301061657.h06Gv8Z14933@odiug.zope.com> References: <200301061657.h06Gv8Z14933@odiug.zope.com> Message-ID: [Guido van Rossum] > First of all, reST is going to be used a lot in Zope3. Maybe it could > become a standard library module? > +1 But I am biased. =) And if this is done, I would like to make the request that the ``tools`` directory still be available somewhere (not necessarily in the distro; perhaps the SF page as a separate download?). ``tools/html.py`` is a puny wrapper, but extremely useful. > Next, I really, really, really would like to improve pickling of > new-style classes. > +0 for getting done by 2.3, +1 period. >And we > need to implement these twice: in Python for pickle.py and in C for > cPickle.py. > I have always wondered, why does both ``cPickle`` (which uses camel-style naming which I thought was a no-no) and ``Pickle``? They do exactly the same thing (in theory). This question is spontaneous, and so if it is obviously from looking at the code, just tell me to RTFM. =) > I'd also like to get rid of __safe_for_unpickling__ and all other > pseudo security features. Attempting to unpickle pickles from an > untrusted source is insane, and nothing can help us there; I'd rather > make the marshal protocol bulletproof (all it needs is a few more > checks for inconsistent data and a little better error handling). > Is there any other place where security has been built into something? Sounds like we should do a security inaudit (is that a word?) and rip out pretty much all security code. -Brett From altis@semi-retired.com Tue Jan 7 16:01:03 2003 From: altis@semi-retired.com (Kevin Altis) Date: Tue, 7 Jan 2003 08:01:03 -0800 Subject: [Python-Dev] new features for 2.3? - resend In-Reply-To: <20030106183930.GA24690@thyrsus.com> Message-ID: > -----Original Message----- > From: Eric S. Raymond > > Guido van Rossum : > > I'd also like to get rid of __safe_for_unpickling__ and all other > > pseudo security features. Attempting to unpickle pickles from an > > untrusted source is insane, and nothing can help us there; I'd rather > > make the marshal protocol bulletproof (all it needs is a few more > > checks for inconsistent data and a little better error handling). > > I do care about *this*, and it's the reason I'm responding. The > `safety' feature always struck me as grubby and non-orthogonal, an > attempt to patch over a problem that fundamentally cannot be solved at > that level, and one that could only backfire by creating a false sense > of security in people who weren't really thinking about the underlying > difficulty. > > If we're going to have a sandboxing[1] facility in Python, it should be > decoupled from pickling and more general. > > +1. Scrap that feature, it was wrong to begin with. I would appreciate a little more explanation regarding the use of pickles. Since I've brought up the issue off-list a few times about using pickles of built-ins such as strings, integers, lists, and dictionaries (and probably datetime), but no sub-classes of built-ins or custom classes. I understand that there are security concerns, but does this mean that exchanging a pickle via XML-RPC and/or SOAP or exchanging a pickle the way you might use a vCard (pickle as just data) is simply not doable? How does this impact ZODB (if at all) for the same types of applications? Binary pickles are extremely fast and easy to use, but it appears that using them in a situation where you need to exchange data is just not doable without additional support modules. Or perhaps there just needs to be a standard safe unpickler that is part of 2.3 that only excepts built-ins of "safe" types? If the pickle contained something unsafe it would simply throw an exception but no harm would be done. Thanks, ka --- Kevin Altis altis@semi-retired.com http://radio.weblogs.com/0102677/ From guido@python.org Mon Jan 6 17:15:46 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 12:15:46 -0500 Subject: Slow String Repeat (was: [Python-Dev] PyBuffer* vs. array.array()) In-Reply-To: Your message of "Mon, 06 Jan 2003 15:42:01 +0100." <3E1995B9.2010608@tismer.com> References: <20030105221909.61705.qmail@web40106.mail.yahoo.com> <200301052253.h05MrBb31241@pcp02138704pcs.reston01.va.comcast.net> <3E18BF31.2050409@tismer.com> <3E18C3BC.1020108@tismer.com> <200301060033.h060XZe31484@pcp02138704pcs.reston01.va.comcast.net> <006501c2b558$31a14fc0$125ffea9@oemcomputer> <3E1995B9.2010608@tismer.com> Message-ID: <200301061715.h06HFkb28624@odiug.zope.com> Of course, repeating a 1-length string could be special-cased using memcpy(). I don't think repeating 2-length strings is a common use case. --Guido van Rossum (home page: http://www.python.org/~guido/) From lists@morpheus.demon.co.uk Mon Jan 6 17:56:29 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Mon, 06 Jan 2003 17:56:29 +0000 Subject: [Python-Dev] Unidiff tool References: Message-ID: Tim Peters writes: > [Brett Cannon] >> Has it ever been considered to make ``difflib`` output diffs that >> could be passed to ``patch``? That way there would be one less >> barrier for people to get over when wanting to help in development. > > If they have CVS, they can make patches properly with cvs diff, and > any developer wannabe who doesn't have CVS is missing a more basic > part of the story than how to make patches . Actually, over my dial-up link, I've never got cvs diff to work. It sits there for *ages* (30-50 mins) talking to the server, but never does much of any use. Same problem with "cvs update". I've taken to just doing cvs checkouts, or getting the full CVS repository tarball from SF (at work, over a fast connection, which has a firewall which won't allow CVS use :-( ) and then making patches by hand. Not that this has any relevance to difflib... Paul. -- This signature intentionally left blank From altis@semi-retired.com Mon Jan 6 19:14:56 2003 From: altis@semi-retired.com (Kevin Altis) Date: Mon, 6 Jan 2003 11:14:56 -0800 Subject: [Python-Dev] new features for 2.3? In-Reply-To: <20030106183930.GA24690@thyrsus.com> Message-ID: > -----Original Message----- > From: Eric S. Raymond > > Guido van Rossum : > > I'd also like to get rid of __safe_for_unpickling__ and all other > > pseudo security features. Attempting to unpickle pickles from an > > untrusted source is insane, and nothing can help us there; I'd rather > > make the marshal protocol bulletproof (all it needs is a few more > > checks for inconsistent data and a little better error handling). > > I do care about *this*, and it's the reason I'm responding. The > `safety' feature always struck me as grubby and non-orthogonal, an > attempt to patch over a problem that fundamentally cannot be solved at > that level, and one that could only backfire by creating a false sense > of security in people who weren't really thinking about the underlying > difficulty. > > If we're going to have a sandboxing[1] facility in Python, it should be > decoupled from pickling and more general. > > +1. Scrap that feature, it was wrong to begin with. I would appreciate a little more explanation regarding the use of pickles. Since I've brought up the issue off-list a few times about using pickles of built-ins such as strings, integers, lists, and dictionaries (and probably datetime), but no sub-classes of built-ins or custom classes. I understand that there are security concerns, but does this mean that exchanging a pickle via XML-RPC and/or SOAP or exchanging a pickle the way you might use a vCard (pickle as just data) is simply not doable? How does this impact ZODB (if at all) for the same types of applications? Binary pickles are extremely fast and easy to use, but it appears that using them in a situation where you need to exchange data is just not doable without additional support modules. Or perhaps there just needs to be a standard safe unpickler that is part of 2.3 that only excepts built-ins of "safe" types? If the pickle contained something unsafe it would simply throw an exception but no harm would be done. Thanks, ka --- Kevin Altis altis@semi-retired.com http://radio.weblogs.com/0102677/ From skip@pobox.com Mon Jan 6 19:50:23 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 6 Jan 2003 13:50:23 -0600 Subject: [Python-Dev] What attempts at security should/can Python implement? In-Reply-To: <032201c2b58f$9d978960$6d94fea9@newmexico> References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> <200301052116.h05LGDS31011@pcp02138704pcs.reston01.va.comcast.net> <003301c2b500$c54ed2e0$0311a044@oemcomputer> <01b301c2b51b$f86358c0$6d94fea9@newmexico> <200301061515.h06FFr901379@odiug.zope.com> <032201c2b58f$9d978960$6d94fea9@newmexico> Message-ID: <15897.56831.775605.990300@montanaro.dyndns.org> Now that Guido has rendered impotent any attempts Python did make at security, does it make sense to try and figure out what (if anything) can be done by the C runtime? Somebody asked about tainting in the past week in a response to a year-old feature request on SF. Does that fall into this category? I've been working my way (slowly) through Kent Beck's "Test-Driven Development by Example" and was thinking that adding tainting to Python strings might be an interesting application of those ideas (for someone wanting to learn by doing), but if tainting won't be of any use I'll find something else. Skip From mal@lemburg.com Mon Jan 6 20:09:46 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 06 Jan 2003 21:09:46 +0100 Subject: [Python-Dev] PEP 297: Support for System Upgrades In-Reply-To: <200301061513.h06FDXO01338@odiug.zope.com> References: <3D37DFEA.9070506@lemburg.com> <3E172842.8070501@lemburg.com> <15895.26084.121086.435407@gargle.gargle.HOWL> <200301050049.h050nqa21172@pcp02138704pcs.reston01.va.comcast.net> <15895.45479.235214.554416@gargle.gargle.HOWL> <3E1985AD.8010202@lemburg.com> <200301061513.h06FDXO01338@odiug.zope.com> Message-ID: <3E19E28A.2030304@lemburg.com> Guido van Rossum wrote: >>> GvR> Or a micro-release should clear out the system-packages >>> GvR> directory. >>> >>>The only reason I'd rather not do that is so that if a package still >>>needs an update for the new Python micro release, a sysadmin could at >>>least copy the package over from one version to the next. >> >>+1 >> >>Ok, then, let's call the dir "site-upgrades-" with >> being major.minor.patchlevel. > > > +1 > > >>It seems that only site.py needs to be changed. Hmm, but >>what happens if someone invokes Python with -S (don't load >>site.py) ? > > > They deserve what they get; they'll have to do their own sys.path > manipulation. Ok, I've started looking at adding support for this. Here's a couple of things I found: * getpath.c: Some of the '/' path delimiters are hard coded; shouldn't these be replaced with SEP ? * There's no easy way to find the first item on sys.path which starts the default path added by Python at startup time. It seems that a suffix search for "python23.zip" gives the best hint. The only other possibility I see is writing the support code directly into getpath.c. * site.py contains code which prefixes "site-packages" with both sys.prefix and sys.exec_prefix. Is this really used anywhere ? (distutils and the old Makefile.pre.in both install to sys.prefix per default) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Mon Jan 6 21:07:21 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 16:07:21 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: Your message of "Mon, 06 Jan 2003 12:31:52 PST." References: <200301061657.h06Gv8Z14933@odiug.zope.com> Message-ID: <200301062107.h06L7L103338@odiug.zope.com> > I have always wondered, why does both ``cPickle`` (which uses camel-style > naming which I thought was a no-no) and ``Pickle``? They do exactly the > same thing (in theory). pickle.py is the specification of the protocol; cPickle.c is a reimplementation that's up to 1000x faster. I always prototype new features in pickle.py. > Is there any other place where security has been built into > something? Sounds like we should do a security inaudit (is that a > word?) and rip out pretty much all security code. There's very little code devoted specifically to security. However, there's a feature called "restricted mode", and in restricted mode, certain introspections are disallowed. Restricted mode is on when a particular stack frame's __builtins__ dictionary isn't the default one (which is __builtin__.__dict__ -- note the difference between __builtin__, which is a module, and __builtins__, which is a global with magic meaning). Read the source for PyFrame_New(). It turns out that in 2.2 and beyond, not enough restrictions were placed on disallowing new introspections that were enabled by virtue of the class/type integration, and that's the cause of most rexec vulnerabilities. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Jan 6 22:20:41 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 06 Jan 2003 23:20:41 +0100 Subject: [Python-Dev] PEP 297: Support for System Upgrades In-Reply-To: <200301062015.h06KFf403144@odiug.zope.com> References: <3D37DFEA.9070506@lemburg.com> <3E172842.8070501@lemburg.com> <15895.26084.121086.435407@gargle.gargle.HOWL> <200301050049.h050nqa21172@pcp02138704pcs.reston01.va.comcast.net> <15895.45479.235214.554416@gargle.gargle.HOWL> <3E1985AD.8010202@lemburg.com> <200301061513.h06FDXO01338@odiug.zope.com> <3E19E28A.2030304@lemburg.com> <200301062015.h06KFf403144@odiug.zope.com> Message-ID: <3E1A0139.7070207@lemburg.com> Guido van Rossum wrote: >>Ok, I've started looking at adding support for this. Here's >>a couple of things I found: >> >>* getpath.c: >> Some of the '/' path delimiters are hard coded; shouldn't >> these be replaced with SEP ? > > All the platforms that I'm awware of that don't use '/' have their own > getpath.c copy anyway (the one for Windows is PC/getpathp.c). But can't hurt to change these in the standard getpath.c, right ? (reduce() is looking for SEP, so on platforms which do use the standard getpath.c but have a different os.sep could be mislead by the hardcoded slash in some constants) >>* There's no easy way to find the first item on sys.path which >> starts the default path added by Python at startup time. It seems >> that a suffix search for "python23.zip" gives the best hint. >> The only other possibility I see is writing the support code >> directly into getpath.c. > > That's where I'd put it, yes. You mean "put it into getpath.c" or "put it in front of .../python23.zip" ? >>* site.py contains code which prefixes "site-packages" with both >> sys.prefix and sys.exec_prefix. Is this really used anywhere ? >> (distutils and the old Makefile.pre.in both install to >> sys.prefix per default) > > I thought they might install extension modules in exec_prefix. But > maybe it's a YAGNI. Hmm, I've just built a Python interpreter with different prefix and exec_prefix settings: using such an interpreter lets distutils default to the exec_prefix subtree. However, Python itself does not create a site-packages directory in that tree (make install creates this directory in $(prefix)/lib/pythonX.X/ and not $(exec_prefix)/lib/pythonX.X/). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Mon Jan 6 22:43:20 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 17:43:20 -0500 Subject: [Python-Dev] PEP 297: Support for System Upgrades In-Reply-To: Your message of "Mon, 06 Jan 2003 23:20:41 +0100." <3E1A0139.7070207@lemburg.com> References: <3D37DFEA.9070506@lemburg.com> <3E172842.8070501@lemburg.com> <15895.26084.121086.435407@gargle.gargle.HOWL> <200301050049.h050nqa21172@pcp02138704pcs.reston01.va.comcast.net> <15895.45479.235214.554416@gargle.gargle.HOWL> <3E1985AD.8010202@lemburg.com> <200301061513.h06FDXO01338@odiug.zope.com> <3E19E28A.2030304@lemburg.com> <200301062015.h06KFf403144@odiug.zope.com> <3E1A0139.7070207@lemburg.com> Message-ID: <200301062243.h06MhKp05408@odiug.zope.com> > >>Ok, I've started looking at adding support for this. Here's > >>a couple of things I found: > >> > >>* getpath.c: > >> Some of the '/' path delimiters are hard coded; shouldn't > >> these be replaced with SEP ? > > > > All the platforms that I'm awware of that don't use '/' have their own > > getpath.c copy anyway (the one for Windows is PC/getpathp.c). > > But can't hurt to change these in the standard getpath.c, right ? > (reduce() is looking for SEP, so on platforms which do use the > standard getpath.c but have a different os.sep could be mislead > by the hardcoded slash in some constants) -0; it's too painful to think about all the places that would have to be fixed and how to fix them. > >>* There's no easy way to find the first item on sys.path which > >> starts the default path added by Python at startup time. It seems > >> that a suffix search for "python23.zip" gives the best hint. > >> The only other possibility I see is writing the support code > >> directly into getpath.c. > > > > That's where I'd put it, yes. > > You mean "put it into getpath.c" or "put it in front of > .../python23.zip" ? Put it in getpath.c. > >>* site.py contains code which prefixes "site-packages" with both > >> sys.prefix and sys.exec_prefix. Is this really used anywhere ? > >> (distutils and the old Makefile.pre.in both install to > >> sys.prefix per default) > > > > I thought they might install extension modules in exec_prefix. But > > maybe it's a YAGNI. > > Hmm, I've just built a Python interpreter with different > prefix and exec_prefix settings: using such an interpreter > lets distutils default to the exec_prefix subtree. However, > Python itself does not create a site-packages directory in > that tree (make install creates this directory in > $(prefix)/lib/pythonX.X/ and not $(exec_prefix)/lib/pythonX.X/). Probably an oversight in the Makefile. Doesn't distutils create a needed directory if it doesn't exist? --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis@bluewin.ch Tue Jan 7 21:54:30 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Tue, 7 Jan 2003 22:54:30 +0100 Subject: [Python-Dev] new features for 2.3? References: <200301061657.h06Gv8Z14933@odiug.zope.com> <200301062107.h06L7L103338@odiug.zope.com> Message-ID: <050401c2b697$5be44a40$6d94fea9@newmexico> From: "Guido van Rossum" > > I have always wondered, why does both ``cPickle`` (which uses camel-style > > naming which I thought was a no-no) and ``Pickle``? They do exactly the > > same thing (in theory). > > pickle.py is the specification of the protocol; cPickle.c is a > reimplementation that's up to 1000x faster. I always prototype new > features in pickle.py. > > > Is there any other place where security has been built into > > something? Sounds like we should do a security inaudit (is that a > > word?) and rip out pretty much all security code. > > There's very little code devoted specifically to security. However, > there's a feature called "restricted mode", and in restricted mode, > certain introspections are disallowed. Restricted mode is on when a > particular stack frame's __builtins__ dictionary isn't the default one > (which is __builtin__.__dict__ -- note the difference between > __builtin__, which is a module, and __builtins__, which is a global > with magic meaning). Read the source for PyFrame_New(). > > It turns out that in 2.2 and beyond, not enough restrictions were > placed on disallowing new introspections that were enabled by virtue > of the class/type integration, and that's the cause of most rexec > vulnerabilities. you may want to look the places where PyEval_GetRestricted() is called, it is used to check whether restricted execution is in place. There are too few of those checks... and anyway blocking things in this adhoc way is a fragile strategy. From dave@boost-consulting.com Mon Jan 6 23:52:38 2003 From: dave@boost-consulting.com (David Abrahams) Date: Mon, 06 Jan 2003 18:52:38 -0500 Subject: [Python-Dev] Re: Trouble with Python 2.3a1 In-Reply-To: <200301062300.h06N0uv368572@boa.lbl.gov> ("Ralf W. Grosse-Kunstleve"'s message of "Mon, 6 Jan 2003 15:00:56 -0800 (PST)") References: <200301062300.h06N0uv368572@boa.lbl.gov> Message-ID: "Ralf W. Grosse-Kunstleve" writes: > Hi David, > > I am back from my trip and I thought it is time to test our stuff with > Python 2.3a1. Using the boost 1.29.0 release pickling of simple > extension classes fails. I tracked this down to the following > difference: > > Python 2.2: > > > Python 2.3a1: > > > Have you encountered this already? Nope, I haven't. I'm cc:'ing the python developers on this reply. Was this an intentional change, guys? -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From tim.one@comcast.net Tue Jan 7 02:11:49 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 06 Jan 2003 21:11:49 -0500 Subject: [Python-Dev] RE: [Zope3-dev] Proposing to simplify datetime In-Reply-To: <3E19A698.10404@zope.com> Message-ID: [Tim] > A natural suggestion is to change the hierarchy to: > > object > timedelta > tzinfo > time > date > datetime [Shane Hathaway] > I think the proposal is so good and well-reasoned that no one had > anything to say. ;-) +1 I think your vote carries the motion, then, Shane! Thank you. I know Guido is in favor of it (has been pushing for it, in fact), and that changing the Python implementation only took a few hours was a Good Sign. I've since played with the new implementation as a user, and found it more pleasant to use (e.g., fewer pointless choices to make, and "promoting" what was a datetime to what was a datetimetz has become trivial instead of an irritating puzzle). From guido@python.org Tue Jan 7 02:20:34 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 21:20:34 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: Your message of "Mon, 06 Jan 2003 19:51:33 EST." References: Message-ID: <200301070220.h072KY701927@pcp02138704pcs.reston01.va.comcast.net> > > First of all, reST is going to be used a lot in Zope3. > > Cool! Is there a website or a mailing list thread containing the > master plan? For what? Zope3 has its own Wiki: http://dev.zope.org/Wikis/DevSite/Projects/ComponentArchitecture/FrontPage For reST usage, AFAIK there isn't much except an email I saw today: http://lists.zope.org/pipermail/zope3-dev/2003-January/004737.html > > Maybe it could become a standard library module? > > That would be great. But could it be a bit premature? [...] > Maybe I'm just being too protective of "my baby". Perhaps it's time > for it to face the realities of the big bad world. Maybe. Maybe I haven't been following reST closely enough. I do admit that I was a bit worried when I saw how rough the tools are (e.g. tools/html.py, which is the only thing I've used) and also when I found that at first they didn't work with Python 2.3 at all. OTOH, maybe this will be an encouragement for you and the other reST developers (are there any besides you? :-) to aim for a release beyond the CVS snapshot. Once you've released it may make more sense to add it to Python 2.3. (This is what we're aiming for with IDLEfork -- they just had their first alpha release.) --Guido van Rossum (home page: http://www.python.org/~guido/) From bac@OCF.Berkeley.EDU Tue Jan 7 02:26:29 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 6 Jan 2003 18:26:29 -0800 (PST) Subject: [Python-Dev] new features for 2.3? In-Reply-To: <200301062107.h06L7L103338@odiug.zope.com> References: <200301061657.h06Gv8Z14933@odiug.zope.com> <200301062107.h06L7L103338@odiug.zope.com> Message-ID: [Guido van Rossum] > > I have always wondered, why does both ``cPickle`` (which uses camel-style > > naming which I thought was a no-no) and ``Pickle``? They do exactly the > > same thing (in theory). > > pickle.py is the specification of the protocol; cPickle.c is a > reimplementation that's up to 1000x faster. I always prototype new > features in pickle.py. > Ah, OK. Makes sense. Thanks for the clarification. > > Is there any other place where security has been built into > > something? Sounds like we should do a security inaudit (is that a > > word?) and rip out pretty much all security code. > > There's very little code devoted specifically to security. However, > there's a feature called "restricted mode", and in restricted mode, > certain introspections are disallowed. Restricted mode is on when a > particular stack frame's __builtins__ dictionary isn't the default one > (which is __builtin__.__dict__ -- note the difference between > __builtin__, which is a module, and __builtins__, which is a global > with magic meaning). Read the source for PyFrame_New(). > And while I am reading that piece of code, anything else I should take a look at? I am tired of not being able to help out more at the C level but I don't know where to start to get a good, overall view of the codebase short of starting at the eval loop and just reading the code that it calls (as of right now I just want a good, deep understanding of how Python does internal object representation and how extension modules actually work; parser can wait for another day =). > It turns out that in 2.2 and beyond, not enough restrictions were > placed on disallowing new introspections that were enabled by virtue > of the class/type integration, and that's the cause of most rexec > vulnerabilities. > Is there any desire to bother to fix this? Or would it be better to just rip this stuff out and hope some TrustedPython project pops up to take over rexec, Bastion, and such and do the work of making secure Python code? -Brett From guido@python.org Tue Jan 7 02:46:09 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 06 Jan 2003 21:46:09 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: Your message of "Mon, 06 Jan 2003 18:26:29 PST." References: <200301061657.h06Gv8Z14933@odiug.zope.com> <200301062107.h06L7L103338@odiug.zope.com> Message-ID: <200301070246.h072k9l02047@pcp02138704pcs.reston01.va.comcast.net> > > There's very little code devoted specifically to security. > > However, there's a feature called "restricted mode", and in > > restricted mode, certain introspections are disallowed. > > Restricted mode is on when a particular stack frame's __builtins__ > > dictionary isn't the default one (which is __builtin__.__dict__ -- > > note the difference between __builtin__, which is a module, and > > __builtins__, which is a global with magic meaning). Read the > > source for PyFrame_New(). > > And while I am reading that piece of code, anything else I should > take a look at? I am tired of not being able to help out more at > the C level but I don't know where to start to get a good, overall > view of the codebase short of starting at the eval loop and just > reading the code that it calls (as of right now I just want a good, > deep understanding of how Python does internal object representation > and how extension modules actually work; parser can wait for another > day =). For learning how things work, I recommend studying extension module code rather than the implementation first; then you can follow leads from the extension. Or use gdb to step through the C code of an extension doing something fairly simple. > > It turns out that in 2.2 and beyond, not enough restrictions were > > placed on disallowing new introspections that were enabled by > > virtue of the class/type integration, and that's the cause of most > > rexec vulnerabilities. > > Is there any desire to bother to fix this? Or would it be better to > just rip this stuff out and hope some TrustedPython project pops up > to take over rexec, Bastion, and such and do the work of making > secure Python code? I'd like the restricted mode even if it's not perfect, and I hope one day it will work again. It's mostly a matter of lack of brain cycles. --Guido van Rossum (home page: http://www.python.org/~guido/) From goodger@python.org Tue Jan 7 03:30:37 2003 From: goodger@python.org (David Goodger) Date: Mon, 06 Jan 2003 22:30:37 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: <200301070220.h072KY701927@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum wrote: >>> First of all, reST is going to be used a lot in Zope3. >> >> Cool! Is there a website or a mailing list thread containing the >> master plan? > > For what? For reStructuredText integration into Zope. I got the impression that it was an official thing. > For reST usage, AFAIK there isn't much except an email I saw today: > > http://lists.zope.org/pipermail/zope3-dev/2003-January/004737.html Thanks. >>> Maybe it could become a standard library module? >> >> That would be great. But could it be a bit premature? > [...] >> Maybe I'm just being too protective of "my baby". Perhaps it's time >> for it to face the realities of the big bad world. > > Maybe. Maybe I haven't been following reST closely enough. I do > admit that I was a bit worried when I saw how rough the tools are > (e.g. tools/html.py, which is the only thing I've used) In what way are the tools rough? Other than: > and also when I found that at first they didn't work with > Python 2.3 at all. Minor issues, all fixed now. > OTOH, maybe this will be an encouragement for you and the other reST > developers (are there any besides you? :-) But of course! There are many, contributing a little bit here & there. A few have contributed significant amounts. But it's mostly me, true. > to aim for a release beyond the CVS snapshot. Once you've released > it may make more sense to add it to Python 2.3. I don't think that Docutils itself is ready. Autodocumenting Python source, what I consider to be its main use case, isn't addressed yet. I've been aiming at another release when that's ready. I think the reStructuredText parser (with the one exception of interpreted text) is mature and ready though. I'd be happy to help integrate that. -- David Goodger Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ From dave@boost-consulting.com Tue Jan 7 22:35:47 2003 From: dave@boost-consulting.com (David Abrahams) Date: Tue, 07 Jan 2003 17:35:47 -0500 Subject: [Python-Dev] Trouble with Python 2.3a1 Message-ID: --=-=-= Ralf describes below a change in the name of classes defined in the __main__ module for 2.3a1, which causes some of our tests to fail. Was it intentional? --=-=-= Content-Type: message/rfc822 Content-Disposition: inline X-From-Line: rwgk@cci.lbl.gov Mon Jan 6 15:00:56 2003 Return-Path: Received: from boa.lbl.gov ([128.3.132.98] verified) by stlport.com (CommuniGate Pro SMTP 3.5.9) with ESMTP id 162837 for dave@boost-consulting.com; Mon, 06 Jan 2003 15:00:57 -0800 Received: (from rwgk@localhost) by boa.lbl.gov (8.11.6/8.11.6) id h06N0uv368572; Mon, 6 Jan 2003 15:00:56 -0800 (PST) Date: Mon, 6 Jan 2003 15:00:56 -0800 (PST) From: "Ralf W. Grosse-Kunstleve" Message-Id: <200301062300.h06N0uv368572@boa.lbl.gov> To: dave@boost-consulting.com Subject: Trouble with Python 2.3a1 Cc: rwgk@boa.lbl.gov X-Content-Length: 399 Lines: 17 Xref: NEFARIOUS mail.misc:17942 MIME-Version: 1.0 Hi David, I am back from my trip and I thought it is time to test our stuff with Python 2.3a1. Using the boost 1.29.0 release pickling of simple extension classes fails. I tracked this down to the following difference: Python 2.2: Python 2.3a1: Have you encountered this already? Thanks, Ralf --=-=-= -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution --=-=-=-- From mal@lemburg.com Tue Jan 7 22:56:07 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 07 Jan 2003 23:56:07 +0100 Subject: [Python-Dev] Misc. warnings Message-ID: <3E1B5B07.7000003@lemburg.com> Some warnings I get when compiling Python CVS on SuSE Linux: /home/lemburg/projects/Python/Dev-Python/Modules/readline.c: In function `flex_complete': /home/lemburg/projects/Python/Dev-Python/Modules/readline.c:542: warning: implicit declaration of function `completion_matches' /home/lemburg/projects/Python/Dev-Python/Modules/readline.c:542: warning: return makes pointer from integer without a cast test_bsddb3 skipped -- Use of the `bsddb' resource not enabled test_ossaudiodev skipped -- No module named ossaudiodev 2 skips unexpected on linux2: test_ossaudiodev test_bsddb3 Why would ossaudiodev be expected to work on Linux ? Dito for BSD DB 3 ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From aahz@pythoncraft.com Tue Jan 7 13:08:37 2003 From: aahz@pythoncraft.com (Aahz) Date: Tue, 7 Jan 2003 08:08:37 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: <200301061657.h06Gv8Z14933@odiug.zope.com> References: <200301061657.h06Gv8Z14933@odiug.zope.com> Message-ID: <20030107130837.GA4956@panix.com> On Mon, Jan 06, 2003, Guido van Rossum wrote: > > First of all, reST is going to be used a lot in Zope3. Maybe it could > become a standard library module? -1 Speaking as someone who's using it for "production" purposes, I don't think reST is ready for prime time. Including it in sandbox/ or Demo/ would be fine, but I suspect reST is going to go through at least one more major API revision before it's released (i.e. within the next six months). Python 2.4 seems more like the appropriate timescale. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "There are three kinds of lies: Lies, Damn Lies, and Statistics." --Disraeli From guido@python.org Tue Jan 7 13:48:48 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 07 Jan 2003 08:48:48 -0500 Subject: [Python-Dev] test_logging failures Message-ID: <200301071348.h07DmmP05594@pcp02138704pcs.reston01.va.comcast.net> I suddenly see spectacular failures in the logging module on Linux. Does anybody know what's up with that? $ ./python ../Lib/test/regrtest.py test_logging.py test_logging Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) test test_logging produced unexpected output: ********************************************************************** *** lines 490-515 of expected output missing: - ERR -> CRITICAL: Message 0 (via logrecv.tcp.ERR) - ERR -> ERROR: Message 1 (via logrecv.tcp.ERR) - INF -> CRITICAL: Message 2 (via logrecv.tcp.INF) - INF -> ERROR: Message 3 (via logrecv.tcp.INF) - INF -> WARN: Message 4 (via logrecv.tcp.INF) - INF -> INFO: Message 5 (via logrecv.tcp.INF) - INF.UNDEF -> CRITICAL: Message 6 (via logrecv.tcp.INF.UNDEF) - INF.UNDEF -> ERROR: Message 7 (via logrecv.tcp.INF.UNDEF) - INF.UNDEF -> WARN: Message 8 (via logrecv.tcp.INF.UNDEF) - INF.UNDEF -> INFO: Message 9 (via logrecv.tcp.INF.UNDEF) - INF.ERR -> CRITICAL: Message 10 (via logrecv.tcp.INF.ERR) - INF.ERR -> ERROR: Message 11 (via logrecv.tcp.INF.ERR) - INF.ERR.UNDEF -> CRITICAL: Message 12 (via logrecv.tcp.INF.ERR.UNDEF) - INF.ERR.UNDEF -> ERROR: Message 13 (via logrecv.tcp.INF.ERR.UNDEF) - DEB -> CRITICAL: Message 14 (via logrecv.tcp.DEB) - DEB -> ERROR: Message 15 (via logrecv.tcp.DEB) - DEB -> WARN: Message 16 (via logrecv.tcp.DEB) - DEB -> INFO: Message 17 (via logrecv.tcp.DEB) - DEB -> DEBUG: Message 18 (via logrecv.tcp.DEB) - UNDEF -> CRITICAL: Message 19 (via logrecv.tcp.UNDEF) - UNDEF -> ERROR: Message 20 (via logrecv.tcp.UNDEF) - UNDEF -> WARN: Message 21 (via logrecv.tcp.UNDEF) - UNDEF -> INFO: Message 22 (via logrecv.tcp.UNDEF) - INF.BADPARENT.UNDEF -> CRITICAL: Message 23 (via logrecv.tcp.INF.BADPARENT.UNDEF) - INF.BADPARENT -> CRITICAL: Message 24 (via logrecv.tcp.INF.BADPARENT) - INF -> INFO: Messages should bear numbers 0 through 24. (via logrecv.tcp.INF) ********************************************************************** 1 test failed: test_logging ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "/home/guido/projects/trunk/Lib/logging/__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file $ --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jan 7 13:12:54 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 07 Jan 2003 08:12:54 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: Your message of "Mon, 06 Jan 2003 22:30:37 EST." References: Message-ID: <200301071312.h07DCsD03400@pcp02138704pcs.reston01.va.comcast.net> > >>> Maybe it could become a standard library module? > >> > >> That would be great. But could it be a bit premature? > > [...] > >> Maybe I'm just being too protective of "my baby". Perhaps it's time > >> for it to face the realities of the big bad world. > > > > Maybe. Maybe I haven't been following reST closely enough. I do > > admit that I was a bit worried when I saw how rough the tools are > > (e.g. tools/html.py, which is the only thing I've used) > > In what way are the tools rough? Well, they aren't even installed. > Other than: > > > and also when I found that at first they didn't work with > > Python 2.3 at all. > > Minor issues, all fixed now. Yeah, everything's a minor issue of programming. :-) > > OTOH, maybe this will be an encouragement for you and the other reST > > developers (are there any besides you? :-) > > But of course! There are many, contributing a little bit here & there. > A few have contributed significant amounts. But it's mostly me, true. For a project of this size, that's often a good idea, as long as the main developer listens to the users (which you do). So congratulations! > > to aim for a release beyond the CVS snapshot. Once you've released > > it may make more sense to add it to Python 2.3. > > I don't think that Docutils itself is ready. Autodocumenting Python source, > what I consider to be its main use case, isn't addressed yet. I've been > aiming at another release when that's ready. > > I think the reStructuredText parser (with the one exception of interpreted > text) is mature and ready though. I'd be happy to help integrate that. Hm, I'd rather do a package deal than piecemeal. Let's forget this until you feel you're truly ready. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jan 7 23:12:58 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 07 Jan 2003 18:12:58 -0500 Subject: [Python-Dev] Trouble with Python 2.3a1 In-Reply-To: Your message of "Tue, 07 Jan 2003 17:35:47 EST." References: Message-ID: <200301072312.h07NCxk29254@odiug.zope.com> > Ralf describes below a change in the name of classes defined in the > __main__ module for 2.3a1, which causes some of our tests to fail. > Was it intentional? I need more information. __main__ gets prepended in certain situations when no module name is given. I don't recall exactly what changed. --Guido van Rossum (home page: http://www.python.org/~guido/) From walter@livinglogic.de Tue Jan 7 16:38:31 2003 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue, 07 Jan 2003 17:38:31 +0100 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: <15892.56781.473869.187757@montanaro.dyndns.org> References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> <15892.56781.473869.187757@montanaro.dyndns.org> Message-ID: <3E1B0287.6080904@livinglogic.de> Skip Montanaro wrote: > Brett> Also, is there any desire to try to move all of the regression > Brett> tests over to PyUnit? Or is the general consensus that it just > Brett> isn't worth the time to move them over? > > I think this is something you probably don't want to do all at once, but as > you add new tests. It's probably also best left for the earliest part of > the development cycle, e.g., right after 2.3final is released. Screwing up > your test suite any more than necessary during alpha/beta testing seems sort > of like aiming a shotgun at your foot to me. I've opened a patch for tests ported to PyUnit: http://www.python.org/sf/662807. The first four test ported are: test_pow, test_charmapcodec, test_userdict and test_b1. Bye, Walter Dörwald From barry@python.org Tue Jan 7 23:30:39 2003 From: barry@python.org (Barry A. Warsaw) Date: Tue, 7 Jan 2003 18:30:39 -0500 Subject: [Python-Dev] Misc. warnings References: <3E1B5B07.7000003@lemburg.com> Message-ID: <15899.25375.120408.421971@gargle.gargle.HOWL> >>>>> "MAL" == M writes: | Dito for BSD DB 3 ? Python 2.3's bsddb module should work with BerkeleyDB back to at least 3.3.11, which my Redhat 7.3 box comes with as the db3-devel-3.3.11-6 package. -Barry From bac@OCF.Berkeley.EDU Tue Jan 7 23:34:30 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 7 Jan 2003 15:34:30 -0800 (PST) Subject: [Python-Dev] Misc. warnings In-Reply-To: <3E1B5B07.7000003@lemburg.com> References: <3E1B5B07.7000003@lemburg.com> Message-ID: [M.-A. Lemburg] > test_bsddb3 skipped -- Use of the `bsddb' resource not enabled > test_ossaudiodev skipped -- No module named ossaudiodev > 2 skips unexpected on linux2: > test_ossaudiodev test_bsddb3 > > Why would ossaudiodev be expected to work on Linux ? > Dito for BSD DB 3 ? > Because they *can* work on Linux. The only way around this would be to come up with some way to either figure out from introspection that a module is dependent on a third-party module or have it explicitly listed in regrtest.py; neither sound foolproof. If you don't want to be warned about it, then you could apply my patch (#658316) that implements the skips.txt functionality and just list the tests to be expected to be skipped on your computer . -Brett From gtalvola@nameconnector.com Tue Jan 7 18:55:26 2003 From: gtalvola@nameconnector.com (Geoffrey Talvola) Date: Tue, 7 Jan 2003 13:55:26 -0500 Subject: [Python-Dev] new features for 2.3? Message-ID: <61957B071FF421419E567A28A45C7FE540106D@mailbox.nameconnector.com> Guido van Rossum [mailto:guido@python.org] wrote: > I'd also like to get rid of __safe_for_unpickling__ and all other > pseudo security features. Attempting to unpickle pickles from an > untrusted source is insane, and nothing can help us there; I'd rather > make the marshal protocol bulletproof (all it needs is a few more > checks for inconsistent data and a little better error handling). 2 questions: 1) Are you going to retain the current ability to create a cPickle.Unpickler, set its find_global attribute to a function that only allows certain trusted classes to be unpickled (or perhaps none at all), and use that unpickler object to "safely" unpickle strings? I'm asking because Webware for Python contains a PickleRPC protocol which uses cPickle in this way, and it would be nice to be able to continue using it with 2.3. 2) Do you think this is indeed safe, or should we scrap it and switch to a new MarshalRPC instead (as indicated by your "attempting to unpickle pickles from an untrusted source is insane" remark)? We originally used pickles because then we can allow certain types and classes (such as mxDateTime objects) and from my understanding, that wouldn't be possible with marshal. - Geoff From jeremy@alum.mit.edu Tue Jan 7 19:19:15 2003 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Tue, 7 Jan 2003 14:19:15 -0500 Subject: [Python-Dev] tutor(function|module) In-Reply-To: <20030105013539.GB7675@homer.gst.com> References: <6761BBA0-2030-11D7-BF12-000A27B19B96@oratrix.com> <20030105013539.GB7675@homer.gst.com> Message-ID: <15899.10291.374824.500244@slothrop.zope.com> >>>>> "CB" == Christopher Blunck writes: CB> I also recognize that this is ?feature? is not a core language CB> functionality. But due to its special nature (I put this into CB> the same area as help) I think that it does warrant thought CB> about if it deserves to be included as a core feature *simply CB> because* it might really help newbies out. CB> A web page full of examples would be great, but again - would CB> Joe New User know about the web page if they didn't even know CB> about the lang (because it came pre-installed)? I don't know... I expect that a newbie is better able to use Google then he or she is able to poke around a Python installation. Creating a web page is a good way to serve them. An added benefit is that people can vote with their clicks. If you write a great tutorial, people will find it via a search engine. If someone writes a greater tutorial, people can use it instead. Jeremy From guido@python.org Tue Jan 7 19:48:58 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 07 Jan 2003 14:48:58 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: Your message of "Tue, 07 Jan 2003 13:55:26 EST." <61957B071FF421419E567A28A45C7FE540106D@mailbox.nameconnector.com> References: <61957B071FF421419E567A28A45C7FE540106D@mailbox.nameconnector.com> Message-ID: <200301071948.h07JmwE13543@odiug.zope.com> > > I'd also like to get rid of __safe_for_unpickling__ and all other > > pseudo security features. Attempting to unpickle pickles from an > > untrusted source is insane, and nothing can help us there; I'd rather > > make the marshal protocol bulletproof (all it needs is a few more > > checks for inconsistent data and a little better error handling). > > 2 questions: > > 1) Are you going to retain the current ability to create a > cPickle.Unpickler, set its find_global attribute to a function that only > allows certain trusted classes to be unpickled (or perhaps none at all), and > use that unpickler object to "safely" unpickle strings? I have no intention to remove existing APIs, so yes. > I'm asking because Webware for Python contains a PickleRPC protocol which > uses cPickle in this way, and it would be nice to be able to continue using > it with 2.3. > > 2) Do you think this is indeed safe, or should we scrap it and switch to a > new MarshalRPC instead (as indicated by your "attempting to unpickle pickles > from an > untrusted source is insane" remark)? We originally used pickles because > then we can allow certain types and classes (such as mxDateTime objects) and > from my understanding, that wouldn't be possible with marshal. I have not done a security audit of pickling, and until I have, I don't think it's safe. I didn't write pickling with safety in mind, and although Jim Fulton was thinking about safety when he wrote cPickle, I don't think that anyone has ever really seriously looked for security holes in unpickling. There may be accidental holes (code not doing as good a job as it could of checking for bad data), but there may also be conceptual holes (things that cannot be made safe without redesigning the pickling architecture). Who knows. --Guido van Rossum (home page: http://www.python.org/~guido/) From Jack.Jansen@oratrix.com Mon Jan 6 01:45:03 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Mon, 6 Jan 2003 02:45:03 +0100 Subject: Slow String Repeat (was: [Python-Dev] PyBuffer* vs. array.array()) In-Reply-To: <3E18C3BC.1020108@tismer.com> Message-ID: <79CC342C-2118-11D7-BB12-000A27B19B96@oratrix.com> On maandag, jan 6, 2003, at 00:46 Europe/Amsterdam, Christian Tismer wrote: > The central copying code in stringobject.c is the following > tight loop: > > for (i = 0; i < size; i += a->ob_size) > memcpy(op->ob_sval+i, a->ob_sval, (int) a->ob_size); > > For my example, this memcpy is started for every single > of the one million bytes. So the overhead of memcpy, > let is be a function call or a macro, will be executed > a million times. Oops, I replied before seeing this message, this does sound plausible. But that gives an easy way to fix it: for copies larger than a certain factor just copy the source object, then duplicate the source object until you're at size/2, then duplicat the last bit. That is, if it is worth the trouble to optimize this, -- - Jack Jansen http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman - From theller@python.net Tue Jan 7 21:11:45 2003 From: theller@python.net (Thomas Heller) Date: 07 Jan 2003 22:11:45 +0100 Subject: [Python-Dev] sys.path[0] Message-ID: <8yxwd5ri.fsf@python.net> [Kevin Altis brought this to my attention, so I'm cc-ing him] The python docs state on sys.path: As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first. This is at least misleading. It appears that for certain ways Python is started, the first item on sys.path is a relative path name, or even empty, even if a script was specified, and the path would have been available. This leads to problems if the script changes the working directory. Not always because the programmer explicitely called os.chdir(), also 'behind the scenes' when a GUI is used. Shouldn't Python convert sys.path to absolute path names, to avoid these problems? Thomas From Raymond Hettinger" Is there an interest in further development of Oren's idea: speed-up access to non-locals by giving dictionaries a fast special case lookup for failed searches with interned strings (the usual case for access to builtins and globals) The proof of concept code looked very promising to me. It can be implemented a way that is easily disconnected if a better approach is found. This looks like low hanging fruit. Raymond Hettinger From martin@v.loewis.de Tue Jan 7 23:32:43 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 00:32:43 +0100 Subject: [Python-Dev] Misc. warnings In-Reply-To: <3E1B5B07.7000003@lemburg.com> References: <3E1B5B07.7000003@lemburg.com> Message-ID: <3E1B639B.5060306@v.loewis.de> M.-A. Lemburg wrote: > test_bsddb3 skipped -- Use of the `bsddb' resource not enabled > test_ossaudiodev skipped -- No module named ossaudiodev > 2 skips unexpected on linux2: > test_ossaudiodev test_bsddb3 > > Why would ossaudiodev be expected to work on Linux ? > Dito for BSD DB 3 ? Passive voice is misleading here: Who would or would not expect thinks to work on Linux? The message "2 skips unexpected" does not mean that anybody is not expecting something. It merely means that those tests are skipped and not listed in the dictionary "_expectations". I would personally expect that the bsddb3 test passes if Sleepycat BSD DB is installed on the system (which should be the case on most Linux systems), and the bsddb resource is activated when running regrtest. I would further expect that ossaudiodev passes on every Linux system that has a soundcard installed. Regards, Martin From altis@semi-retired.com Wed Jan 8 01:00:42 2003 From: altis@semi-retired.com (Kevin Altis) Date: Tue, 7 Jan 2003 17:00:42 -0800 Subject: [Python-Dev] patch CGIHTTPServer.py for IE POST bug Message-ID: Steve Holden submitted a patch for the IE POST bug in December, which I thought had made it into 2.3a1, but apparently not. I've been using this select code to throw away extra bytes from IE POSTs on my Win2K box for about four months and haven't noticed any problems using MoinMoin on my local box as well as some test POST scripts. However, Steve says he is busy with PyCon and this patch, though small, probably needs review before a commit. The bug: http://sourceforge.net/tracker/?func=detail&aid=430160&group_id=5470&atid=10 5470 and http://sourceforge.net/tracker/?func=detail&aid=427345&group_id=5470&atid=10 5470 The patch: http://sourceforge.net/tracker/?func=detail&aid=654910&group_id=5470&atid=30 5470 ka From guido@python.org Wed Jan 8 00:57:30 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 07 Jan 2003 19:57:30 -0500 Subject: [Python-Dev] Misc. warnings In-Reply-To: Your message of "Tue, 07 Jan 2003 23:56:07 +0100." <3E1B5B07.7000003@lemburg.com> References: <3E1B5B07.7000003@lemburg.com> Message-ID: <200301080057.h080vUE06904@pcp02138704pcs.reston01.va.comcast.net> > Why would ossaudiodev be expected to work on Linux ? It should work, except setup.py hasn't been fixed to build it. Here's a patch: *** setup.py 4 Jan 2003 04:06:56 -0000 1.133 --- setup.py 8 Jan 2003 00:55:37 -0000 *************** *** 722,727 **** --- 722,728 ---- if platform == 'linux2': # Linux-specific modules exts.append( Extension('linuxaudiodev', ['linuxaudiodev.c']) ) + exts.append( Extension('ossaudiodev', ['ossaudiodev.c']) ) if platform == 'sunos5': # SunOS specific modules However, then the test suite fails: $ ./python ../Lib/test/regrtest.py test_ossaudiodev.py test_ossaudiodev test test_ossaudiodev produced unexpected output: ********************************************************************** *** lines 2-7 of actual output doesn't appear in expected output after line 1: + expected rate >= 0, not -1 + expected sample size >= 0, not -2 + nchannels must be 1 or 2, not 3 + unknown audio encoding: 177 + for linear unsigned 16-bit little-endian audio, expected sample size 16, not 8 + for linear unsigned 8-bit audio, expected sample size 8, not 16 ********************************************************************** 1 test failed: test_ossaudiodev $ Also, I note that the compilation of ossaudiodev should really be dependent on the existence of either (Linux) or (BSD). What's up with that, Greg? I've seen tons of checkins to ossaudiodev -- how close is it to completion? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jan 8 01:15:19 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 07 Jan 2003 20:15:19 -0500 Subject: [Python-Dev] sys.path[0] In-Reply-To: Your message of "07 Jan 2003 22:11:45 +0100." <8yxwd5ri.fsf@python.net> References: <8yxwd5ri.fsf@python.net> Message-ID: <200301080115.h081FJW07152@pcp02138704pcs.reston01.va.comcast.net> > [Kevin Altis brought this to my attention, so I'm cc-ing him] > > The python docs state on sys.path: > > As initialized upon program startup, the first item of this list, > path[0], is the directory containing the script that was used to > invoke the Python interpreter. If the script directory is not > available (e.g. if the interpreter is invoked interactively or if > the script is read from standard input), path[0] is the empty > string, which directs Python to search modules in the current > directory first. > > This is at least misleading. > > It appears that for certain ways Python is started, the first item > on sys.path is a relative path name, or even empty, even if a script > was specified, and the path would have been available. What's wrong with a relative pathname? If you invoke the script using a relative pathname, why shouldn't that be what you get? The docs don't say that it's the absolute pathname, so I don't think you can claim that the docs are misleading here. > This leads to problems if the script changes the working directory. > Not always because the programmer explicitely called os.chdir(), > also 'behind the scenes' when a GUI is used. Why would a GUI change the current directory? That seems pretty broken. > Shouldn't Python convert sys.path to absolute path names, to avoid > these problems? site.py converts sys.path entries to absolute pathnames, *except* for the path entry for to the script directory, because that is added to sys.path *after* site.py is run. I'm disinclined to do anything about this, except perhaps warn that the script directory may be given as a relative path. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jan 8 01:18:42 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 07 Jan 2003 20:18:42 -0500 Subject: [Python-Dev] SF patch 606098: fast dictionary lookup by name In-Reply-To: Your message of "Tue, 07 Jan 2003 15:59:43 EST." <001d01c2b68f$d4f50080$125ffea9@oemcomputer> References: <001d01c2b68f$d4f50080$125ffea9@oemcomputer> Message-ID: <200301080118.h081IgB07192@pcp02138704pcs.reston01.va.comcast.net> > Is there an interest in further development of Oren's idea: > > speed-up access to non-locals by giving dictionaries a fast special > case lookup for failed searches with interned strings (the usual > case for access to builtins and globals) > > The proof of concept code looked very promising to me. > It can be implemented a way that is easily disconnected > if a better approach is found. This looks like low hanging > fruit. +1. I like Oren's ideas here a lot. He promised a new version, but it appears he has no time for that now, so I'd appreciate it if you could develop this further without waiting for him (which acknowledging him, of course). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jan 8 01:22:15 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 07 Jan 2003 20:22:15 -0500 Subject: [Python-Dev] Misc. warnings In-Reply-To: Your message of "Tue, 07 Jan 2003 19:57:30 EST." <200301080057.h080vUE06904@pcp02138704pcs.reston01.va.comcast.net> References: <3E1B5B07.7000003@lemburg.com> <200301080057.h080vUE06904@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200301080122.h081MFL07239@pcp02138704pcs.reston01.va.comcast.net> > However, then the test suite fails: > > $ ./python ../Lib/test/regrtest.py test_ossaudiodev.py > test_ossaudiodev > test test_ossaudiodev produced unexpected output: > ********************************************************************** > *** lines 2-7 of actual output doesn't appear in expected output after line 1: > + expected rate >= 0, not -1 > + expected sample size >= 0, not -2 > + nchannels must be 1 or 2, not 3 > + unknown audio encoding: 177 > + for linear unsigned 16-bit little-endian audio, expected sample size 16, not 8 > + for linear unsigned 8-bit audio, expected sample size 8, not 16 > ********************************************************************** > 1 test failed: > test_ossaudiodev > $ Come to think of it, this is shallow; the file Lib/test/output/test_ossaudiodev is missing. I'll fix this and the setup.py; someone else can worry about enabling it for BSD and other boxes. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Wed Jan 8 01:24:29 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 07 Jan 2003 20:24:29 -0500 Subject: [Python-Dev] sys.path[0] In-Reply-To: <200301080115.h081FJW07152@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > ... > Why would a GUI change the current directory? That seems pretty > broken. Sure is. Sean True and Mark Hammond and I all stumbled over this in the spambayes project while working on the Outlook client. Turned out that, on Win9x but not 2K (or perhaps it had to do with which version and service pack level of Outlook too ...), calling some subset of MS MAPI functions had the side effect of changing the current directory to a system directory in which some of MS's MAPI code lives. That really screwed up subsequent imports. I don't claim it's Python's job to try to hide that, though. From gward@python.net Wed Jan 8 01:32:46 2003 From: gward@python.net (Greg Ward) Date: Tue, 7 Jan 2003 20:32:46 -0500 Subject: [Python-Dev] Misc. warnings In-Reply-To: <200301080057.h080vUE06904@pcp02138704pcs.reston01.va.comcast.net> References: <3E1B5B07.7000003@lemburg.com> <200301080057.h080vUE06904@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030108013246.GA3040@cthulhu.gerg.ca> On 07 January 2003, Guido van Rossum said: > It should work, except setup.py hasn't been fixed to build it. Here's > a patch: Err, yeah, I have a similar patch on my local copy, but I haven't checked it in yet because of this: > Also, I note that the compilation of ossaudiodev should really be > dependent on the existence of either (Linux) or > (BSD). and, come to think of it, this: > However, then the test suite fails: > > $ ./python ../Lib/test/regrtest.py test_ossaudiodev.py > test_ossaudiodev > test test_ossaudiodev produced unexpected output: > ********************************************************************** > *** lines 2-7 of actual output doesn't appear in expected output after line 1: is another good reason not to check that setup.py change in. ;-) That one's easy: I never copied output/test_linuxaudiodev to output/test_ossaudiodev. Done now, trying it out, will check in soon. I'll let someone else worry about activating ossaudiodev on FreeBSD. > What's up with that, Greg? I've seen tons of checkins to ossaudiodev > -- how close is it to completion? I'm fairly happy with the code now. (I think -- haven't looked at it since last week.) Have some preliminary docs supplied by the guy who wrote the mixer interface (Nick FitzRoy-Dale) that I have to look at and beef up. I also have crazy ideas about writing an extensive, and noisy, test script that would go in Tools or sandbox or something. But the module itself is pretty good now. Greg -- Greg Ward http://www.gerg.ca/ I joined scientology at a garage sale!! From guido@python.org Wed Jan 8 01:34:28 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 07 Jan 2003 20:34:28 -0500 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: Your message of "Tue, 07 Jan 2003 17:38:31 +0100." <3E1B0287.6080904@livinglogic.de> References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> <15892.56781.473869.187757@montanaro.dyndns.org> <3E1B0287.6080904@livinglogic.de> Message-ID: <200301080134.h081YSK07358@pcp02138704pcs.reston01.va.comcast.net> > I've opened a patch for tests ported to PyUnit: > http://www.python.org/sf/662807. > > The first four test ported are: test_pow, test_charmapcodec, > test_userdict and test_b1. I don't know how you picked these, but note that test_b1 and test_b2 really belong together, and should really be combined into test_builtin. (Somehow, long ago, I thought that there was too much there to fit in one file. Silly me. ;-) --Guido van Rossum (home page: http://www.python.org/~guido/) From neal@metaslash.com Wed Jan 8 02:08:45 2003 From: neal@metaslash.com (Neal Norwitz) Date: Tue, 7 Jan 2003 21:08:45 -0500 Subject: [Python-Dev] Raising string exceptions Message-ID: <20030108020845.GL29873@epoch.metaslash.com> Should the error message when raising an invalid exception: "exceptions must be strings, classes or instances" in ceval.c:2743 be changed to remove strings, since this is deprecated? Also, should we add a PendingDeprecationWarning when raising a string exception? I suspect a DeprecationWarning is too much for 2.3. Neal From mhammond@skippinet.com.au Wed Jan 8 02:11:29 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 8 Jan 2003 13:11:29 +1100 Subject: [Python-Dev] sys.path[0] In-Reply-To: Message-ID: <025801c2b6bb$4252afd0$530f8490@eden> [Tim] > [Guido] > > ... > > Why would a GUI change the current directory? That seems pretty > > broken. > > Sure is. Sean True and Mark Hammond and I all stumbled over > this in the > spambayes project while working on the Outlook client. > Turned out that, on > Win9x but not 2K (or perhaps it had to do with which version > and service > pack level of Outlook too ...), calling some subset of MS > MAPI functions had Worse, displaying a standard "open file" dialog will change the current directory of your app to the current directory of the dialog when closed. Mark. From tim.one@comcast.net Wed Jan 8 02:19:02 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 07 Jan 2003 21:19:02 -0500 Subject: [Python-Dev] Unidiff tool In-Reply-To: Message-ID: [Paul Moore] > Actually, over my dial-up link, I've never got cvs diff to work. It > sits there for *ages* (30-50 mins) talking to the server, but never > does much of any use. Same problem with "cvs update". You may wish to upgrade to Win98 . In the first year of PythonLabs existence, I did all Python development on a laptop with a 28K phone modem, from a DOS box with barebones cmdline CVS. Worked fine. I still use CVS over dialup about twice a month, with both Python and Zope trees. From tim.one@comcast.net Wed Jan 8 02:21:52 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 07 Jan 2003 21:21:52 -0500 Subject: [Python-Dev] sys.path[0] In-Reply-To: <025801c2b6bb$4252afd0$530f8490@eden> Message-ID: [Mark Hammond] > Worse, displaying a standard "open file" dialog will change the > current directory of your app to the current directory of the dialog > when closed. Life is easier here if you choose to view *that* one as a feature . From skip@pobox.com Wed Jan 8 03:01:36 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 7 Jan 2003 21:01:36 -0600 Subject: [Python-Dev] Misc. warnings In-Reply-To: <3E1B5B07.7000003@lemburg.com> References: <3E1B5B07.7000003@lemburg.com> Message-ID: <15899.38032.74568.421252@montanaro.dyndns.org> mal> test_bsddb3 skipped -- Use of the `bsddb' resource not enabled Where are the libdb include files and libraries installed on SuSE? If you've got 'em it's easy to update setup.py. mal> test_ossaudiodev skipped -- No module named ossaudiodev I noticed this the other day on my Mandrake system. I guess some underlying library is missing or installed in an unexpected place. Skip From gward@python.net Wed Jan 8 03:09:39 2003 From: gward@python.net (Greg Ward) Date: Tue, 7 Jan 2003 22:09:39 -0500 Subject: [Python-Dev] Directories w/ spaces - pita In-Reply-To: <15893.49508.827550.246089@montanaro.dyndns.org> References: <15893.49508.827550.246089@montanaro.dyndns.org> Message-ID: <20030108030939.GA14013@cthulhu.gerg.ca> On 03 January 2003, Skip Montanaro said: > For example, on my Mac OS X system, building creates these two > directories: > > ./build/lib.darwin-6.3-Power Macintosh-2.3 > ./build/temp.darwin-6.3-Power Macintosh-2.3 > > I assume this is a configure thing, accepting the output of uname -m without > further processing. No, it's a distutils thing -- the code to change is get_platform() in Lib/distutils/util.py. Should probably munge *all* the strings returned by os.uname() to remove characters that don't belong on filenames. Greg -- Greg Ward http://www.gerg.ca/ I am NOT a nut.... From neal@metaslash.com Wed Jan 8 03:44:59 2003 From: neal@metaslash.com (Neal Norwitz) Date: Tue, 7 Jan 2003 22:44:59 -0500 Subject: [Python-Dev] test_logging failures In-Reply-To: <200301071348.h07DmmP05594@pcp02138704pcs.reston01.va.comcast.net> References: <200301071348.h07DmmP05594@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030108034458.GO29873@epoch.metaslash.com> On Tue, Jan 07, 2003 at 08:48:48AM -0500, Guido van Rossum wrote: > I suddenly see spectacular failures in the logging module on Linux. > Does anybody know what's up with that? No. It works for me on Linux. It also hasn't broken anything on the snake farm. The test uses threads. Neal From barry@python.org Wed Jan 8 03:49:48 2003 From: barry@python.org (Barry A. Warsaw) Date: Tue, 7 Jan 2003 22:49:48 -0500 Subject: [Python-Dev] Misc. warnings References: <3E1B5B07.7000003@lemburg.com> <3E1B639B.5060306@v.loewis.de> Message-ID: <15899.40924.635734.421206@gargle.gargle.HOWL> >>>>> "MvL" =3D=3D Martin v L=F6wis writes: MvL> I would personally expect that the bsddb3 test passes if MvL> Sleepycat BSD DB is installed on the system (which should be MvL> the case on most Linux systems), and the bsddb resource is MvL> activated when running regrtest. Actually, there are two bsddb tests. The first one, test_bsddb, is exceedingly simple and always runs. The second, misnamed test_bsddb3 is the full test suite from PyBSDDB and only runs when the bsddb resource is enabled. -Barry From guido@python.org Tue Jan 7 21:40:22 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 07 Jan 2003 16:40:22 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: Your message of "Mon, 06 Jan 2003 11:14:56 PST." References: Message-ID: <200301072140.h07LeNW26018@odiug.zope.com> > I would appreciate a little more explanation regarding the use of > pickles. Since I've brought up the issue off-list a few times about > using pickles of built-ins such as strings, integers, lists, and > dictionaries (and probably datetime), but no sub-classes of > built-ins or custom classes. That sentence not parse. > I understand that there are security concerns, but does this mean > that exchanging a pickle via XML-RPC and/or SOAP or exchanging a > pickle the way you might use a vCard (pickle as just data) is simply > not doable? I wouldn't touch a pickle that came in someone's email signature with a 10-foot pole. It might seem safe now, and 2 years from now, when everybody's doing it, a bored teenager in China finds a way to use it to transport a feature. No, thank you. > How does this impact ZODB (if at all) for the same types of > applications? Binary pickles are extremely fast and easy to use, but > it appears that using them in a situation where you need to exchange > data is just not doable without additional support modules. Pickles are fine as long as you trust the data source. For untrusted situations, you should design a custom format that OBVIOUSLY cannot be used to hack into your system. XML sounds pretty good. HTML is bad (given JavaScript etc.). > Or perhaps there just needs to be a standard safe unpickler that is > part of 2.3 that only excepts built-ins of "safe" types? If the > pickle contained something unsafe it would simply throw an exception > but no harm would be done. No, for the same reasons as above. I don't think you can prove it's safe, so I don't think you should trust it. Making marshal safe would be much easier, as long as you don't use eval, exec or new.function() on the result. (Marshal currently can be caused to SegFault by giving it bad data, but that's a localized problem. The problem with pickle is that you have to validate the entire Python interpreter and understand all the hidden introspective hooks that are available.) --Guido van Rossum (home page: http://www.python.org/~guido/) From whisper@oz.net Wed Jan 8 05:25:04 2003 From: whisper@oz.net (David LeBlanc) Date: Tue, 7 Jan 2003 21:25:04 -0800 Subject: [Python-Dev] sys.path[0] In-Reply-To: <025801c2b6bb$4252afd0$530f8490@eden> Message-ID: This has been a "feature" of MS Windows since Windows 2.11 16 bit version. Apps I was developing in the mid 80's had to capture and save the app's startup directory in order to access resources local to the app because many file ops and file dialogs would change the cwd. David LeBlanc Seattle, WA USA > -----Original Message----- > From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On > Behalf Of Mark Hammond > Sent: Tuesday, January 07, 2003 18:11 > To: python-dev@python.org > Subject: RE: [Python-Dev] sys.path[0] > > > [Tim] > > [Guido] > > > ... > > > Why would a GUI change the current directory? That seems pretty > > > broken. > > > > Sure is. Sean True and Mark Hammond and I all stumbled over > > this in the > > spambayes project while working on the Outlook client. > > Turned out that, on > > Win9x but not 2K (or perhaps it had to do with which version > > and service > > pack level of Outlook too ...), calling some subset of MS > > MAPI functions had > > Worse, displaying a standard "open file" dialog will change the current > directory of your app to the current directory of the dialog when closed. > > Mark. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From nas@python.ca Wed Jan 8 06:13:00 2003 From: nas@python.ca (Neil Schemenauer) Date: Tue, 7 Jan 2003 22:13:00 -0800 Subject: [Python-Dev] new features for 2.3? In-Reply-To: <200301072140.h07LeNW26018@odiug.zope.com> References: <200301072140.h07LeNW26018@odiug.zope.com> Message-ID: <20030108061300.GA24472@glacier.arctrix.com> Guido van Rossum wrote: > For untrusted situations, you should design a custom format that > OBVIOUSLY cannot be used to hack into your system. XML sounds pretty > good. Ugh. XML is way to verbose and is slow to parse, IMHO. A limited subset of the pickle or marshal format would be pretty good. > No, for the same reasons as above. I don't think you can prove > [pickle is] safe, so I don't think you should trust it. What about a subset that only included int, float, string, unicode, dict, and tuple? > Making marshal safe would be much easier, as long as you don't use > eval, exec or new.function() on the result. The documentation for marshal says "details of the format are undocumented on purpose; it may change between Python versions". Maybe we need something like marshal that works on a limited set of types and has a stable format. Neil From nas@python.ca Wed Jan 8 06:24:05 2003 From: nas@python.ca (Neil Schemenauer) Date: Tue, 7 Jan 2003 22:24:05 -0800 Subject: [Python-Dev] What attempts at security should/can Python implement? In-Reply-To: <15897.56831.775605.990300@montanaro.dyndns.org> References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> <200301052116.h05LGDS31011@pcp02138704pcs.reston01.va.comcast.net> <003301c2b500$c54ed2e0$0311a044@oemcomputer> <01b301c2b51b$f86358c0$6d94fea9@newmexico> <200301061515.h06FFr901379@odiug.zope.com> <032201c2b58f$9d978960$6d94fea9@newmexico> <15897.56831.775605.990300@montanaro.dyndns.org> Message-ID: <20030108062405.GB24472@glacier.arctrix.com> Skip Montanaro wrote: > Now that Guido has rendered impotent any attempts Python did make at > security, does it make sense to try and figure out what (if anything) can be > done by the C runtime? Personally, I think it would be best to direct effort at fixing bugs. All kinds of bugs, not just things like buffer and integer overflows (hi Tim :-). It often happens that a seemingly innocent bug turns into a security problem. As I believe Guido said earlier, building a security model into the language is really hard. We don't have the resources to do it right. I'm not sure Sun does either. :-) Neil From theller@python.net Wed Jan 8 09:03:15 2003 From: theller@python.net (Thomas Heller) Date: 08 Jan 2003 10:03:15 +0100 Subject: [Python-Dev] sys.path[0] In-Reply-To: <200301080115.h081FJW07152@pcp02138704pcs.reston01.va.comcast.net> References: <8yxwd5ri.fsf@python.net> <200301080115.h081FJW07152@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3co4c8to.fsf@python.net> Guido van Rossum writes: > > [Kevin Altis brought this to my attention, so I'm cc-ing him] > > > > The python docs state on sys.path: > > > > As initialized upon program startup, the first item of this list, > > path[0], is the directory containing the script that was used to > > invoke the Python interpreter. If the script directory is not > > available (e.g. if the interpreter is invoked interactively or if > > the script is read from standard input), path[0] is the empty > > string, which directs Python to search modules in the current > > directory first. > > > > This is at least misleading. > > > > It appears that for certain ways Python is started, the first item > > on sys.path is a relative path name, or even empty, even if a script > > was specified, and the path would have been available. > > What's wrong with a relative pathname? If you invoke the script using > a relative pathname, why shouldn't that be what you get? The docs > don't say that it's the absolute pathname, so I don't think you can > claim that the docs are misleading here. Actually I don't care whether sys.path[0] contains an absolute or relative pathname, but modules imported via a relative path entry get a mod.__file__ attribute which is also a relative pathname. Changing the working directory then leads to strange results because mod.__file__ is no longer a valid pathname: think of reload(), tracebacks, and maybe more. So maybe the __file__ attribute should be converted to an absolute path during the module import? > > Shouldn't Python convert sys.path to absolute path names, to avoid > > these problems? > > site.py converts sys.path entries to absolute pathnames, *except* for > the path entry for to the script directory, because that is added to > sys.path *after* site.py is run. > Hehe. Does this prove that it's an implementation glitch? > I'm disinclined to do anything about this, except perhaps warn that > the script directory may be given as a relative path. Well, doing "sys.path[0] = os.path.abspath(sys.path[0])" or "sys.path = [os.path.abspath(p) for p in sys.path]" early enough seems to fix it, although I would argue that should be Python's job. Thomas From ping@zesty.ca Wed Jan 8 09:01:49 2003 From: ping@zesty.ca (Ka-Ping Yee) Date: Wed, 8 Jan 2003 03:01:49 -0600 (CST) Subject: [Python-Dev] sys.path[0] In-Reply-To: <3co4c8to.fsf@python.net> Message-ID: On 8 Jan 2003, Thomas Heller wrote: > Actually I don't care whether sys.path[0] contains an absolute or relative > pathname, but modules imported via a relative path entry get a mod.__file__ > attribute which is also a relative pathname. > > Changing the working directory then leads to strange results because > mod.__file__ is no longer a valid pathname: think of reload(), tracebacks, > and maybe more. Exactly for this reason, changing the working directory confuses inspect and pydoc and presumably anything else that tries to find source code. There's no way to work around this because the true path information is simply not available, unless we fix the __file__ attribute. I'd be in favour of setting all __file__ attributes to absolute paths. > > I'm disinclined to do anything about this, except perhaps warn that > > the script directory may be given as a relative path. The current working directory is a piece of hidden global state. In general, hidden global state is bad, and this particular piece of state is especially important because it affects what Python modules get loaded. I'd prefer for the interpreter to just set up sys.path correctly to begin with -- what's the point in doing it ambiguously only to fix it up later anyway? -- ?!ng From martin@v.loewis.de Wed Jan 8 09:17:05 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 08 Jan 2003 10:17:05 +0100 Subject: [Python-Dev] Misc. warnings In-Reply-To: <15899.38032.74568.421252@montanaro.dyndns.org> References: <3E1B5B07.7000003@lemburg.com> <15899.38032.74568.421252@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > mal> test_bsddb3 skipped -- Use of the `bsddb' resource not enabled > > Where are the libdb include files and libraries installed on SuSE? If > you've got 'em it's easy to update setup.py. That is not the problem. I'm pretty sure the _bsddb module was built on his system. However, test_bsddb3 just isn't run normally, because it takes too much time to complete. > mal> test_ossaudiodev skipped -- No module named ossaudiodev > > I noticed this the other day on my Mandrake system. I guess some underlying > library is missing or installed in an unexpected place. No. It's just that there is no machinery to actually build the module - not even an entry in Setup.dist. Regards, Martin From martin@v.loewis.de Wed Jan 8 09:26:13 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 08 Jan 2003 10:26:13 +0100 Subject: [Python-Dev] Raising string exceptions In-Reply-To: <20030108020845.GL29873@epoch.metaslash.com> References: <20030108020845.GL29873@epoch.metaslash.com> Message-ID: Neal Norwitz writes: > Should the error message when raising an invalid exception: > > "exceptions must be strings, classes or instances" > > in ceval.c:2743 be changed to remove strings, since this is > deprecated? +1. > Also, should we add a PendingDeprecationWarning when raising a > string exception? -0. I think making it a nearly undocumented feature is enough for the coming years. Regards, Martin From martin@v.loewis.de Wed Jan 8 09:37:01 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 08 Jan 2003 10:37:01 +0100 Subject: [Python-Dev] What attempts at security should/can Python implement? In-Reply-To: <15897.56831.775605.990300@montanaro.dyndns.org> References: <200301050203.SAA01788@mail.arc.nasa.gov> <200301050312.TAA03857@mail.arc.nasa.gov> <200301052116.h05LGDS31011@pcp02138704pcs.reston01.va.comcast.net> <003301c2b500$c54ed2e0$0311a044@oemcomputer> <01b301c2b51b$f86358c0$6d94fea9@newmexico> <200301061515.h06FFr901379@odiug.zope.com> <032201c2b58f$9d978960$6d94fea9@newmexico> <15897.56831.775605.990300@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > Now that Guido has rendered impotent any attempts Python did make at > security, does it make sense to try and figure out what (if > anything) can be done by the C runtime? I think Guido's rationale for removing all these features will be widely misunderstood. Me channeling him: it is not that he believes that the architectures developed were inherently incapable of providing security. Instead, he feels that no "expert" for these matters has reviewed these architecture for flaws, and that the continuing maintenance of these things isn't going to happen. If this understanding is correct, then any new approaches will likely suffer from the same fate. Unless somebody steps forward and says: "I am a security expert, and I guarantee that this and that feature is secure (in some documented sense)", then I think he will dislike any changes that mean to provide security. So this not a matter of engineering but of authority. Somebody must take the blame, and Guido doesn't want to be that someone. > Somebody asked about tainting in the past week in a response to a > year-old feature request on SF. Does that fall into this category? > I've been working my way (slowly) through Kent Beck's "Test-Driven > Development by Example" and was thinking that adding tainting to > Python strings might be an interesting application of those ideas > (for someone wanting to learn by doing), but if tainting won't be of > any use I'll find something else. I'm pretty sure that tainting would have to be maintained as a separately-distributed patch for quite some time (e.g. a tpython branch). Only if users accept it as secure, and only if the author is known in the community, and willing to continue maintaining it, and willing to take all the blame, it could be merged into Python. Regards, Martin From mal@lemburg.com Wed Jan 8 09:48:02 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 08 Jan 2003 10:48:02 +0100 Subject: [Python-Dev] Misc. warnings In-Reply-To: <3E1B639B.5060306@v.loewis.de> References: <3E1B5B07.7000003@lemburg.com> <3E1B639B.5060306@v.loewis.de> Message-ID: <3E1BF3D2.6070505@lemburg.com> Martin v. L=F6wis wrote: > M.-A. Lemburg wrote: >=20 >> test_bsddb3 skipped -- Use of the `bsddb' resource not enabled >> test_ossaudiodev skipped -- No module named ossaudiodev >> 2 skips unexpected on linux2: >> test_ossaudiodev test_bsddb3 >> >> Why would ossaudiodev be expected to work on Linux ? >> Dito for BSD DB 3 ? >=20 >=20 > Passive voice is misleading here: Who would or would not expect thinks=20 > to work on Linux? >=20 > The message "2 skips unexpected" does not mean that anybody is not=20 > expecting something. It merely means that those tests are skipped and=20 > not listed in the dictionary "_expectations". So why not list them in _expectations ? > I would personally expect that the bsddb3 test passes if Sleepycat BSD=20 > DB is installed on the system (which should be the case on most Linux=20 > systems), and the bsddb resource is activated when running regrtest. > > I would further expect that ossaudiodev passes on every Linux system=20 > that has a soundcard installed. True, but what those systems which don't come with a Linux supported soundcard or a supported BSD DB installation ? I'd place the two modules into the _expectations dict and enhance the error messages to: test_bsddb3 skipped -- No usable Sleepycat BSD DB 3 installation found test_ossaudiodev skipped -- No OSS-compatible soundcard support found BTW, after inspection I found that SuSE Linux ships with BSD DB 4.0.x. The module name would suggest that this version is not supported. Is that so ? --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jan 8 09:49:13 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 08 Jan 2003 10:49:13 +0100 Subject: [Python-Dev] Misc. warnings In-Reply-To: <15899.25375.120408.421971@gargle.gargle.HOWL> References: <3E1B5B07.7000003@lemburg.com> <15899.25375.120408.421971@gargle.gargle.HOWL> Message-ID: <3E1BF419.7070406@lemburg.com> Barry A. Warsaw wrote: >>>>>>"MAL" == M writes: > > > | Dito for BSD DB 3 ? > > Python 2.3's bsddb module should work with BerkeleyDB back to at least > 3.3.11, which my Redhat 7.3 box comes with as the db3-devel-3.3.11-6 > package. SuSE Linux comes with BSD DB 4.0. Perhaps that's why the installation doesn't configure for BSD DB ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jan 8 10:00:48 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 08 Jan 2003 11:00:48 +0100 Subject: [Python-Dev] Misc. warnings In-Reply-To: <15899.38032.74568.421252@montanaro.dyndns.org> References: <3E1B5B07.7000003@lemburg.com> <15899.38032.74568.421252@montanaro.dyndns.org> Message-ID: <3E1BF6D0.6030106@lemburg.com> Skip Montanaro wrote: > mal> test_bsddb3 skipped -- Use of the `bsddb' resource not enabled > > Where are the libdb include files and libraries installed on SuSE? If > you've got 'em it's easy to update setup.py. Hmm, I do have them, but they point to DB 4.0.14. Now the _bsddb.c source file indicates that it should support this version. I'll have to do some more investigation. > mal> test_ossaudiodev skipped -- No module named ossaudiodev > > I noticed this the other day on my Mandrake system. I guess some underlying > library is missing or installed in an unexpected place. Guido and Greg cleared this bit up: the module is being built, so the failure is expected ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Wed Jan 8 10:05:33 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 08 Jan 2003 11:05:33 +0100 Subject: [Python-Dev] Misc. warnings In-Reply-To: <3E1BF3D2.6070505@lemburg.com> References: <3E1B5B07.7000003@lemburg.com> <3E1B639B.5060306@v.loewis.de> <3E1BF3D2.6070505@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > > The message "2 skips unexpected" does not mean that anybody is not > > expecting something. It merely means that those tests are skipped > > and not listed in the dictionary "_expectations". > > So why not list them in _expectations ? I don't know. I cannot understand the purpose of _expectations. If the purpose is that Python never prints "skips unexpected", it would be better to remove that print statement. > True, but what those systems which don't come with a Linux > supported soundcard or a supported BSD DB installation ? They will skip that test. Again, I don't understand the purpose of the skip message, but I know of atleast one case where a user reviewed the skip messages to install a number of missing libraries. That user would not learn about missing libraries if the tests are added to _expectations. > I'd place the two modules into the _expectations dict and > enhance the error messages to: > > test_bsddb3 skipped -- No usable Sleepycat BSD DB 3 installation found This is not the likely cause. I'm pretty sure you have such an installation on your system. The likely cause that the test was skipped is that the bsddb3 resource was not given to regrtest. > test_ossaudiodev skipped -- No OSS-compatible soundcard support found Again, not the likely cause. The likely cause is that ossaudiodev is not built - you have discovered a genuine bug. ossaudiodev is never built, on any system. > BTW, after inspection I found that SuSE Linux ships with > BSD DB 4.0.x. The module name would suggest that this version > is not supported. Is that so ? No. The _bsddb.c works find with BSD DB 4.0, and I'm pretty sure that your Python build has found and used this library. Otherwise, test_bsddb would also be skipped. Regards, Martin From mal@lemburg.com Wed Jan 8 10:11:13 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 08 Jan 2003 11:11:13 +0100 Subject: [Python-Dev] Misc. warnings In-Reply-To: <15899.40924.635734.421206@gargle.gargle.HOWL> References: <3E1B5B07.7000003@lemburg.com> <3E1B639B.5060306@v.loewis.de> <15899.40924.635734.421206@gargle.gargle.HOWL> Message-ID: <3E1BF941.5030801@lemburg.com> Barry A. Warsaw wrote: >>>>>>"MvL" =3D=3D Martin v L=F6wis writes: >=20 >=20 > MvL> I would personally expect that the bsddb3 test passes if > MvL> Sleepycat BSD DB is installed on the system (which should be > MvL> the case on most Linux systems), and the bsddb resource is > MvL> activated when running regrtest. >=20 > Actually, there are two bsddb tests. The first one, test_bsddb, is > exceedingly simple and always runs. The second, misnamed test_bsddb3 > is the full test suite from PyBSDDB and only runs when the bsddb > resource is enabled. Ah, now I get it: you have to run regrtest.py -u bsddb in order for the test to run. In that case, I'd opt for marking the test_bsddb3 test as expected skip. Hmm, if I run the test manually, I get: projects/Python> pythondev Dev-Python/Lib/test/test_bsddb3.py -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-= =3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D= -=3D Sleepycat Software: Berkeley DB 4.0.14: (November 18, 2001) bsddb.db.version(): (4, 0, 14) bsddb.db.__version__: 4.1.1 bsddb.db.cvsid: $Id: _bsddb.c,v 1.3 2002/12/30 20:53:52 bwarsaw Exp= $ python version: 2.3a1 (#1, Jan 6 2003, 02:03:16) [GCC 2.95.3 20010315 (SuSE)] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-= =3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D= -=3D Traceback (most recent call last): File "Dev-Python/Lib/test/test_bsddb3.py", line 67, in ? unittest.main(defaultTest=3D'suite') File "/home/lemburg/projects/Python/Installation/lib/python2.3/unittes= t.py", line 710, in __init__ self.parseArgs(argv) File "/home/lemburg/projects/Python/Installation/lib/python2.3/unittes= t.py", line 737, in parseArgs self.createTests() File "/home/lemburg/projects/Python/Installation/lib/python2.3/unittes= t.py", line 743, in createTests self.module) File "/home/lemburg/projects/Python/Installation/lib/python2.3/unittes= t.py", line 509, in loadTestsFromNames suites.append(self.loadTestsFromName(name, module)) File "/home/lemburg/projects/Python/Installation/lib/python2.3/unittes= t.py", line 494, in loadTestsFromName test =3D obj() File "Dev-Python/Lib/test/test_bsddb3.py", line 44, in suite module =3D __import__("bsddb.test."+name, globals(), locals(), name) ImportError: No module named test.test_associate Looking at the installed bsddb package, there is no subpackage 'test' in there. It is available in the source tree, though. --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Wed Jan 8 10:46:41 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 11:46:41 +0100 Subject: [Python-Dev] Misc. warnings In-Reply-To: <3E1BF941.5030801@lemburg.com> References: <3E1B5B07.7000003@lemburg.com> <3E1B639B.5060306@v.loewis.de> <15899.40924.635734.421206@gargle.gargle.HOWL> <3E1BF941.5030801@lemburg.com> Message-ID: <3E1C0191.8090005@v.loewis.de> M.-A. Lemburg wrote: > Looking at the installed bsddb package, there is no subpackage > 'test' in there. It is available in the source tree, though. Can you try to fix this? If not, file a bug report. Regards, Martin From walter@livinglogic.de Wed Jan 8 10:48:26 2003 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Wed, 08 Jan 2003 11:48:26 +0100 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: <200301080134.h081YSK07358@pcp02138704pcs.reston01.va.comcast.net> References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> <15892.56781.473869.187757@montanaro.dyndns.org> <3E1B0287.6080904@livinglogic.de> <200301080134.h081YSK07358@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E1C01FA.9020602@livinglogic.de> Guido van Rossum wrote: >>I've opened a patch for tests ported to PyUnit: >>http://www.python.org/sf/662807. >> >>The first four test ported are: test_pow, test_charmapcodec, >>test_userdict and test_b1. > > > I don't know how you picked these, I just picked a few easy ones at random. > but note that test_b1 and test_b2 > really belong together, and should really be combined into > test_builtin. (Somehow, long ago, I thought that there was too much > there to fit in one file. Silly me. ;-) OK, so I'll combine test_b1.py and test_b2.py into test_builtins.py. So should I go on with this? Do we want to change all tests before 2.3 is finished, or start changing them after 2.3 is released (or something in between)? Bye, Walter Dörwald From guido@python.org Wed Jan 8 12:21:17 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 08 Jan 2003 07:21:17 -0500 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: Your message of "Wed, 08 Jan 2003 11:48:26 +0100." <3E1C01FA.9020602@livinglogic.de> References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> <15892.56781.473869.187757@montanaro.dyndns.org> <3E1B0287.6080904@livinglogic.de> <200301080134.h081YSK07358@pcp02138704pcs.reston01.va.comcast.net> <3E1C01FA.9020602@livinglogic.de> Message-ID: <200301081221.h08CLHG08953@pcp02138704pcs.reston01.va.comcast.net> > So should I go on with this? Do we want to change all tests before 2.3 > is finished, or start changing them after 2.3 is released > (or something in between)? I'd say something in between. It's probably good if someone (not me) reviews your patches. It's never good to rush these things, and I've said before that there's no reason to insist on all tests being PyUnit tests. (Definitely don't touch anything that's using doctest.) In particular, I *don't* want you to literally translate existing tests to PyUnit idiom. Try to improve on the tests -- think about end cases, etc. There are coverage tools (ask Skip) -- when testing Python modules, see if the test covers all code in the module! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jan 8 12:27:16 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 08 Jan 2003 07:27:16 -0500 Subject: [Python-Dev] sys.path[0] In-Reply-To: Your message of "Wed, 08 Jan 2003 03:01:49 CST." References: Message-ID: <200301081227.h08CRGY08969@pcp02138704pcs.reston01.va.comcast.net> > Exactly for this reason, changing the working directory confuses > inspect and pydoc and presumably anything else that tries to find > source code. There's no way to work around this because the true > path information is simply not available, unless we fix the > __file__ attribute. > > I'd be in favour of setting all __file__ attributes to absolute paths. Note that code objects have their own filename attribute, which is not directly related to __file__, and that's the one that causes the most problems. I truly wish we could change marshal so that when it loads a code object, it replaces the filename attribute with the filename from which the object is loaded, but that's far from easy. :-( > > > I'm disinclined to do anything about this, except perhaps warn that > > > the script directory may be given as a relative path. > > The current working directory is a piece of hidden global state. > In general, hidden global state is bad, and this particular piece > of state is especially important because it affects what Python > modules get loaded. I'd prefer for the interpreter to just set > up sys.path correctly to begin with -- what's the point in doing > it ambiguously only to fix it up later anyway? Maybe things have changed, but in the past I've been bitten by absolute path conversions. E.g. I rememeber from my time at CWI that automount caused really ugly long absulute paths that everybody hated. Also, there are conditions under which getcwd() can fail (when an ancestor directory doesn't have enough permissions) so the code doing so must be complex. That said, I'd be +0 if someone gave me a patch that fixed the path of the script (the only path that's not already absolutized by site.py). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jan 8 12:45:02 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 08 Jan 2003 07:45:02 -0500 Subject: [Python-Dev] test_logging failures In-Reply-To: Your message of "Tue, 07 Jan 2003 22:44:59 EST." <20030108034458.GO29873@epoch.metaslash.com> References: <200301071348.h07DmmP05594@pcp02138704pcs.reston01.va.comcast.net> <20030108034458.GO29873@epoch.metaslash.com> Message-ID: <200301081245.h08Cj2x09051@pcp02138704pcs.reston01.va.comcast.net> > On Tue, Jan 07, 2003 at 08:48:48AM -0500, Guido van Rossum wrote: > > I suddenly see spectacular failures in the logging module on Linux. > > Does anybody know what's up with that? > > No. It works for me on Linux. It also hasn't broken anything on the > snake farm. The test uses threads. > > Neal I can't reproduce the problem either. I don't know what was wrong with my system at the time, but I'll drop this. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jan 8 12:47:09 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 08 Jan 2003 07:47:09 -0500 Subject: [Python-Dev] Misc. warnings In-Reply-To: Your message of "Tue, 07 Jan 2003 20:32:46 EST." <20030108013246.GA3040@cthulhu.gerg.ca> References: <3E1B5B07.7000003@lemburg.com> <200301080057.h080vUE06904@pcp02138704pcs.reston01.va.comcast.net> <20030108013246.GA3040@cthulhu.gerg.ca> Message-ID: <200301081247.h08Cl9a09070@pcp02138704pcs.reston01.va.comcast.net> > I also have crazy ideas about writing an extensive, and noisy, > test script that would go in Tools or sandbox or something. Have it run only if "-u noisy" is passed to regrtest.py. --Guido van Rossum (home page: http://www.python.org/~guido/) From mhammond@skippinet.com.au Wed Jan 8 12:50:38 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Wed, 8 Jan 2003 23:50:38 +1100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: Message-ID: <02c201c2b714$8bd69500$530f8490@eden> David: > MarkH: > > I fear the only way to approach this is with a PEP. We > > need to clearly state our requirements, and clearly show > I'm also willing to lend a hand with a PEP, if it's worth anything. I > don't know as much about the problems in this domain as you do; I've > only seen this one example that bit me. I'm prepared to spend a few > brain cycles on it and help with the writing, though. Cool :) And thanks to Anthony too. I will keep python-dev CC'd for just a little longer though, just to see what is controversial. Something tells me it will be the "goal" , so let's see how we go there. My goal: For a multi-threaded application (generally this will be a larger app embedding Python, but that is irrelevant), make it reasonably easy to accomplish 2 things: 1) Allow "arbitrary" threads (that is, threads never before seen by Python) to acquire the resources necessary to call the Python C API. 2) Allow Python extensions to be written which support (1) above. Currently (2) is covered by Py_BEGIN_ALLOW_THREADS, except that it is kinda like only having a hammer in your toolbox . I assert that 2) could actually be split into discrete goals: 2.1) Extension functions that expect to take a lot of time, but generally have no thread-state considerations. This includes sleep(), all IO functions, and many others. This is exactly what Py_BEGIN_ALLOW_THREADS was designed for. 2.2) Extensions that *may* take a little time, but more to the point, may directly and synchronously trigger callbacks. That is, it is not expected that much time will be spent outside of Python, but rather that Python will be re-entered. I can concede that functions that may trigger asynch callbacks need no special handling here, as the normal Python thread switch mechanism will ensure correct their dispatch. Currently 2.1 and 2.2 are handled the same way, but this need not be the case. Currently 2.2 is only supported by *always* giving up the lock, and at each entry point *always* re-acquiring it. This is obviously wasteful if indeed the same thread immediately re-enters - hence we are here with a request for "how do I tell if I have the lock?". Combine this with the easily stated but tricky to implement (1) and no one understands it at all I also propose that we restrict this to applications that intend to use a single "PyInterpreterState" - if you truly want multiple threads running in multiple interpreters (and good luck to you - I'm not aware anyone has ever actually done it ) then you are on your own. Are these goals a reasonable starting point? This describes all my venturing into this area. If-only-it-got-easier-each-time ly, Mark. From barry@python.org Wed Jan 8 13:04:49 2003 From: barry@python.org (Barry A. Warsaw) Date: Wed, 8 Jan 2003 08:04:49 -0500 Subject: [Python-Dev] Misc. warnings References: <3E1B5B07.7000003@lemburg.com> <15899.25375.120408.421971@gargle.gargle.HOWL> <3E1BF419.7070406@lemburg.com> Message-ID: <15900.8689.231027.418876@gargle.gargle.HOWL> >>>>> "MAL" == M writes: >> with BerkeleyDB back to at least 3.3.11, which my Redhat 7.3 >> box comes with as the db3-devel-3.3.11-6 package. MAL> SuSE Linux comes with BSD DB 4.0. Perhaps that's why the MAL> installation doesn't configure for BSD DB ?! The _bsddb module that's in CVS today should work with BerkeleyDB up to the latest Sleepycat release 4.1.25. -Barry From martin@v.loewis.de Wed Jan 8 13:26:13 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 14:26:13 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <02c201c2b714$8bd69500$530f8490@eden> References: <02c201c2b714$8bd69500$530f8490@eden> Message-ID: <3E1C26F5.4040101@v.loewis.de> Mark Hammond wrote: > 1) Allow "arbitrary" threads (that is, threads never before seen by Python) > to acquire the resources necessary to call the Python C API. This is possible today, all you need is a pointer to an interpreter state. If you have that, you can use PyThreadState_New, PyEval_AcquireThread, after which you have the resources necessary to call the Python API. In many cases, extensions can safely assume that there is exactly one interpreter state all the time, so they can safe the interpreter pointer in their init function. Regards, Martin From theller@python.net Wed Jan 8 14:36:36 2003 From: theller@python.net (Thomas Heller) Date: 08 Jan 2003 15:36:36 +0100 Subject: [Python-Dev] sys.path[0] In-Reply-To: <200301081227.h08CRGY08969@pcp02138704pcs.reston01.va.comcast.net> References: <200301081227.h08CRGY08969@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: [ping] > > Exactly for this reason, changing the working directory confuses > > inspect and pydoc and presumably anything else that tries to find > > source code. There's no way to work around this because the true > > path information is simply not available, unless we fix the > > __file__ attribute. > > > > I'd be in favour of setting all __file__ attributes to absolute paths. That's what site.py does: for m in sys.modules.values(): if hasattr(m, "__file__") and m.__file__: m.__file__ = os.path.abspath(m.__file__) del m > > Note that code objects have their own filename attribute, which is not > directly related to __file__, and that's the one that causes the most > problems. I truly wish we could change marshal so that when it loads > a code object, it replaces the filename attribute with the filename > from which the object is loaded, but that's far from easy. :-( > > > > > I'm disinclined to do anything about this, except perhaps warn that > > > > the script directory may be given as a relative path. > > > > The current working directory is a piece of hidden global state. > > In general, hidden global state is bad, and this particular piece > > of state is especially important because it affects what Python > > modules get loaded. I'd prefer for the interpreter to just set > > up sys.path correctly to begin with -- what's the point in doing > > it ambiguously only to fix it up later anyway? > > Maybe things have changed, but in the past I've been bitten by > absolute path conversions. E.g. I rememeber from my time at CWI that > automount caused really ugly long absulute paths that everybody > hated. Also, there are conditions under which getcwd() can fail (when > an ancestor directory doesn't have enough permissions) so the code > doing so must be complex. > > That said, I'd be +0 if someone gave me a patch that fixed the path of > the script (the only path that's not already absolutized by site.py). I've submitted a patch #664376 which fixes the problem on Windows, I cannot do it for other systems. This patch only converts sys.path[0], it doesn't touch sys.argv[0]. Thomas From mhammond@skippinet.com.au Wed Jan 8 13:49:36 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Thu, 9 Jan 2003 00:49:36 +1100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1C26F5.4040101@v.loewis.de> Message-ID: <02d401c2b71c$c8806460$530f8490@eden> > Mark Hammond wrote: > > 1) Allow "arbitrary" threads (that is, threads never before > seen by Python) > > to acquire the resources necessary to call the Python C API. > > This is possible today, all you need is a pointer to an interpreter > state. If you have that, you can use PyThreadState_New, But what if in some cases, this callback is as a result of Python code on the same thread - ie, there already exists a Python thread-state higher up the stack? Mark. From amk@amk.ca Wed Jan 8 13:52:59 2003 From: amk@amk.ca (A.M. Kuchling) Date: Wed, 08 Jan 2003 08:52:59 -0500 Subject: [Python-Dev] Re: Whither rexec? In-Reply-To: <200301061726.h06HQWe28737@odiug.zope.com> References: <200301061726.h06HQWe28737@odiug.zope.com> Message-ID: Guido van Rossum wrote: > See my recent checkins and what I just sent to python-announce (not > sure when the moderator will get to it): Back in December I reduced the "Restricted Execution" HOWTO to a warning not to use rexec. This morning, perhaps because of Guido's announcement, I've gotten two e-mails from users of the module asking for more details, both sounding a bit desperate for alternatives. Doubtless more rexec users will come out of the woodwork as a result. I'd like to add some suggested alternatives; any suggestions? People could run untrusted code inside a chroot()'ed jail; are there any packages that help with this? If the application uses Bastion to let untrusted code access various Python objects, things get really tough; the only option might be to redesign the whole application to expose some socket-based interface to those objects, and then run jailed code that can talk to only that socket. (Completely redesigning applications that rely on running untrusted code is probably a good idea in any event.) --amk From barry@python.org Wed Jan 8 14:01:24 2003 From: barry@python.org (Barry A. Warsaw) Date: Wed, 8 Jan 2003 09:01:24 -0500 Subject: [Python-Dev] Misc. warnings References: <3E1B5B07.7000003@lemburg.com> <200301080057.h080vUE06904@pcp02138704pcs.reston01.va.comcast.net> <20030108013246.GA3040@cthulhu.gerg.ca> <200301081247.h08Cl9a09070@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15900.12084.324540.882209@gargle.gargle.HOWL> BTW Greg, there's a problem with test_ossaudiodev -- it hangs quite nicely when running "make test". When I kill it I get the following traceback. No time for me to debug it right now... -Barry -------------------- snip snip -------------------- test_ossaudiodev Traceback (most recent call last): File "./Lib/test/regrtest.py", line 961, in ? main() File "./Lib/test/regrtest.py", line 259, in main ok = runtest(test, generate, verbose, quiet, testdir) File "./Lib/test/regrtest.py", line 389, in runtest the_package = __import__(abstest, globals(), locals(), []) File "/home/barry/projects/python/Lib/test/test_ossaudiodev.py", line 93, in ? test() File "/home/barry/projects/python/Lib/test/test_ossaudiodev.py", line 90, in test play_sound_file(data, rate, ssize, nchannels) File "/home/barry/projects/python/Lib/test/test_ossaudiodev.py", line 53, in play_sound_file a.write(data) KeyboardInterrupt [85616 refs] make: [test] Error 1 (ignored) From martin@v.loewis.de Wed Jan 8 14:00:42 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 15:00:42 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <02d401c2b71c$c8806460$530f8490@eden> References: <02d401c2b71c$c8806460$530f8490@eden> Message-ID: <3E1C2F0A.2000607@v.loewis.de> Mark Hammond wrote: > But what if in some cases, this callback is as a result of Python code on > the same thread - ie, there already exists a Python thread-state higher up > the stack? Then you get a deadlock. However, it was not your (stated) goal to support this case. You mentioned threads that Python had never seen before - there can't be a thread state higher up in such a thread. Regards, Martin From markh@skippinet.com.au Wed Jan 8 14:15:37 2003 From: markh@skippinet.com.au (Mark Hammond) Date: Thu, 9 Jan 2003 01:15:37 +1100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1C2F0A.2000607@v.loewis.de> Message-ID: <000201c2b720$6b18f770$530f8490@eden> > Then you get a deadlock. However, it was not your (stated) goal to > support this case. You mentioned threads that Python had never seen > before - there can't be a thread state higher up in such a thread. My mistake - I used "i.e." in place of "e.g.". However, "arbitrary" is fairly clear. Mark. From pyth@devel.trillke.net Wed Jan 8 14:21:04 2003 From: pyth@devel.trillke.net (holger krekel) Date: Wed, 8 Jan 2003 15:21:04 +0100 Subject: [Python-Dev] Misc. warnings In-Reply-To: <15900.12084.324540.882209@gargle.gargle.HOWL>; from barry@python.org on Wed, Jan 08, 2003 at 09:01:24AM -0500 References: <3E1B5B07.7000003@lemburg.com> <200301080057.h080vUE06904@pcp02138704pcs.reston01.va.comcast.net> <20030108013246.GA3040@cthulhu.gerg.ca> <200301081247.h08Cl9a09070@pcp02138704pcs.reston01.va.comcast.net> <15900.12084.324540.882209@gargle.gargle.HOWL> Message-ID: <20030108152104.B349@prim.han.de> Barry A. Warsaw wrote: > > BTW Greg, there's a problem with test_ossaudiodev -- it hangs quite > nicely when running "make test". When I kill it I get the following > traceback. No time for me to debug it right now... had a similar problem earlier. you aren't listening to music while running the test, are you? Maybe the test should timeout, anyway? holger > > -Barry > > -------------------- snip snip -------------------- > test_ossaudiodev > Traceback (most recent call last): > File "./Lib/test/regrtest.py", line 961, in ? > main() > File "./Lib/test/regrtest.py", line 259, in main > ok = runtest(test, generate, verbose, quiet, testdir) > File "./Lib/test/regrtest.py", line 389, in runtest > the_package = __import__(abstest, globals(), locals(), []) > File "/home/barry/projects/python/Lib/test/test_ossaudiodev.py", line 93, in ? > test() > File "/home/barry/projects/python/Lib/test/test_ossaudiodev.py", line 90, in test > play_sound_file(data, rate, ssize, nchannels) > File "/home/barry/projects/python/Lib/test/test_ossaudiodev.py", line 53, in play_sound_file > a.write(data) > KeyboardInterrupt > [85616 refs] > make: [test] Error 1 (ignored) > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From martin@v.loewis.de Wed Jan 8 14:25:45 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 15:25:45 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <000201c2b720$6b18f770$530f8490@eden> References: <000201c2b720$6b18f770$530f8490@eden> Message-ID: <3E1C34E9.5040902@v.loewis.de> Mark Hammond wrote: > My mistake - I used "i.e." in place of "e.g.". However, "arbitrary" is > fairly clear. I feel this is still underspecified. I have successfully used multiple threads, and callbacks from arbitrary threads. For this to work, I have to allow threads in all calls to the library if the library can call back before returning. Regards, Martin From jim@interet.com Wed Jan 8 14:33:03 2003 From: jim@interet.com (James C. Ahlstrom) Date: Wed, 08 Jan 2003 09:33:03 -0500 Subject: [Python-Dev] sys.path[0] References: <8yxwd5ri.fsf@python.net> Message-ID: <3E1C369F.6070703@interet.com> Thomas Heller wrote: > Shouldn't Python convert sys.path to absolute path names, to avoid > these problems? IMHO, yes. If the directory of the script file is important enough to be sys.path[0], then it should be an absolute path. I believe this was discussed before at the time of PEP 273. Anyway, my implementation of PEP 273 adds the absolute path, and also moves the insertion of sys.path[0] prior to the import of site.py to emiminate yet another source of confusion. These features can be extracted from PEP 273, and they work on Unix and Windows. Just apply all patches except for import.c. JimA From markh@skippinet.com.au Wed Jan 8 14:34:30 2003 From: markh@skippinet.com.au (Mark Hammond) Date: Thu, 9 Jan 2003 01:34:30 +1100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1C34E9.5040902@v.loewis.de> Message-ID: <000c01c2b723$0e6b0bf0$530f8490@eden> [Martin] > I feel this is still underspecified. I have successfully used > multiple threads, and callbacks from arbitrary threads. For > this to work, I have to allow threads in all calls to the > library if the library can call back before returning. It can be done, yes. I am not looking for a change in semantics, just a simple way to do it (and maybe even a fast way to do it, but that is secondary). If such a way already exists, please enlighten us. If not, but it is sufficiently simple to describe, then please describe it. Otherwise, I do not understand your point. Mark. From jacobs@penguin.theopalgroup.com Wed Jan 8 14:38:17 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 8 Jan 2003 09:38:17 -0500 (EST) Subject: [Python-Dev] Re: Whither rexec? In-Reply-To: Message-ID: On Wed, 8 Jan 2003, A.M. Kuchling wrote: > Guido van Rossum wrote: > > See my recent checkins and what I just sent to python-announce (not > > sure when the moderator will get to it): > > Back in December I reduced the "Restricted Execution" HOWTO > to a warning not to use rexec. This morning, perhaps because of Guido's > announcement, I've gotten two e-mails from users of the module asking > for more details, both sounding a bit desperate for alternatives. > Doubtless more rexec users will come out of the woodwork as a result. This also deeply affects Pl/Python, the embedded Python interpreter in PostgreSQL. It runs in a "trusted mode" via a restricted execution environment. I'll drop a note to the other developers about this, so we can figure out what to do. The simple solution is to simply make Pl/Python an untrusted language, though I'm sure that won't be popular. As for fixing the problems in the Python core -- I'm willing to tentatively volunteer in the effort. I am certainly not committing to doing it all myself! However, I am happy to coordinate, code, manage design docs and validation suites, and generally keep things going. Anything more than that depends on how much help, support, real code, and testing I get from other volunteers. My first challenge to python-dev. Answer this: It has been said that the previous rexec functionality was ad hoc and brittle, and many better solutions are possible. What better alternatives exist in terms of features offered, overall runtime performance, ease of maintenance, and validation? More complete answers should address many, if not all, of the following subjects: Proxy objects -- making unsafe objects safe(r) Restricted environments -- limiting access to system resources Restricted introspection -- limiting the amount of information obtainable from exposed objects and environment Tainting -- tracking trusted status of objects Security policy management -- Configuration of how security mechanisms are applied Regards, -Kevin Jacobs -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From martin@v.loewis.de Wed Jan 8 14:49:45 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 15:49:45 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <000c01c2b723$0e6b0bf0$530f8490@eden> References: <000c01c2b723$0e6b0bf0$530f8490@eden> Message-ID: <3E1C3A89.4060504@v.loewis.de> Mark Hammond wrote: > It can be done, yes. I am not looking for a change in semantics, just a > simple way to do it (and maybe even a fast way to do it, but that is > secondary). If such a way already exists, please enlighten us. If not, but > it is sufficiently simple to describe, then please describe it. Otherwise, > I do not understand your point. There is a very simple strategy to support multiple threads in an extension module. 1. In all callbacks, create a thread state and acquire the current thread (this requires a singleton interpreter state). 2. In all API calls that may invoke callbacks, use Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS around the API call. If this strategy is followed, every code always has all Python resources, and no deadlocks result. Regards, Martin From martin@v.loewis.de Wed Jan 8 15:06:12 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 16:06:12 +0100 Subject: [Python-Dev] Re: Whither rexec? In-Reply-To: References: Message-ID: <3E1C3E64.3020609@v.loewis.de> Kevin Jacobs wrote: > It has been said that the previous rexec functionality was ad hoc and > brittle, and many better solutions are possible. What better alternatives > exist in terms of features offered, overall runtime performance, ease of > maintenance, and validation? I disagree with that statement. The approach taken by rexec is very straight forward, and, in principle, allows to impose arbitrary functional restrictions on untrusted code. Non-functional restrictions (code has to observe limitations in resource usage, such as computing time, memory consumption, or number of open files) are not easily implemented with rexec. I believe that any extension to provide such restrictions also would be orthogonal to rexec. So anybody working on this should see how rexec could be enhanced. IMO, of course. > Proxy objects -- making unsafe objects safe(r) I think the really tricky part today is usage of the type() builtin (i.e. access to an object's __class__ attribute). The new problem is that types are now callable. On one hand, you cannot hand out the true type objects to untrusted code, since they could use them to overcome the limitations. On the other hand, if the untrusted code merely uses the type objects for testing type membership, such code should continue to work. So you either need a way to disable calls to type objects, or you need proxy type objects around the builtin types. For proxy types, some types could be considered safe (e.g. int, str, unicode); others would need proxies (the file type in particular). > Restricted introspection -- limiting the amount of information > obtainable from exposed objects and > environment Can somebody please explain how this is different from restricted environments? I.e. why would one restrict introspection in general, as long as you can't obtain information about the system? > Tainting -- tracking trusted status of objects This is clearly out of scope of rexec, and, IMO, not relevant for untrusted code. Tainting is about processing untrusted data by trusted code. > Security policy management -- Configuration of how security mechanisms are > applied I think rexec is quite flexible here, giving or denying access on a per-function basis (and allowing to provide wrappers for functions which must be restricted depending on the set of arguments). Regards, Martin From guido@python.org Wed Jan 8 15:17:26 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 08 Jan 2003 10:17:26 -0500 Subject: [Python-Dev] patch CGIHTTPServer.py for IE POST bug In-Reply-To: Your message of "Tue, 07 Jan 2003 17:00:42 PST." Message-ID: <200301081517.h08FHQ509776@pcp02138704pcs.reston01.va.comcast.net> > Steve Holden submitted a patch for the IE POST bug in December, which I > thought had made it into 2.3a1, but apparently not. I've been using this > select code to throw away extra bytes from IE POSTs on my Win2K box for > about four months and haven't noticed any problems using MoinMoin on my > local box as well as some test POST scripts. However, Steve says he is busy > with PyCon and this patch, though small, probably needs review before a > commit. > > The bug: > http://sourceforge.net/tracker/?func=detail&aid=430160&group_id=5470&atid=105470 > > and > > http://sourceforge.net/tracker/?func=detail&aid=427345&group_id=5470&atid=105470 > > The patch: > http://sourceforge.net/tracker/?func=detail&aid=654910&group_id=5470&atid=305470 Looks good to me; it was already assigned to MvL and I've simply marked it as Accepted. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Jan 8 15:20:39 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 08 Jan 2003 10:20:39 -0500 Subject: [Python-Dev] Raising string exceptions In-Reply-To: Your message of "08 Jan 2003 10:26:13 +0100." References: <20030108020845.GL29873@epoch.metaslash.com> Message-ID: <200301081520.h08FKeR09799@pcp02138704pcs.reston01.va.comcast.net> > Neal Norwitz writes: > > > Should the error message when raising an invalid exception: > > > > "exceptions must be strings, classes or instances" > > > > in ceval.c:2743 be changed to remove strings, since this is > > deprecated? [MvL] > +1. Maybe change to "string (deprecated)" or change "must" to "should". > > Also, should we add a PendingDeprecationWarning when raising a > > string exception? > > -0. I think making it a nearly undocumented feature is enough for the > coming years. Actually, PendingDeprecationWarning sounds good to me -- that way you can turn on heavy warnings to review if a piece of code needs work. In general, everything that's deprecated should at least have a PendingDeprecationWarning warning attached to it, so there is *some* way to discover use of deprecated features. From dave@boost-consulting.com Wed Jan 8 15:27:11 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 08 Jan 2003 10:27:11 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <02d401c2b71c$c8806460$530f8490@eden> ("Mark Hammond"'s message of "Thu, 9 Jan 2003 00:49:36 +1100") References: <02d401c2b71c$c8806460$530f8490@eden> Message-ID: "Mark Hammond" writes: >> Mark Hammond wrote: >> > 1) Allow "arbitrary" threads (that is, threads never before >> seen by Python) >> > to acquire the resources necessary to call the Python C API. >> >> This is possible today, all you need is a pointer to an interpreter >> state. If you have that, you can use PyThreadState_New, > > But what if in some cases, this callback is as a result of Python code on > the same thread - ie, there already exists a Python thread-state higher up > the stack? I believe that's the case which bit me. -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From martin@v.loewis.de Wed Jan 8 15:33:26 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 16:33:26 +0100 Subject: [Python-Dev] patch CGIHTTPServer.py for IE POST bug In-Reply-To: <200301081517.h08FHQ509776@pcp02138704pcs.reston01.va.comcast.net> References: <200301081517.h08FHQ509776@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E1C44C6.7030704@v.loewis.de> > Looks good to me; it was already assigned to MvL and I've simply > marked it as Accepted. I had assigned it for review only, as well; Steve should probably best commit it himself (reassigned). Regards, Martin From dave@boost-consulting.com Wed Jan 8 15:40:33 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 08 Jan 2003 10:40:33 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <02c201c2b714$8bd69500$530f8490@eden> ("Mark Hammond"'s message of "Wed, 8 Jan 2003 23:50:38 +1100") References: <02c201c2b714$8bd69500$530f8490@eden> Message-ID: "Mark Hammond" writes: > My goal: > > For a multi-threaded application (generally this will be a larger > app embedding Python, but that is irrelevant), make it reasonably > easy to accomplish 2 things: > > 1) Allow "arbitrary" threads (that is, threads never before seen by > Python) to acquire the resources necessary to call the Python C API. > > 2) Allow Python extensions to be written which support (1) above. 3) Allow "arbitrary" threads to acquire the resources necessary to call the Python C API, even if they already have those resources, and to later release them if they did not have those resources. > Currently (2) is covered by Py_BEGIN_ALLOW_THREADS, except that it > is kinda like only having a hammer in your toolbox . I assert > that 2) could actually be split into discrete goals: I'm going to ask some questions just to make sure your terminology is clear to me: > 2.1) Extension functions that expect to take a lot of time, but > generally have no thread-state considerations. This includes > sleep(), all IO functions, and many others. This is exactly what > Py_BEGIN_ALLOW_THREADS was designed for. In other words, functions which will not call back into the Python API? > 2.2) Extensions that *may* take a little time, but more to the > point, may directly and synchronously trigger callbacks. By "callbacks", do you mean "functions which (may) use the Python C API?" > That is, it is not expected that much time will be spent outside of > Python, but rather that Python will be re-entered. I can concede > that functions that may trigger asynch callbacks need no special > handling here, as the normal Python thread switch mechanism will > ensure correct their dispatch. By "trigger asynch callbacks" do you mean, "cause a callback to occur on a different thread?" > Currently 2.1 and 2.2 are handled the same way, but this need not be > the case. Currently 2.2 is only supported by *always* giving up the > lock, and at each entry point *always* re-acquiring it. This is > obviously wasteful if indeed the same thread immediately re-enters - > hence we are here with a request for "how do I tell if I have the > lock?". Yep, that pinpoints my problem. > Combine this with the easily stated but tricky to implement (1) and > no one understands it at all > > I also propose that we restrict this to applications that intend to > use a single "PyInterpreterState" - if you truly want multiple > threads running in multiple interpreters (and good luck to you - I'm > not aware anyone has ever actually done it ) then you are on > your own. Fine with me ;-). I think eventually we'll need to come up with a more precise definition of exactly when "you're on your own", but for now that'll do. > Are these goals a reasonable starting point? This describes all my > venturing into this area. Sounds about right to me. -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From jacobs@penguin.theopalgroup.com Wed Jan 8 15:50:50 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 8 Jan 2003 10:50:50 -0500 (EST) Subject: [Python-Dev] Re: Whither rexec? In-Reply-To: <3E1C3E64.3020609@v.loewis.de> Message-ID: On Wed, 8 Jan 2003, "Martin v. L=F6wis" wrote: > Kevin Jacobs wrote: > > It has been said that the previous rexec functionality was ad hoc a= nd > > brittle, and many better solutions are possible. What better alter= natives > > exist in terms of features offered, overall runtime performance, ea= se of > > maintenance, and validation? >=20 > I disagree with that statement. The approach taken by rexec is very=20 > straight forward, and, in principle, allows to impose arbitrary=20 > functional restrictions on untrusted code. Good. I only partly agree with it myself. However, rexec _is_ brittle, = as demonstrated by the many incremental problems that keep popping up, even pre-Python 2.2. > So anybody working on this should see how rexec could be enhanced. IMO,= =20 > of course. I agree, though seeing how it can be fixed is not the same as deciding th= at it is the optimal solution. I'm starting out with a very open mind and a= m purposely solicting for as much input as possible. > > Proxy objects -- making unsafe objects safe(r) [...] > > Restricted introspection -- limiting the amount of information > > obtainable from exposed objects and > > environment >=20 > Can somebody please explain how this is different from restricted=20 > environments? I.e. why would one restrict introspection in general, as=20 > long as you can't obtain information about the system? The closure of all objects reachable (via introspection) from a given starting set can be _very_ large and non-trivial to compute.=20 Limiting introspection is a simple way to close many of possible holes through which references to untrusted objects can be obtained. > > Tainting -- tracking trusted status of objects >=20 > This is clearly out of scope of rexec, and, IMO, not relevant for=20 > untrusted code. Tainting is about processing untrusted data by trusted = code. I don't think it is so clearly out of the scope of the space of all possi= ble restricted execution enviornments. Tainting (used in a fairly liberal sense) is one way to propogate the security status of objects without hav= ing to proxy them. > > Security policy management -- Configuration of how security mechani= sms are > > applied >=20 > I think rexec is quite flexible here, giving or denying access on a=20 > per-function basis (and allowing to provide wrappers for functions whic= h=20 > must be restricted depending on the set of arguments). I agree. Thanks for your input, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From martin@v.loewis.de Wed Jan 8 16:06:02 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 17:06:02 +0100 Subject: [Python-Dev] Re: Whither rexec? In-Reply-To: References: Message-ID: <3E1C4C6A.1010700@v.loewis.de> Kevin Jacobs wrote: > Good. I only partly agree with it myself. However, rexec _is_ brittle, as > demonstrated by the many incremental problems that keep popping up, even > pre-Python 2.2. I only have now looked in my dictionary to find the translation for "brittle" :-) (I think "brüchig" is the proper translation in this context) I agree it is brittle. It should be possible to macerate it, though. > I agree, though seeing how it can be fixed is not the same as deciding that > it is the optimal solution. I'm starting out with a very open mind and am > purposely solicting for as much input as possible. I think any maintainer of such a feature would need to take the existing code base into account. Current users would certainly be served best if rexec would work. > The closure of all objects reachable (via introspection) from > a given starting set can be _very_ large and non-trivial to compute. > Limiting introspection is a simple way to close many of possible holes > through which references to untrusted objects can be obtained. I guess you have to define "introspection", then. To navigate to an object, I don't need introspection: I can just access the attributes, without investigating first which objects are there. IOW, if I Tkinter.open was the builtin open function, I would not need to use introspection to find out it was there - I could just *use* Tkinter.open("/etc/passwd", "a"). In Python, anything that is reachable with introspection is also reachable without introspection. Regards, Martin From barry@python.org Wed Jan 8 16:36:53 2003 From: barry@python.org (Barry A. Warsaw) Date: Wed, 8 Jan 2003 11:36:53 -0500 Subject: [Python-Dev] Misc. warnings References: <3E1B5B07.7000003@lemburg.com> <200301080057.h080vUE06904@pcp02138704pcs.reston01.va.comcast.net> <20030108013246.GA3040@cthulhu.gerg.ca> <200301081247.h08Cl9a09070@pcp02138704pcs.reston01.va.comcast.net> <15900.12084.324540.882209@gargle.gargle.HOWL> <20030108152104.B349@prim.han.de> Message-ID: <15900.21413.758446.734522@gargle.gargle.HOWL> >>>>> "hk" == holger krekel writes: >> BTW Greg, there's a problem with test_ossaudiodev -- it hangs >> quite nicely when running "make test". When I kill it I get >> the following traceback. No time for me to debug it right >> now... hk> had a similar problem earlier. you aren't listening to music hk> while running the test, are you? Nope. -Barry From dave@boost-consulting.com Wed Jan 8 17:11:29 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 08 Jan 2003 12:11:29 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1C3A89.4060504@v.loewis.de> ("Martin v. =?iso-8859-1?q?L=F6wis"'s?= message of "Wed, 08 Jan 2003 15:49:45 +0100") References: <000c01c2b723$0e6b0bf0$530f8490@eden> <3E1C3A89.4060504@v.loewis.de> Message-ID: "Martin v. L=F6wis" writes: > Mark Hammond wrote: >> It can be done, yes. I am not looking for a change in semantics, just a >> simple way to do it (and maybe even a fast way to do it, but that is >> secondary). If such a way already exists, please enlighten us. If not,= but >> it is sufficiently simple to describe, then please describe it. Otherwi= se, >> I do not understand your point. > > There is a very simple strategy to support multiple threads in an > extension module. > > 1. In all callbacks, create a thread state and acquire the current > thread (this requires a singleton interpreter state). > > 2. In all API calls that may invoke callbacks, use > Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS around the API call. > > If this strategy is followed, every code always has all Python > resources, and no deadlocks result. IIUC, that strategy doesn't get Mark what he wants in this case: 2.2) Extensions that *may* take a little time, but more to the point, may directly and synchronously trigger callbacks. That is, it is not expected that much time will be spent outside of Python, but rather that Python will be re-entered. Which is to be able to avoid releasing the GIL in the case where the extension isn't going to do much other than invoke the callback function which re-acquires it. --=20 David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From skip@pobox.com Wed Jan 8 17:14:44 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 8 Jan 2003 11:14:44 -0600 Subject: [Python-Dev] tainting In-Reply-To: References: <3E1C3E64.3020609@v.loewis.de> Message-ID: <15900.23684.730182.550578@montanaro.dyndns.org> >> > Tainting -- tracking trusted status of objects >> >> This is clearly out of scope of rexec, and, IMO, not relevant for >> untrusted code. Tainting is about processing untrusted data by >> trusted code. Kevin> I don't think it is so clearly out of the scope of the space of Kevin> all possible restricted execution enviornments. Tainting (used Kevin> in a fairly liberal sense) is one way to propogate the security Kevin> status of objects without having to proxy them. Can tainting be restricted to just strings and unicode objects or is it a facility which needs to be extended to all objects whose state could be affected by them? For example, if I execute: s = raw_input("Enter a string here: ") Clearly s would be tainted. Suppose I then executed: t = int(s) x.foo = s[4:] Would t need to be tainted? I assume the object associated with x.foo would have to be since it is a string (actually, that would be a side effect of the slicing operation). Would the object associated with x itself have to be tainted? How best to untaint an object? Perl untaints when the programmer extracts bits from a tainted string via regular expressions. That seems rather unPythonic. Should objects which can be tainted just have a writable 'taint' attribute? Skip From martin@v.loewis.de Wed Jan 8 17:22:37 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 18:22:37 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: References: <000c01c2b723$0e6b0bf0$530f8490@eden> <3E1C3A89.4060504@v.loewis.de> Message-ID: <3E1C5E5D.3070106@v.loewis.de> David Abrahams wrote: > IIUC, that strategy doesn't get Mark what he wants in this case: > > 2.2) Extensions that *may* take a little time, but more to the > point, may directly and synchronously trigger callbacks. That is, > it is not expected that much time will be spent outside of Python, > but rather that Python will be re-entered. > > Which is to be able to avoid releasing the GIL in the case where the > extension isn't going to do much other than invoke the callback > function which re-acquires it. I think you are incorrectly interpreting Mark's priorities: I am not looking for a change in semantics, just a simple way to do it (and maybe even a fast way to do it, but that is secondary). So performance is not the his primary goal. The goal is that it is easy to use, and I think my strategy is fairly easy to follow: If in doubt, release the lock. Regards, Martin From theller@python.net Wed Jan 8 17:59:16 2003 From: theller@python.net (Thomas Heller) Date: 08 Jan 2003 18:59:16 +0100 Subject: [Python-Dev] sys.path[0] In-Reply-To: References: <200301081227.h08CRGY08969@pcp02138704pcs.reston01.va.comcast.net> Message-ID: > I've submitted a patch #664376 which fixes the problem on Windows, > I cannot do it for other systems. > > This patch only converts sys.path[0], it doesn't touch sys.argv[0]. Guido has approved this patch, so it is checked in. Volunteers needed to extend it to linux ;-) Thomas From jacobs@penguin.theopalgroup.com Wed Jan 8 17:24:39 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 8 Jan 2003 12:24:39 -0500 (EST) Subject: [Python-Dev] tainting In-Reply-To: <15900.23684.730182.550578@montanaro.dyndns.org> Message-ID: On Wed, 8 Jan 2003, Skip Montanaro wrote: > >> > Tainting -- tracking trusted status of objects > >> > >> This is clearly out of scope of rexec, and, IMO, not relevant for > >> untrusted code. Tainting is about processing untrusted data by > >> trusted code. > > Kevin> I don't think it is so clearly out of the scope of the space of > Kevin> all possible restricted execution enviornments. Tainting (used > Kevin> in a fairly liberal sense) is one way to propogate the security > Kevin> status of objects without having to proxy them. > > Can tainting be restricted to just strings and unicode objects or is it a > facility which needs to be extended to all objects whose state could be > affected by them? Tainting a la Perl is all about strings and the operations that will taint and untaint, mainly to keep neophytes from writing bad CGI script. For my purposes, I want tainting to represent the 'trustiness' of any object in order to tell the interpreter what operations may be performed on/with it in a given context. Maybe is would be clearer to talk about 'security monikers' instead of tainting. > For example, if I execute: > > s = raw_input("Enter a string here: ") > > Clearly s would be tainted. Suppose I then executed: > > t = int(s) > x.foo = s[4:] > > Would t need to be tainted? I assume the object associated with x.foo would > have to be since it is a string (actually, that would be a side effect of > the slicing operation). Would the object associated with x itself have to > be tainted? > > How best to untaint an object? Perl untaints when the programmer extracts > bits from a tainted string via regular expressions. That seems rather > unPythonic. Should objects which can be tainted just have a writable > 'taint' attribute? I'll deffer to Lewis Carrol: Alice asks: "Would you tell me, please, which way I ought to go from here?" "That depends a good deal on where you want to get to," said the Cheshire Cat. -Kevin ;) -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From martin@v.loewis.de Wed Jan 8 17:33:35 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 18:33:35 +0100 Subject: [Python-Dev] tainting In-Reply-To: <15900.23684.730182.550578@montanaro.dyndns.org> References: <3E1C3E64.3020609@v.loewis.de> <15900.23684.730182.550578@montanaro.dyndns.org> Message-ID: <3E1C60EF.7040109@v.loewis.de> Skip Montanaro wrote: > Can tainting be restricted to just strings and unicode objects or is it a > facility which needs to be extended to all objects whose state could be > affected by them? If I understand things correctly, in general, if the result depends on a tainted argument, it becomes itself tainted. I'm unsure whether exceptions should be made for container objects, as those may consist of tainted and untainted components. > Clearly s would be tainted. Suppose I then executed: > > t = int(s) The question here is whether execution of int(s) would be allowed. There would need to be some machinery to determine whether the "normal" outcome of an operation is also produced with a tainted argument. If the operation has its normal outcome, then clearly that is tainted as well. The question is whether an exceptional outcome would have to be tainted. > x.foo = s[4:] > > Would t need to be tainted? I assume the object associated with x.foo would > have to be since it is a string (actually, that would be a side effect of > the slicing operation). Would the object associated with x itself have to > be tainted? I would normally think only x.__dict__['foo'] needs to be tainted, since everything else does not depend on untrusted input. One may argue that len(x.__dict__) may change as a result of this operation (so the entire dictionary is affected). However, that happens independent of whether s is tainted or not. So if you have d[s] = 1 then tainting s might be necessary, since now len(s) depends on the value of s. Regards, Martin From dave@boost-consulting.com Wed Jan 8 17:53:32 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 08 Jan 2003 12:53:32 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1C5E5D.3070106@v.loewis.de> ("Martin v. =?iso-8859-1?q?L=F6wis"'s?= message of "Wed, 08 Jan 2003 18:22:37 +0100") References: <000c01c2b723$0e6b0bf0$530f8490@eden> <3E1C3A89.4060504@v.loewis.de> <3E1C5E5D.3070106@v.loewis.de> Message-ID: "Martin v. L=F6wis" writes: > David Abrahams wrote: >> IIUC, that strategy doesn't get Mark what he wants in this case: >> 2.2) Extensions that *may* take a little time, but more to the >> point, may directly and synchronously trigger callbacks. That is, >> it is not expected that much time will be spent outside of Python, >> but rather that Python will be re-entered. >> Which is to be able to avoid releasing the GIL in the case where the >> extension isn't going to do much other than invoke the callback >> function which re-acquires it. > > I think you are incorrectly interpreting Mark's priorities: > > I am not looking for a change in semantics, just a > simple way to do it (and maybe even a fast way to do it, > but that is secondary). > > So performance is not the his primary goal. The goal is that it is > easy to use, and I think my strategy is fairly easy to follow: If in > doubt, release the lock. OK. I guess there's one more point worth mentioning: APIs are not always scrupulously documented. In particular, documentation might give you no reason to think any callbacks will be invoked for a given call, when in fact it will be. Furthermore, problems with not releasing the GIL will don't show up with arbitrary callbacks in the API, only when someone finally installs one which uses Python's API. The Windows API is a prime example of this, but I'm sure there are many others. If we could make "creating a thread state and acquiring the current thread" immune to the condition where the the current thread is already acquired, we'd be making it much easier to write bulletproof extensions. --=20 David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From dave@boost-consulting.com Wed Jan 8 18:04:46 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 08 Jan 2003 13:04:46 -0500 Subject: [Python-Dev] Trouble with Python 2.3a1 In-Reply-To: <200301072312.h07NCxk29254@odiug.zope.com> (Guido van Rossum's message of "Tue, 07 Jan 2003 18:12:58 -0500") References: <200301072312.h07NCxk29254@odiug.zope.com> Message-ID: Guido van Rossum writes: >> Ralf describes below a change in the name of classes defined in the >> __main__ module for 2.3a1, which causes some of our tests to fail. >> Was it intentional? > > I need more information. __main__ gets prepended in certain > situations when no module name is given. I don't recall exactly what > changed. OK, here's some more information. I'd just give you a test case, but making a 'C' testcase that isn't tied to Boost.Python will take a lot of work in this instance. My extension module, 'simple' is creating a class derived from 'object' by calling my own metaclass, which is derived from PyTypeType and basically adds nothing to it. The new class is called 'empty'. In order to ensure that empty gets the right __module__ attribute, I followed your suggestion of prepending the module name and a dot to the class name before passing it on to the metaclass I'm invoking. The output below shows that in 2.3a1, that module prefix string is discarded in favor of the name of the module which imports the extension module. Python 2.2.1 (#2, Jun 17 2002, 12:06:51) [GCC 2.96 20000731 (Red Hat Linux 7.3 2.96-110)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import simple module_prefix result: simple. >>> simple.empty.__module__ 'simple' >>> Python 2.3a1 (#1, Jan 6 2003, 14:17:56) [GCC 2.96 20000731 (Red Hat Linux 7.3 2.96-110)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import simple module_prefix result: simple. >>> simple.empty.__module__ '__main__' >>> -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From skip@pobox.com Wed Jan 8 18:04:12 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 8 Jan 2003 12:04:12 -0600 Subject: [Python-Dev] tainting In-Reply-To: <3E1C60EF.7040109@v.loewis.de> References: <3E1C3E64.3020609@v.loewis.de> <15900.23684.730182.550578@montanaro.dyndns.org> <3E1C60EF.7040109@v.loewis.de> Message-ID: <15900.26652.536450.674059@montanaro.dyndns.org> Martin> So if you have Martin> d[s] = 1 Martin> then tainting s might be necessary, since now len(s) depends on Martin> the value of s. You meant "... tainting d might be necessary, ...", right? S From walter@livinglogic.de Wed Jan 8 18:25:24 2003 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Wed, 08 Jan 2003 19:25:24 +0100 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: <200301081221.h08CLHG08953@pcp02138704pcs.reston01.va.comcast.net> References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> <15892.56781.473869.187757@montanaro.dyndns.org> <3E1B0287.6080904@livinglogic.de> <200301080134.h081YSK07358@pcp02138704pcs.reston01.va.comcast.net> <3E1C01FA.9020602@livinglogic.de> <200301081221.h08CLHG08953@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E1C6D14.7080908@livinglogic.de> Guido van Rossum wrote: >>So should I go on with this? Do we want to change all tests before 2.3 >>is finished, or start changing them after 2.3 is released >>(or something in between)? > > > I'd say something in between. It's probably good if someone (not me) > reviews your patches. Definitely. Any volunteers? > It's never good to rush these things, and I've > said before that there's no reason to insist on all tests being PyUnit > tests. (Definitely don't touch anything that's using doctest.) I won't. I'll start with the easy ones. > In particular, I *don't* want you to literally translate existing > tests to PyUnit idiom. Try to improve on the tests -- think about end > cases, etc. I would find it easier to do this in two steps. Enhancing the tests takes time, so the test will change in CVS and I'd have to keep track of the changes. I'd prefer to check in the 1:1 port as soon as possible and enhance the tests afterwards. > There are coverage tools (ask Skip) -- when testing > Python modules, see if the test covers all code in the module! Bye, Walter Dörwald From martin@v.loewis.de Wed Jan 8 18:30:46 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 19:30:46 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: References: <000c01c2b723$0e6b0bf0$530f8490@eden> <3E1C3A89.4060504@v.loewis.de> <3E1C5E5D.3070106@v.loewis.de> Message-ID: <3E1C6E56.6030205@v.loewis.de> David Abrahams wrote: > OK. I guess there's one more point worth mentioning: APIs are not > always scrupulously documented. In particular, documentation might > give you no reason to think any callbacks will be invoked for a given > call, when in fact it will be. [...] > The Windows API is a prime example of this Are you sure about this? I would expect that the documentation of the Win32 API is very clear about when and how user code is invoked. More precisely, no API function except DispatchEvent will ever invoke user code. Maybe you meant "Windows API" in a more general sense? If you include COM, then yes, any invocation of a COM object may do many things, so you should always release the GIL when invoking a COM method. Regards, Martin From dave@boost-consulting.com Wed Jan 8 19:00:24 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 08 Jan 2003 14:00:24 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1C6E56.6030205@v.loewis.de> ("Martin v. =?iso-8859-1?q?L=F6wis"'s?= message of "Wed, 08 Jan 2003 19:30:46 +0100") References: <000c01c2b723$0e6b0bf0$530f8490@eden> <3E1C3A89.4060504@v.loewis.de> <3E1C5E5D.3070106@v.loewis.de> <3E1C6E56.6030205@v.loewis.de> Message-ID: "Martin v. L=F6wis" writes: > David Abrahams wrote: >> OK. I guess there's one more point worth mentioning: APIs are not >> always scrupulously documented. In particular, documentation might >> give you no reason to think any callbacks will be invoked for a given >> call, when in fact it will be. > [...] >> The Windows API is a prime example of this > > Are you sure about this? I would expect that the documentation of the > Win32 API is very clear about when and how user code is invoked. More > precisely, no API function except DispatchEvent will ever invoke user > code. > > Maybe you meant "Windows API" in a more general sense? If you include > COM, then yes, any invocation of a COM object may do many things, so > you should always release the GIL when invoking a COM method. No, in fact there are several places where the API docs are less-than-scrupulous about letting you know that your own event dispatching hook may be re-entered during the call. It's been a long time since I've had the pleasure, but IIRC one of them happens in the APIs for printing. --=20 David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From tim.one@comcast.net Wed Jan 8 19:14:22 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 08 Jan 2003 14:14:22 -0500 Subject: [Python-Dev] Misc. warnings In-Reply-To: Message-ID: [Martin v. L=F6wis] > I don't know. I cannot understand the purpose of _expectations. If a Python expert for platform sys.platform X would be at all surpri= sed if test T got skipped when running the tests on X, then T should not be = listed in _expectations[X], else it should be, with disagreements favoring l= eaving it out (the disagreeing experts can add a comment to regrtest.py abou= t why they disagree, or make regrtest smarter about recognizing when a test= skip is expected). I added a couple of the latter to _ExpectedSkips.__ini= t__: if not os.path.supports_unicode_filenames: self.expected.add('test_pep277') if test_normalization.skip_expected: self.expected.add('test_normalization') if test_socket_ssl.skip_expected: self.expected.add('test_socket_ssl') but so far nobody else has cared enough to add more of this nature. = If a module can't itself tell whether it expects to be skipped, then perha= ps we need another "mystery test" category (where mystery =3D=3D we have no= idea whether the test should pass -- such tests will always be a source of confusion). From mal@lemburg.com Wed Jan 8 19:34:36 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 08 Jan 2003 20:34:36 +0100 Subject: [Python-Dev] Re: Whither rexec? In-Reply-To: References: <200301061726.h06HQWe28737@odiug.zope.com> Message-ID: <3E1C7D4C.5080502@lemburg.com> A.M. Kuchling wrote: > Guido van Rossum wrote: > >> See my recent checkins and what I just sent to python-announce (not >> sure when the moderator will get to it): > > > Back in December I reduced the "Restricted Execution" HOWTO > to a warning not to use rexec. This morning, perhaps because of Guido's > announcement, I've gotten two e-mails from users of the module asking > for more details, both sounding a bit desperate for alternatives. > Doubtless more rexec users will come out of the woodwork as a result. > > I'd like to add some suggested alternatives; any suggestions? People > could run untrusted code inside a chroot()'ed jail; are there any > packages that help with this? > > If the application uses Bastion to let untrusted code access various > Python objects, things get really tough; the only option might be to > redesign the whole application to expose some socket-based interface to > those objects, and then run jailed code that can talk to only that > socket. (Completely redesigning applications that rely on running > untrusted code is probably a good idea in any event.) If you only want to secure a few objects, then mxProxy can help you with this: it allows access management at C level on a per-method basis and also via callbacks... http://www.egenix.com/files/python/mxProxy.html -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jan 8 19:32:00 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 08 Jan 2003 20:32:00 +0100 Subject: [Python-Dev] Misc. warnings In-Reply-To: <20030108152104.B349@prim.han.de> References: <3E1B5B07.7000003@lemburg.com> <200301080057.h080vUE06904@pcp02138704pcs.reston01.va.comcast.net> <20030108013246.GA3040@cthulhu.gerg.ca> <200301081247.h08Cl9a09070@pcp02138704pcs.reston01.va.comcast.net> <15900.12084.324540.882209@gargle.gargle.HOWL> <20030108152104.B349@prim.han.de> Message-ID: <3E1C7CB0.3010808@lemburg.com> FYI, I opened two bugs reports for the findings: http://sourceforge.net/tracker/index.php?func=detail&aid=664584&group_id=5470&atid=105470 assigned to Greg Ward http://sourceforge.net/tracker/index.php?func=detail&aid=664581&group_id=5470&atid=105470 assigned to Barry Warsaw Feel free to comment there. I don't have time to look into fixing these. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From cnetzer@mail.arc.nasa.gov Wed Jan 8 20:14:58 2003 From: cnetzer@mail.arc.nasa.gov (Chad Netzer) Date: Wed, 8 Jan 2003 12:14:58 -0800 Subject: [Python-Dev] Raising string exceptions In-Reply-To: <200301081520.h08FKeR09799@pcp02138704pcs.reston01.va.comcast.net> References: <20030108020845.GL29873@epoch.metaslash.com> <200301081520.h08FKeR09799@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200301082014.MAA03225@mail.arc.nasa.gov> On Wednesday 08 January 2003 07:20, Guido van Rossum wrote: > > > in ceval.c:2743 be changed to remove strings, since this is > > > deprecated? > Maybe change to "string (deprecated)" or change "must" to "should". And move it after 'class' and 'instance' in the printout, to subtly make it the last choice, not the first. ie. "exceptions must be classes or instances.\n(or strings, which are deprecated)." or "exceptions must be classes or instances. (strings are deprecated)." or "exceptions must be classes, instances, or (deprecated) strings." -- Bay Area Python Interest Group - http://www.baypiggies.net/ Chad Netzer cnetzer@mail.arc.nasa.gov From martin@v.loewis.de Wed Jan 8 21:10:59 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 22:10:59 +0100 Subject: [Python-Dev] Misc. warnings In-Reply-To: References: Message-ID: <3E1C93E3.5000607@v.loewis.de> Tim Peters wrote: >>I don't know. I cannot understand the purpose of _expectations. > > If a Python expert for platform sys.platform X would be at all surprised if > test T got skipped when running the tests on X, then T should not be > listed in _expectations[X] You mean: If there is no way in the world that this test could ever be skipped, on a computer running this system? > else it should be, with disagreements favoring leaving > it out (the disagreeing experts can add a comment to regrtest.py about why > they disagree, or make regrtest smarter about recognizing when a test skip > is expected). I feel that reality will turn out differently: it is less work to add it to _expectations, so over the long run, anything that was ever skipped on some installation will end up in _expectations. Regards, Martin From martin@v.loewis.de Wed Jan 8 21:19:01 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 08 Jan 2003 22:19:01 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: References: <000c01c2b723$0e6b0bf0$530f8490@eden> <3E1C3A89.4060504@v.loewis.de> <3E1C5E5D.3070106@v.loewis.de> <3E1C6E56.6030205@v.loewis.de> Message-ID: <3E1C95C5.7010003@v.loewis.de> David Abrahams wrote: > No, in fact there are several places where the API docs are > less-than-scrupulous about letting you know that your own event > dispatching hook may be re-entered during the call. It's been a long > time since I've had the pleasure, but IIRC one of them happens in the > APIs for printing. It's unclear what you are talking about here. If you mean PrintDlgEx, then it very well documents that PRINTDLGEX.lpCallback can be invoked. In any case, it would be a bug in the wrapper to not release the GIL around calling PrintDlgEx. Bugs happen and they can be fixed. Regards, Martin From tim.one@comcast.net Wed Jan 8 21:18:56 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 08 Jan 2003 16:18:56 -0500 Subject: [Python-Dev] Misc. warnings In-Reply-To: <3E1C93E3.5000607@v.loewis.de> Message-ID: [Tim] >> If a Python expert for platform sys.platform X would be at all >> surprised if test T got skipped when running the tests on X, then = T >> should not be listed in _expectations[X] [Martin v. L=F6wis] > You mean: If there is no way in the world that this test could ever > be skipped, on a computer running this system? I meant what I said. What a platform expert finds surprising is a ma= tter for their best judgment. > ... > I feel that reality will turn out differently: it is less work to a= dd > it to _expectations, so over the long run, anything that was ever > skipped on some installation will end up in _expectations. If the experts for platform X are happy with that outcome, fine by me= . From jack@performancedrivers.com Wed Jan 8 21:26:10 2003 From: jack@performancedrivers.com (Jack Diederich) Date: Wed, 8 Jan 2003 16:26:10 -0500 Subject: [Python-Dev] playback debugging Message-ID: <20030108162610.C4744@localhost.localdomain> Would it be possible to save every function return/stack in a running python program and then playback the results in a debugger? The article linked from /. http://web.media.mit.edu/~lieber/Lieberary/Softviz/CACM-Debugging/CACM-Debugging-Intro.html#Intro mentions that a lisp interpreter (Lisp stepper ZStep 95) did something like this that then let you replay a program from the end back to the beginning. The idea would be that every function return replays the orignal return even if the outcome is different. core operations that call out to C wouldn't actually make the call, so random() would always return the same value that it orignally did at that spot. Ditto for file.read(). The pure python code would almost by definition do excatly the same thing, b/c the inputs would always be identical. If this isn't always possible a warning could be issued and the orignal value be returned regardless. This would require work on the interpreter and the debugger, but does anyone see a reason why this is technically impossible? Async IO programs would be SOL. And large or long running programs would be infeasable. I run everything under mod_python which essentially means a series of very short one-off programs. This would be a huge win IMO. I currently save the input parameters and can re-run them outside of apache. This is non-ideal as some server side state is lost, but works most of the time. A large number of python programs would reap a huge reward from this, but please correct me if I'm missing the obvious. -jackdied There is a fine line between being "The Man" and being "That Guy" From dave@boost-consulting.com Wed Jan 8 21:48:56 2003 From: dave@boost-consulting.com (David Abrahams) Date: Wed, 08 Jan 2003 16:48:56 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1C95C5.7010003@v.loewis.de> ("Martin v. =?iso-8859-1?q?L=F6wis"'s?= message of "Wed, 08 Jan 2003 22:19:01 +0100") References: <000c01c2b723$0e6b0bf0$530f8490@eden> <3E1C3A89.4060504@v.loewis.de> <3E1C5E5D.3070106@v.loewis.de> <3E1C6E56.6030205@v.loewis.de> <3E1C95C5.7010003@v.loewis.de> Message-ID: "Martin v. L=F6wis" writes: > David Abrahams wrote: >> No, in fact there are several places where the API docs are >> less-than-scrupulous about letting you know that your own event >> dispatching hook may be re-entered during the call. It's been a long >> time since I've had the pleasure, but IIRC one of them happens in the >> APIs for printing. > > It's unclear what you are talking about here. If you mean PrintDlgEx, > then it very well documents that PRINTDLGEX.lpCallback can be > invoked. Well, as I said, it's been a long time, so I don't remember the details. However, let's assume it was PrintDlgEx for the time being. If the caller of PrintDlgEx controls the contents of the PRINTDLGEX structure, he can determine whether its lpCallback points to a function that calls back into Python. If it doesn't call back into Python, he might reasonably presume that there's no need to release the GIL. He would be wrong. Lots of events can be dispatched to the application before PrintDlgEx returns, so he needs to release the GIL if anything in the application event handler can invoke Python. AFAICT, this is typical for any Windows API function which the Windows engineers thought might take a long time to return, and it's typically not documented. > In any case, it would be a bug in the wrapper to not release the GIL > around calling PrintDlgEx.=20 You say tomato, I say documentation bug. > Bugs happen and they can be fixed. Yes. This is an example of a kind of bug which is not uncommon, and very hard to detect under some reasonable usage/development scenarios. It might make sense to make Python immune to this kind of bug. I think I'm done arguing about this. If Mark isn't discouraged by now, I'm still ready to help with the PEP. --=20 David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From tdelaney@avaya.com Wed Jan 8 23:00:40 2003 From: tdelaney@avaya.com (Delaney, Timothy) Date: Thu, 9 Jan 2003 10:00:40 +1100 Subject: [Python-Dev] PEP 297: Support for System Upgrades Message-ID: > From: M.-A. Lemburg [mailto:mal@lemburg.com] > > Ok, then, let's call the dir "site-upgrades-" with > being major.minor.patchlevel. -1. This may lead to directory (namespace) pollution. I would prefer the directory to be site-upgrades/ Tim Delaney From markh@skippinet.com.au Wed Jan 8 23:10:46 2003 From: markh@skippinet.com.au (Mark Hammond) Date: Thu, 9 Jan 2003 10:10:46 +1100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: Message-ID: <006201c2b76b$2e04ded0$530f8490@eden> [David] > I think I'm done arguing about this. If Mark isn't discouraged by > now, I'm still ready to help with the PEP. No, I saw this coming a mile off :) A little like clockwork really. Martin seems to be trying to make 3 points: 1) There is no problem. All Windows API function that could indirectly send a Windows message are clearly documented that they generate messages, and if they arent then it is all MS' fault anyway. Substitute "Windows" and "MS" for the particular problem you are having, and we have a nice answer to every potential problem :) 2) Even if such a problem did exist, then creating a brand new thread-state for each and every invocation is acceptable. 3) Mark stated performance was secondary to correctness. Therefore, as soon as we have correctness we can ignore performance as it is not a primary requirement. (1) is clearly bogus. As with David, I am not interested in discussing this issue. David, Anthony and I all have this problem today. Tim Peters can see the problem and can see it exists (even if he believes my current implementation is incorrect). All due respect Martin, but stick to where your expertise lies. As for (2): My understanding (due to the name of the object) is that a single thread should use a single thread-state. You are suggesting that the same thread could have any number of different thread-states, depending on how often the Python interpreter was recursively entered. While I concede that this is likely to work in the general case, I am not sure it is "correct". If no threading semantics will be broken by having one thread use multiple thread-states, then I must ask what purpose thread-states (as opposed to the GIL) have. Mark. From tdelaney@avaya.com Wed Jan 8 23:16:58 2003 From: tdelaney@avaya.com (Delaney, Timothy) Date: Thu, 9 Jan 2003 10:16:58 +1100 Subject: [Python-Dev] new features for 2.3? Message-ID: > From: Guido van Rossum [mailto:guido@python.org] > > We can do that in Python 2.3. Because this is backwards incompatible, > I propose that you have to request this protocol explicitly. I > propose to "upgrade' the binary flag to a general "protocol version" > flag, with values: > > 0 - original protocol > 1 - binary protocol > 2 - new protocol If you're going to do this, why not go all the way, and encode the python version in the protocol? The value to use could be encoded as constants in pickle.py (and cPickle), with a CURRENT_PROTOCOL or some such to make things easy. It might also be an idea to allow passing `sys.version_info`. The thing I'm trying to avoid is a proliferation of magic numbers when pickling ... what happens when we get another backwards-incompatible feature ... do we bump the number up to 3 and call it 'newer protocol'? I think using pickle constants (be they version numbers or otherwise) is the way to go here. Tim Delaney From jacobs@penguin.theopalgroup.com Wed Jan 8 23:28:22 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 8 Jan 2003 18:28:22 -0500 (EST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS Message-ID: Hi all, I've just noticed a problem in one of our financial systems that was caused by the new pure-Python strptime module by Brett Cannon. This was a test system, so I suppose it was doing its job. SourceForge won't currently let me login (big surprise there!) so I'm posting on python-dev for now. Test script: import time n = time.strptime('7/1/2002', '%m/%d/%Y') print 'Time tuple:',n m = time.mktime(n) print 'Seconds from epoch:',m print 'Formatted local time:',time.strftime('%m/%d/%Y', time.localtime(m)) Python 2.2.2: Time tuple: (2002, 7, 1, 0, 0, 0, 0, 182, 0) Seconds from epoch: 1025499600.0 Formatted local time: 07/01/2002 Python 2.3-cvs (with #undef HAVE_STRPTIME hard-coded): Time tuple: (2002, 7, 1, -1, -1, -1, 0, 182, -1) Seconds from epoch: 1025492339.0 Formatted local time: 06/30/2002 Our financial system does _really_ strange things when dates don't round-trip properly. It is running on a large IA32 Redhat Linux 8.0 system with the latest Python CVS. Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From bac@OCF.Berkeley.EDU Wed Jan 8 23:58:10 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Wed, 8 Jan 2003 15:58:10 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: References: Message-ID: [Kevin Jacobs] > Hi all, > > I've just noticed a problem in one of our financial systems that was caused > by the new pure-Python strptime module by Brett Cannon. Lucky me. =) > Python 2.3-cvs (with #undef HAVE_STRPTIME hard-coded): Just for future reference, you can just import ``_strptime`` and use ``_strptime.strptime()`` instead of having to fiddle with ``time``. OK, so I will take a look, but if anyone beats me to a solution I will have no qualms about it. =) -Brett From martin@v.loewis.de Wed Jan 8 23:59:41 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Jan 2003 00:59:41 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <006201c2b76b$2e04ded0$530f8490@eden> References: <006201c2b76b$2e04ded0$530f8490@eden> Message-ID: <3E1CBB6D.6030002@v.loewis.de> Mark Hammond wrote: > While I concede that this is likely to work in the general case, I am not > sure it is "correct". If no threading semantics will be broken by having > one thread use multiple thread-states, then I must ask what purpose > thread-states (as opposed to the GIL) have. That is easy to answer (even though it is out of my area of expertise): it carries the Python stack, in particular for exceptions. Now, if you have multiple thread states in a single thread, the question is how a Python exception should propagate through the C stack. With multiple thread states, the exception "drops off" in the callback, which usually has no meaningful way to deal with it except to print it (in my application, the callback was always CORBA-initiated, so it was straight-forward to propagate it across the wire to the remote caller). The only meaningful alternative would be to assume that there is a single thread state. In that case, the exception would be stored in the thread state, and come out in the original caller. Now, it is very questionable that you could unwind the C stack between the the entrance to the library and the callback: If, as David says, you don't even know that the API may invoke a callback, there is surely no way to indicate that an exception came out of it. As a result, when returning to the bottom of the C stack, the extension suddenly finds an extension in its thread state. The extension probably doesn't expect that exception, so it is simply lost (when the next exception is set). Potentially, strange things happen as somebody might invoke PyErr_Occurred(). I question whether this is better than printing the exception, in the case of multiple thread states. Regards, Martin From jacobs@penguin.theopalgroup.com Thu Jan 9 00:05:14 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 8 Jan 2003 19:05:14 -0500 (EST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: On Wed, 8 Jan 2003, Brett Cannon wrote: > [Kevin Jacobs] > > > Hi all, > > > > I've just noticed a problem in one of our financial systems that was caused > > by the new pure-Python strptime module by Brett Cannon. > > Lucky me. =) > > > > Python 2.3-cvs (with #undef HAVE_STRPTIME hard-coded): > > Just for future reference, you can just import ``_strptime`` and use > ``_strptime.strptime()`` instead of having to fiddle with ``time``. I just removed the #undef HAVE_STRPTIME in the mean time. Thanks for the tip, though. ;) > OK, so I will take a look, but if anyone beats me to a solution I will > have no qualms about it. =) I suspect that I know the cause, though I have not looked at the source for the specific code. Basically, your strptime implementation leaves unspecified information as -1. This is nice, except that it violates the SVID 3, POSIX, BSD 4.3, ISO 9899 standards. i.e., here is an exerpt from the mktime man page on my linux system on how the values in a time-tuple are interpreted: The mktime() function converts a broken-down time structure, expressed as local time, to calendar time representation. The function ignores the specified contents of the structure members tm_wday and tm_yday and recomputes them from the other information in the broken-down time structure. If structure members are outside their legal interval, they will be normalized (so that, e.g., 40 October is changed into 9 Novem- ber). Thus, mktime correctly returns a time-from-epoch that is 1 hour, 1 minute, and 1 second (3661 seconds) behind where it should be (unless it is DST). Thanks for the quick reply, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From bac@OCF.Berkeley.EDU Thu Jan 9 00:14:46 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Wed, 8 Jan 2003 16:14:46 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: References: Message-ID: [Kevin Jacobs] > On Wed, 8 Jan 2003, Brett Cannon wrote: > > [Kevin Jacobs] > > I suspect that I know the cause, though I have not looked at the source for > the specific code. Basically, your strptime implementation leaves > unspecified information as -1. This is nice, except that it violates the > SVID 3, POSIX, BSD 4.3, ISO 9899 standards. Ah, nuts. You are right, ``_strptime`` does set unknown values to -1; Python docs say that unknown values might be set to who-knows-what, so I figured -1 was obvious since none of the values can be -1. Seems some people don't like that idea. Bah! i.e., here is an exerpt from > the mktime man page on my linux system on how the values in a time-tuple are > interpreted: > > The mktime() function converts a broken-down time structure, expressed > as local time, to calendar time representation. The function ignores > the specified contents of the structure members tm_wday and tm_yday and > recomputes them from the other information in the broken-down time > structure. If structure members are outside their legal interval, they > will be normalized (so that, e.g., 40 October is changed into 9 Novem- > ber). > > Thus, mktime correctly returns a time-from-epoch that is 1 hour, 1 minute, > and 1 second (3661 seconds) behind where it should be (unless it is DST). > How lovely. OK, so I am up for suggestions. I mean I could return default values that are within the acceptable range (one-line change since I just initialize the list I use to store values with default values of -1 as it is), but I don't want to mislead users into thinking that the values were extracted from the data string. Does obviousness come before or after following a spec? Would setting default values within range but to their minimum value (so if the month is not known, set it to 1 for Jan instead of -1) solve your problem, Kevin? -Brett From tim.one@comcast.net Thu Jan 9 00:33:42 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 08 Jan 2003 19:33:42 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <006201c2b76b$2e04ded0$530f8490@eden> Message-ID: [Mark Hammond] > ... > David, Anthony and I all have this problem today. Tim Peters can > see the problem and can see it exists (even if he believes my current > implementation is incorrect). I haven't looked at this in at least 2 years. Back when I did, I thought there *may* be rare races in how the Win32 classes initialized themselves. That may not have been the case in reality. I'd like to intensify the problem, though: you're in a thread and you want to call a Python API function safely. Period. You don't know anything else. You don't even know whether Python has been initialized yet, let alone whether there's already a thread state, and/or an interpreter state, sitting around available for you to use. You don't even know whether you're a thread created by Python or via some other means. I believe that, in order to end this pain forever , even this case must be made tractable. It doesn't preclude that a thread knowing more than nothing may be able to do something cheaper and simpler than a thread that knows nothing at all. I'd also like to postulate that proposed solutions can rely on a new Python C API supplying a portable spelling of thread-local storage. We can implement that easily on pthreads and Windows boxes, it seems to me to cut to the heart of several problems, and I'm willing to say that Python threading doesn't work anymore on other boxes until platform wizards volunteer code to implement this API there too. Since the start, Python threading has been constrained by the near-empty intersection of what even the feeblest platform thread implementations supply, and that creates problems without real payback. Let 'em eat Stackless . From bac@OCF.Berkeley.EDU Thu Jan 9 00:44:34 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Wed, 8 Jan 2003 16:44:34 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: References: Message-ID: [Brett Cannon] > [Kevin Jacobs] > > > > [Kevin Jacobs] > > Does obviousness come before or > after following a spec? Would setting default values within range but to > their minimum value (so if the month is not known, set it to 1 for Jan > instead of -1) solve your problem, Kevin? > Well, I answered my own questions: "spec comes first", and "setting the default to 0 fixes it". If you need a quick fix, Kevin, I can personally send you a diff (contextual or unified, your choice =) with the change until this gets into CVS. I will put a patch up on SF ASAP and email back here with the patch #. -Brett From jacobs@penguin.theopalgroup.com Thu Jan 9 00:44:52 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 8 Jan 2003 19:44:52 -0500 (EST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: On Wed, 8 Jan 2003, Brett Cannon wrote: > i.e., here is an exerpt from > > the mktime man page on my linux system on how the values in a time-tuple are > > interpreted: > > > > The mktime() function converts a broken-down time structure, expressed > > as local time, to calendar time representation. The function ignores > > the specified contents of the structure members tm_wday and tm_yday and > > recomputes them from the other information in the broken-down time > > structure. If structure members are outside their legal interval, they > > will be normalized (so that, e.g., 40 October is changed into 9 Novem- > > ber). > > > > Thus, mktime correctly returns a time-from-epoch that is 1 hour, 1 minute, > > and 1 second (3661 seconds) behind where it should be (unless it is DST). > > How lovely. OK, so I am up for suggestions. I mean I could return > default values that are within the acceptable range (one-line change since > I just initialize the list I use to store values with default values of -1 > as it is), but I don't want to mislead users into thinking that the values > were extracted from the data string. Does obviousness come before or > after following a spec? Would setting default values within range but to > their minimum value (so if the month is not known, set it to 1 for Jan > instead of -1) solve your problem, Kevin? No, -1 is the appropriate missing value for months, days, and years. e.g.: print time.strptime('12:22:23', '%H:%M:%S') # libc's strptime > (-1, -1, -1, 12, 22, 23, -1, -1, -1) All of your questions about what should be returned will be answered (in great detail) by the various standard that define strptime. Most UNIX system man pages provide fairly good definitions for how strptime should work. It would also be wise to built a test suite that you can use to validate your strptime implementation against the libc strptime implementations (like I did for the one case that I posted). That way you can eventually use your strptime to detect problems in platform strptime implementations, though that will have to wait until your version has been validated to conform very strictly. I am very greatful for the effort you put in to your strptime implementation and hope that it won't be too inconvenient to go this last mile. Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From tim.one@comcast.net Thu Jan 9 00:48:36 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 08 Jan 2003 19:48:36 -0500 Subject: [Python-Dev] test_codeccallbacks failing on Windows Message-ID: This failure started very recently: C:\Code\python\PCbuild>python ../lib/test/test_codeccallbacks.py ... test_xmlcharrefreplace (__main__.CodecCallbackTest) ... ok test_xmlcharrefvalues (__main__.CodecCallbackTest) ... ERROR ====================================================================== ERROR: test_xmlcharrefvalues (__main__.CodecCallbackTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "../lib/test/test_codeccallbacks.py", line 511, in test_xmlcharrefvalues s = u"".join([unichr(x) for x in v]) ValueError: unichr() arg not in range(0x10000) (narrow Python build) ---------------------------------------------------------------------- Ran 25 tests in 0.380s FAILED (errors=1) Traceback (most recent call last): File "../lib/test/test_codeccallbacks.py", line 613, in ? test_main() File "../lib/test/test_codeccallbacks.py", line 610, in test_main test.test_support.run_suite(suite) File "C:\CODE\PYTHON\lib\test\test_support.py", line 218, in run_suite raise TestFailed(err) test.test_support.TestFailed: Traceback (most recent call last): File "../lib/test/test_codeccallbacks.py", line 511, in test_xmlcharrefvalues s = u"".join([unichr(x) for x in v]) ValueError: unichr() arg not in range(0x10000) (narrow Python build) C:\Code\python\PCbuild>tcap/u From jacobs@penguin.theopalgroup.com Thu Jan 9 00:50:01 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Wed, 8 Jan 2003 19:50:01 -0500 (EST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: On Wed, 8 Jan 2003, Brett Cannon wrote: > [Brett Cannon] > > [Kevin Jacobs] > > > > [Kevin Jacobs] > > > Does obviousness come before or > > after following a spec? Would setting default values within range but to > > their minimum value (so if the month is not known, set it to 1 for Jan > > instead of -1) solve your problem, Kevin? > > Well, I answered my own questions: "spec comes first", and "setting the > default to 0 fixes it". If you need a quick fix, Kevin, I can personally > send you a diff (contextual or unified, your choice =) with the change > until this gets into CVS. > > I will put a patch up on SF ASAP and email back here with the patch #. I appreciate the effort, but please take your time. There is no rush since I have several viable work-arounds. I'll also be happy to test any patches when they're ready. Thanks again, -Kevin -- -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From bac@OCF.Berkeley.EDU Thu Jan 9 00:54:31 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Wed, 8 Jan 2003 16:54:31 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: References: Message-ID: [Kevin Jacobs] > No, -1 is the appropriate missing value for months, days, and years. > > e.g.: > print time.strptime('12:22:23', '%H:%M:%S') # libc's strptime > > (-1, -1, -1, 12, 22, 23, -1, -1, -1) > But in your example, which strptime is only given the year, month, and day, you got (2002, 7, 1, 0, 0, 0, 0, 182, 0). Now why is the day of the week and timezone 0 in this one but -1 in the one above? > All of your questions about what should be returned will be answered (in > great detail) by the various standard that define strptime. Most UNIX > system man pages provide fairly good definitions for how strptime should > work. I did my implementation based on OpenBSD's strptime man page (first hit on Google), so I have tried that already. >It would also be wise to built a test suite that you can use to > validate your strptime implementation against the libc strptime > implementations (like I did for the one case that I posted). That way you > can eventually use your strptime to detect problems in platform strptime > implementations, though that will have to wait until your version has been > validated to conform very strictly. > Could. Would be a matter of testing what time.strptime is and write a testing function that does what is needed. > I am very greatful for the effort you put in to your strptime implementation > and hope that it won't be too inconvenient to go this last mile. > =) I am not stopping until this thing is perfect. -Brett From tdelaney@avaya.com Thu Jan 9 00:56:06 2003 From: tdelaney@avaya.com (Delaney, Timothy) Date: Thu, 9 Jan 2003 11:56:06 +1100 Subject: [Python-Dev] no expected test output for test_sort? Message-ID: > From: Guido van Rossum [mailto:guido@python.org] > > cases, etc. There are coverage tools (ask Skip) -- when testing > Python modules, see if the test covers all code in the module! Speaking of which, I am currently seeking permission to release my (unfinished) coverage tool. One advantage it has is that it can run PyUnit tests from within the coverage tool without any extra work - as far as PyUnit is concerned, the script being covered is the __main__ script. Tim Delaney From bac@OCF.Berkeley.EDU Thu Jan 9 00:57:14 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Wed, 8 Jan 2003 16:57:14 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: References: Message-ID: [Kevin Jacobs] > > I will put a patch up on SF ASAP and email back here with the patch #. > > I appreciate the effort, but please take your time. There is no rush since > I have several viable work-arounds. I'll also be happy to test any patches > when they're ready. > No rush; perk of not working. =) But I am going to hold off until it is agreed upon that making the default value 0 is the right solution. And if anyone out there has some strptime docs that they feel I should take a look at (sans OpenBSD's man page since that is what this implementation is based on), let me know. -Brett From tim.one@comcast.net Thu Jan 9 01:31:46 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 08 Jan 2003 20:31:46 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: [Brett Cannon] > ... > And if anyone out there has some strptime docs that they feel I > should take a look at (sans OpenBSD's man page since that is what > this implementation is based on), let me know. I believe strptime is a x-platform mess. The POSIX docs aren't clear, but seem to imply that the struct tm* passed to strptime() is both input and output, and that strptime() is supposed to leave all the fields alone except for those explicitly given new values by the format string, and even then doesn't define exactly which fields those are beyond saying "the appropriate tm structure members": http://www.opengroup.org/onlinepubs/007908799/xsh/strptime.html It doesn't say anything about how out-of-range values are to be treated. See the Python nondist/sandbox/datetime/datetime.py's class tmxxx for one plausible way to normalize out-of-range fields, if that's necessary. How that's supposed to work in all cases is also clear as mud, and POSIX is useless: http://www.opengroup.org/onlinepubs/007908799/xsh/mktime.html It echoes the C std so far as it goes, but omits the C standard's crucial additional text: If the call is successful, a second call to the mktime function with the resulting struct tm value shall always leave it unchanged and return the same value as the first call. Furthermore, if the normalized time is exactly representable as a time_t value, then the normalized broken-down time and the broken-down time generated by converting the result of the mktime function by a call to localtime shall be identical. This doesn't specify an algorithm for normalization, but constrains possible algorithms via cross-function invariants that must be maintained. From guido@python.org Thu Jan 9 02:06:54 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 08 Jan 2003 21:06:54 -0500 Subject: [Python-Dev] new features for 2.3? In-Reply-To: Your message of "Thu, 09 Jan 2003 10:16:58 +1100." References: Message-ID: <200301090206.h0926sc14535@pcp02138704pcs.reston01.va.comcast.net> > > We can do that in Python 2.3. Because this is backwards incompatible, > > I propose that you have to request this protocol explicitly. I > > propose to "upgrade' the binary flag to a general "protocol version" > > flag, with values: > > > > 0 - original protocol > > 1 - binary protocol > > 2 - new protocol > > If you're going to do this, why not go all the way, and encode the > python version in the protocol? The value to use could be encoded as > constants in pickle.py (and cPickle), with a CURRENT_PROTOCOL or > some such to make things easy. > > It might also be an idea to allow passing `sys.version_info`. > > The thing I'm trying to avoid is a proliferation of magic numbers > when pickling ... what happens when we get another > backwards-incompatible feature ... do we bump the number up to 3 and > call it 'newer protocol'? > > I think using pickle constants (be they version numbers or > otherwise) is the way to go here. No way. If any kind of version number is encoded in a pickle, it should be the pickle protocol version. Unlike marshalled data, pickles are guraranteed future-proof, and also backwards compatible, unless the pickled data itself depends on Python features (for example, an instance of a list subclass pickled in Python 2.2 can't be unpickled in older Python versions). Each subsequent pickle protocol is a superset of the previous protocol. There are use cases that have many pickles containing very small amounts of data, where adding a version header to each pickle would amount to a serious increase of data size. Maybe we can afford adding one byte to indicate the new version number (so that unpicklers that don't speak the new protocol will die cleanly), but I think that's about it. --Guido van Rossum (home page: http://www.python.org/~guido/) From markh@skippinet.com.au Thu Jan 9 02:07:56 2003 From: markh@skippinet.com.au (Mark Hammond) Date: Thu, 9 Jan 2003 13:07:56 +1100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: Message-ID: <008e01c2b783$ed86d2f0$530f8490@eden> [Tim] > I'd like to intensify the problem, though: Good! I was just taking small steps to get to the same endpoint. > you're in a thread and you want > to call a Python API function safely. Period. You don't > know anything else. You don't even know whether Python has > been initialized yet, let alone whether there's already a > thread state, and/or an interpreter state, sitting around > available for you to use. Agreed 100%. In some ways, I believe this is just the conclusion from my 2 points taken together if we can ignore the current world order. I split them to try and differentiate the requirements from the current API, but if we progress to Tim's description, my points become: 1) Becomes exactly as Tim stated. 2.1) Stays the same - release the GIL. 2.2) Goes away - if (1) requires no knowledge of Python's state, there is no need for extensions to take special action just to enable this. > I'd also like to postulate that proposed solutions can rely > on a new Python > C API supplying a portable spelling of thread-local storage. We can > implement that easily on pthreads and Windows boxes, it seems > to me to cut > to the heart of several problems, and I'm willing to say that Python > threading doesn't work anymore on other boxes until platform wizards > volunteer code to implement this API there too. This sounds good to me. After you have done the Win98 version, I volunteer to port it to Win2k . I believe we have a reasonable starting point. Our PEP could have: * All the usual PEP fluff * Define the goal, basically as stated by Tim. * Define a new C API specifically for this purpose, probably as an "optional extension" to the existing thread state APIs. * Define a TLS interface that all ports must implement *iff* this new API is to be available. This sounds reasonable to me unless we can see a number of other uses for TLS - in which case the TLS interface would probably get its own PEP, with this PEP relying on it. However, I don't see too much need for TLS - once we have our hands on a Python thread-state, we have a thread-specific dictionary available today, and a TLS dictionary from inside your Python code is trivial. From what I can see, we just need platform TLS to get hold of our thread-state, from which point we can (and do) manage our own thread specific data. How does this sound? Mark. From guido@python.org Thu Jan 9 02:12:26 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 08 Jan 2003 21:12:26 -0500 Subject: [Python-Dev] playback debugging In-Reply-To: Your message of "Wed, 08 Jan 2003 16:26:10 EST." <20030108162610.C4744@localhost.localdomain> References: <20030108162610.C4744@localhost.localdomain> Message-ID: <200301090212.h092CQs14570@pcp02138704pcs.reston01.va.comcast.net> > Would it be possible to save every function return/stack in a running > python program and then playback the results in a debugger? > The article linked from /. > http://web.media.mit.edu/~lieber/Lieberary/Softviz/CACM-Debugging/CACM-Debugging-Intro.html#Intro > > mentions that a lisp interpreter (Lisp stepper ZStep 95) did something > like this that then let you replay a program from the end back to the > beginning. > > The idea would be that every function return replays the orignal return > even if the outcome is different. core operations that call out to > C wouldn't actually make the call, so random() would always return the > same value that it orignally did at that spot. Ditto for file.read(). > The pure python code would almost by definition do excatly the same thing, > b/c the inputs would always be identical. If this isn't always possible a > warning could be issued and the orignal value be returned regardless. > > This would require work on the interpreter and the debugger, > but does anyone see a reason why this is technically impossible? > > Async IO programs would be SOL. And large or long running programs > would be infeasable. I run everything under mod_python which essentially > means a series of very short one-off programs. This would be a huge > win IMO. I currently save the input parameters and can re-run them > outside of apache. This is non-ideal as some server side state is lost, > but works most of the time. A large number of python programs would > reap a huge reward from this, but please correct me if I'm missing > the obvious. This sounds like a really cool idea, and I think it would probably be possible for small example programs. I don't know how much work it would be, but I support you in trying to get this to work -- as long as it doesn't mean extra work for me (I simply have no time). Ideally, you would define a *very* small set of changes to the existing Python VM that would let you write the rest of the logic of this code in Python (even if it had to be ugly Python). You might even try to see if the existing debugger hooks will let you do this -- you can trap Python function entries and exits. But I don't think you can trap calls to functions implemented in C, so I expect you'll need more hooks. The smaller the hooks are, the more likely they are to be accepted in the codebase. Oh, and they have to be safe -- it should not be possible to cause segfaults or other nastiness by abusing the hooks. --Guido van Rossum (home page: http://www.python.org/~guido/) From tdelaney@avaya.com Thu Jan 9 02:41:44 2003 From: tdelaney@avaya.com (Delaney, Timothy) Date: Thu, 9 Jan 2003 13:41:44 +1100 Subject: [Python-Dev] new features for 2.3? Message-ID: > From: Guido van Rossum [mailto:guido@python.org] > > No way. If any kind of version number is encoded in a pickle, it > should be the pickle protocol version. Unlike marshalled data, > pickles are guraranteed future-proof, and also backwards compatible, > unless the pickled data itself depends on Python features (for > example, an instance of a list subclass pickled in Python 2.2 can't be > unpickled in older Python versions). Each subsequent pickle protocol > is a superset of the previous protocol. > > There are use cases that have many pickles containing very small > amounts of data, where adding a version header to each pickle would > amount to a serious increase of data size. Maybe we can afford adding > one byte to indicate the new version number (so that unpicklers that > don't speak the new protocol will die cleanly), but I think that's > about it. Sorry - that's more or less what I meant ... that there would be a mapping between the version and a magic number (the pickle protocol version). This could also be achieved by passing in the `sys.version_info` - it would be mapped to the appropriate number. Multiple versions may well use the same pickle protocol, but should the protocol numbers be made visible to the user? The "version number" (pickle protocol) would be an indicator to earlier versions that they don't know how do deal with this protocol ... essentially "you must be this version or higher to use this". Tim Delaney From markh@skippinet.com.au Thu Jan 9 02:42:43 2003 From: markh@skippinet.com.au (Mark Hammond) Date: Thu, 9 Jan 2003 13:42:43 +1100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1CBB6D.6030002@v.loewis.de> Message-ID: <009701c2b788$c9d1f290$530f8490@eden> [Martin] > Mark Hammond wrote: > > While I concede that this is likely to work in the general > > case, I am not sure it is "correct". If no threading semantics > > will be broken by having one thread use multiple thread-states, > > then I must ask what purpose thread-states (as opposed to the GIL) have. > > That is easy to answer (even though it is out of my area of > expertise): > it carries the Python stack, in particular for exceptions. It also carries the profiler and debugger hooks, a general purpose "thread state" dictionary and other misc. details such as the tick count, recursion depth protection etc. > Now, if you have multiple thread states in a single thread, > the question > is how a Python exception should propagate through the C stack. Actually, I think the question is still "why would a single thread have multiple thread-states?". (Or maybe "should a thread-state be renamed to, say, an "invocation frame"?) > With multiple thread states, the exception "drops off" in the > callback, which usually "usually" is the key word here. Python isn't designed only to handle what programs "usually" do. A strategy I have seen recently here, which is to argue that any other requirements are self-evidently broken, is not helpful. We could possibly argue that exceptions are OK to handle this way. Similar amounts of text could also possibly convince that the profiler, debugger and thread-switch items also will not be too badly broken by having multiple thread states per thread, or that such breakage is "desirable" (ie, can be justified). You will have more trouble convincing me that future items stored in a Python thread state will not be broken, but I am past arguing about it. Please correct me if I am wrong, but it seems your solution to this is: * Every function which *may* trigger such callbacks *must* switch out the current thread state (thereby dropping the GIL etc) * Every entry-point which needs to call Python must *always* allocate and switch to a new thread-state. * Anything broken by having multiple thread-states per thread be either (a) fixed, or (b) justified in terms of a specific CORBA binding implementation. * Anyone wanting anything more remains out on their own, just as now. If so, I am afraid I was hoping for just a little more . Mark. From guido@python.org Thu Jan 9 04:44:03 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 08 Jan 2003 23:44:03 -0500 Subject: [Python-Dev] Re: Whither rexec? In-Reply-To: Your message of "Wed, 08 Jan 2003 20:34:36 +0100." <3E1C7D4C.5080502@lemburg.com> References: <200301061726.h06HQWe28737@odiug.zope.com> <3E1C7D4C.5080502@lemburg.com> Message-ID: <200301090444.h094i4G15018@pcp02138704pcs.reston01.va.comcast.net> > If you only want to secure a few objects, then mxProxy can > help you with this: it allows access management at C level > on a per-method basis and also via callbacks... > > http://www.egenix.com/files/python/mxProxy.html Zope3 has a similar proxy feature. But I think that the safety of proxies still relies on there not being backdoors, and the new-style class code has added too many of those. More on this thread later, when I have more bandwidth to deal with it. --Guido van Rossum (home page: http://www.python.org/~guido/) From altis@semi-retired.com Thu Jan 9 05:48:05 2003 From: altis@semi-retired.com (Kevin Altis) Date: Wed, 8 Jan 2003 21:48:05 -0800 Subject: [Python-Dev] PEP 290 revisited Message-ID: I'm in the process of cleaning up string module references in wxPython. I decided to take a look at the 2.3a1 release and it appears that Python and the standard libs need a similar cleanup. Of course, there are other cleanups to make per PEP 290 and PEP 8. I searched for threads on this topic after the initial discussions in June 2002 that led to PEP 290, but didn't find anything regarding cleaning up Python itself. http://mail.python.org/pipermail/python-dev/2002-June/024950.html This seems like the kind of work that even a puddenhead like me could do, not wasting the valuable time of the real developers. :) So, the question is whether patches for cleaning up string module references... would be welcome and if so, now, or not until after 2.3 final is released? Obviously, there is the possibility of typos or some other editing mistake, though of course all files would be checked for syntax errors with tabnanny and the patch can be easily scanned to see that the replacement lines are equivalent. I would be working from an anon-cvs checkout, so I just need to know which branch... I should checkout. My strategy with wxPython cleanup has been to deal with a directory of files per submitted patch file. For large dirs this can be split further into 5 - 10 files at a time so reading over the patches before a commit is pretty easy. I'm also dealing with just one cleanup issue at a time to reduce the possibility of editing errors. ka From skip@pobox.com Thu Jan 9 06:11:18 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 9 Jan 2003 00:11:18 -0600 Subject: [Python-Dev] PEP 290 revisited In-Reply-To: References: Message-ID: <15901.4742.504315.205329@montanaro.dyndns.org> Kevin> Obviously, there is the possibility of typos or some other Kevin> editing mistake, though of course all files would be checked for Kevin> syntax errors with tabnanny and the patch can be easily scanned Kevin> to see that the replacement lines are equivalent. I would be Kevin> working from an anon-cvs checkout, so I just need to know which Kevin> branch... I should checkout. I'd check out the main trunk. Also, running the regression tests after a batch of changes should help convince you your changes are correct. Skip From guido@python.org Thu Jan 9 06:20:17 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 09 Jan 2003 01:20:17 -0500 Subject: [Python-Dev] PEP 290 revisited In-Reply-To: Your message of "Wed, 08 Jan 2003 21:48:05 PST." References: Message-ID: <200301090620.h096KHE15872@pcp02138704pcs.reston01.va.comcast.net> > I'm in the process of cleaning up string module references in wxPython. > > I decided to take a look at the 2.3a1 release and it appears that > Python and the standard libs need a similar cleanup. I thought we already eradicated most uses of the string module; whatever's left is probably due to PEP 291. But in any case, I'm against these kinds of peephole changes. Historically they have often introduced subtle bugs due to the sheer volume of changes. Also, these changes complicate code audits using cvs annotate when you need to understand who wrote a particular line of code, and why. If you want to help, pick one module at a time (it must be a module that you know and use) and do a thorough style review on *all* aspects of the module. E.g. add docstrings, make sure docstrings conform to PEP 8, use the latest builtins where it makes sense, use nested scopes if it would clarify things, etc. Also do a thorough review of the module's test suite, making sure that all end cases are tested and that all code in the module is actually tested (use coverage tools!), adding tests when necessary. I know, that's much less fun and no quick satisfaction, but it leads to code *improvement* rather than bitrot. --Guido van Rossum (home page: http://www.python.org/~guido/) From altis@semi-retired.com Thu Jan 9 07:22:26 2003 From: altis@semi-retired.com (Kevin Altis) Date: Wed, 8 Jan 2003 23:22:26 -0800 Subject: [Python-Dev] PEP 290 revisited In-Reply-To: <200301090620.h096KHE15872@pcp02138704pcs.reston01.va.comcast.net> Message-ID: > -----Original Message----- > From: Guido van Rossum > > > I'm in the process of cleaning up string module references in wxPython. > > > > I decided to take a look at the 2.3a1 release and it appears that > > Python and the standard libs need a similar cleanup. > > I thought we already eradicated most uses of the string module; > whatever's left is probably due to PEP 291. > > But in any case, I'm against these kinds of peephole changes. > Historically they have often introduced subtle bugs due to the sheer > volume of changes. Also, these changes complicate code audits using > cvs annotate when you need to understand who wrote a particular line > of code, and why. That makes sense. Fair enough. > If you want to help, pick one module at a time (it must be a module > that you know and use) and do a thorough style review on *all* aspects > of the module. E.g. add docstrings, make sure docstrings conform to > PEP 8, use the latest builtins where it makes sense, use nested scopes > if it would clarify things, etc. Also do a thorough review of the > module's test suite, making sure that all end cases are tested and > that all code in the module is actually tested (use coverage tools!), > adding tests when necessary. > > I know, that's much less fun and no quick satisfaction, but it leads > to code *improvement* rather than bitrot. Yes, but it also means the folks doing the real work in a module are going to have to deal with this kind of stuff that probably seems trivial to them and not worth doing when they could be writing real code. It just means there is more on their plate and that Python itself, may not meet its own guidelines; these kinds of changes tend to not get done because there is never enough time. I am certainly not up for the level of involvement you are suggesting for a given module within the standard libs, nor do I think I have the level of knowledge and skills required, so I'll have to decline on that and just stick to the projects already on my plate. The downside is that after a certain point, a Python programmer starts looking at the standard libs and common packages for inspiration and code to reuse in their own code, at least I know I did. That's where I picked up my use of the string module, types, == None and != None, etc. A year and half later I'm getting around to cleaning this stuff up in my own code (and wxPython) when I see it, plus now I know better, but I would have preferred to see the "correct" way first. And yes I didn't discover the PEPs until long after I was already investigating and copying examples from the Python sources; that order of discovery may not be typical. Anyway, at least the level of involvement requirement is clear so those that want to step up to the plate, can. ka From adamrich@skunkworksone.com Thu Jan 9 09:00:29 2003 From: adamrich@skunkworksone.com (Adam Richardson) Date: Thu, 9 Jan 2003 09:00:29 +0000 Subject: [Python-Dev] Any Lasso'ers on the list ? Message-ID: Hi All, List newbie here, so apologies if this is the wrong list to post this to. We're currently assessing Python for middleware needs for various projects, chiefly with mySQL or Oracle db's. We currently use Lasso with a combination of the Corral Method / FrameWork Pro, so I'm wondering if there are any developers on the list who know both Lasso and Python who might be able to answer a few questions I have about porting Lasso solutions to Python. Thanks in advance, Adam -- -------------------------- Adam Richardson, CEO Skunkworks One Database, Web Development and Data Security Consultants 30 Chandos Street, St. Leonards, Sydney, NSW 2065 Australia Ph: +61 2 9439 8600 Fax: +61 2 9437 4584 http://www.SkunkworksOne.com SkunkWorks One specialises in developing databases, e-commerce and business systems, with our number one priority being to secure those systems with the data security that today's business environment demands. "The U.S. General Accounting Office (GAO) reports that about 250,000 break-ins into Federal computer systems were attempted in one year and 64 percent were successful. The number of attacks is doubling every year and the GAO estimates that only one to four percent of these attacks will be detected and only about one percent will be reported." The contents of this e-mail and any attachment(s) are strictly confidential and are solely for the person(s) at the e-mail address(es) above. Skunkworks One is a division of Waenick Pty. Ltd. From martin@v.loewis.de Thu Jan 9 10:30:47 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 09 Jan 2003 11:30:47 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: References: Message-ID: Tim Peters writes: > I'd like to intensify the problem, though: you're in a thread and you want > to call a Python API function safely. Period. Are there semantic requirements to the Python API in this context, with respect to the state of global things? E.g. when I run the simple string "import sys;print sys.modules", would I need to get the same output that I get elsewhere? If yes, is it possible to characterize "elsewhere" any better? Regards, Martin From martin@v.loewis.de Thu Jan 9 10:34:38 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 09 Jan 2003 11:34:38 +0100 Subject: [Python-Dev] Any Lasso'ers on the list ? In-Reply-To: References: Message-ID: Adam Richardson writes: > List newbie here, so apologies if this is the wrong list to post > this to. Adam, This is indeed the wrong list: It deals with the development *of* Python, not with the development *with* Python. Regards, Martin From mal@lemburg.com Thu Jan 9 10:42:46 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 09 Jan 2003 11:42:46 +0100 Subject: [Python-Dev] Re: Whither rexec? In-Reply-To: <200301090444.h094i4G15018@pcp02138704pcs.reston01.va.comcast.net> References: <200301061726.h06HQWe28737@odiug.zope.com> <3E1C7D4C.5080502@lemburg.com> <200301090444.h094i4G15018@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E1D5226.4090906@lemburg.com> Guido van Rossum wrote: >>If you only want to secure a few objects, then mxProxy can >>help you with this: it allows access management at C level >>on a per-method basis and also via callbacks... >> >> http://www.egenix.com/files/python/mxProxy.html > > > Zope3 has a similar proxy feature. But I think that the safety of > proxies still relies on there not being backdoors, and the new-style > class code has added too many of those. mxProxy stores a reference to the object in a C Proxy object and then manages access to this object through the Proxy methods and attributes. Provided that no other reference to the wrapped Python object exist in the interpreter, the only way to get at the object is via hacking the C code, ie. by using a special extension which knows how to extract the C pointer to the object from the Proxy object. Now, the Proxy object knows that e.g. bound methods of the object contain a reference to the object itself and rewraps the method in a way which hides the pointer to self. I don't know whether the new class code has added more backdoors of this kind. If so, I'd appreciate some details or references, so that I can add support for these to mxProxy as well. > More on this thread later, when I have more bandwidth to deal with it. Ok. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Thu Jan 9 10:58:29 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 09 Jan 2003 11:58:29 +0100 Subject: [Python-Dev] Any Lasso'ers on the list ? In-Reply-To: References: Message-ID: <3E1D55D5.5060506@lemburg.com> Adam Richardson wrote: > Hi All, > > List newbie here, so apologies if this is the wrong list to post this to. > > We're currently assessing Python for middleware needs for various > projects, chiefly with mySQL or Oracle db's. > > We currently use Lasso with a combination of the Corral Method / > FrameWork Pro, so I'm wondering if there are any developers on the list > who know both Lasso and Python who might be able to answer a few > questions I have about porting Lasso solutions to Python. You should probably ask this question on the Python Database SIG list (db-sig@python.org) or comp.lang.python. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From ben@algroup.co.uk Thu Jan 9 11:12:51 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Thu, 09 Jan 2003 11:12:51 +0000 Subject: [Python-Dev] Re: Whither rexec? References: <200301061726.h06HQWe28737@odiug.zope.com> Message-ID: <3E1D5933.4060309@algroup.co.uk> A.M. Kuchling wrote: > Guido van Rossum wrote: > >> See my recent checkins and what I just sent to python-announce (not >> sure when the moderator will get to it): > > > Back in December I reduced the "Restricted Execution" HOWTO > to a warning not to use rexec. This morning, perhaps because of Guido's > announcement, I've gotten two e-mails from users of the module asking > for more details, both sounding a bit desperate for alternatives. > Doubtless more rexec users will come out of the woodwork as a result. > > I'd like to add some suggested alternatives; any suggestions? Although working code does not yet exist[1], I have a suggestion: capabilities. I'd say more, but I just got off a plane and would probably not be coherent, but I promise I _will_ say more after sleeping. Cheers, Ben. [1] I should say that I actually do have somewhat working code, but have recently completely changed my mind about how these should work, so its in a state of flux. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From walter@livinglogic.de Thu Jan 9 11:39:31 2003 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 09 Jan 2003 12:39:31 +0100 Subject: [Python-Dev] test_codeccallbacks failing on Windows In-Reply-To: References: Message-ID: <3E1D5F73.8060203@livinglogic.de> Tim Peters wrote: > This failure started very recently: > > C:\Code\python\PCbuild>python ../lib/test/test_codeccallbacks.py > ... > test_xmlcharrefreplace (__main__.CodecCallbackTest) ... ok > test_xmlcharrefvalues (__main__.CodecCallbackTest) ... ERROR Fixed. Bye, Walter Dörwald From pyth@devel.trillke.net Thu Jan 9 12:00:08 2003 From: pyth@devel.trillke.net (holger krekel) Date: Thu, 9 Jan 2003 13:00:08 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: ; from tim.one@comcast.net on Wed, Jan 08, 2003 at 07:33:42PM -0500 References: <006201c2b76b$2e04ded0$530f8490@eden> Message-ID: <20030109130008.H349@prim.han.de> Tim Peters wrote: > [...] > I'd also like to postulate that proposed solutions can rely on a new Python > C API supplying a portable spelling of thread-local storage. We can > implement that easily on pthreads and Windows boxes, it seems to me to cut > to the heart of several problems, and I'm willing to say that Python > threading doesn't work anymore on other boxes until platform wizards > volunteer code to implement this API there too. FWIW, I am pretty confident that this can be done (read: copied) as Douglas Schmidt has implemented it (on more platforms than python supports ) in the Adapative Communication Framework (ACE): http://doc.ece.uci.edu/Doxygen/Beta/html/ace/classACE__TSS.html As usual, Douglas Schmidt also presents a detailed platform analysis and research about his implementation: http://www.cs.wustl.edu/~schmidt/PDF/TSS-pattern.pdf "This paper describes the Thread-Specific Storage pattern, which alleviates several problems with multi-threading performance and programming complexity. The Thread-Specific Storage pattern improves performance and simplifies multithreaded applications by allowing multiple threads to use one logically global access point to retrieve thread-specific data without incurring locking overhead for each access." regards, holger From markh@skippinet.com.au Thu Jan 9 12:07:40 2003 From: markh@skippinet.com.au (Mark Hammond) Date: Thu, 9 Jan 2003 23:07:40 +1100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: Message-ID: <00da01c2b7d7$b5abebf0$530f8490@eden> [Martin] > Tim Peters writes: > > > I'd like to intensify the problem, though: you're in a > > thread and you want to call a Python API function safely. > > Period. > > Are there semantic requirements to the Python API in this context, > with respect to the state of global things? E.g. when I run the simple > string "import sys;print sys.modules", would I need to get the same > output that I get elsewhere? If yes, is it possible to characterize > "elsewhere" any better? Yes, good catch. A PyInterpreterState must be known, and as you stated previously, it is trivial to get one of these and stash it away globally. The PyThreadState is the problem child. Mark. From jacobs@penguin.theopalgroup.com Thu Jan 9 12:07:41 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Thu, 9 Jan 2003 07:07:41 -0500 (EST) Subject: [Python-Dev] Assignment to __class__ Message-ID: Hi all, I'm testing large chunks of our code base with Python 2.3a1 and have run into another minor snag. Five months ago Guido committed a patch to prevent assigning __class__ for non-heap-types, which was backported to 2.2-maint two weeks ago in response to SF #658106. This is a great idea for preventing nonsensical assignments to None.__class__, or 2.__class__, but it is too heavy handed in preventing assignments to [1,2,3].__class__, (1,2,3).__class__ or {1:2,3:4}.__class__. My specific use-case involves dictionary and list objects. I define a classes that inherits from list or dict and add specialized algebraic, vector and tensor functions over the range and domain of the data in the list or dictionary. I _could_ just copy the data into my new objects, but it is wasteful since these structures can be very large and deeply nested. I suspect that it is possible to come up with better criteria for allowing safe assignment to __class__ that will still allow the useful technique I describe above. Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From mhammond@skippinet.com.au Thu Jan 9 13:51:17 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Fri, 10 Jan 2003 00:51:17 +1100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1D7208.4040103@v.loewis.de> Message-ID: <000801c2b7e6$2ea3e810$530f8490@eden> > Mark Hammond wrote: > > Yes, good catch. A PyInterpreterState must be known, and > > as you stated previously, it is trivial to get one of these > > and stash it away globally. > > The PyThreadState is the problem child. > > Then of course you know more than Tim would grant you: you do have an > interpreter state, and hence you can infer that Python has been > initialized. So I infer that your requirements are different > from Tim's. Sheesh - lucky this is mildly entertaining . You are free to infer what you like, but I believe it is clear and would prefer to see a single other person with a problem rather than continue pointless semantic games. Tiredly, Mark. From martin@v.loewis.de Thu Jan 9 14:00:00 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Jan 2003 15:00:00 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <000801c2b7e6$2ea3e810$530f8490@eden> References: <000801c2b7e6$2ea3e810$530f8490@eden> Message-ID: <3E1D8060.5030507@v.loewis.de> Mark Hammond wrote: > Sheesh - lucky this is mildly entertaining . You are free to infer > what you like, but I believe it is clear and would prefer to see a single > other person with a problem rather than continue pointless semantic games. Feel free to ignore me if you think you have the requirements specified, and proceed right away to presenting the solution. Regards, Martin From jack@performancedrivers.com Thu Jan 9 18:57:00 2003 From: jack@performancedrivers.com (Jack Diederich) Date: Thu, 9 Jan 2003 13:57:00 -0500 Subject: [Python-Dev] playback debugging In-Reply-To: <200301090212.h092CQs14570@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Wed, Jan 08, 2003 at 09:12:26PM -0500 References: <20030108162610.C4744@localhost.localdomain> <200301090212.h092CQs14570@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030109135700.B7074@localhost.localdomain> On Wed, Jan 08, 2003 at 09:12:26PM -0500, Guido van Rossum wrote: > > Would it be possible to save every function return/stack in a running > > python program and then playback the results in a debugger? > > > > The idea would be that every function return replays the orignal return > > even if the outcome is different. core operations that call out to > > C wouldn't actually make the call, so random() would always return the > > same value that it orignally did at that spot. Ditto for file.read(). > > The pure python code would almost by definition do excatly the same thing, > > b/c the inputs would always be identical. If this isn't always possible a > > warning could be issued and the orignal value be returned regardless. > > This sounds like a really cool idea, and I think it would probably be > possible for small example programs. I don't know how much work it > would be, but I support you in trying to get this to work -- as long > as it doesn't mean extra work for me (I simply have no time). > Ideally, you would define a *very* small set of changes to the > existing Python VM that would let you write the rest of the logic of > this code in Python (even if it had to be ugly Python). You might > even try to see if the existing debugger hooks will let you do this -- > you can trap Python function entries and exits. But I don't think you > can trap calls to functions implemented in C, so I expect you'll need > more hooks. The smaller the hooks are, the more likely they are to be > accepted in the codebase. Oh, and they have to be safe -- it should > not be possible to cause segfaults or other nastiness by abusing the > hooks. I was actually just going to mention it, but I guess I'll take a peek at implementing it too. Which files define the VM and whither are the debug hooks? I grepped around for debug hooks, but everything 'grep -i debug *.c' just seemed to turn up instances of Py_DEBUG or things that only run when Py_DEBUG is defined. For C functions it may be possible to wrap them in python by doing a little name munging on import. Although that could be tricky and introduce small problems when people screw with module __dict__'s -jackdied From xscottg@yahoo.com Thu Jan 9 16:32:43 2003 From: xscottg@yahoo.com (Scott Gilbert) Date: Thu, 9 Jan 2003 08:32:43 -0800 (PST) Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <000801c2b7e6$2ea3e810$530f8490@eden> Message-ID: <20030109163243.31402.qmail@web40103.mail.yahoo.com> --- Mark Hammond wrote: > > [...] I believe it is clear and would prefer to see a single > other person with a problem [...] > I've been reading this thread with interest since we've recently fought (and lost) this battle at my company. Here is our use case: We use Python in primarily two ways, the first and obvious use is as a scripting language with a small group of us creating extensions to talk to existing libraries. There are no relevant problems here. The second use is as a data structures library used from C++. We created a very easy to use C++ class that has a bazillion operator overloads and handles all the reference counting and what not for the user. It used to handle threading too, but that proved to be very difficult. Think of this C++ class as something similar to what boost::python::{object, dict, list, tuple, long, numeric} provides, but intended for users who don't really like or want to know C++. Most of our users write small C++ processes that communicate amongst themselves via an assortment of IPC mechanisms. Occasionally these C++ processes are threaded, and we wanted to handle that. Our model was that C++ code would never hold the GIL, and that before we entered the Python API we would use pthread_getspecific (thread local storage) to see if there was a valid PyThreadState to use. If there wasn't a thread state, we would create one. Since C++ code never held the GIL, we'd always acquire it. This strategy allows all Python threads to take turns running, and allows any C++ threads to enter into Python when needed. Performance lagged a little this way, but not so much that we cared. The problem came when our users started to write generic libraries to be used from C++ and also wanted these libraries as Python extensions. In one case, their library would be used up in a standalone C++ process (where the GIL was not held), and in another they would use boost to try and export their library as an extension to Python (where the GIL was held). The same C++ library couldn't know in advance if the GIL was held. The way boost templatizes on your functions and classes, it is not at all clear when you can safely release the GIL for the benefit of the C++ library being wrapped up that expects the GIL is not held. Since being able to support writing generic libraries easily is more important to us than supporting multithreaded C++ processes (using Python as a data structure library), we changed our strategy and made it so that in C++ the GIL was held by default. Since for these types of processes "most" of our time is spent in C++, no Python threads ever get a chance to run without additional work from the C++ author. It also requires additional work to have multiple C++ threads use Python. This was pretty unsatisfying to those of us who like to work with threads. It's too late to make this long story short, but what would have made our situation much easier would be something like: void *what_happened = Py_AcquireTheGilIfIDontAlreadyHaveIt(); // Can safely call Python API functions here, no matter what the // context is... Py_ReleaseTheGilIfImSupposedTo(what_happened); I hope seeing another side of this is of some use. Cheers, -Scott __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com From guido@python.org Thu Jan 9 15:31:29 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 09 Jan 2003 10:31:29 -0500 Subject: [Python-Dev] Assignment to __class__ In-Reply-To: Your message of "Thu, 09 Jan 2003 07:07:41 EST." References: Message-ID: <200301091531.h09FVTC30823@odiug.zope.com> > I'm testing large chunks of our code base with Python 2.3a1 and have run > into another minor snag. Five months ago Guido committed a patch to prevent > assigning __class__ for non-heap-types, which was backported to 2.2-maint > two weeks ago in response to SF #658106. This is a great idea for > preventing nonsensical assignments to None.__class__, or 2.__class__, but it > is too heavy handed in preventing assignments to [1,2,3].__class__, > (1,2,3).__class__ or {1:2,3:4}.__class__. > > My specific use-case involves dictionary and list objects. I define a > classes that inherits from list or dict and add specialized algebraic, > vector and tensor functions over the range and domain of the data in the > list or dictionary. I _could_ just copy the data into my new objects, but it > is wasteful since these structures can be very large and deeply nested. > > I suspect that it is possible to come up with better criteria for allowing > safe assignment to __class__ that will still allow the useful technique I > describe above. You can only set __class__ when the old and new class instance have the same instance layout at the C level. Changing this is impossible given the way objects are implemented in C. This means you can never change a list into a dict or vice versa, because the C structs are different. Or do I misunderstand you? Can you give an example of something you think should be allowed but currently isn't? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 9 15:47:40 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 09 Jan 2003 10:47:40 -0500 Subject: [Python-Dev] PEP 290 revisited In-Reply-To: Your message of "Wed, 08 Jan 2003 23:22:26 PST." References: Message-ID: <200301091547.h09Flf731716@odiug.zope.com> > > If you want to help, pick one module at a time (it must be a module > > that you know and use) and do a thorough style review on *all* aspects > > of the module. E.g. add docstrings, make sure docstrings conform to > > PEP 8, use the latest builtins where it makes sense, use nested scopes > > if it would clarify things, etc. Also do a thorough review of the > > module's test suite, making sure that all end cases are tested and > > that all code in the module is actually tested (use coverage tools!), > > adding tests when necessary. > > > > I know, that's much less fun and no quick satisfaction, but it leads > > to code *improvement* rather than bitrot. > > Yes, but it also means the folks doing the real work in a module are > going to have to deal with this kind of stuff that probably seems > trivial to them and not worth doing when they could be writing real > code. It just means there is more on their plate and that Python > itself, may not meet its own guidelines; these kinds of changes tend > to not get done because there is never enough time. I've never considered this a problem. If the code isn't changed for trivial reasons that means I still recognize it when I have to fix a bug 3 years later. > I am certainly not up for the level of involvement you are > suggesting for a given module within the standard libs, nor do I > think I have the level of knowledge and skills required, so I'll > have to decline on that and just stick to the projects already on my > plate. That's okay. You're doing good work on your own projects! > The downside is that after a certain point, a Python programmer > starts looking at the standard libs and common packages for > inspiration and code to reuse in their own code, at least I know I > did. That's where I picked up my use of the string module, types, == > None and != None, etc. A year and half later I'm getting around to > cleaning this stuff up in my own code (and wxPython) when I see it, > plus now I know better, but I would have preferred to see the > "correct" way first. And yes I didn't discover the PEPs until long > after I was already investigating and copying examples from the > Python sources; that order of discovery may not be typical. People who don't read documentation have no excuse. :-) > Anyway, at least the level of involvement requirement is clear so > those that want to step up to the plate, can. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 9 15:36:40 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 09 Jan 2003 10:36:40 -0500 Subject: [Python-Dev] Re: Whither rexec? In-Reply-To: Your message of "Thu, 09 Jan 2003 11:12:51 GMT." <3E1D5933.4060309@algroup.co.uk> References: <200301061726.h06HQWe28737@odiug.zope.com> <3E1D5933.4060309@algroup.co.uk> Message-ID: <200301091536.h09Faeq31062@odiug.zope.com> > [1] I should say that I actually do have somewhat working code, but > have recently completely changed my mind about how these should > work, so its in a state of flux. Is this the modern version of "this margin isn't wide enough to hold the proof"? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 9 15:42:29 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 09 Jan 2003 10:42:29 -0500 Subject: [Python-Dev] Re: Whither rexec? In-Reply-To: Your message of "Thu, 09 Jan 2003 11:42:46 +0100." <3E1D5226.4090906@lemburg.com> References: <200301061726.h06HQWe28737@odiug.zope.com> <3E1C7D4C.5080502@lemburg.com> <200301090444.h094i4G15018@pcp02138704pcs.reston01.va.comcast.net> <3E1D5226.4090906@lemburg.com> Message-ID: <200301091542.h09FgTK31538@odiug.zope.com> [MAL] > >>If you only want to secure a few objects, then mxProxy can > >>help you with this: it allows access management at C level > >>on a per-method basis and also via callbacks... > >> > >> http://www.egenix.com/files/python/mxProxy.html > Guido van Rossum wrote: > > Zope3 has a similar proxy feature. But I think that the safety of > > proxies still relies on there not being backdoors, and the new-style > > class code has added too many of those. [MAL] > mxProxy stores a reference to the object in a C Proxy object > and then manages access to this object through the Proxy methods > and attributes. Provided that no other reference to the wrapped > Python object exist in the interpreter, the only way to get at > the object is via hacking the C code, ie. by using a special > extension which knows how to extract the C pointer to the object > from the Proxy object. Yes, this is exactly what Zope3 does (apart from details). The *provided* clause is the scheme's weakness, of course -- it's not always possible to have no unproxied references to an object. At least, in Zope it's not possible, because proxies can't be pickled, and we use this for persistent objects, so the unproxied objects are also held somewhere (if only temporarily). > Now, the Proxy object knows that e.g. bound methods of the > object contain a reference to the object itself and rewraps the > method in a way which hides the pointer to self. Zope3 proxies do this too. > I don't know whether the new class code has added more backdoors of > this kind. If so, I'd appreciate some details or references, so that > I can add support for these to mxProxy as well. The point of this thread is that at this point nobody knows about all the backdoors that might exist. --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy@udel.edu Thu Jan 9 16:53:58 2003 From: tjreedy@udel.edu (Terry Reedy) Date: Thu, 9 Jan 2003 11:53:58 -0500 Subject: [Python-Dev] Re: PEP 290 revisited References: <200301090620.h096KHE15872@pcp02138704pcs.reston01.va.comcast.net> Message-ID: "Kevin Altis" wrote in message news:KJEOLDOPMIDKCMJDCNDPIEGGCLAA.altis@semi-retired.com... > The downside is that after a certain point, a Python programmer starts > looking at the standard libs and common packages for inspiration and code to > reuse in their own code, at least I know I did. That's where I picked up my > use of the string module, types, == None and != None, etc. I think the issue of the library setting bad examples should not be dismissed. I have seen at least two modules that use 'list' as a list variable name (ie, with 'list=[]' followed by 'list.append(val)' in a for loop). In the meanwhile, experienced Pythoneers are constantly recommending "Don't do that!", "Bad habit, change your code", etc (and similarly for str, dict, and other builtins) on c.l.p. With PythonWin (for instance), it would be fairly easy to grep and edit the entire library for one potential-bad-usage word at a time. It would be somewhat tedious, but fairly easy and quite safe if done with reasonable care. I would pick a uniform replacement, like 'listx' for 'list', that is unlikely to conflict with other var names. Of course, each function would have to be scanned to make sure of no conflict. Terry J. Reedy From dave@boost-consulting.com Thu Jan 9 19:45:38 2003 From: dave@boost-consulting.com (David Abrahams) Date: Thu, 09 Jan 2003 14:45:38 -0500 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL References: <006201c2b76b$2e04ded0$530f8490@eden> <20030109130008.H349@prim.han.de> Message-ID: holger krekel writes: > Tim Peters wrote: >> [...] >> I'd also like to postulate that proposed solutions can rely on a new Python >> C API supplying a portable spelling of thread-local storage. We can >> implement that easily on pthreads and Windows boxes, it seems to me to cut >> to the heart of several problems, and I'm willing to say that Python >> threading doesn't work anymore on other boxes until platform wizards >> volunteer code to implement this API there too. > > FWIW, I am pretty confident that this can be done (read: copied) as > Douglas Schmidt has implemented it (on more platforms than python > supports ) in the Adapative Communication Framework (ACE): > > http://doc.ece.uci.edu/Doxygen/Beta/html/ace/classACE__TSS.html We also have a TSS implementation in the Boost.Threads library. I haven't looked at the ACE code myself, but I've heard that every component depends on many others, so it might be easier to extract useful information from the Boost implementation. -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From dave@boost-consulting.com Thu Jan 9 19:47:47 2003 From: dave@boost-consulting.com (David Abrahams) Date: Thu, 09 Jan 2003 14:47:47 -0500 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL References: <3E1D7208.4040103@v.loewis.de> <000801c2b7e6$2ea3e810$530f8490@eden> Message-ID: "Mark Hammond" writes: >> Mark Hammond wrote: >> > Yes, good catch. A PyInterpreterState must be known, and >> > as you stated previously, it is trivial to get one of these >> > and stash it away globally. >> > The PyThreadState is the problem child. >> >> Then of course you know more than Tim would grant you: you do have an >> interpreter state, and hence you can infer that Python has been >> initialized. So I infer that your requirements are different >> from Tim's. > > Sheesh - lucky this is mildly entertaining . You are free to > infer what you like, but I believe it is clear and would prefer to > see a single other person with a problem rather than continue > pointless semantic games. In this instance, it looks to me like Martin makes a good point. If I'm missing something, I'd appreciate an explanation. Thanks, -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From martin@v.loewis.de Thu Jan 9 17:23:07 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Jan 2003 18:23:07 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: References: Message-ID: <3E1DAFFB.5090202@v.loewis.de> Tim Peters wrote: > No more than self-consistency, I expect, same as for a proper call now run > from a thread. That covers the case that proper calls are possible from this thread (although there are still ambiguities: a single that may have multiple thread states, which are associated to multiple interpreters, which may have diffent contents of sys.modules) However, that answer does not work for the case for a thread that has never seen a Python call: proper calls could not run on this thread. > Then I expect the strongest that can be said is that the output you get > corresponds to the actual state of sys.modules at some time T The issue is that there may be multiple sys.modules in the process, at T. Then the question is whether to use one of the existing ones (if so: which one), or create a new one. > I don't expect we can say anything stronger than that today either. Currently, the extension author must somehow come up with an interpreter state. I'm uncertain whether you are proposing to leave it at that, or whether you require that any solution to "the problem" also provides a way to obtain the "right" interpreter state somehow. Regards, Martin From python@rcn.com Thu Jan 9 15:40:00 2003 From: python@rcn.com (Raymond Hettinger) Date: Thu, 9 Jan 2003 10:40:00 -0500 Subject: [Python-Dev] no expected test output for test_sort? References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> <15892.56781.473869.187757@montanaro.dyndns.org> <3E1B0287.6080904@livinglogic.de> <200301080134.h081YSK07358@pcp02138704pcs.reston01.va.comcast.net> <3E1C01FA.9020602@livinglogic.de> <200301081221.h08CLHG08953@pcp02138704pcs.reston01.va.comcast.net> <3E1C6D14.7080908@livinglogic.de> Message-ID: <007001c2b7f5$5f8ad600$125ffea9@oemcomputer> [Walter] > >>So should I go on with this? Do we want to change all tests before 2.3 > >>is finished, or start changing them after 2.3 is released > >>(or something in between)? > > [GvR] > > I'd say something in between. It's probably good if someone (not me) > > reviews your patches. [Walter] > Definitely. Any volunteers? I'll review a few of these but cannot sign-up for the whole ball game because of time constraints, inability to run certain modules, and trying to review only things I thoroughly understand. My own hope for your project is that new tests are made, coverage increased, interfaces are checked to the document specification, and bugs are found. I'm much less enthusiastic about having existing tests converted to unittest format. Raymond Hettinger From simonb@webone.com.au Thu Jan 9 23:00:27 2003 From: simonb@webone.com.au (Simon Burton) Date: Fri, 10 Jan 2003 10:00:27 +1100 Subject: [Python-Dev] playback debugging In-Reply-To: <20030109135700.B7074@localhost.localdomain> References: <20030108162610.C4744@localhost.localdomain> <200301090212.h092CQs14570@pcp02138704pcs.reston01.va.comcast.net> <20030109135700.B7074@localhost.localdomain> Message-ID: <20030110100027.2650189b.simonb@webone.com.au> > > I was actually just going to mention it, but I guess I'll take a peek > at implementing it too. > > Which files define the VM and whither are the debug hooks? > I grepped around for debug hooks, but everything 'grep -i debug *.c' > just seemed to turn up instances of Py_DEBUG or things that only run > when Py_DEBUG is defined. i think the debug hook is the sys.settrace function used by the pdb module, have you looked at this? Should be enough functionality to implement playbacks of native python code. (I'll do it if you don't.) Simon Burton. From tim.one@comcast.net Thu Jan 9 17:07:28 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 09 Jan 2003 12:07:28 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: Message-ID: [Tim] > I'd like to intensify the problem, though: you're in a thread > and you want to call a Python API function safely. Period. [martin@v.loewis.de] > Are there semantic requirements to the Python API in this context, > with respect to the state of global things? No more than self-consistency, I expect, same as for a proper call now run from a thread. > E.g. when I run the simple string "import sys;print sys.modules", would I > need to get the same output that I get elsewhere? If yes, is it possible to > characterize "elsewhere" any better? I don't know what "elsewhere" means at all. Let's make some assumptions first: Python either has already been initialized successfully, or does initialize successfully as a result of whatever prologue dance is required before you're allowed to "run the simple string". "sys" resolves to the builtin sys. You "run the simple string" at time T1, and it returns at time T2. Nobody finalizes Python "by surprise", or kills it, during this either. Then I expect the strongest that can be said is that the output you get corresponds to the actual state of sys.modules at some time T, with T1 <= T <= T2 (and because you're executing Python code, there's nothing to stop the interpreter from letting some other thread(s) run before and after each of the interpreted string statements, and they can do anything to sys.modules). I don't expect we can say anything stronger than that today either. From martin@v.loewis.de Thu Jan 9 12:58:48 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 09 Jan 2003 13:58:48 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <00da01c2b7d7$b5abebf0$530f8490@eden> References: <00da01c2b7d7$b5abebf0$530f8490@eden> Message-ID: <3E1D7208.4040103@v.loewis.de> Mark Hammond wrote: > Yes, good catch. A PyInterpreterState must be known, and as you stated > previously, it is trivial to get one of these and stash it away globally. > The PyThreadState is the problem child. Then of course you know more than Tim would grant you: you do have an interpreter state, and hence you can infer that Python has been initialized. So I infer that your requirements are different from Tim's. Regards, Martin From tim.one@comcast.net Thu Jan 9 17:12:27 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 09 Jan 2003 12:12:27 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <000801c2b7e6$2ea3e810$530f8490@eden> Message-ID: [Martin v. Lowis] > Then of course you know more than Tim would grant you: you do have an > interpreter state, and hence you can infer that Python has been > initialized. So I infer that your requirements are different > from Tim's. If so, I doubt they'll stay that way . I don't want Mark to *have* to know whether there's an interpreter state available, so the all-purpose prologue code will need to have a way to know that without Mark's help. I do want Mark to be *able* to use a leaner prologue dance if he happens to know that an interpreter state is available. I'd also like for that leaner prologue dance to be able to assert that an interpreter state is indeed available. "The leaner prologue dance" may be identical to the "all-purpose prologue code"; whether or not it can be is an implementation detail, which should become clear later. From jacobs@penguin.theopalgroup.com Thu Jan 9 16:33:57 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Thu, 9 Jan 2003 11:33:57 -0500 (EST) Subject: [Python-Dev] Assignment to __class__ In-Reply-To: <200301091531.h09FVTC30823@odiug.zope.com> Message-ID: On Thu, 9 Jan 2003, Guido van Rossum wrote: > > I suspect that it is possible to come up with better criteria for allowing > > safe assignment to __class__ that will still allow the useful technique I > > describe above. > > You can only set __class__ when the old and new class instance have > the same instance layout at the C level. Changing this is impossible > given the way objects are implemented in C. This means you can never > change a list into a dict or vice versa, because the C structs are > different. > > Or do I misunderstand you? Can you give an example of something you > think should be allowed but currently isn't? Sorry, I was not as clear as I should have been. Here is what used to work, and I hope can be made to work again: class AlgebraicDict(dict): def doReallyComplexThings(self, foo): ... def __add__(self, other): ... def __mul__(self, other): ... unsuspecting_dict = {1:[1,2],2:3} unsuspecting_dict.__class__ = AlgebraicDict > TypeError: __class__ assignment: only for heap types Analogously, we want to transform native lists into AlgebraicLists, which of course has list as a base class. Hope this clears things up, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From mhammond@skippinet.com.au Fri Jan 10 00:53:03 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Fri, 10 Jan 2003 11:53:03 +1100 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: Message-ID: <002001c2b842$a26cb220$530f8490@eden> > >> Then of course you know more than Tim would grant you: you > >> do have an > >> interpreter state, and hence you can infer that Python has been > >> initialized. So I infer that your requirements are different > >> from Tim's. > > Sheesh - lucky this is mildly entertaining . You are free to > > infer what you like, but I believe it is clear and would prefer to > > see a single other person with a problem rather than continue > > pointless semantic games. > In this instance, it looks to me like Martin makes a good point. If > I'm missing something, I'd appreciate an explanation. There was no requirement that identical code be used in all cases. Checking if Python is initialized is currently trivial, and requires no special inference skills. It is clear that some consideration will need to be given to the PyInterpreterState used for all this, but that is certainly tractable - every single person who has spoken up with this requirement to date has indicated that their application does not need multiple interpreter states - so explicitly ignoring that case seems fine. Mark. From tim.one@comcast.net Thu Jan 9 19:23:43 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 09 Jan 2003 14:23:43 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1DAFFB.5090202@v.loewis.de> Message-ID: [Martin v. Lowis] > That covers the case that proper calls are possible from this thread > (although there are still ambiguities: a single that may have multiple > thread states, which are associated to multiple interpreters, which may > have diffent contents of sys.modules) > > However, that answer does not work for the case for a thread that has > never seen a Python call: proper calls could not run on this thread. > >> Then I expect the strongest that can be said is that the output you get >> corresponds to the actual state of sys.modules at some time T > > The issue is that there may be multiple sys.modules in the process, at > T. Then the question is whether to use one of the existing ones (if so: > which one), or create a new one. All right, you're worried about multiple interpreter states. I'm not -- I've never used them, the Python distribution never uses them, there are no tests of that feature, and they don't seem particularly well-defined in end cases regardless. I'm happy to leave them as an afterthought. If someone wants to champion them, they better ensure their interests are met. As far as I'm concerned, if a user does the all-purpose prologue dance (the "I don't know anything, but I want to use the Python API anyway" one), then the interpreter state in effect isn't defined. It may use an existing interpreter state, or an interpreter state created solely for use by this call, or take one out of a pool of interpreter states reused for such cases, or whatever. Regardless, it's a little tail that I don't want wagging the dog here. >> I don't expect we can say anything stronger than that today either. > Currently, the extension author must somehow come up with an interpreter > state. I'm uncertain whether you are proposing to leave it at that, or > whether you require that any solution to "the problem" also provides a > way to obtain the "right" interpreter state somehow. I define "right" as "undefined" in this case. Someone who cares about multiple interpreter states should feel free to define and propose a stronger requirement. However, the people pushing for change here have explicitly disavowed interest in multiple interpreter states, and I'm happy to press on leaving them for afterthoughts. From amit@ontrackindia.com Fri Jan 10 07:21:23 2003 From: amit@ontrackindia.com (Amit Khan) Date: Fri, 10 Jan 2003 12:51:23 +0530 Subject: [Python-Dev] os.popen is not working Message-ID: <012101c2b878$e17c20f0$1b00a8c0@ontrackindia.com> Hi, In one of my windows box os.popen("ipconfig /all") is not working. It is showing an error message like Traceback (most recent call last): File "", line 1, in ? WindowsError: [Errno 123] The filename, directory name, or volume label syntax i s incorrect: 'C:\\WINNT\\system32\\cmd.exe\\ /c dir' I guess "\\" after "C:\\WINNT\\system32\\cmd.exe" is creating problem. What can be reason behind it? Warm Regards, Amit Khan From martin@v.loewis.de Fri Jan 10 09:39:06 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 10 Jan 2003 10:39:06 +0100 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: References: <006201c2b76b$2e04ded0$530f8490@eden> <20030109130008.H349@prim.han.de> Message-ID: David Abrahams writes: > We also have a TSS implementation in the Boost.Threads library. I > haven't looked at the ACE code myself, but I've heard that every > component depends on many others, so it might be easier to extract > useful information from the Boost implementation. Without looking at either Boost or ACE, I would guess that neither will help much: We would be looking for TLS support for AtheOS, BeOS, cthreads, lwp, OS/2, GNU pth, Solaris threads, SGI threads, and Win CE. I somewhat doubt that either Boost or ACE aim for such a wide coverage. Regards, Martin From martin@v.loewis.de Fri Jan 10 09:46:57 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 10 Jan 2003 10:46:57 +0100 Subject: [Python-Dev] os.popen is not working In-Reply-To: <012101c2b878$e17c20f0$1b00a8c0@ontrackindia.com> References: <012101c2b878$e17c20f0$1b00a8c0@ontrackindia.com> Message-ID: "Amit Khan" writes: > What can be reason behind it? Please don't use python-dev for help in developing a Python application; use python-list if you need help, use sf.net/projects/python if you want to report a bug. On python-dev, it would be appropriate to post a complete analysis of the problem, suggest a change, and ask whether anybody could see problems with such a change. Regards, Martin From Raymond Hettinger" During a code review, Neal Norwitz noticed that ceval.c defines DUP_TOPX for x in (1,2,3,4,5) but that compile.c never generates that op code with a parameter greater than three. The question of the day is whether anyone knows of a reason that we can't or shouldn't remove the code for the 4 and 5 cases. Is there anything else (past or present) that can generate this opcode? Taking it out is only a microscopic win, a few saved brain cycles and a smaller byte size for the main eval loop (making it slightly more likely to stay in cache). Also, we wanted to know if anyone still had a use for the LLTRACE facility built into ceval.c. It's been there since '92 and may possibly not longer be of value. Raymond Hettinger From jacobs@penguin.theopalgroup.com Fri Jan 10 11:40:52 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 10 Jan 2003 06:40:52 -0500 (EST) Subject: [Python-Dev] Assignment to __class__ In-Reply-To: <200301091531.h09FVTC30823@odiug.zope.com> Message-ID: On Thu, 9 Jan 2003, Guido van Rossum wrote: > You can only set __class__ when the old and new class instance have > the same instance layout at the C level. Changing this is impossible > given the way objects are implemented in C. This means you can never > change a list into a dict or vice versa, because the C structs are > different. > > Or do I misunderstand you? Can you give an example of something you > think should be allowed but currently isn't? Sorry, I was not as clear as I should have been. Here is what used to work, and I hope can be made to work again: class AlgebraicDict(dict): def doReallyComplexThings(self, foo): ... def __add__(self, other): ... def __mul__(self, other): ... unsuspecting_dict = {1:[1,2],2:3} unsuspecting_dict.__class__ = AlgebraicDict > TypeError: __class__ assignment: only for heap types Analogously, we want to transform native lists into AlgebraicLists, which of course has list as a base class. Hope this clears things up, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From mhammond@skippinet.com.au Fri Jan 10 12:56:18 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Fri, 10 Jan 2003 23:56:18 +1100 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: Message-ID: <009001c2b8a7$ab8b2f70$530f8490@eden> [Martin] > David Abrahams writes: > > > We also have a TSS implementation in the Boost.Threads library. I > > haven't looked at the ACE code myself, but I've heard that every > > component depends on many others, so it might be easier to extract > > useful information from the Boost implementation. > > Without looking at either Boost or ACE, I would guess that neither > will help much: We would be looking for TLS support for AtheOS, BeOS, > cthreads, lwp, OS/2, GNU pth, Solaris threads, SGI threads, and Win > CE. I somewhat doubt that either Boost or ACE aim for such a wide > coverage. We could simply have a "pluggable TLS" design. It seems that everyone who has this requirement is interfacing to a complex library (com, xpcom, Boost, ACE), and in general, these libraries also require TLS. So consider an API such as: PyTSM_HANDLE Py_InitThreadStateManager( void (*funcTLSAlloc)(...), ... PyInterpreterState = NULL); void Py_EnsureReadyToRock(PyTSM_HANDLE); void Py_DoneRocking(PyTSM_HANDLE); ... Obviously the spelling is drastically different, but the point is that we can lean on the extension module itself, rather than the platform, to provide the TLS. In the case of Windows and a number of other OS's, you could fallback to a platform implementation if necessary, but in the case of xpcom, for example, you know that xpcom also defines its own TLS API, so anywhere we need the extension module, TLS comes for "free", even if no one has ported the platform TLS API to the Python TLS API. Our TLS requirements are very simple, and could be "spelt" in a small number of function pointers. Such a design also handles any PyInterpreterState issues - we simply assert if the passed pointer is non-NULL, and leave it to someone who cares to fix . Mark. From guido@python.org Fri Jan 10 13:00:36 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 10 Jan 2003 08:00:36 -0500 Subject: [Python-Dev] Assignment to __class__ In-Reply-To: Your message of "Fri, 10 Jan 2003 06:40:52 EST." References: Message-ID: <200301101300.h0AD0as20966@pcp02138704pcs.reston01.va.comcast.net> > Sorry, I was not as clear as I should have been. Here is what used to work, > and I hope can be made to work again: > > class AlgebraicDict(dict): > def doReallyComplexThings(self, foo): ... > def __add__(self, other): ... > def __mul__(self, other): ... > > unsuspecting_dict = {1:[1,2],2:3} > > unsuspecting_dict.__class__ = AlgebraicDict > > TypeError: __class__ assignment: only for heap types But in this case, the instance layout of dict and AlgebraicDict is different anyway (AlgebraicDict has room for the __dict__ to contain instance variables) so you can't do that. The layouts could be the same if you add __slots__ = [] to AlgebraicDict. But the problem is that even then, AlgebraicDict may be allocated using a different free list than dict, and changing its __class__ to dict would free it using the wrong free list. To work around, create a neutral dict subclass that has the same layout at AlgebraicDict. --Guido van Rossum (home page: http://www.python.org/~guido/) From jacobs@penguin.theopalgroup.com Fri Jan 10 13:04:28 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Fri, 10 Jan 2003 08:04:28 -0500 (EST) Subject: [Python-Dev] Assignment to __class__ In-Reply-To: <200301101300.h0AD0as20966@pcp02138704pcs.reston01.va.comcast.net> Message-ID: On Fri, 10 Jan 2003, Guido van Rossum wrote: > > Sorry, I was not as clear as I should have been. Here is what used to work, > > and I hope can be made to work again: > > > > class AlgebraicDict(dict): > > def doReallyComplexThings(self, foo): ... > > def __add__(self, other): ... > > def __mul__(self, other): ... > > > > unsuspecting_dict = {1:[1,2],2:3} > > > > unsuspecting_dict.__class__ = AlgebraicDict > > > TypeError: __class__ assignment: only for heap types > > But in this case, the instance layout of dict and AlgebraicDict is > different anyway (AlgebraicDict has room for the __dict__ to contain > instance variables) so you can't do that. > > The layouts could be the same if you add __slots__ = [] to > AlgebraicDict. Again, I was not 100% clear. AlgebraicDict and AlgebraicList do define __slots__ to be an empty sequence. > But the problem is that even then, AlgebraicDict may be allocated > using a different free list than dict, and changing its __class__ to > dict would free it using the wrong free list. > > To work around, create a neutral dict subclass that has the same > layout at AlgebraicDict. The problem is that I do not control where the dict is being allocated. I suppose I can live with having to make copies, but it seems like a major step backwards to me. Thanks, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From walter@livinglogic.de Fri Jan 10 13:16:16 2003 From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=) Date: Fri, 10 Jan 2003 14:16:16 +0100 Subject: [Python-Dev] no expected test output for test_sort? In-Reply-To: <007001c2b7f5$5f8ad600$125ffea9@oemcomputer> References: <15892.37519.804602.556008@montanaro.dyndns.org> <15892.40582.284715.290847@montanaro.dyndns.org> <15892.56781.473869.187757@montanaro.dyndns.org> <3E1B0287.6080904@livinglogic.de> <200301080134.h081YSK07358@pcp02138704pcs.reston01.va.comcast.net> <3E1C01FA.9020602@livinglogic.de> <200301081221.h08CLHG08953@pcp02138704pcs.reston01.va.comcast.net> <3E1C6D14.7080908@livinglogic.de> <007001c2b7f5$5f8ad600$125ffea9@oemcomputer> Message-ID: <3E1EC7A0.1050306@livinglogic.de> Raymond Hettinger wrote: > [Walter] > >>>>So should I go on with this? Do we want to change all tests before 2.3 >>>>is finished, or start changing them after 2.3 is released >>>>(or something in between)? >>> > [GvR] > >>>I'd say something in between. It's probably good if someone (not me) >>>reviews your patches. > > > [Walter] > >>Definitely. Any volunteers? > > > I'll review a few of these but cannot sign-up for the whole ball game > because of time constraints, inability to run certain modules, and > trying to review only things I thoroughly understand. And I won't change any tests that I don't understand, so we'll probably won't change the whole test suite. > My own hope for your project is that new tests are made, coverage > increased, interfaces are checked to the document specification, and > bugs are found. I'm much less enthusiastic about having existing > tests converted to unittest format. The old version of test_pow.py had the following code print 'The number in both columns should match.' print `pow(3,3) % 8`, `pow(3,3,8)` Something like this is nearly impossible to spot in the output of regrtest.py. Porting all tests to PyUnit makes the output more uniform and it's easier to spot problem. Furthermore porting the old tests to PyUnit gives hints about what features are still missing from PyUnit: For example I've missed a method assertFuzzyEqual(), that does the equivalent of test.test_support.fcmp(), and an assertIs() would also be helpful. Bye, Walter Dörwald From gmcm@hypernet.com Fri Jan 10 13:23:06 2003 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 10 Jan 2003 08:23:06 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: References: <3E1DAFFB.5090202@v.loewis.de> Message-ID: <3E1E82EA.14254.52684CF4@localhost> On 9 Jan 2003 at 14:23, Tim Peters wrote: > Someone who cares about multiple interpreter states > should feel free to define and propose a stronger > requirement. However, the people pushing for change > here have explicitly disavowed interest in multiple > interpreter states, and I'm happy to press on > leaving them for afterthoughts. I have used multiple interpreter states, but not because I wanted to. Consider in-process COM servers implemented in Python. When an application asks for the COM server, the COM support code will do the *loading* on a thread spawned by the COM support code. When the application *uses* the COM server, it may do so on it's own thread (it will certainly do so on a different thread than was used to load the server). Installer freezes in-process COM servers. It does so by using a generic shim dll which gets renamed for each component. Basically, this dll will forward most of the calls on to Mark's PythoncomXX.dll. But it wants to install import hooks. If it is the only Python in the process, everything is fine. But if the application is Python, or the user wants to load more than one Python based COM server, then there already is an interpreter state. Unfortunately, the shim dll can't get to it, so can't install it's import hooks into the right one. At least, that's my recollection of the problems I was having before I gave up. My understanding of COM is relatively superficial. I understand the Python part better, but certainly not completely, (there were things I thought *should* work that I couldn't get working). There may even be a reason to want multiple interpreter states. My understanding of COM and threading is not deep enough for me to make a coherent statement of requirements. But pretty clearly Python's thread API doesn't let me get anywhere close to handling multiple Pythons in one process. -- Gordon http://www.mcmillan-inc.com/ From gmcm@hypernet.com Fri Jan 10 13:23:06 2003 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 10 Jan 2003 08:23:06 -0500 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: References: Message-ID: <3E1E82EA.24874.52684CB8@localhost> On 10 Jan 2003 at 10:39, Martin v. L=F6wis wrote: > Without looking at either Boost or ACE, I would > guess that neither will help much: We would be > looking for TLS support for AtheOS, BeOS, cthreads, > lwp, OS/2, GNU pth, Solaris threads, SGI threads, > and Win CE. I somewhat doubt that either Boost or > ACE aim for such a wide coverage. I've heard the claim that ACE runs on more platforms than Java. -- Gordon http://www.mcmillan-inc.com/ From martin@v.loewis.de Fri Jan 10 13:47:58 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Jan 2003 14:47:58 +0100 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: <3E1E82EA.24874.52684CB8@localhost> References: <3E1E82EA.24874.52684CB8@localhost> Message-ID: <3E1ECF0E.6040804@v.loewis.de> Gordon McMillan wrote: > I've heard the claim that ACE runs on more platforms > than Java. See http://www.cs.wustl.edu/~schmidt/ACE-overview.html That claim may come from the support for RTOSs (such as pSOS), and they may also have counted older versions of systems which Java hasn't been ported to (such HPUX 9). However, the minority thread libraries that Python supports (AtheOS, OS/2), appear to be unsupported in ACE. I agree with Tim's statement that there is no real problem in breaking support for these systems - if somebody cares about them, somebody will fix it, else we can rip it out. I was just responding to the claim that looking elsewhere may help. Regards, Martin From jeremy@alum.mit.edu Fri Jan 10 13:58:32 2003 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 10 Jan 2003 08:58:32 -0500 Subject: [Python-Dev] DUP_TOPX In-Reply-To: <003201c2b88f$70e3b300$d20ca044@oemcomputer> References: <003201c2b88f$70e3b300$d20ca044@oemcomputer> Message-ID: <15902.53640.150713.742913@slothrop.zope.com> >>>>> "RH" == Raymond Hettinger writes: RH> Also, we wanted to know if anyone still had a use for the RH> LLTRACE facility built into ceval.c. It's been there since '92 RH> and may possibly not longer be of value. I have used it in the last six months. Very handy for debugging. It would be a shame to lose it. Jeremy From theller@python.net Fri Jan 10 14:00:57 2003 From: theller@python.net (Thomas Heller) Date: 10 Jan 2003 15:00:57 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1E82EA.14254.52684CF4@localhost> References: <3E1DAFFB.5090202@v.loewis.de> <3E1E82EA.14254.52684CF4@localhost> Message-ID: "Gordon McMillan" writes: > On 9 Jan 2003 at 14:23, Tim Peters wrote: > > > Someone who cares about multiple interpreter states > > should feel free to define and propose a stronger > > requirement. However, the people pushing for change > > here have explicitly disavowed interest in multiple > > interpreter states, and I'm happy to press on > > leaving them for afterthoughts. > > I have used multiple interpreter states, but not > because I wanted to. > > Consider in-process COM servers implemented > in Python. When an application asks for the COM > server, the COM support code will do the *loading* > on a thread spawned by the COM support code. > When the application *uses* the COM server, it may > do so on it's own thread (it will certainly do so on > a different thread than was used to load the server). > > Installer freezes in-process COM servers. It does > so by using a generic shim dll which gets renamed > for each component. Basically, this dll will forward > most of the calls on to Mark's PythoncomXX.dll. > But it wants to install import hooks. > > If it is the only Python in the process, everything > is fine. But if the application is Python, or the user > wants to load more than one Python based COM > server, then there already is an interpreter state. > Unfortunately, the shim dll can't get to it, ... I cannot really believe this. Isn't it the same as for normal, unfrozen inprocess COM servers? The shim dll could do the same as pythoncom22.dll does, or even rely on it to do the right thing. Unfrozen inproc COM works whether the main process is Python or not. > ... so can't > install it's import hooks into the right one. IMO, it's the frozen DLL rendering the Python environment unusable for everything else (the main process, for example). I hope using the frozen module mechanism instead of import hooks will make this more tolerant. All this may of course be off-topic for this thread. Thomas From gmcm@hypernet.com Fri Jan 10 14:35:03 2003 From: gmcm@hypernet.com (Gordon McMillan) Date: Fri, 10 Jan 2003 09:35:03 -0500 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: References: <3E1E82EA.14254.52684CF4@localhost> Message-ID: <3E1E93C7.25009.52AA2B49@localhost> On 10 Jan 2003 at 15:00, Thomas Heller wrote: > "Gordon McMillan" writes: [...] > > Installer freezes in-process COM servers. It does so > > by using a generic shim dll which gets renamed for > > each component. Basically, this dll will forward most > > of the calls on to Mark's PythoncomXX.dll. But it > > wants to install import hooks. [...] > I cannot really believe this. Isn't it the same as > for normal, unfrozen inprocess COM servers? No. COM always loads pythoncom22 in that case. Note that a Python22 app can load a frozen Python21 COM server just fine. > The > shim dll could do the same as pythoncom22.dll does, > or even rely on it to do the right thing. That's what it tries to do. It loads pythoncomXX.dll and forwards all the calls it can. > Unfrozen inproc COM works whether the main process is > Python or not. Yes, pythoncom doesn't install import hooks. > > ... so can't > > install it's import hooks into the right one. > > IMO, it's the frozen DLL rendering the Python > environment unusable for everything else (the main > process, for example). I don't understand that statement at all. Working with a (same version) Python app is actually a secondary worry. I'm more bothered that, for example, Excel can't load 2 frozen servers which use the same Python. > I hope using the frozen module mechanism instead of > import hooks will make this more tolerant. But where are those modules frozen? How do they get installed in the already running Python? What if mulitple sets of frozen modules (with dups) want to install themselves? > All this may of course be off-topic for this thread. It ties into Martin's earlier comments about threading models. It may be that the solution lies in using COM's apartment threading, instead of free threading. That way, the COM server could have it's own interpreter state, and the calls would end up in the right interpreter. Maybe. But I don't understand the COM part well enough, and Mark's stuff supports free threading, not apartment threading. I really brought all this up to try to widen the scope from extension modules which can easily grab an interpreter state and hold onto it. -- Gordon http://www.mcmillan-inc.com/ From pyth@trillke.net Fri Jan 10 14:41:21 2003 From: pyth@trillke.net (holger krekel) Date: Fri, 10 Jan 2003 15:41:21 +0100 Subject: [Python-Dev] [ann] Minimal Python project Message-ID: <20030110154121.F1568@prim.han.de> Minimal Python Discussion, Coding and Sprint -------------------------------------------- We announce a mailinglist dedicated to developing a "Minimal Python" version. Minimal means that we want to have a very small C-core and as much as possible (re)implemented in python itself. This includes (parts of) the VM-Code. Building on the expected gains in flexibility we also hope to use distribution techniques such as PEP302 to ship only a minimal set of modules and load needed modules on demand. As Armin Rigo of PSYCO fame takes part in the effort, we are confident that MinimalPython will eventually run faster than today's CPython. And because Christian Tismer takes part, we are confident that we will find a radical enough approach which also fits Stackless :-) We are very interested in learning about and integrating prior art. And in hearing any doubtful or reinforcing opinions. Expertise is welcomed in all areas. So if you have an interest or even code regarding 'minimal python' please join the list: http://codespeak.net/mailman/listinfo/pypy-dev discussions previously took place on python-de and in private mails. We will repost some core ideas (in english) for further discussion. In a few months we'd like to do a one-week Sprint in Germany focusing on releasing first bits of Minimal Python. best wishes, Armin Rigo, Christian Tismer, Holger Krekel From guido@python.org Fri Jan 10 14:56:04 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 10 Jan 2003 09:56:04 -0500 Subject: [Python-Dev] DUP_TOPX In-Reply-To: Your message of "Fri, 10 Jan 2003 05:02:52 EST." <003201c2b88f$70e3b300$d20ca044@oemcomputer> References: <003201c2b88f$70e3b300$d20ca044@oemcomputer> Message-ID: <200301101456.h0AEu4h11883@odiug.zope.com> > During a code review, Neal Norwitz noticed that ceval.c > defines DUP_TOPX for x in (1,2,3,4,5) but that compile.c > never generates that op code with a parameter greater than > three. > > The question of the day is whether anyone knows of a > reason that we can't or shouldn't remove the code for > the 4 and 5 cases. Is there anything else (past or present) > that can generate this opcode? Not that I know of. Now's the time to find out, so let's drop these. > Taking it out is only a microscopic win, a few saved brain > cycles and a smaller byte size for the main eval loop > (making it slightly more likely to stay in cache). > > Also, we wanted to know if anyone still had a use for the > LLTRACE facility built into ceval.c. It's been there since > '92 and may possibly not longer be of value. Haven't used in 5 years I think, so if you want to lose it, be my guest. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Fri Jan 10 15:08:54 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 10 Jan 2003 10:08:54 -0500 Subject: [Python-Dev] DUP_TOPX In-Reply-To: <15902.53640.150713.742913@slothrop.zope.com> Message-ID: Note that someone (not me -- never tried it) thought enough of LLTRACE to document it, in Misc/SpecialBuilds.txt. When I created that file, I warned that any special build not documented there was fair game for removal, so *someone* must want it, and badly enough to write a paragraph . From guido@python.org Fri Jan 10 16:09:35 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 10 Jan 2003 11:09:35 -0500 Subject: [Python-Dev] DUP_TOPX In-Reply-To: Your message of "Fri, 10 Jan 2003 08:58:32 EST." <15902.53640.150713.742913@slothrop.zope.com> References: <003201c2b88f$70e3b300$d20ca044@oemcomputer> <15902.53640.150713.742913@slothrop.zope.com> Message-ID: <200301101609.h0AG9Zx12573@odiug.zope.com> > RH> Also, we wanted to know if anyone still had a use for the > RH> LLTRACE facility built into ceval.c. It's been there since '92 > RH> and may possibly not longer be of value. > > I have used it in the last six months. Very handy for debugging. It > would be a shame to lose it. OK, then let's keep it. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jan 10 16:11:24 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 10 Jan 2003 11:11:24 -0500 Subject: [Python-Dev] [ann] Minimal Python project In-Reply-To: Your message of "Fri, 10 Jan 2003 15:41:21 +0100." <20030110154121.F1568@prim.han.de> References: <20030110154121.F1568@prim.han.de> Message-ID: <200301101611.h0AGBOd12601@odiug.zope.com> Way cool. Would anybody be available to present some of these ideas and early results at PyCon? --Guido van Rossum (home page: http://www.python.org/~guido/) From dave@boost-consulting.com Fri Jan 10 16:46:34 2003 From: dave@boost-consulting.com (David Abrahams) Date: Fri, 10 Jan 2003 11:46:34 -0500 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL References: <002001c2b842$a26cb220$530f8490@eden> Message-ID: "Mark Hammond" writes: >> >> Then of course you know more than Tim would grant you: you >> >> do have an >> >> interpreter state, and hence you can infer that Python has been >> >> initialized. So I infer that your requirements are different >> >> from Tim's. > >> > Sheesh - lucky this is mildly entertaining . You are free to >> > infer what you like, but I believe it is clear and would prefer to >> > see a single other person with a problem rather than continue >> > pointless semantic games. > >> In this instance, it looks to me like Martin makes a good point. If >> I'm missing something, I'd appreciate an explanation. > > There was no requirement that identical code be used in all cases. Checking > if Python is initialized is currently trivial, and requires no special > inference skills. It is clear that some consideration will need to be given > to the PyInterpreterState used for all this, but that is certainly > tractable - every single person who has spoken up with this requirement to > date has indicated that their application does not need multiple interpreter > states - so explicitly ignoring that case seems fine. I understand now, thanks. -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From dave@boost-consulting.com Fri Jan 10 17:30:35 2003 From: dave@boost-consulting.com (David Abrahams) Date: Fri, 10 Jan 2003 12:30:35 -0500 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: (martin@v.loewis.de's message of "10 Jan 2003 10:39:06 +0100") References: <006201c2b76b$2e04ded0$530f8490@eden> <20030109130008.H349@prim.han.de> Message-ID: martin@v.loewis.de (Martin v. L=F6wis) writes: > David Abrahams writes: > >> We also have a TSS implementation in the Boost.Threads library. I >> haven't looked at the ACE code myself, but I've heard that every >> component depends on many others, so it might be easier to extract >> useful information from the Boost implementation. > > Without looking at either Boost or ACE, I would guess that neither > will help much: We would be looking for TLS support for AtheOS, BeOS, > cthreads, lwp, OS/2, GNU pth, Solaris threads, SGI threads, and Win > CE. I somewhat doubt that either Boost or ACE aim for such a wide > coverage. Boost covers only pthreads and Win32 at the moment. I thought I understood Tim to be saying that all of the other ones should be considered broken in Python anyway until proven otherwise, which is why I bothered to mention it. --=20 David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From dave@boost-consulting.com Fri Jan 10 17:36:59 2003 From: dave@boost-consulting.com (David Abrahams) Date: Fri, 10 Jan 2003 12:36:59 -0500 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: <009001c2b8a7$ab8b2f70$530f8490@eden> ("Mark Hammond"'s message of "Fri, 10 Jan 2003 23:56:18 +1100") References: <009001c2b8a7$ab8b2f70$530f8490@eden> Message-ID: "Mark Hammond" writes: > [Martin] >> David Abrahams writes: >> >> > We also have a TSS implementation in the Boost.Threads library. I >> > haven't looked at the ACE code myself, but I've heard that every >> > component depends on many others, so it might be easier to extract >> > useful information from the Boost implementation. >> >> Without looking at either Boost or ACE, I would guess that neither >> will help much: We would be looking for TLS support for AtheOS, BeOS, >> cthreads, lwp, OS/2, GNU pth, Solaris threads, SGI threads, and Win >> CE. I somewhat doubt that either Boost or ACE aim for such a wide >> coverage. > > We could simply have a "pluggable TLS" design. > > It seems that everyone who has this requirement is interfacing to a > complex library (com, xpcom, Boost, ACE), and in general, these > libraries also require TLS. Boost isn't in that category. Boost provides a threading library to establish a platform-independent C++ interface for threading, but to date none of the other Boost libraries depend on the use of Boost.Threads. In other words, Boost doesn't require TLS, but it can provide TLS ;-) > So consider an API such as: > > PyTSM_HANDLE Py_InitThreadStateManager( > void (*funcTLSAlloc)(...), > ... > PyInterpreterState = NULL); > > void Py_EnsureReadyToRock(PyTSM_HANDLE); > void Py_DoneRocking(PyTSM_HANDLE); > ... > > Obviously the spelling is drastically different, but the point is > that we can lean on the extension module itself, rather than the > platform, to provide the TLS. In the case of Windows and a number > of other OS's, you could fallback to a platform implementation if > necessary, but in the case of xpcom, for example, you know that > xpcom also defines its own TLS API, so anywhere we need the > extension module, TLS comes for "free", even if no one has ported > the platform TLS API to the Python TLS API. Our TLS requirements > are very simple, and could be "spelt" in a small number of function > pointers. I take it you are planning to provide a way to get the neccessary TLS from Python's API (in case it isn't lying about elsewhere), but not neccessarily port it to every platform? If so, that sounds like a fine approach. -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From paul@pfdubois.com Fri Jan 10 17:43:19 2003 From: paul@pfdubois.com (Paul F Dubois) Date: Fri, 10 Jan 2003 09:43:19 -0800 Subject: [Python-Dev] Parallel pyc construction Message-ID: <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> On a 384 processor job we have once again encountered that old question = of corrupted .pyc files, sometimes resulting in an error, sometimes in a = silent wrong behavior later. I know this was allegedly fixed previously but it looks like it doesn't really work. We lost a couple of weeks work this = time. Didn't we talk about an option to not make pyc files? I can't seem to = find it. (We totally don't care about the cost of imports. The documentation mentions "ihooks" but not the module itself. I know that importing has = been an area of create turmoil so I don't really know where to look.) I = couldn't even find the list of command-line options for Python itself except a mention of -c in the tutorial. Any pointers would be appreciated. From altis@semi-retired.com Fri Jan 10 17:54:45 2003 From: altis@semi-retired.com (Kevin Altis) Date: Fri, 10 Jan 2003 09:54:45 -0800 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> Message-ID: > -----Original Message----- > From: Paul F Dubois > > On a 384 processor job we have once again encountered that old question of > corrupted .pyc files, sometimes resulting in an error, sometimes > in a silent > wrong behavior later. I know this was allegedly fixed previously but it > looks like it doesn't really work. We lost a couple of weeks work > this time. > > Didn't we talk about an option to not make pyc files? I can't seem to find > it. (We totally don't care about the cost of imports. The documentation > mentions "ihooks" but not the module itself. I know that > importing has been > an area of create turmoil so I don't really know where to look.) > I couldn't > even find the list of command-line options for Python itself except a > mention of -c in the tutorial. Any pointers would be appreciated. I hadn't considered the option of not making .pyc files, though I've complained about .pyo files in the past with distutils, but now compilation is optional there. The .pyc and .pyo files certainly clutter a directory. If there is no significant performance improvement for loading and using .pyc files or the difference is only significant for large files or certain code constructs, maybe they shouldn't be automatically created. I guess this is another area for test cases. ka From tim.one@comcast.net Fri Jan 10 17:51:48 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 10 Jan 2003 12:51:48 -0500 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> Message-ID: [Paul F Dubois] > ... > I couldn't even find the list of command-line options for Python itself > except a mention of -c in the tutorial. Any pointers would be appreciated. Do python -h On Unixish boxes I believe there's also a man page (Misc/python.man). From altis@semi-retired.com Fri Jan 10 18:10:27 2003 From: altis@semi-retired.com (Kevin Altis) Date: Fri, 10 Jan 2003 10:10:27 -0800 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: Message-ID: I forgot to say that the file clutter wouldn't be a problem if _pyc and _pyo sub-directories (or .pyc and .pyo to hide them in Unix) were automatically created and the files stuck in there, but I'm sure that would end up screwing up something else that relies on the .pyc and .pyo files being in the same directory as the .py or .pyw files. ka > -----Original Message----- > From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On > Behalf Of Kevin Altis > Sent: Friday, January 10, 2003 9:55 AM > To: python-dev@python.org > Subject: RE: [Python-Dev] Parallel pyc construction > > > > -----Original Message----- > > From: Paul F Dubois > > > > On a 384 processor job we have once again encountered that old > question of > > corrupted .pyc files, sometimes resulting in an error, sometimes > > in a silent > > wrong behavior later. I know this was allegedly fixed previously but it > > looks like it doesn't really work. We lost a couple of weeks work > > this time. > > > > Didn't we talk about an option to not make pyc files? I can't > seem to find > > it. (We totally don't care about the cost of imports. The documentation > > mentions "ihooks" but not the module itself. I know that > > importing has been > > an area of create turmoil so I don't really know where to look.) > > I couldn't > > even find the list of command-line options for Python itself except a > > mention of -c in the tutorial. Any pointers would be appreciated. > > I hadn't considered the option of not making .pyc files, though I've > complained about .pyo files in the past with distutils, but now > compilation > is optional there. The .pyc and .pyo files certainly clutter a > directory. If > there is no significant performance improvement for loading and using .pyc > files or the difference is only significant for large files or > certain code > constructs, maybe they shouldn't be automatically created. I guess this is > another area for test cases. > > ka From aahz@pythoncraft.com Fri Jan 10 18:11:11 2003 From: aahz@pythoncraft.com (Aahz) Date: Fri, 10 Jan 2003 13:11:11 -0500 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> References: <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> Message-ID: <20030110181111.GA16655@panix.com> On Fri, Jan 10, 2003, Paul F Dubois wrote: > > Didn't we talk about an option to not make pyc files? I can't seem > to find it. (We totally don't care about the cost of imports. The > documentation mentions "ihooks" but not the module itself. I know that > importing has been an area of create turmoil so I don't really know > where to look.) I couldn't even find the list of command-line options > for Python itself except a mention of -c in the tutorial. Any pointers > would be appreciated. Why not make the .py directory read-only? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "I used to have a .sig but I found it impossible to please everyone..." --SFJ From neal@metaslash.com Fri Jan 10 18:11:47 2003 From: neal@metaslash.com (Neal Norwitz) Date: Fri, 10 Jan 2003 13:11:47 -0500 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> References: <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> Message-ID: <20030110181147.GB29873@epoch.metaslash.com> On Fri, Jan 10, 2003 at 09:43:19AM -0800, Paul F Dubois wrote: > On a 384 processor job we have once again encountered that old question of > corrupted .pyc files, sometimes resulting in an error, sometimes in a silent > wrong behavior later. I know this was allegedly fixed previously but it > looks like it doesn't really work. We lost a couple of weeks work this time. > > Didn't we talk about an option to not make pyc files? I can't seem to find > it. (We totally don't care about the cost of imports. The documentation > mentions "ihooks" but not the module itself. I know that importing has been > an area of create turmoil so I don't really know where to look.) I couldn't > even find the list of command-line options for Python itself except a > mention of -c in the tutorial. Any pointers would be appreciated. A while ago I fixed a problem when there were more than 64k items used to create a list. The fix went into 2.2.2 I believe. For 2.3 some sizes were increased from 2 to 4 bytes so the problem shouldn't occur. Here's the bug: http://python.org/sf/561858 There is a bug (aka feature request) assigned to me: http://python.org/sf/602345 option for not writing .py[co] files I haven't done anything with it yet. Feel free to submit a patch. What version of python had this problem? Can you make a test case? Neal From guido@python.org Fri Jan 10 18:29:41 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 10 Jan 2003 13:29:41 -0500 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: Your message of "Fri, 10 Jan 2003 09:43:19 PST." <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> References: <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> Message-ID: <200301101829.h0AITfZ13115@odiug.zope.com> > On a 384 processor job we have once again encountered that old question of > corrupted .pyc files, sometimes resulting in an error, sometimes in a silent > wrong behavior later. I know this was allegedly fixed previously but it > looks like it doesn't really work. We lost a couple of weeks work this time. > > Didn't we talk about an option to not make pyc files? I can't seem to find > it. (We totally don't care about the cost of imports. The documentation > mentions "ihooks" but not the module itself. I know that importing has been > an area of create turmoil so I don't really know where to look.) I couldn't > even find the list of command-line options for Python itself except a > mention of -c in the tutorial. Any pointers would be appreciated. I don't think we have such an option, but it's a good idea. If you submit a patch, we'll add it. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jan 10 18:30:57 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 10 Jan 2003 13:30:57 -0500 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: Your message of "Fri, 10 Jan 2003 09:54:45 PST." References: Message-ID: <200301101830.h0AIUvQ13135@odiug.zope.com> > I hadn't considered the option of not making .pyc files, though I've > complained about .pyo files in the past with distutils, but now > compilation is optional there. The .pyc and .pyo files certainly > clutter a directory. If there is no significant performance > improvement for loading and using .pyc files or the difference is > only significant for large files or certain code constructs, maybe > they shouldn't be automatically created. I guess this is another > area for test cases. Oh, in most cases .pyc/.pyo files *do* give significant speedup; the parser + bytecode compiler are really slow. It's just that Paul's machine is so fast and his program runs so long that he doesn't care. But many people do. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Jan 10 18:37:14 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 10 Jan 2003 13:37:14 -0500 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: Your message of "Fri, 10 Jan 2003 13:11:11 EST." <20030110181111.GA16655@panix.com> References: <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> <20030110181111.GA16655@panix.com> Message-ID: <200301101837.h0AIbES13212@odiug.zope.com> > Why not make the .py directory read-only? Excellent suggestion for a work-around. --Guido van Rossum (home page: http://www.python.org/~guido/) From just@letterror.com Fri Jan 10 18:04:17 2003 From: just@letterror.com (Just van Rossum) Date: Fri, 10 Jan 2003 19:04:17 +0100 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: Message-ID: Kevin Altis wrote: > I hadn't considered the option of not making .pyc files, though I've > complained about .pyo files in the past with distutils, but now > compilation is optional there. The .pyc and .pyo files certainly > clutter a directory. If there is no significant performance > improvement for loading and using .pyc files or the difference is > only significant for large files or certain code constructs, maybe > they shouldn't be automatically created. I guess this is another area > for test cases. There *is* a significant performance improvement (otherwise I doubt we'd have .pyc files in the first place ;-), but it only improves startup time. So it can make a big difference for short running processes, yet can be totally irrelevant for long running processes. Just From martin@v.loewis.de Fri Jan 10 19:25:38 2003 From: martin@v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 10 Jan 2003 20:25:38 +0100 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: References: <006201c2b76b$2e04ded0$530f8490@eden> <20030109130008.H349@prim.han.de> Message-ID: <3E1F1E32.50800@v.loewis.de> David Abrahams wrote: > Boost covers only pthreads and Win32 at the moment. I thought I > understood Tim to be saying that all of the other ones should be > considered broken in Python anyway until proven otherwise, which is > why I bothered to mention it. Right. My response was really addressing Holger's suggestion that support for those platforms can be copied from ACE, and your suggestion that this support is better copied from Boost. Neither will help for the platforms for which Tim is willing to say that they become broken. Regards, Martin From python-kbutler@sabaydi.com Fri Jan 10 19:41:04 2003 From: python-kbutler@sabaydi.com (Kevin J. Butler) Date: Fri, 10 Jan 2003 12:41:04 -0700 Subject: [Python-Dev] Re: [ann] Minimal Python project Message-ID: <3E1F21D0.5060904@sabaydi.com> > > >From: holger krekel > >We announce a mailinglist dedicated to developing >a "Minimal Python" version. Minimal means that >we want to have a very small C-core and as much >as possible (re)implemented in python itself. This >includes (parts of) the VM-Code. > >From: Guido van Rossum >Way cool. > > +1 I've been thinking of proposing a very similar thing - though I was thinking "Python in Python" which suggest all sorts of interesting logo ideas. :-) >We are very interested in learning about and >integrating prior art. And in hearing any >doubtful or reinforcing opinions. Expertise >is welcomed in all areas. > The Squeak Smalltalk implementation is interesting & relevant: http://www.squeak.org/features/vm.html The Squeak VM is written in a subset of Smalltalk ("Slang", different from "S-lang") that can be translated directly to C. This core provides the interpreter for the rest of the language, allowing the entire system to be very portable, and it facilitates development in many ways - you get to work on the core while working in your favorite language, you get to use all your favorite tools, etc. Plus I expect the translatable subset provides a solid, simple basis for integrating external code. I think a similar approach would be very useful in Minimal Python (... in Python), probably adopting ideas from Psyco http://psyco.sourceforge.net/ and/or Pyrex http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/ as the foundation for the "compilable subset". This would also provide a nice basis for a Jython implementation... Can-I-play-with-it-yet?-ly y'rs, kb From theller@python.net Fri Jan 10 20:09:26 2003 From: theller@python.net (Thomas Heller) Date: 10 Jan 2003 21:09:26 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E1E93C7.25009.52AA2B49@localhost> References: <3E1E82EA.14254.52684CF4@localhost> <3E1E93C7.25009.52AA2B49@localhost> Message-ID: <3co0g421.fsf@python.net> "Gordon McMillan" writes: > > Working with a (same version) Python app is actually > a secondary worry. I'm more bothered that, for > example, Excel can't load 2 frozen servers which > use the same Python. A COM component is useless IMO if it restricts which other components you can use, or which client you use, and that's why I didn't allow inproc COM servers in py2exe up to now. But, since this problem doesn't occur with nonfrozen servers, it seems the import hooks are the problem. > > > I hope using the frozen module mechanism instead of > > import hooks will make this more tolerant. > > But where are those modules frozen? How do they > get installed in the already running Python? What > if mulitple sets of frozen modules (with dups) want > to install themselves? I hope one could extend the FrozenModule table in an already running Python by adding more stuff to it. Isn't there already code in cvs which allows this? > It ties into Martin's earlier comments about > threading models. It may be that the solution > lies in using COM's apartment threading, > instead of free threading. That way, the COM > server could have it's own interpreter state, and > the calls would end up in the right interpreter. > Maybe. > > But I don't understand the COM part well enough, > and Mark's stuff supports free threading, not > apartment threading. Last I checked, win32all registers the components as ThreadingModel = both. As I understand this, both STA and MTA is supported (STA = Single Threaded Apartment, MTA = MultiThreaded Appartment). So marking them as STA should be safe if it is needed. Maybe Mark can clear the confusion? > > I really brought all this up to try to widen the > scope from extension modules which can > easily grab an interpreter state and hold > onto it. > Thomas From just@letterror.com Fri Jan 10 20:26:45 2003 From: just@letterror.com (Just van Rossum) Date: Fri, 10 Jan 2003 21:26:45 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3co0g421.fsf@python.net> Message-ID: Thomas Heller wrote: > I hope one could extend the FrozenModule table in an already running > Python by adding more stuff to it. Isn't there already code in cvs > which allows this? There was code from me that would let you do that from Python, but I ripped it out as PEP 302 makes it unnecessary. However, it's possible from C (and always has been, it's the trick that Anthony Tuininga used in his freeze-like tool). Normally, PyImport_FrozenModules points to a static array, but there's nothing against setting it to a heap array. The fact that it's not a Python object and that the array elements aren't Python objects makes it a little messy, though (another reason why I backed out the Python interface to it). Just From theller@python.net Fri Jan 10 20:53:02 2003 From: theller@python.net (Thomas Heller) Date: 10 Jan 2003 21:53:02 +0100 Subject: [Python-Dev] sys.path[0] In-Reply-To: <200301081227.h08CRGY08969@pcp02138704pcs.reston01.va.comcast.net> References: <200301081227.h08CRGY08969@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [ping] > > Exactly for this reason, changing the working directory confuses > > inspect and pydoc and presumably anything else that tries to find > > source code. There's no way to work around this because the true > > path information is simply not available, unless we fix the > > __file__ attribute. > > > > I'd be in favour of setting all __file__ attributes to absolute paths. > [guido] > Note that code objects have their own filename attribute, which is not > directly related to __file__, and that's the one that causes the most > problems. I truly wish we could change marshal so that when it loads > a code object, it replaces the filename attribute with the filename > from which the object is loaded, but that's far from easy. :-( > > > > > I'm disinclined to do anything about this, except perhaps warn that > > > > the script directory may be given as a relative path. > > > > The current working directory is a piece of hidden global state. > > In general, hidden global state is bad, and this particular piece > > of state is especially important because it affects what Python > > modules get loaded. I'd prefer for the interpreter to just set > > up sys.path correctly to begin with -- what's the point in doing > > it ambiguously only to fix it up later anyway? > > Maybe things have changed, but in the past I've been bitten by > absolute path conversions. E.g. I rememeber from my time at CWI that > automount caused really ugly long absulute paths that everybody > hated. Also, there are conditions under which getcwd() can fail (when > an ancestor directory doesn't have enough permissions) so the code > doing so must be complex. > > That said, I'd be +0 if someone gave me a patch that fixed the path of > the script (the only path that's not already absolutized by site.py). I've reopened http://www.python.org/sf/664376 and uploaded an implementation for linux and maybe other systems where the realpath function is available. I'd appreciate some help because the patch is not complete. IMO it makes no sense to fix this on windows and not on other systems. Thomas From bac@OCF.Berkeley.EDU Fri Jan 10 21:01:08 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Fri, 10 Jan 2003 13:01:08 -0800 (PST) Subject: [Python-Dev] [ann] Minimal Python project In-Reply-To: <20030110154121.F1568@prim.han.de> References: <20030110154121.F1568@prim.han.de> Message-ID: [holger krekel] > Minimal Python Discussion, Coding and Sprint > -------------------------------------------- > > We announce a mailinglist dedicated to developing > a "Minimal Python" version. Minimal means that > we want to have a very small C-core and as much > as possible (re)implemented in python itself. This > includes (parts of) the VM-Code. > I can see this possibly being a good learning experience for people wanting to get into Python core programming. Obviously it won't be the same as CPython, but since it will need to stay compatible it could easily be a good way to understand the concepts in how the language is designed before jumping into the full-fledged C core; this is especially true if most of it is written in Python. -Brett From bac@OCF.Berkeley.EDU Fri Jan 10 21:08:37 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Fri, 10 Jan 2003 13:08:37 -0800 (PST) Subject: [Python-Dev] Parallel pyc construction In-Reply-To: <200301101829.h0AITfZ13115@odiug.zope.com> References: <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> <200301101829.h0AITfZ13115@odiug.zope.com> Message-ID: [Guido van Rossum] > > On a 384 processor job we have once again encountered that old question of > > corrupted .pyc files, sometimes resulting in an error, sometimes in a silent > > wrong behavior later. I know this was allegedly fixed previously but it > > looks like it doesn't really work. We lost a couple of weeks work this time. > > > > Didn't we talk about an option to not make pyc files? I can't seem to find > > it. (We totally don't care about the cost of imports. The documentation > > mentions "ihooks" but not the module itself. I know that importing has been > > an area of create turmoil so I don't really know where to look.) I couldn't > > even find the list of command-line options for Python itself except a > > mention of -c in the tutorial. Any pointers would be appreciated. > > I don't think we have such an option, but it's a good idea. If you > submit a patch, we'll add it. > What about PEP 301 and an import hook? Couldn't a custom import hook be written up that didn't output a .py file? I would think it could be as simple as finding the file, opening it, and then compiling it as a module and inserting it directly into ``sys.modules``. Wouldn't that circumvent any .py(c|o) writing? Of course this assumes Paul is using 2.3, but even if he isn't couldn't a solution be used like that, helping to prevent needing to write a patch for Python (unless you want this in the 2.2 branch)? -Brett From Nicolas.Chauvat@logilab.fr Fri Jan 10 21:19:09 2003 From: Nicolas.Chauvat@logilab.fr (Nicolas Chauvat) Date: Fri, 10 Jan 2003 22:19:09 +0100 Subject: [Python-Dev] Re: [pypy-dev] Re: [ann] Minimal Python project In-Reply-To: <3E1F21D0.5060904@sabaydi.com> References: <3E1F21D0.5060904@sabaydi.com> Message-ID: <20030110211909.GA17151@logilab.fr> On Fri, Jan 10, 2003 at 12:41:04PM -0700, Kevin J. Butler wrote: > >From: holger krekel > > > >We announce a mailinglist dedicated to developing > >a "Minimal Python" version. Minimal means that > >we want to have a very small C-core and as much > >as possible (re)implemented in python itself. This > >includes (parts of) the VM-Code. > > > >From: Guido van Rossum > >Way cool. > +1 +1 > The Squeak Smalltalk implementation is interesting & relevant: > http://www.squeak.org/features/vm.html > > The Squeak VM is written in a subset of Smalltalk ("Slang", different IIRC, this is also the case for Mozart, an implementation of the Oz language. Cf http://www.mozart-oz.org/ > Can-I-play-with-it-yet?-ly y'rs, I want this and some time to play around with metaprogramming in Python !! -- Nicolas Chauvat http://www.logilab.com - "Mais où est donc Ornicar ?" - LOGILAB, Paris (France) From guido@python.org Fri Jan 10 21:34:45 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 10 Jan 2003 16:34:45 -0500 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: Your message of "Fri, 10 Jan 2003 13:08:37 PST." References: <000201c2b8cf$c60fcae0$6601a8c0@NICKLEBY> <200301101829.h0AITfZ13115@odiug.zope.com> Message-ID: <200301102134.h0ALYjB27131@odiug.zope.com> > What about PEP 301 and an import hook? Couldn't a custom import > hook be written up that didn't output a .py file? I would think it > could be as simple as finding the file, opening it, and then > compiling it as a module and inserting it directly into > ``sys.modules``. Wouldn't that circumvent any .py(c|o) writing? I think using an import hook to prevent writing .pyc files is way too much work. You can't use the built-in code because that *does* write the .pyc files, so you have to redo all the work that the standard hook does. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Fri Jan 10 21:32:50 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 10 Jan 2003 15:32:50 -0600 Subject: [Python-Dev] sys.path[0] In-Reply-To: References: <200301081227.h08CRGY08969@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15903.15362.963839.421571@montanaro.dyndns.org> Thomas> I've reopened http://www.python.org/sf/664376 and uploaded an Thomas> implementation for linux and maybe other systems where the Thomas> realpath function is available. I'd appreciate some help Thomas> because the patch is not complete. I just uploaded a somewhat different patch. I wouldn't count on the validity of the changes to sysmodule.c, but it does contain the necesary bits for verifying that realpath() exists. Skip From just@letterror.com Fri Jan 10 21:42:53 2003 From: just@letterror.com (Just van Rossum) Date: Fri, 10 Jan 2003 22:42:53 +0100 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: <200301102134.h0ALYjB27131@odiug.zope.com> Message-ID: Guido van Rossum wrote: > > What about PEP 301 and an import hook? Couldn't a custom import > > hook be written up that didn't output a .py file? I would think it > > could be as simple as finding the file, opening it, and then > > compiling it as a module and inserting it directly into > > ``sys.modules``. Wouldn't that circumvent any .py(c|o) writing? > > I think using an import hook to prevent writing .pyc files is way too > much work. You can't use the built-in code because that *does* write > the .pyc files, so you have to redo all the work that the standard > hook does. But it's another fine use case for a to-be-written PEP-302-compliant ihooks.py replacement. I'll make a mental note. Just From whisper@oz.net Fri Jan 10 22:12:30 2003 From: whisper@oz.net (David LeBlanc) Date: Fri, 10 Jan 2003 14:12:30 -0800 Subject: [Python-Dev] [ann] Minimal Python project In-Reply-To: Message-ID: I was thinking about something along these lines but with a different slant: an "essential" python that's stripped way down and that could be ROM-able. In the same way that Java is used in the Sharp Zarius, I think it would be way cool to have an os/python cored PDA. I wonder if it could be done in 16mb or less, 8mb being the next logical number... David LeBlanc Seattle, WA USA > -----Original Message----- > From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On > Behalf Of Brett Cannon > Sent: Friday, January 10, 2003 13:01 > To: holger krekel > Cc: python-list@python.org; python-dev@python.org > Subject: Re: [Python-Dev] [ann] Minimal Python project > > > [holger krekel] > > > Minimal Python Discussion, Coding and Sprint > > -------------------------------------------- > > > > We announce a mailinglist dedicated to developing > > a "Minimal Python" version. Minimal means that > > we want to have a very small C-core and as much > > as possible (re)implemented in python itself. This > > includes (parts of) the VM-Code. > > > > I can see this possibly being a good learning experience for people > wanting to get into Python core programming. Obviously it won't be the > same as CPython, but since it will need to stay compatible it could easily > be a good way to understand the concepts in how the language is designed > before jumping into the full-fledged C core; this is especially true if > most of it is written in Python. > > -Brett > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From blunck@gst.com Fri Jan 10 22:32:26 2003 From: blunck@gst.com (Christopher Blunck) Date: Fri, 10 Jan 2003 17:32:26 -0500 Subject: [Python-Dev] [ann] Minimal Python project In-Reply-To: <20030110154121.F1568@prim.han.de> References: <20030110154121.F1568@prim.han.de> Message-ID: <20030110223226.GE31668@homer.gst.com> Boss. On one of our projects, we are only allocated a 32Mb footprint in which to fit kernel, root filesystem, essential libs (c and curses for example), and our actual product. A component of our product is a configuration gui that we wrote in py. We put a lot of effort into shrinking down py and getting it as small as we could in order to fit on the filesystem (recall that running a linux system writes stuff (utmp for example) that must have space above your installation waterline). We had a lot of success. I'm very encouraged and excited about this project. Definitely keep the world posted, as this could really open some doors for us. -c -- 5:30pm up 81 days, 8:46, 1 user, load average: 0.14, 0.30, 0.61 From pedronis@bluewin.ch Fri Jan 10 22:22:15 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Fri, 10 Jan 2003 23:22:15 +0100 Subject: [Python-Dev] [ann] Minimal Python project References: Message-ID: <09c201c2b8f6$bb293220$6d94fea9@newmexico> I don't to exactly what they are aiming for. But in general minimal C runtime != minimal footprint in particular if high performance is a goal, the opposite may well be the case. regards. From mhammond@skippinet.com.au Sat Jan 11 00:22:28 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Sat, 11 Jan 2003 11:22:28 +1100 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: Message-ID: <01b101c2b907$8683f620$530f8490@eden> [David] > "Mark Hammond" writes: > > > We could simply have a "pluggable TLS" design. > > > > It seems that everyone who has this requirement is interfacing to a > > complex library (com, xpcom, Boost, ACE), and in general, these > > libraries also require TLS. > > Boost isn't in that category. Boost provides a threading library to > establish a platform-independent C++ interface for threading, but to > date none of the other Boost libraries depend on the use of > Boost.Threads. In other words, Boost doesn't require TLS, but it can > provide TLS ;-) Yes, this is exactly what I meant. Mozilla is another good example. Mozilla does not require TLS, but indeed builds its own API for it - ie, xpcom does not require it, but does provide it. While Mozilla therefore has TLS ports for many many platforms, this doesn't help us directly, as we can't just lift their code (MPL, etc). But I believe we could simply lean on them for their implementations at runtime. > I take it you are planning to provide a way to get the neccessary TLS > from Python's API (in case it isn't lying about elsewhere), but not > neccessarily port it to every platform? I am not sure what you mean by "get the necessary TLS from Python's API". I don't see a need for Python to expose any TLS functionality. If TLS is required *only* for this thread-state magic, then Python just consumes TLS, never exposes it. It obviously does expose an API which internally uses TLS, but it will not expose TLS itself. I forsee a "bootstrap prelude dance" which an extension library must perform, setting up these pointers exactly once. The obvious question from this approach is how to deal with *multiple* libraries in one app. For example, what happens when a single Python application wishes to use Boost *and* xpcom, and both attempt their bootstrap prelude, each providing a TLS implementation? Off the top of my head, a "first in wins" strategy may be fine - we dont care *who* provides TLS, so long as we have it. We don't really have a way to unload an extension module, so lifetime issues may not get in our way. Mark. From martin@v.loewis.de Sat Jan 11 07:53:57 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 11 Jan 2003 08:53:57 +0100 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: <01b101c2b907$8683f620$530f8490@eden> References: <01b101c2b907$8683f620$530f8490@eden> Message-ID: "Mark Hammond" writes: > The obvious question from this approach is how to deal with *multiple* > libraries in one app. For example, what happens when a single Python > application wishes to use Boost *and* xpcom, and both attempt their > bootstrap prelude, each providing a TLS implementation? I would advise to follow Tim's strategy: Make TLS part of the thread_* files, accept that on some threading configuration, there won't be TLS until somebody implements it, and make TLS usage part of the core instead of part of the extension module. I doubt any of the potential TLS providers supports more than Win32 or pthreads. Regards, Martin From bsder@allcaps.org Sat Jan 11 08:25:44 2003 From: bsder@allcaps.org (Andrew P. Lentvorski, Jr.) Date: Sat, 11 Jan 2003 00:25:44 -0800 (PST) Subject: [Python-Dev] [ann] Minimal Python project In-Reply-To: <20030110154121.F1568@prim.han.de> Message-ID: On Fri, 10 Jan 2003, holger krekel wrote: > We announce a mailinglist dedicated to developing > a "Minimal Python" version. Minimal means that This will be really great for OS distributions and system administration. Being able to compile Python with a simple make/gcc combo and not requiring a full development environment would be wonderful. I look forward to seeing this. -a From dave@boost-consulting.com Sat Jan 11 13:48:24 2003 From: dave@boost-consulting.com (David Abrahams) Date: Sat, 11 Jan 2003 08:48:24 -0500 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: <01b101c2b907$8683f620$530f8490@eden> ("Mark Hammond"'s message of "Sat, 11 Jan 2003 11:22:28 +1100") References: <01b101c2b907$8683f620$530f8490@eden> Message-ID: "Mark Hammond" writes: > [David] > >> "Mark Hammond" writes: >> >> > We could simply have a "pluggable TLS" design. >> > >> > It seems that everyone who has this requirement is interfacing to a >> > complex library (com, xpcom, Boost, ACE), and in general, these >> > libraries also require TLS. >> >> Boost isn't in that category. Boost provides a threading library to >> establish a platform-independent C++ interface for threading, but to >> date none of the other Boost libraries depend on the use of >> Boost.Threads. In other words, Boost doesn't require TLS, but it can >> provide TLS ;-) > > Yes, this is exactly what I meant. Mozilla is another good example. > Mozilla does not require TLS, but indeed builds its own API for it - ie, > xpcom does not require it, but does provide it. > > While Mozilla therefore has TLS ports for many many platforms, this doesn't > help us directly, as we can't just lift their code (MPL, etc). But I > believe we could simply lean on them for their implementations at runtime. Ah, so that void (*funcTLSAlloc)(...) was supposed to be something supplied by the extension writer? Hmm, the Boost interface doesn't work that way, and AFAICT wouldn't be easily adapted to it. It basically works like this: the user declares a special C++ TSS object which internally holds a pointer. That pointer has a different value in each thread, and if you want more storage, you can allocate it and stick it in the pointer. The user can declare any number of these TSS objects, up to some implementation-specified limit. -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From bac@OCF.Berkeley.EDU Sat Jan 11 23:31:08 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Sat, 11 Jan 2003 15:31:08 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: References: Message-ID: OK, so I read the Open Group's specification and it had squat for default value info (unless what they provide online is not as detailed as what members have access to). So I logged into a Solaris machine with 2.1.1 and ran ``time.strptime('January', '%B')`` and got (1900, 1, 0, 0, 0, 0, 6, 1, 0) (which is strange because the day of the week for 1900-01-01 is Monday, not Sunday; must be rolling back a day since the day is 0? But then the Julian day value is wrong). But otherwise I am fine with it defaulting to 1900-01-00 as Kevin seems to be suggesting. But a perk of my strptime implementation is that I researched equations that can calculate the Julian day based on the Gregorian date values, Gregorian values based on the year and Julian day, and day of the week from the year, month, and day. This means that if I set day to 1 if it was not given by the user, then the Gregorian calculation will figure out that it should be 1900-01-01; I would like to use that calculation. Now I could special-case all of this so that this quirky business of the day of the month being set to 0 is implemented. But I would much rather return valid values if I am going to have to have default values in the first place. So does anyone have any objections if I default to the date 1900-01-01 00:00:00 with a timezone of 0 and then calculate the Julian day and day of the week? That way the default values will be valid *and* not mess up ``time.mktime()``? -Brett From mhammond@skippinet.com.au Sun Jan 12 12:47:57 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Sun, 12 Jan 2003 23:47:57 +1100 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: Message-ID: <05dc01c2ba38$d664a450$530f8490@eden> > I would advise to follow Tim's strategy: Make TLS part of the thread_* > files, accept that on some threading configuration, there won't be TLS > until somebody implements it, and make TLS usage part of the core > instead of part of the extension module. > > I doubt any of the potential TLS providers supports more than Win32 or > pthreads. Yeah, I'm not crazy on the idea myself - but I think it has merit. I'm thinking mainly of xpcom, which has pretty reasonable support beyond pthreads and win32 - but I am more than happy to stick it in the YAGNI basket. Mark. From mhammond@skippinet.com.au Sun Jan 12 12:52:26 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Sun, 12 Jan 2003 23:52:26 +1100 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: Message-ID: <05df01c2ba39$75d5e710$530f8490@eden> > etc). But I > > believe we could simply lean on them for their > implementations at runtime. > > Ah, so that void (*funcTLSAlloc)(...) was supposed to be something > supplied by the extension writer? > > Hmm, the Boost interface doesn't work that way, and AFAICT wouldn't be > easily adapted to it. Windows and Mozilla work as you describe too, but I don't see the problem. For both of these, we would just provide a 3 line stub function, which uses the platform TLS API to return a "void *" we previously stashed. This local function is passed in. But yeah, as I said before, happy to YAGNI it. Mark. From skip@manatee.mojam.com Sun Jan 12 13:00:19 2003 From: skip@manatee.mojam.com (Skip Montanaro) Date: Sun, 12 Jan 2003 07:00:19 -0600 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200301121300.h0CD0J8H008462@manatee.mojam.com> Bug/Patch Summary ----------------- 338 open / 3204 total bugs (+2) 107 open / 1894 total patches (+2) New Bugs -------- sys.version[:3] gives incorrect version (2003-01-05) http://python.org/sf/662701 test_signal hang on some Solaris boxes (2003-01-05) http://python.org/sf/662787 configure script fails with wchat_t size. (2003-01-05) http://python.org/sf/662840 codec registry and Python embedding problem (2003-01-06) http://python.org/sf/663074 sets module review (2003-01-07) http://python.org/sf/663701 test_socket test_unicode_file fail on 2.3a1 on winNT (2003-01-07) http://python.org/sf/663782 Lib Man 2.2.6.2 word change (2003-01-07) http://python.org/sf/664044 test_bsddb3 fails when run directly (2003-01-08) http://python.org/sf/664581 test_ossaudiodev fails to run (2003-01-08) http://python.org/sf/664584 Demo/dbm.py - Rewrite using anydbm? (2003-01-08) http://python.org/sf/664715 files with long lines and an encoding crash (2003-01-09) http://python.org/sf/665014 datetime-RFC2822 roundtripping (2003-01-09) http://python.org/sf/665194 win32 os.path.normpath not correct for leading slash cases (2003-01-09) http://python.org/sf/665336 Crash in binascii_a2b_uu (2003-01-09) http://python.org/sf/665460 curses causes interpreter crash (2003-01-10) http://python.org/sf/665570 missing important curses calls (2003-01-10) http://python.org/sf/665572 reduce() masks exception (2003-01-10) http://python.org/sf/665761 filter() treatment of str and tuple inconsistent (2003-01-10) http://python.org/sf/665835 AssertionErrors in httplib (2003-01-11) http://python.org/sf/666219 'help' makes linefeed only under Win32 (2003-01-11) http://python.org/sf/666444 New Patches ----------- Port tests to unittest (2003-01-05) http://python.org/sf/662807 Implement FSSpec.SetDates() (2003-01-05) http://python.org/sf/662836 (email) Escape backslashes in specialsre and escapesre (2003-01-06) http://python.org/sf/663369 658254: accept None for time.ctime() and friends (2003-01-06) http://python.org/sf/663482 telnetlib option subnegotiation fix (2003-01-07) http://python.org/sf/664020 distutils config exe_extension on Mac OS X, Linux (2003-01-07) http://python.org/sf/664131 664044: 2.2.6.2 String formatting operations (2003-01-07) http://python.org/sf/664183 661913: inconsistent error messages between string an unicod (2003-01-07) http://python.org/sf/664192 sys.path[0] should contain absolute pathname (2003-01-08) http://python.org/sf/664376 Crash in binascii_a2b_uu on corrupt data (2003-01-09) http://python.org/sf/665458 Japanese Unicode Codecs (2003-01-11) http://python.org/sf/666484 Closed Bugs ----------- CGIHTTPServer fix for Windows (2001-05-25) http://python.org/sf/427345 CGIHTTPServer.py POST bug using IE (2001-06-04) http://python.org/sf/430160 unsafe call to PyThreadState_Swap (2002-02-04) http://python.org/sf/513033 os.chmod is underdocumented :-) (2002-08-08) http://python.org/sf/592859 HTTPConnection memory leak (2002-08-22) http://python.org/sf/598797 3rd parameter for Tkinter.scan_dragto (2002-08-30) http://python.org/sf/602259 __rdiv__ vs new-style classes (2002-10-15) http://python.org/sf/623669 __all__ as determiner of a module's api (2002-10-30) http://python.org/sf/631055 optparse module undocumented (2002-11-14) http://python.org/sf/638703 64-bit bug on AIX (2002-11-18) http://python.org/sf/639945 metaclass causes __dict__ to be dict (2002-11-22) http://python.org/sf/642358 bad documentation for the "type" builtin (2002-12-12) http://python.org/sf/652888 'realpath' function missing from os.path (2002-12-27) http://python.org/sf/659228 readline and threads crashes (2002-12-31) http://python.org/sf/660476 ossaudiodev issues (2003-01-01) http://python.org/sf/660697 test_pep263 fails in MacPython-OS9 (2003-01-02) http://python.org/sf/661330 Add clarification of __all__ to refman? (2003-01-03) http://python.org/sf/661848 Closed Patches -------------- optparse LaTeX docs (bug #638703) (2002-11-22) http://python.org/sf/642236 tarfile module implementation (2002-12-09) http://python.org/sf/651082 New import hooks + Import from Zip files (2002-12-12) http://python.org/sf/652586 Fix bug in IE/CGI [bug 427345] (2002-12-16) http://python.org/sf/654910 Add sysexits.h EX_* symbols to posix (2003-01-02) http://python.org/sf/661368 Remove old code from lib\os.py (2003-01-03) http://python.org/sf/661583 Cygwin auto-import module patch (2003-01-03) http://python.org/sf/661760 gcc 3.2 /usr/local/include patch (2003-01-03) http://python.org/sf/661869 Add array_contains() to arraymodule (2003-01-04) http://python.org/sf/662433 From dave@boost-consulting.com Sun Jan 12 15:50:06 2003 From: dave@boost-consulting.com (David Abrahams) Date: Sun, 12 Jan 2003 10:50:06 -0500 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: <05df01c2ba39$75d5e710$530f8490@eden> ("Mark Hammond"'s message of "Sun, 12 Jan 2003 23:52:26 +1100") References: <05df01c2ba39$75d5e710$530f8490@eden> Message-ID: "Mark Hammond" writes: >> etc). But I >> > believe we could simply lean on them for their >> implementations at runtime. >> >> Ah, so that void (*funcTLSAlloc)(...) was supposed to be something >> supplied by the extension writer? >> >> Hmm, the Boost interface doesn't work that way, and AFAICT wouldn't be >> easily adapted to it. > > Windows and Mozilla work as you describe too, but I don't see the problem. > For both of these, we would just provide a 3 line stub function, > which uses the platform TLS API to return a "void *" we previously stashed. > This local function is passed in. I can't really imagine what you're suggesting here. Code samples help. > But yeah, as I said before, happy to YAGNI it. Not sure what "it" is supposed to be here, either. -- David Abrahams dave@boost-consulting.com * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution From paul@pfdubois.com Sun Jan 12 18:10:25 2003 From: paul@pfdubois.com (Paul F Dubois) Date: Sun, 12 Jan 2003 10:10:25 -0800 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: <200301101837.h0AIbES13212@odiug.zope.com> Message-ID: <000201c2ba65$e12c3830$6601a8c0@NICKLEBY> Thanks to everyone who has tried to help me with this problem. I will = try to make a command line option for this. The .py files in question belong to the user and I don't have any = control over where they are; and I don't know about them ahead of time so I = cannot precompile them. The user wrote the files so all I know is that someone = is trying to import something. Each of the hundreds or thousands of Pythons reads the same Python program. However, since we are using a parallel processor and the problems will run for minutes if not months, the cost = of any imports does not matter.=20 It is interesting that the other set of people who care about this are = doing little embedded stuff, sort of the exact opposite end of the computing spectrum. Paul > -----Original Message----- > From: guido@odiug.zope.com [mailto:guido@odiug.zope.com] On=20 > Behalf Of Guido van Rossum > Sent: Friday, January 10, 2003 10:37 AM > To: Aahz > Cc: Paul F Dubois; python-dev@python.org > Subject: Re: [Python-Dev] Parallel pyc construction >=20 >=20 > > Why not make the .py directory read-only? >=20 > Excellent suggestion for a work-around. >=20 > --Guido van Rossum (home page: http://www.python.org/~guido/) >=20 From neal@metaslash.com Sun Jan 12 19:04:04 2003 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 12 Jan 2003 14:04:04 -0500 Subject: [Python-Dev] Parallel pyc construction In-Reply-To: <000201c2ba65$e12c3830$6601a8c0@NICKLEBY> References: <200301101837.h0AIbES13212@odiug.zope.com> <000201c2ba65$e12c3830$6601a8c0@NICKLEBY> Message-ID: <20030112190404.GE29873@epoch.metaslash.com> On Sun, Jan 12, 2003 at 10:10:25AM -0800, Paul F Dubois wrote: > Thanks to everyone who has tried to help me with this problem. I will try to > make a command line option for this. I have already uploaded a patch on the SF bug/feature request. Feel free to use it. Anybody want to review it? http://python.org/sf/602345 There is a patch for 2.3 as well as 2.2. Neal From mhammond@skippinet.com.au Sun Jan 12 23:59:40 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Mon, 13 Jan 2003 10:59:40 +1100 Subject: [Python-Dev] Re: Extension modules, Threading, and the GIL In-Reply-To: Message-ID: <074901c2ba96$ad698670$530f8490@eden> [David Abrahams] > I can't really imagine what you're suggesting here. Code samples > help. OK :) Here is some code demonstrating my "pluggable" idea, but working backwards . (Please don't pick on the names, or where I will need to pass a param, or where I forgot to cast, etc ) First, let's assume we come up with a high-level API similar to: /*** Python "auto-thread-state" API ***/ typedef void *PyATS_HANDLE; /* Get's a "cookie" for use in all subsequent auto-thread-state calls. Generally called once per application/extension. Not strictly necessary, but a vehicle to abstract PyInterpreterState should people care enough. */ PyATS_HANDLE PyAutoThreadState_Init(); /* Ensure we have Python ready to rock. This is the "slow" version that assumes nothing about Python's state, other than the handle is valid. */ int PyAutoThreadState_Ensure(PyATS_HANDLE); /* Notify the auto-thread-state mechanism that we are done - there should be one Release() per Ensure(). Again, maybe not necessary if we are super clever, but for the sake of argument ... */ void PyAutoThreadState_Release(PyATS_HANDLE); /* end of definitions */ This is almost the "holy grail" for me. Your module/app init code does: PyATS_HANDLE myhandle = PyAutoThreadState_Init() And your C function entry points do a PyAutoThreadStateEnsure()/Release() pair. That is it! Your Python extension functions generally need take no special action, including releasing the lock, as PyAutoThreadStateEnsure() is capable of coping with the fact the lock is already held by this thread. So, to my mind, that sketches out the high-level API we are discussing. Underneath the covers, Python will need TLS to implement this. We have 2 choices for the TLS: * Implement it inside Python as part of the platform threading API. This works fine in most scenarios, but may potentially let down e.g. some Mozilla xpcom users - users where Python is ported, but this TLS API is not. Those platforms could not use this new AutoThreadState API, even though the application has a functional TLS implemention provided by xpcom. * Allow the extension author to provide "pluggable" TLS. This would expand the API like so: /* Back in the "auto thread state" header struct PyTLS_FUNCS = { /* Save a PyThreadState pointer in TLS */ int (*pfnSaveThreadState)(PyThreadState *p); /* Release the pointer for the thread (as the thread dies) */ void (*pfnReleaseThreadState)(); /* Get the saved pointer for this thread */ PyThreadState *(*pfnGetThreadState)(); } /* For the Win32 extensions, I would provide the following code in my extension */ DWORD dwTlsIndex = 0; // The TLS functions we "export" back to Python. int MyTLS_SaveThreadState(PyThreadState *ts) { // allocate space for the pointer in the platform TLS PyThreadState **p = (ThreadData **)malloc(sizeof(PyThreadState *)); if (!p) return -1; *p = ts; TlsSetValue(dwTlsIndex, p); return 0; } void PyThreadState MyTLS_DropThreadState() { PyThreadState **p = (PyThreadState**)TlsGetValue(dwTlsIndex); if (!p) return; TlsSetValue(dwTlsIndex, NULL); free(p); } PyThreadState *MyTLS_FetchThreadState() { return (PyThreadState *)TlsGetValue(dwTlsIndex); } // A structure of function pointers defined by Python. Py_AutoThreadStateFunctions myfuncs = { MyTLS_SaveThreadState, MyTLS_DropThreadState, MyTLS_FetchThreadState } /* End of Win32 code */ The XPCOM code would look almost identical, except spelt PR_GetThreadPrivate, PR_SetThreadPrivate etc. I assume pthreads can also fit into this scheme. > > But yeah, as I said before, happy to YAGNI it. > > Not sure what "it" is supposed to be here, either. I'm happy to YAGNI the pluggable TLS idea. I see that the number of users who would actually benefit is tiny. Keeping the TLS api completely inside Python is fine with me. Mark. From tim.one@comcast.net Mon Jan 13 04:30:14 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 12 Jan 2003 23:30:14 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: [Brett Cannon] > OK, so I read the Open Group's specification and it had squat for default > value info Read it again: the notion of "default value" makes no sense for the C strptime(), because a struct tm* is both input to and output from the C version of this function. The caller is responsible for setting up the defaults they want in the struct tm passed *to* strptime(). The Python strptime doesn't work that way (no timetuple is passed on), which makes the "default values" issue more intense in the Python version. > (unless what they provide online is not as detailed as what > members have access to). Nope, that's all there is. > So I logged into a Solaris machine with 2.1.1 and ran > ``time.strptime('January', '%B')`` and got > (1900, 1, 0, 0, 0, 0, 6, 1, 0) > (which is strange because the day of the week for 1900-01-01 The timetuple you got back on Solaris is for the senseless "day" 1900-01-00: tm_mday is 1-based, not 0-based. > is Monday, not Sunday; But if 1900-01-00 *were* a real day , it would be a Sunday. > must be rolling back a day since the day is 0? But then the Julian day > value is wrong). Right, it's a bug nest no matter how you look at it, and "the standards" appear of no use in sorting it out. > But otherwise I am fine with it defaulting to 1900-01-00 as Kevin seems > to be suggesting. It's hard to know what Kevin was suggesting -- he was mostly appealing to emperors that turn out to have no clothes . > But a perk of my strptime implementation is that I researched equations > that can calculate the Julian day based on the Gregorian date values, > Gregorian values based on the year and Julian day, and day of the week > from the year, month, and day. This means that if I set day to 1 if it > was not given by the user, then the Gregorian calculation will figure out > that it should be 1900-01-01; I would like to use that calculation. You don't need to do any of that so long as you're using Python 2.3: the new datetime module can do it for you, and at C speed: >>> import datetime >>> datetime.date(1900, 1, 1).weekday() 0 >>> datetime.date(1900, 1, 1).timetuple() (1900, 1, 1, 0, 0, 0, 0, 1, -1) >>> etc. > Now I could special-case all of this so that this quirky business of the > day of the month being set to 0 is implemented. But I would much rather > return valid values if I am going to have to have default values in the > first place. So does anyone have any objections if I default to the date > 1900-01-01 00:00:00 with a timezone of 0 The last bit there is the tm_isdst flag, which only records DST info. datetime returns -1 there which is explicitly defined as "don't know" -- without a real time zone attached to a date, datetime can't guess whether DST is in effect, and uses -1 to communicate that back to the caller. > and then calculate the Julian day and day of the week? That way the > default values will be valid *and* not mess up ``time.mktime()``? Kevin is the only one with a real use case here so far, and he needs to define what he wants with more rigor. It may or may not turn out to be feasible to implement that, whatever it is. From bac@OCF.Berkeley.EDU Mon Jan 13 06:03:37 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Sun, 12 Jan 2003 22:03:37 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: References: Message-ID: [Tim Peters] > [Brett Cannon] > > OK, so I read the Open Group's specification and it had squat for default > > value info > > Read it again: the notion of "default value" makes no sense for the C > strptime(), because a struct tm* is both input to and output from the C > version of this function. The caller is responsible for setting up the > defaults they want in the struct tm passed *to* strptime(). The Python > strptime doesn't work that way (no timetuple is passed on), which makes the > "default values" issue more intense in the Python version. > And the reason I set values to -1 in the first place is because the Python docs say that there is no guarantee as to what will be returned. Why chouldn't they had added this function to ISO C? > > (unless what they provide online is not as detailed as what > > members have access to). > > Nope, that's all there is. > Figures. [stuff about how lacking the "standards" are for the C implementation] > It's hard to know what Kevin was suggesting -- he was mostly appealing to > emperors that turn out to have no clothes . > =) It would seem that Kevin would be happy as long as he can pass a struct_time straight from ``strptime()`` to ``time.mktime()`` and have the result be what he expects. > > But a perk of my strptime implementation is that I researched equations > > that can calculate the Julian day based on the Gregorian date values, > > Gregorian values based on the year and Julian day, and day of the week > > from the year, month, and day. This means that if I set day to 1 if it > > was not given by the user, then the Gregorian calculation will figure out > > that it should be 1900-01-01; I would like to use that calculation. > > You don't need to do any of that so long as you're using Python 2.3: the > new datetime module can do it for you, and at C speed: > > >>> import datetime > >>> datetime.date(1900, 1, 1).weekday() > 0 > >>> datetime.date(1900, 1, 1).timetuple() > (1900, 1, 1, 0, 0, 0, 0, 1, -1) > >>> > > etc. > Well, since this code was added in 2.3a0 and I have no API to worry about I can easily make use of the code. Is the datetime API all settled after the last shakedown of the class hierarchy? Or should I hold off for a little while? Might as well cut back on the code duplication as much as possible And I take it there is no desire to integrate strptime into datetime at all (I remember someone saying that this would be going in the wrong direction although you do have strftime). > > Now I could special-case all of this so that this quirky business of the > > day of the month being set to 0 is implemented. But I would much rather > > return valid values if I am going to have to have default values in the > > first place. So does anyone have any objections if I default to the date > > 1900-01-01 00:00:00 with a timezone of 0 > > The last bit there is the tm_isdst flag, which only records DST info. > datetime returns -1 there which is explicitly defined as "don't know" -- > without a real time zone attached to a date, datetime can't guess whether > DST is in effect, and uses -1 to communicate that back to the caller. > OK, I will leave it as -1 then as the default. > > and then calculate the Julian day and day of the week? That way the > > default values will be valid *and* not mess up ``time.mktime()``? > > Kevin is the only one with a real use case here so far, and he needs to > define what he wants with more rigor. It may or may not turn out to be > feasible to implement that, whatever it is. > OK, then I won't bother with a patch until Kevin explicitly says exactly what he wants. And if there is a specific requirement that he wants I would suspect that there will need to be some new C code added to timemodule.c since strptime has no guarantees in terms of standards and just having _strptime conform won't be enough (especially while it is only a backup for the C version). Yay, less work for me. =) Thanks, Tim. -Brett From jacobs@penguin.theopalgroup.com Mon Jan 13 11:38:57 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 13 Jan 2003 06:38:57 -0500 (EST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: On Sun, 12 Jan 2003, Tim Peters wrote: > > But otherwise I am fine with it defaulting to 1900-01-00 as Kevin seems > > to be suggesting. > > It's hard to know what Kevin was suggesting -- he was mostly appealing to > emperors that turn out to have no clothes . ;) I was appealing to the Linux, Tru64Unix, IRIX, and Solaris emperors, all naked, then. > > and then calculate the Julian day and day of the week? That way the > > default values will be valid *and* not mess up ``time.mktime()``? > > Kevin is the only one with a real use case here so far, and he needs to > define what he wants with more rigor. It may or may not turn out to be > feasible to implement that, whatever it is. My first e-mail did just that. I want def round_trip(n): strftime('%m/%d/%Y', localtime(mktime(strptime(n, '%m/%d/%Y'))) assert n == round_trip(n) when n is a valid date value. This is my major use-case, since the lack of being able to round-trip date values breaks all of our financial tracking applications in _very_ ugly ways. i.e., with libc strptime: round_trip('07/01/2002') == '07/01/2002' with Brett's strptime: round_trip('07/01/2002') == '06/30/2002' Along the way, I pointed out several other places where Brett's strptime departed from what I was used to, with the hope that those issues could be examined as well. -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From mwh@python.net Mon Jan 13 13:59:27 2003 From: mwh@python.net (Michael Hudson) Date: 13 Jan 2003 13:59:27 +0000 Subject: [Python-Dev] Trouble with Python 2.3a1 In-Reply-To: David Abrahams's message of "Wed, 08 Jan 2003 13:04:46 -0500" References: <200301072312.h07NCxk29254@odiug.zope.com> Message-ID: <2mu1gdjglc.fsf@starship.python.net> David Abrahams writes: > The output below shows that in 2.3a1, that module prefix string is > discarded in favor of the name of the module which imports the > extension module. Look at revision 2.194 of Objects/typeobject.c. I changed some stuff in this area, but am far too fuzzy headed today to work out if it's likely to have affected you in this case... Cheers, M. -- I have no disaster recovery plan for black holes, I'm afraid. Also please be aware that if it one looks imminent I will be out rioting and setting fire to McDonalds (always wanted to do that) and probably not reading email anyway. -- Dan Barlow From mal@lemburg.com Mon Jan 13 15:02:10 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 13 Jan 2003 16:02:10 +0100 Subject: [Python-Dev] Interop between datetime and mxDateTime Message-ID: <3E22D4F2.1030308@lemburg.com> I'd like to make mxDateTime and datetime in Python 2.3 cooperate. Looking at the datetime.h file, it seems that the C API isn't all that fleshed out yet. Will this happen before 2.3b1 ? Related to this: I wonder why datetime is not a normal Python object which lives in Objects/ ?! Finally, the datetime objects don't seem to provide any means of letting binary operations with other types succeed. Coercion or mixed type operations are not implemented. When will this happen ? Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Mon Jan 13 15:31:46 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 10:31:46 -0500 Subject: [Python-Dev] Failures compiling ossaudiodev.c Message-ID: <200301131531.h0DFVkh16211@odiug.zope.com> I get errors compiling ossaudiodev.c on a Mandrake 8.1 system: gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -I. -I/home/guido/projects/python/dist/src/./Include -I/usr/local/include -I/home/guido/projects/python/dist/src/Include -I/home/guido/projects/python/dist/src/linux -c /home/guido/projects/python/dist/src/Modules/ossaudiodev.c -o build/temp.linux-i686-2.3/ossaudiodev.o /home/guido/projects/python/dist/src/Modules/ossaudiodev.c: In function `initossaudiodev': /home/guido/projects/python/dist/src/Modules/ossaudiodev.c:965: `AFMT_AC3' undeclared (first use in this function) /home/guido/projects/python/dist/src/Modules/ossaudiodev.c:965: (Each undeclared identifier is reported only once /home/guido/projects/python/dist/src/Modules/ossaudiodev.c:965: for each function it appears in.) /home/guido/projects/python/dist/src/Modules/ossaudiodev.c:1008: `SNDCTL_DSP_BIND_CHANNEL' undeclared (first use in this function) /home/guido/projects/python/dist/src/Modules/ossaudiodev.c:1012: `SNDCTL_DSP_GETCHANNELMASK' undeclared (first use in this function) --Guido van Rossum (home page: http://www.python.org/~guido/) From neal@metaslash.com Mon Jan 13 15:41:33 2003 From: neal@metaslash.com (Neal Norwitz) Date: Mon, 13 Jan 2003 10:41:33 -0500 Subject: [Python-Dev] _Py_QnewFlag Message-ID: <20030113154133.GF29873@epoch.metaslash.com> This comment is in Include/pydebug.h: /* _XXX Py_QnewFlag should go away in 2.3. It's true iff -Qnew is passed, on the command line, and is used in 2.2 by ceval.c to make all "/" divisions true divisions (which they will be in 2.3). */ Is this comment correct? Neal From guido@python.org Mon Jan 13 15:49:51 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 10:49:51 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: Your message of "Mon, 13 Jan 2003 16:02:10 +0100." <3E22D4F2.1030308@lemburg.com> References: <3E22D4F2.1030308@lemburg.com> Message-ID: <200301131549.h0DFnp323711@odiug.zope.com> > I'd like to make mxDateTime and datetime in Python 2.3 > cooperate. Me too. There was a proposal long ago that all datetime-like objects should implement a timetuple() method. Perhaps we should revive that? datetime does implement this. > Looking at the datetime.h file, it seems that > the C API isn't all that fleshed out yet. Will this happen > before 2.3b1 ? I think that datetime.h is in fact entirely private to the module. > Related to this: I wonder why datetime is not a normal > Python object which lives in Objects/ ?! No single reason. Some reasons I can think of: - There's a pure Python implementation that can be used with Python 2.2 (probably even earlier versions) - It's not that essential to most code, so it seems appropriate to have to import it - I didn't want this to be seen as "growing the language" - It's still very young > Finally, the datetime objects don't seem to provide any > means of letting binary operations with other types > succeed. Coercion or mixed type operations are not > implemented. When will this happen ? If it's up to me, never. If you want to add a number of days, seconds, or fortnights to a datetime, use a timedelta(). The datetime type also supports a method to extract a posix timestamp. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jan 13 15:55:17 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 10:55:17 -0500 Subject: [Python-Dev] _Py_QnewFlag In-Reply-To: Your message of "Mon, 13 Jan 2003 10:41:33 EST." <20030113154133.GF29873@epoch.metaslash.com> References: <20030113154133.GF29873@epoch.metaslash.com> Message-ID: <200301131555.h0DFtH223784@odiug.zope.com> > This comment is in Include/pydebug.h: > > /* _XXX Py_QnewFlag should go away in 2.3. It's true iff -Qnew is passed, > on the command line, and is used in 2.2 by ceval.c to make all "/" divisions > true divisions (which they will be in 2.3). */ > > Is this comment correct? It would be if "2.3" were replaced with "3.0". --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon Jan 13 16:16:23 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 11:16:23 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <3E22D4F2.1030308@lemburg.com> Message-ID: [M.-A. Lemburg] > I'd like to make mxDateTime and datetime in Python 2.3 > cooperate. Cool! Good luck . > Looking at the datetime.h file, it seems that > the C API isn't all that fleshed out yet. Will this happen > before 2.3b1 ? There are no plans to expand the C API. Macros are provided for efficient field extraction; all objects here are immutable so there are no macros for changing fields; anything else needed has to go thru PyObject_CallMethod on an existing datetime object, or (for construction) calling the type object. > Related to this: I wonder why datetime is not a normal > Python object which lives in Objects/ ?! Didn't fit there. Like, e.g., arraymodule.c and mmapmodule.c, it implements a module the user needs to import explicitly (as well as supplying new object types). Nothing in Objects is like that (they don't implement modules). > Finally, the datetime objects don't seem to provide any > means of letting binary operations with other types > succeed. Coercion or mixed type operations are not > implemented. When will this happen ? Coercion is out of favor. Binary datetime methods generally return NotImplemented when they don't know how to handle an operation themself. In that case, Python then asks "the other" object to try the operation. If that also returns NotImplemented, *then* Python complains. >>> date.today() + 12 Traceback (most recent call last): File "", line 1, in ? TypeError: unsupported operand type(s) for +: 'datetime.date' and 'int' >>> That outcome means that date and int both got a crack at it, and both returned NotImplemented. Comparison is an exception to this: in order to stop comparison from falling back to the default compare-by-object-address, datetime comparison operators explictly raise TypeError when they don't know what to do. Offhand that seems hard (perhaps impossible) to worm around. From nas@python.ca Mon Jan 13 17:34:20 2003 From: nas@python.ca (Neil Schemenauer) Date: Mon, 13 Jan 2003 09:34:20 -0800 Subject: [Python-Dev] properties on modules? Message-ID: <20030113173420.GA21406@glacier.arctrix.com> It would be really cool if this worked: import time now = property(lambda m: time.time()) Obviously a silly example but I hope the idea is clear. Is there a reason this couldn't be made to work? Neil From fredrik@pythonware.com Mon Jan 13 17:41:39 2003 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 13 Jan 2003 18:41:39 +0100 Subject: [Python-Dev] Interop between datetime and mxDateTime References: <3E22D4F2.1030308@lemburg.com> <200301131549.h0DFnp323711@odiug.zope.com> Message-ID: <003301c2bb2b$0a9bb680$ced241d5@hagrid> guido wrote: > There was a proposal long ago that all datetime-like objects should > implement a timetuple() method. Perhaps we should revive that? > datetime does implement this. the last revision of the "time type" proposal seems to suggest that all time types should implement the following interface: tm = timeobject.timetuple() cmp(timeobject, timeobject) hash(timeobject) and optionally deltaobject = timeobject - timeobject floatobject = float(deltaobject) # fractional seconds timeobject = timeobject + integerobject timeobject = timeobject + floatobject timeobject = timeobject + deltaobject more here: http://effbot.org/zone/idea-time-type.htm From mwh@python.net Mon Jan 13 18:08:16 2003 From: mwh@python.net (Michael Hudson) Date: 13 Jan 2003 18:08:16 +0000 Subject: [Python-Dev] PEP 290 revisited In-Reply-To: "Kevin Altis"'s message of "Wed, 8 Jan 2003 23:22:26 -0800" References: Message-ID: <2mn0m4kjn3.fsf@starship.python.net> "Kevin Altis" writes: > > I know, that's much less fun and no quick satisfaction, but it leads > > to code *improvement* rather than bitrot. > > Yes, but it also means the folks doing the real work in a module are going > to have to deal with this kind of stuff that probably seems trivial to them > and not worth doing when they could be writing real code. It just means > there is more on their plate and that Python itself, may not meet its own > guidelines; these kinds of changes tend to not get done because there is > never enough time. I think this is a bogus argument, sorry. If you're doing something non trivial to a module, the time required to use string methods rather than the string module is in the noise. CHeers, M. -- Need to Know is usually an interesting UK digest of things that happened last week or might happen next week. [...] This week, nothing happened, and we don't care. -- NTK Know, 2000-12-29, http://www.ntk.net/ From mwh@python.net Mon Jan 13 18:12:14 2003 From: mwh@python.net (Michael Hudson) Date: 13 Jan 2003 18:12:14 +0000 Subject: [Python-Dev] DUP_TOPX In-Reply-To: Tim Peters's message of "Fri, 10 Jan 2003 10:08:54 -0500" References: Message-ID: <2mk7h8kjgh.fsf@starship.python.net> Tim Peters writes: > Note that someone (not me -- never tried it) thought enough of LLTRACE to > document it, in Misc/SpecialBuilds.txt. That would have been me. > When I created that file, I warned that any special build not > documented there was fair game for removal, so *someone* must want > it, and badly enough to write a paragraph . It's handly for ultra low level debugging (hence the name -- maybe it should be ULLTRACE ). It's the sort of stuff that wouldn't be hard to add while you were debugging, but it's there now so why kill it? Cheers, M. -- Roll on a game of competetive offence-taking. -- Dan Sheppard, ucam.chat From mwh@python.net Mon Jan 13 18:15:45 2003 From: mwh@python.net (Michael Hudson) Date: 13 Jan 2003 18:15:45 +0000 Subject: [Python-Dev] properties on modules? In-Reply-To: Neil Schemenauer's message of "Mon, 13 Jan 2003 09:34:20 -0800" References: <20030113173420.GA21406@glacier.arctrix.com> Message-ID: <2mhecckjam.fsf@starship.python.net> Neil Schemenauer writes: > It would be really cool if this worked: > > import time > now = property(lambda m: time.time()) > > Obviously a silly example but I hope the idea is clear. Is there a > reason this couldn't be made to work? This would make referring to a variable inside and outside a module do different things, no? Urg. Cheers, M. -- Just point your web browser at http://www.python.org/search/ and look for "program", "doesn't", "work", or "my". Whenever you find someone else whose program didn't work, don't do what they did. Repeat as needed. -- Tim Peters, on python-help, 16 Jun 1998 From pedronis@bluewin.ch Mon Jan 13 18:17:34 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Mon, 13 Jan 2003 19:17:34 +0100 Subject: [Python-Dev] properties on modules? References: <20030113173420.GA21406@glacier.arctrix.com> Message-ID: <041101c2bb30$0b7d98c0$6d94fea9@newmexico> From: "Neil Schemenauer" > It would be really cool if this worked: > > import time > now = property(lambda m: time.time()) > > Obviously a silly example but I hope the idea is clear. Is there a > reason this couldn't be made to work? > it does not work any less than: >>> import time >>> class C(object): pass ... >>> c=C() >>> c.new=property(lambda o: time.time()) >>> c.new modules are instances of a types, not types. So the issue is about instance-level property apart the module thing details. From guido@python.org Mon Jan 13 18:31:58 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 13:31:58 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: Your message of "Mon, 13 Jan 2003 18:41:39 +0100." <003301c2bb2b$0a9bb680$ced241d5@hagrid> References: <3E22D4F2.1030308@lemburg.com> <200301131549.h0DFnp323711@odiug.zope.com> <003301c2bb2b$0a9bb680$ced241d5@hagrid> Message-ID: <200301131831.h0DIVwr24888@odiug.zope.com> > guido wrote: > > > There was a proposal long ago that all datetime-like objects should > > implement a timetuple() method. Perhaps we should revive that? > > datetime does implement this. > > the last revision of the "time type" proposal seems to suggest that all > time types should implement the following interface: > > tm = timeobject.timetuple() This is specified in the proposal as returning local time if the timeobject knows about timezones. Unfortunately, the datetime module has timezone support built in, but in such a way that no knowledge of actual timezones is built into it. In particular, the datetime module is agnostic of the local timezone rules and regulations. We currently support timetuple() but it always returns the "naive time" represented by a datetime object (stripping the timezone info rather converting to local time). Is this a problem? If we have to convert to local time using the facilities of the time module, timetuple() will only work for dates supported by the time module (roughly 1970-2038 on most systems), and raise an error otherwise. That's not in line with datetime's philosophy (especially since there's nothing in the time tuple itself that needs such a range restriction, and other datetime types might not like this restriction either). Suggestions? > cmp(timeobject, timeobject) See Tim's post for problems with this; however if the other object derives from basetime we could return NotImplemented. > hash(timeobject) We would need to agree on how the hash should be computed in such a way that different datetime objects that compare equal hash the same. I fear this would be really hard unless we punted and made hash() return a constant. > and optionally > > deltaobject = timeobject - timeobject > floatobject = float(deltaobject) # fractional seconds > timeobject = timeobject + integerobject > timeobject = timeobject + floatobject > timeobject = timeobject + deltaobject The first and last one are already implemented for heterogeneous operations, and can be implemented by a third party datetime type; I would be against conversions between ints or floats and datetime.timedelta, because these don't roundtrip (float doesn't have enough precision to represent the full range with mucroseconds). > more here: > > http://effbot.org/zone/idea-time-type.htm The proposal also recomments an abstract base type, "basetime", for all time types. Without this, cmp() is hard to do (see Tim's post for explanation; we don't want datetime objects to be comparable to objects with arbitrary other types, because the default comparison iss meaningless). This could be a pure "marker" type, like "basestring". Marc-Andre, if we export basetime from the core, can mxDateTime subclass from that? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jan 13 18:36:02 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 13:36:02 -0500 Subject: [Python-Dev] properties on modules? In-Reply-To: Your message of "Mon, 13 Jan 2003 09:34:20 PST." <20030113173420.GA21406@glacier.arctrix.com> References: <20030113173420.GA21406@glacier.arctrix.com> Message-ID: <200301131836.h0DIa2Z24938@odiug.zope.com> > It would be really cool if this worked: > > import time > now = property(lambda m: time.time()) > > Obviously a silly example but I hope the idea is clear. Is there a > reason this couldn't be made to work? The idea is not clear to me at all. Why can't you say now = lambda: time.time() ??? --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Mon Jan 13 18:34:56 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 13 Jan 2003 12:34:56 -0600 Subject: [Python-Dev] PEP 290 revisited In-Reply-To: <2mn0m4kjn3.fsf@starship.python.net> References: <2mn0m4kjn3.fsf@starship.python.net> Message-ID: <15907.1744.966609.791798@montanaro.dyndns.org> Michael> If you're doing something non trivial to a module, the time Michael> required to use string methods rather than the string module is Michael> in the noise. Provided of course that your non-trivial changes and the string methods changes are in separate checkins. Otherwise, your non-trivial changes can easily drown in a sea of trivial changes. Skip From brian@sweetapp.com Mon Jan 13 18:49:12 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Mon, 13 Jan 2003 10:49:12 -0800 Subject: [Python-Dev] properties on modules? In-Reply-To: <200301131836.h0DIa2Z24938@odiug.zope.com> Message-ID: <009201c2bb34$76f5a9e0$21795418@dell1700> > > It would be really cool if this worked: > > > > import time > > now = property(lambda m: time.time()) > > > > Obviously a silly example but I hope the idea is clear. Is there a > > reason this couldn't be made to work? > > The idea is not clear to me at all. Why can't you say > > now = lambda: time.time() Presumably, he would prefer this syntax: start = time.now to: start = time.now() The .NET framework implements "now" as a property rather than a function and I find it distasteful for some reason. Cheers, Brian From tim.one@comcast.net Mon Jan 13 18:46:05 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 13:46:05 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <003301c2bb2b$0a9bb680$ced241d5@hagrid> Message-ID: [Fredrik Lundh] > the last revision of the "time type" proposal seems to suggest that all > time types should implement the following interface: Calling T datetime.time, and D datetime.datetime: > tm = timeobject.timetuple() D yes, T no. > cmp(timeobject, timeobject) Both yes, but not a mix of D and T. > hash(timeobject) Both yes. It's curious that that the minimal API doesn't have a way to create a timeobject ab initio (the only operations here with a timeobject output need a timeobject as input first). > and optionally > > deltaobject = timeobject - timeobject D yes, T no. > floatobject = float(deltaobject) # fractional seconds datetime.timedelta doesn't have anything like that, but it could be useful. Converting a timedelta to minutes (or seconds, or whatever) is painful now. I'd rather see explicit .toseconds(), .tominutes() (etc) methods. A caution that an IEEE double doesn't necessariy have enough bits of precision so that roundrip equality can be guaranteed. I expect you'd also need an API to create a delta object from a number of seconds (in datetime that's spelled datetime.timedelta(seconds=a_float_int_or_long) ). > timeobject = timeobject + integerobject Neither, and unclear what it means (is the integer seconds? milliseconds? days? etc). > timeobject = timeobject + floatobject Neither, likewise. > timeobject = timeobject + deltaobject D yes, T no. From guido@python.org Mon Jan 13 18:49:22 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 13:49:22 -0500 Subject: [Python-Dev] properties on modules? In-Reply-To: Your message of "Mon, 13 Jan 2003 10:49:12 PST." <009201c2bb34$76f5a9e0$21795418@dell1700> References: <009201c2bb34$76f5a9e0$21795418@dell1700> Message-ID: <200301131849.h0DInMq25104@odiug.zope.com> [NeilS] > > > It would be really cool if this worked: > > > > > > import time > > > now = property(lambda m: time.time()) > > > > > > Obviously a silly example but I hope the idea is clear. Is there a > > > reason this couldn't be made to work? [Me] > > The idea is not clear to me at all. Why can't you say > > > > now = lambda: time.time() [Brian Q] > Presumably, he would prefer this syntax: > > start = time.now > > to: > > start = time.now() Aha. > The .NET framework implements "now" as a property rather than a function > and I find it distasteful for some reason. I have to agree with you -- I am -1 on such a feature. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@python.org Mon Jan 13 18:56:01 2003 From: barry@python.org (Barry A. Warsaw) Date: Mon, 13 Jan 2003 13:56:01 -0500 Subject: [Python-Dev] properties on modules? References: <20030113173420.GA21406@glacier.arctrix.com> <200301131836.h0DIa2Z24938@odiug.zope.com> Message-ID: <15907.3009.89291.801079@gargle.gargle.HOWL> >>>>> "GvR" == Guido van Rossum writes: GvR> The idea is not clear to me at all. Why can't you say GvR> now = lambda: time.time() I'm not sure, but I'd guess that Neil wants to do something like: print now instead of print now() to get the current time. -Barry From tim.one@comcast.net Mon Jan 13 18:55:13 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 13:55:13 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <200301131831.h0DIVwr24888@odiug.zope.com> Message-ID: > tm = timeobject.timetuple() [Guido] > This is specified in the proposal as returning local time if the > timeobject knows about timezones. Unfortunately, the datetime module > has timezone support built in, but in such a way that no knowledge of > actual timezones is built into it. In particular, the datetime module > is agnostic of the local timezone rules and regulations. We currently > support timetuple() but it always returns the "naive time" represented > by a datetime object (stripping the timezone info rather converting to > local time). FYI, the time zone info object is consulted by timetuple(), but only to set the result's tm_isdst flag. From skip@pobox.com Mon Jan 13 19:04:13 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 13 Jan 2003 13:04:13 -0600 Subject: [Python-Dev] properties on modules? In-Reply-To: <200301131849.h0DInMq25104@odiug.zope.com> References: <009201c2bb34$76f5a9e0$21795418@dell1700> <200301131849.h0DInMq25104@odiug.zope.com> Message-ID: <15907.3501.591773.869274@montanaro.dyndns.org> Guido> [Brian Q] >> Presumably, he would prefer this syntax: >> >> start = time.now >> >> to: >> >> start = time.now() Guido> Aha. >> The .NET framework implements "now" as a property rather than a function >> and I find it distasteful for some reason. Guido> I have to agree with you -- I am -1 on such a feature. "now" as "time.time()" is a specific example which Neil admitted was a bit contrived. I think the question is still open whether or not modules should be able to support properties, though I do think the ball is back in his court to come up with a less contrived example. Skip From ben@algroup.co.uk Mon Jan 13 19:13:49 2003 From: ben@algroup.co.uk (Ben Laurie) Date: Mon, 13 Jan 2003 19:13:49 +0000 Subject: [Python-Dev] properties on modules? References: <009201c2bb34$76f5a9e0$21795418@dell1700> Message-ID: <3E230FED.2090707@algroup.co.uk> Brian Quinlan wrote: >>>It would be really cool if this worked: >>> >>>import time >>>now = property(lambda m: time.time()) >>> >>>Obviously a silly example but I hope the idea is clear. Is there a >>>reason this couldn't be made to work? >> >>The idea is not clear to me at all. Why can't you say >> >>now = lambda: time.time() > > > Presumably, he would prefer this syntax: > > start = time.now > > to: > > start = time.now() > > The .NET framework implements "now" as a property rather than a function > and I find it distasteful for some reason. Presumably because inutuition says properties should hold still, not wiggle about of their own accord. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff From pedronis@bluewin.ch Mon Jan 13 19:12:30 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Mon, 13 Jan 2003 20:12:30 +0100 Subject: [Python-Dev] properties on modules? References: <009201c2bb34$76f5a9e0$21795418@dell1700> <200301131849.h0DInMq25104@odiug.zope.com> <15907.3501.591773.869274@montanaro.dyndns.org> Message-ID: <04c401c2bb37$b89a23a0$6d94fea9@newmexico> From: "Skip Montanaro" > I think the question is still open whether or not modules should > be able to support properties, though I do think the ball is back in his > court to come up with a less contrived example. if module.prop would be a property then from module import prop would probably not do what one may expect... I think this kill it. From altis@semi-retired.com Mon Jan 13 19:23:10 2003 From: altis@semi-retired.com (Kevin Altis) Date: Mon, 13 Jan 2003 11:23:10 -0800 Subject: [Python-Dev] PEP 290 revisited In-Reply-To: <2mn0m4kjn3.fsf@starship.python.net> Message-ID: > -----Original Message----- > From: Michael Hudson > > "Kevin Altis" writes: > > > > I know, that's much less fun and no quick satisfaction, but it leads > > > to code *improvement* rather than bitrot. > > > > Yes, but it also means the folks doing the real work in a > module are going > > to have to deal with this kind of stuff that probably seems > trivial to them > > and not worth doing when they could be writing real code. It just means > > there is more on their plate and that Python itself, may not > meet its own > > guidelines; these kinds of changes tend to not get done because there is > > never enough time. > > I think this is a bogus argument, sorry. If you're doing something > non trivial to a module, the time required to use string methods > rather than the string module is in the noise. The point was that if you were just going to change to string methods or implement any of the suggested changes in PEP 290, that each of these changes would be considered trivial by themselves and not worth doing on their own. I think they are worth doing, thus my original post, but other developers probably feel that if it ain't broke don't fix it, despite how the changes should make some code more readable and a bit faster. If someone was involved in doing something non-trivial to the module, then I agree with you, the time to implement just one change of PEP 290 isn't bad either. Maybe the Python dev guidelines should encourage developers to make PEP 8 and PEP 290 cleanups the next time they work on their particular modules. Implementing all the changes of PEP 290 does take time and focus. I would set aside at least an hour per file for a thorough set of changes, meticulously double-checking the patch diffs, etc. For PythonCard I found it much simpler to batch change and so was able to do all the changes comfortably over two days for 150+ files; I didn't change that many files, but I had to do repeated greps, etc. of them. I was focused on one particular aspect of PEP 290 per change to all the files in the framework and samples. Each change such as switching to is None instead of == None, isinstance(), or startswith() and endswith() in all the files was a separate check-in. I think this makes it very easy to see the "upgrades" without any confusion with new functionality changes (Skip's point). But Guido's reply about someone other than the maintainer of the module making changes leading to "bitrot" has merit, so I withdrew my original offer to make the upgrades myself. It will be up to you and the other developers to decide how much effort you want to put into conforming to PEP 290. ka From guido@python.org Mon Jan 13 19:28:43 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 14:28:43 -0500 Subject: [Python-Dev] properties on modules? In-Reply-To: Your message of "Mon, 13 Jan 2003 20:12:30 +0100." <04c401c2bb37$b89a23a0$6d94fea9@newmexico> References: <009201c2bb34$76f5a9e0$21795418@dell1700> <200301131849.h0DInMq25104@odiug.zope.com> <15907.3501.591773.869274@montanaro.dyndns.org> <04c401c2bb37$b89a23a0$6d94fea9@newmexico> Message-ID: <200301131928.h0DJShC25454@odiug.zope.com> > if module.prop would be a property then > > from module import prop > > would probably not do what one may expect... > > I think this kill it. You mean I don't get to kill it by BDFL pronouncement? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From python@rcn.com Mon Jan 13 19:36:24 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 13 Jan 2003 14:36:24 -0500 Subject: [Python-Dev] PEP 290 revisited References: Message-ID: <007701c2bb3b$0f4c04e0$9d0fa044@oemcomputer> From: "Kevin Altis" > Implementing all the changes of PEP 290 does take time and focus. I would > set aside at least an hour per file for a thorough set of changes, > meticulously double-checking the patch diffs, etc. Yes. It takes a lot a care to do it right. > It will be up to you and the other > developers to decide how much effort you want to put into conforming to PEP > 290. I wouldn't use the word conform. The spirit of the PeP was to provide some clues to those looking to take advantage of a recent language feature. The clues include how to find a potential update, what change to make, what the benefit would be, and more importantly what cues indicate that the change might be inappropriate. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From whisper@oz.net Mon Jan 13 19:46:47 2003 From: whisper@oz.net (David LeBlanc) Date: Mon, 13 Jan 2003 11:46:47 -0800 Subject: [Python-Dev] properties on modules? In-Reply-To: <3E230FED.2090707@algroup.co.uk> Message-ID: Well, if you buy into the whole space time continium thing, "now" doesn't wiggle about of it's own accord. Of course "now" isn't the "now that was then"... I can see how one might think of "now" as an attribute. It's specific and unique for normal space and time, such as exists outside of computers ;) David LeBlanc Seattle, WA USA > -----Original Message----- > From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On > Behalf Of Ben Laurie > Sent: Monday, January 13, 2003 11:14 > To: Brian Quinlan > Cc: python-dev@python.org > Subject: Re: [Python-Dev] properties on modules? > > > Brian Quinlan wrote: > >>>It would be really cool if this worked: > >>> > >>>import time > >>>now = property(lambda m: time.time()) > >>> > >>>Obviously a silly example but I hope the idea is clear. Is there a > >>>reason this couldn't be made to work? > >> > >>The idea is not clear to me at all. Why can't you say > >> > >>now = lambda: time.time() > > > > > > Presumably, he would prefer this syntax: > > > > start = time.now > > > > to: > > > > start = time.now() > > > > The .NET framework implements "now" as a property rather than a function > > and I find it distasteful for some reason. > > Presumably because inutuition says properties should hold still, not > wiggle about of their own accord. > > Cheers, > > Ben. > > -- > http://www.apache-ssl.org/ben.html http://www.thebunker.net/ > > "There is no limit to what a man can do or how far he can go if he > doesn't mind who gets the credit." - Robert Woodruff > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From just@letterror.com Mon Jan 13 19:59:41 2003 From: just@letterror.com (Just van Rossum) Date: Mon, 13 Jan 2003 20:59:41 +0100 Subject: [Python-Dev] import vs. data files (Fwd: ResourcePackage 1.0.0a2 available...) Message-ID: FWIW, I find the project below an interesting approach to the data file problem -- there's no import magic at all and should be compatible with any packaging scheme. Just > From: "Mike C. Fletcher" > Subject: ResourcePackage 1.0.0a2 available... > Date: Mon, 13 Jan 2003 11:45:24 -0500 > Newsgroups: comp.lang.python > Message-ID: > > ResourcePackage is a mechanism for automatically managing resources > (i.e. non-Python files: small images, documentation files, binary > data) embedded in Python modules (as Python source code), > particularly for those wishing to create re-usable Python packages > which require their own resource-sets. > > ResourcePackage allows you to set up resource-specific sub-packages > within your package. It creates a Python module for each resource > placed in the resource package's directory during development. You > can set up these packages with a simple file-copy and then use the > resources saved/updated in the package directory like so: > > from mypackage.resources import open_icon > result = myStringLoadingFunction( open_icon.data ) > > ResourcePackage scans the package-directory on import to refresh > module contents, so simply saving an updated version of the file will > make it available the next time your application is run. > > When you are ready to distribute your package, you need only replace > the copied file with a dummy __init__.py to disable the scanning > support and eliminate all dependencies on resourcepackage (that is, > your users do not need to have resourcepackage installed once this is > done). Users of your packages do not need to do anything special > when creating their applications to give you access to your > resources, as they are simply Python packages/modules included in > your package's hierarchy. Your package's code (other than the > mentioned __init__.py) doesn't change. > > Note: there is code in resource package to allow you to manually > refresh your package directories w/out copying in __init__.py. I > haven't yet wrapped that as a script, but intend to for the 1.0 > release. ResourcePackage is currently in 1.0 alpha status, it appears > to work properly, but it has only had minimal testing. You can get > the distribution from the project page at: > > http://sourceforge.net/projects/resourcepackage/ > > I'm interested in any bug reports, enhancement requests or comments. > > Enjoy all, > Mike Fletcher > > _______________________________________ > Mike C. Fletcher > Designer, VR Plumber, Coder > http://members.rogers.com/mcfletch/ From tim.one@comcast.net Mon Jan 13 20:16:05 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 15:16:05 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: [Kevin Jacobs] > I was appealing to the Linux, Tru64Unix, IRIX, and Solaris emperors, all > naked, then. That's not a spec, it's just a collection of OS names, and, for the most part, their man pages are even vaguer than the POSIX partial spec. The IRIX collection is especially funny : o Bug #469938 - strptime() does not parse %y formats correctly. o Bug #469941 - strptime() does not give correct return value. o Bug #515837 - strptime() %y does not handle years 2000 and greater. The Solaris docs are the only ones that show any awareness that there *are* unclear behaviors here, and they don't promise anything across releases despite that: In addition to the behavior described above by various standards, the Solaris implementation of strptime() provides the following extensions. These may change at any time in the future. Portable applications should not depend on these extended features: If _STRPTIME_DONTZERO is not defined, the tm struct is zeroed on entry and strptime() updates the fields of the tm struct associated with the specifiers in the format string. [Note from Tim: I believe this violates the POSIX spec, although the latter is too vague to say for sure.] If _STRPTIME_DONTZERO is defined, strptime() does not zero the tm struct on entry. Additionally, for some specifiers, strptime() will use some values in the input tm struct to recalculate the date and re-assign the appropriate members of the tm struct. [yadda yadda yadda] So which version of which Solaris variation do you depend on? > My first e-mail did just that. I want > > def round_trip(n): > strftime('%m/%d/%Y', localtime(mktime(strptime(n, '%m/%d/%Y'))) > > assert n == round_trip(n) > > when n is a valid date value. This is my major use-case, If you have other use cases, do spell them out. I expect this specific use case will prove to be tractable. > since the lack of being able to round-trip date values breaks all of our > financial tracking applications in _very_ ugly ways. > > i.e., with libc strptime: > > round_trip('07/01/2002') == '07/01/2002' > > with Brett's strptime: > > round_trip('07/01/2002') == '06/30/2002' > > Along the way, I pointed out several other places where Brett's strptime > departed from what I was used to, with the hope that those issues could be > examined as well. They've been examined, but the specs don't say enough to resolve anything. Poke and hope appears all that remains. Specific use cases may help, but more OS names won't. From nas@python.ca Mon Jan 13 20:30:47 2003 From: nas@python.ca (Neil Schemenauer) Date: Mon, 13 Jan 2003 12:30:47 -0800 Subject: [Python-Dev] properties on modules? In-Reply-To: <15907.3501.591773.869274@montanaro.dyndns.org> References: <009201c2bb34$76f5a9e0$21795418@dell1700> <200301131849.h0DInMq25104@odiug.zope.com> <15907.3501.591773.869274@montanaro.dyndns.org> Message-ID: <20030113203047.GA21836@glacier.arctrix.com> Skip Montanaro wrote: > I think the question is still open whether or not modules should > be able to support properties, though I do think the ball is back in his > court to come up with a less contrived example. The Quixote web system publishes Python functions by traversing name spaces. The publishing code is very roughly: def get_object(container, name): if hasattr(container, name): return getattr(container, name) elif isinstance(container, ModuleType): mname = container.__name__ + '.' + name __import__(mname) return sys.modules[mname] else: raise TraversalError def publish(path): o = root_namespace for component in '/'.split(path[1:]): o = get_object(o, component) return o() If you use instances for name spaces then you can use __getattr__ or properties to lazily create attributes. It's annoying that there is nothing like __getattr__/__setattr__ or properties for modules. Neil From jacobs@penguin.theopalgroup.com Mon Jan 13 20:36:34 2003 From: jacobs@penguin.theopalgroup.com (Kevin Jacobs) Date: Mon, 13 Jan 2003 15:36:34 -0500 (EST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: On Mon, 13 Jan 2003, Tim Peters wrote: > [Kevin Jacobs] > So which version of which Solaris variation do you depend on? I depend on the default version that is exposed via the time module when using Python 2.2 under Solaris 9. > > My first e-mail did just that. I want > > > > def round_trip(n): > > strftime('%m/%d/%Y', localtime(mktime(strptime(n, '%m/%d/%Y'))) > > > > assert n == round_trip(n) > > > > when n is a valid date value. This is my major use-case, > > If you have other use cases, do spell them out. I expect this specific use > case will prove to be tractable. This is the only one that has emerged from a real-life running application. In fact, this kind of code is our only use of strptime, though after all of this confusion it may be replaced RSN with something more deterministic. > They've been examined, but the specs don't say enough to resolve anything. > Poke and hope appears all that remains. Specific use cases may help, but > more OS names won't. I'm not sure what else you expect of me. I have an application, some anecdotal evidence that something is amiss, reported specific test-code that that demonstrated problem, and encouraged Brett do a bit of digging though the standards to see if he can improve things. If you feel that I am being unreasonably vague about what Brett should do, then you are quite right! I'm burried under a dozen other projects, and in spite of that, felt it was important to make time to test Python 2.3a1. In doing so I found one concrete thing wrong with how we used strptime and a bucket full of things that looked fishy. However, digging through the nooks and crannies of strptime is not at all important to me, especially when I have a viable work-around on all platforms that I care about (i.e., re-enabling the libc version). Happy to help, but not expecting the Spanish Inquisition, -Kevin -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (216) 986-0710 x 19 E-mail: jacobs@theopalgroup.com Fax: (216) 986-0714 WWW: http://www.theopalgroup.com From skip@pobox.com Mon Jan 13 20:36:28 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 13 Jan 2003 14:36:28 -0600 Subject: [Python-Dev] properties on modules? In-Reply-To: <04c401c2bb37$b89a23a0$6d94fea9@newmexico> References: <009201c2bb34$76f5a9e0$21795418@dell1700> <200301131849.h0DInMq25104@odiug.zope.com> <15907.3501.591773.869274@montanaro.dyndns.org> <04c401c2bb37$b89a23a0$6d94fea9@newmexico> Message-ID: <15907.9036.125706.22122@montanaro.dyndns.org> Samuele> From: "Skip Montanaro" >> I think the question is still open whether or not modules should >> be able to support properties, though I do think the ball is back in his >> court to come up with a less contrived example. Samuele> if module.prop would be a property Samuele> then Samuele> from module import prop Samuele> would probably not do what one may expect... Samuele> I think this kill it. On the other hand, it might be a rather elegant way to prevent people from accessing volatile module-level objects using "from module import prop". <0.5 wink> Skip From guido@python.org Mon Jan 13 20:48:13 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 15:48:13 -0500 Subject: [Python-Dev] properties on modules? In-Reply-To: Your message of "Mon, 13 Jan 2003 12:30:47 PST." <20030113203047.GA21836@glacier.arctrix.com> References: <009201c2bb34$76f5a9e0$21795418@dell1700> <200301131849.h0DInMq25104@odiug.zope.com> <15907.3501.591773.869274@montanaro.dyndns.org> <20030113203047.GA21836@glacier.arctrix.com> Message-ID: <200301132048.h0DKmDd25963@odiug.zope.com> > The Quixote web system publishes Python functions by traversing name > spaces. The publishing code is very roughly: > > def get_object(container, name): > if hasattr(container, name): > return getattr(container, name) > elif isinstance(container, ModuleType): > mname = container.__name__ + '.' + name > __import__(mname) > return sys.modules[mname] > else: > raise TraversalError > > def publish(path): > o = root_namespace > for component in '/'.split(path[1:]): > o = get_object(o, component) > return o() > > If you use instances for name spaces then you can use __getattr__ or > properties to lazily create attributes. It's annoying that there is > nothing like __getattr__/__setattr__ or properties for modules. I still don't see the use case for late binding of module attributes. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon Jan 13 20:51:51 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 15:51:51 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: [Kevin Jacobs] > I depend on the default version that is exposed via the time module when > using Python 2.2 under Solaris 9. Let's hope that means something to Brett . > This is the only one that has emerged from a real-life running > application. In fact, this kind of code is our only use of strptime, > though after all of this confusion it may be replaced RSN with > something more deterministic. Possibly wise. I'm not sure why ANSI C refused to adopt strptime(), but suspect it's because it was such a legacy mess. > I'm not sure what else you expect of me. To make it your full-time job to specify behavior in all cases, of course. > I have an application, some anecdotal evidence that something is amiss, > reported specific test-code that that demonstrated problem, and encouraged > Brett do a bit of digging though the standards to see if he can improve > things. That's helpful itself, and appreciated. It turns out the standards aren't helpful, though. > If you feel that I am being unreasonably vague about what Brett should do, > then you are quite right! I'm burried under a dozen other projects, and in > spite of that, felt it was important to make time to test Python 2.3a1. That's good. Thank you. > In doing so I found one concrete thing wrong with how we used strptime and > a bucket full of things that looked fishy. However, digging through the > nooks and crannies of strptime is not at all important to me, especially > when I have a viable work-around on all platforms that I care about (i.e., > re-enabling the libc version). > > Happy to help, but not expecting the Spanish Inquisition, I was trying to provoke you into being specific, on the chance that you did know exactly what you wanted but felt it was "too obvious" to bother spelling it out. My apologies if that came across as offensive or too strong. Getting concrete use cases remains the most helpful thing that can be done, whether or not a standard backs them. Python inherited a lot of platform accidents from platfrom strptime() implementations (and, indeed, the Python docs warned about that all along), and we simply can't know which accidents were important to people if they don't complain now. We can do something "reasonable" if we know what they are, but nobody knows what all the platform accidents were, so "backward compatible in all cases" is an unachievable wish. From bac@OCF.Berkeley.EDU Mon Jan 13 20:56:41 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 13 Jan 2003 12:56:41 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: References: Message-ID: [Kevin Jacobs] > On Mon, 13 Jan 2003, Tim Peters wrote: > > [Kevin Jacobs] > > I'm not sure what else you expect of me. I have an application, some > anecdotal evidence that something is amiss, reported specific test-code that > that demonstrated problem, and encouraged Brett do a bit of digging though > the standards to see if he can improve things. > I think what Tim is looking for is an explicit spelling of what standards you initially mentioned were. It appears you were not thinking of any specifically (in terms of ``strptime()``, but you did mention ``time.mktime()``'s). I see a few solutions to all of this: 1) Change nothing -- Keeps my life simple, but Kevin doesn't get his use case fixed. 2) Define our own standard on return values -- This would require editing timemodule.c to make sure any C version that gets used conforms to what we specify. The other option is to just remove the C version and use the Python version exclusively; speed and backwards-compatibility be damned, although the docs leave such a big hole open for what ``strptime`` will do that it would still conform to the docs. =) 3) Change ``time.mktime()`` code -- We can make the function ignore all -1 values if we cared to since our docs say nothing about supporting negative values. 4) Change ``time.mktime()`` docs -- Warn that it can, on occasion, accept negative values as valid (need to check ISO C docs to see if it always accepts negative values or if this is a "perk" of some other spec) 5) Create ``time.normalize()`` -- Would take a struct_time and return a new one with values that are within the valid ranges. This doesn't lead to us coming up with standards for ``strptime`` and allows people like Kevin to make sure any struct_time objects they use won't have unsuspected values. We should at least do 4) (and I will write the doc patch if people agree this should be done). I have no issue doing 2) since I would think we would just force all unreasonable values to the smallest valid value possible (sans things that can be calculated such as day of the week and Julian day to make sure they are valid). I have no clue how often negative values are used in ``time.mktime()``, so I can't comment on 3). 5) is just an idea off the top of my head. And I obviously have not great objection to 1). =) -Brett From just@letterror.com Mon Jan 13 21:03:19 2003 From: just@letterror.com (Just van Rossum) Date: Mon, 13 Jan 2003 22:03:19 +0100 Subject: [Python-Dev] properties on modules? In-Reply-To: <200301132048.h0DKmDd25963@odiug.zope.com> Message-ID: Guido van Rossum wrote: > I still don't see the use case for late binding of module attributes. The PyObjC project (the Python <-> Obective-C bridge) has/had a problem which might have been solved with lazy module attributes: one module can export a *lot* of ObjC classes, and building the Python wrappers takes quite a bit of time (at runtime), so much much it was quite noticable in the startup time, which especially hurt small apps. At some point I even suggested to insert an object with a __getattr__ method into sys.modules(*), but in the end some other optimizations were done to make importing somewhat quicker. *) I was pleasantly surprised how seamless that actually works: as long as the object either has an __all__ or a __dict__ attribute everything seems to work just fine. So if you *really* want lazy attributes, you can get them... Just From lists@morpheus.demon.co.uk Mon Jan 13 22:13:07 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Mon, 13 Jan 2003 22:13:07 +0000 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS References: Message-ID: Brett Cannon writes: > 2) Define our own standard on return values -- This would require > editing timemodule.c to make sure any C version that gets used conforms to > what we specify. The other option is to just remove the C version and use > the Python version exclusively; speed and backwards-compatibility be > damned, although the docs leave such a big hole open for what ``strptime`` > will do that it would still conform to the docs. =) I've not been following this much, and I come from a platform (Windows) with no C version of strptime, so I'm both grateful for the new version, and unconcerned about compatibility :-) However, I think there's a slightly less aggressive compromise. Would it be possible (or useful) to specify and document the behaviour of the Python version (essentially, "define our own standard"), but rather than jump through hoops to force C versions into conformance, just punt and say that C versions, when used, may differ in their handling of unspecified input? This probably only makes sense as an option if: a) The Python user can be expected to *know* whether they have the C version or the Python version, and b) People writing code for portability have a way of defending against the differences (or they write to the lowest common denominator of making no assumptions...) I don't know if these two assumptions are reasonable, though... Paul. -- This signature intentionally left blank From bac@OCF.Berkeley.EDU Mon Jan 13 22:59:33 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 13 Jan 2003 14:59:33 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: References: Message-ID: [Paul Moore] > Brett Cannon writes: > > > 2) Define our own standard on return values -- This would require > > editing timemodule.c to make sure any C version that gets used conforms to > > what we specify. The other option is to just remove the C version and use > > the Python version exclusively; speed and backwards-compatibility be > > damned, although the docs leave such a big hole open for what ``strptime`` > > will do that it would still conform to the docs. =) > > I've not been following this much, and I come from a platform > (Windows) with no C version of strptime, so I'm both grateful for the > new version, and unconcerned about compatibility :-) However, I think > there's a slightly less aggressive compromise. Would it be possible > (or useful) to specify and document the behaviour of the Python > version (essentially, "define our own standard"), but rather than jump > through hoops to force C versions into conformance, just punt and say > that C versions, when used, may differ in their handling of > unspecified input? > I don't know how much help it would be stating what the Python version does explicitly and then saying we have no clue what the C version might do *and* you don't get to control which version ``time`` uses. You might as well stick with what the docs say as-is and say we don't guarantee what you will get. > This probably only makes sense as an option if: > > a) The Python user can be expected to *know* whether they have the C > version or the Python version, and Would ``time.strptime == _strptime.strptime`` actually work? Otherwise there is no way to know without setting some module flag. > b) People writing code for portability have a way of defending against > the differences (or they write to the lowest common denominator of > making no assumptions...) > Well, we could tell people that they could import ``_strptime`` directly if they want to use the Python version. I mean this whole issue is solved if people passed their time tuples through I brain-dead simple function that unpacked the tuple and tested that values within the valid range if they weren't gave them values that made sense. But then the question becomes do we want to do this for the user in the library or should we let them do it themselves. -Brett From jepler@unpythonic.net Tue Jan 14 00:30:40 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Mon, 13 Jan 2003 18:30:40 -0600 Subject: [Python-Dev] properties on modules? In-Reply-To: <20030113173420.GA21406@glacier.arctrix.com> References: <20030113173420.GA21406@glacier.arctrix.com> Message-ID: <20030113183025.A10229@unpythonic.net> On Mon, Jan 13, 2003 at 09:34:20AM -0800, Neil Schemenauer wrote: > It would be really cool if this worked: > > import time > now = property(lambda m: time.time()) > > Obviously a silly example but I hope the idea is clear. Is there a > reason this couldn't be made to work? I don't know, but if you create time.now, then will *both* of these work? print time.now from time import now print now If not, it would be quite counterintuitive. Jeff From aahz@pythoncraft.com Tue Jan 14 00:40:08 2003 From: aahz@pythoncraft.com (Aahz) Date: Mon, 13 Jan 2003 19:40:08 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: References: Message-ID: <20030114004008.GA2148@panix.com> On Mon, Jan 13, 2003, Brett Cannon wrote: > > Well, we could tell people that they could import ``_strptime`` > directly if they want to use the Python version. I mean this > whole issue is solved if people passed their time tuples through I > brain-dead simple function that unpacked the tuple and tested that > values within the valid range if they weren't gave them values that > made sense. But then the question becomes do we want to do this for > the user in the library or should we let them do it themselves. ...or provide a function that allows them to do it themselves. (Thereby making it easy but not defaulted.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "I used to have a .sig but I found it impossible to please everyone..." --SFJ From mhammond@skippinet.com.au Tue Jan 14 00:49:35 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Tue, 14 Jan 2003 11:49:35 +1100 Subject: [Python-Dev] properties on modules? In-Reply-To: <009201c2bb34$76f5a9e0$21795418@dell1700> Message-ID: <03b301c2bb66$cf8cba80$530f8490@eden> [Brian Quinlan] > Presumably, he would prefer this syntax: > > start = time.now > > to: > > start = time.now() > > The .NET framework implements "now" as a property rather than > a function > and I find it distasteful for some reason. Interestingly, a new, wonderful book called "Programming in the .NET Environment", co-authored by me (http://www.aw.com/productpage?ISBN=0201770180 or http://www.amazon.com/exec/obidos/tg/detail/-/0201770180/qid=1042505176) has some information on this ;) One of the other authors is the Program Manager for the .NET framework class library. He has written the MS design guidelines for the .NET framework, so we asked him to paraphrase them in our book. One part of this (page 241) deals specifically with properties and methods. One of the guidelines states that a property should *not* be used "if calling the member twice in succession produces different results". So clearly, this violates the .NET design guidelines - unless, of course, you call it twice *very quickly* . Mark. From bac@OCF.Berkeley.EDU Tue Jan 14 00:53:35 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 13 Jan 2003 16:53:35 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: <20030114004008.GA2148@panix.com> References: <20030114004008.GA2148@panix.com> Message-ID: [Aahz] > On Mon, Jan 13, 2003, Brett Cannon wrote: > > > > Well, we could tell people that they could import ``_strptime`` > > directly if they want to use the Python version. I mean this > > whole issue is solved if people passed their time tuples through I > > brain-dead simple function that unpacked the tuple and tested that > > values within the valid range if they weren't gave them values that > > made sense. But then the question becomes do we want to do this for > > the user in the library or should we let them do it themselves. > > ...or provide a function that allows them to do it themselves. (Thereby > making it easy but not defaulted.) > -- That is definitely an option. Heck, I will even volunteer to write it if this is what people think is the best solution (and I will do it in C, no less! =). -Brett From mhammond@skippinet.com.au Tue Jan 14 00:57:01 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Tue, 14 Jan 2003 11:57:01 +1100 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <200301131831.h0DIVwr24888@odiug.zope.com> Message-ID: <03b901c2bb67$d9c4bc90$530f8490@eden> [Guido] > The proposal also recomments an abstract base type, "basetime", for > all time types. Without this, cmp() is hard to do (see Tim's post for > explanation; we don't want datetime objects to be comparable to > objects with arbitrary other types, because the default comparison iss > meaningless). > > This could be a pure "marker" type, like "basestring". Marc-Andre, > if we export basetime from the core, can mxDateTime subclass from > that? I have the exact same issue for my COM time objects, and like MAL, just recently started thinking about it. Certainly, such a subclass would help me enormously, and would probably make it quite simple for this to allow *any* datetime object to be passed to COM/Windows functions. If we can do the same for MAL (ie, a Python datetime magically works anywhere an mxDateTime did before), then this would be a real bonus :) Mark. From tim.one@comcast.net Tue Jan 14 01:21:05 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 20:21:05 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <03b901c2bb67$d9c4bc90$530f8490@eden> Message-ID: [Guido] > The proposal also recomments an abstract base type, "basetime", for > all time types. Without this, cmp() is hard to do (see Tim's post for > explanation; we don't want datetime objects to be comparable to > objects with arbitrary other types, because the default comparison iss > meaningless). > > This could be a pure "marker" type, like "basestring". Marc-Andre, > if we export basetime from the core, can mxDateTime subclass from > that? Let me ask a question: when I tried to make datatime.tzinfo a pure "marker" type, I eventually had to give up, because I absolutely could not make it work with pickling (and recalling that pickles produced by datetime.py had to be readable by datetimemodule.c, and vice versa). Instead I had to make it instantiable (give it an init method that didn't complain), and require that tzinfo subclasses also have an __init__method callable with no arguments. Are we able to get away with basestring because pickle already has deep knowledge about Python's string types? [Mark Hammond] > I have the exact same issue for my COM time objects, and like MAL, just > recently started thinking about it. > > Certainly, such a subclass would help me enormously, and would > probably make it quite simple for this to allow *any* datetime > object to be passed to COM/Windows functions. Well, I don't see that a basetime marker class alone would allow for that. What else are you assuming? From mhammond@skippinet.com.au Tue Jan 14 01:36:27 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Tue, 14 Jan 2003 12:36:27 +1100 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: Message-ID: <03e001c2bb6d$5b9f9690$530f8490@eden> [Tim] > [Mark Hammond] > > I have the exact same issue for my COM time objects, and > like MAL, just > > recently started thinking about it. > > > > Certainly, such a subclass would help me enormously, and would > > probably make it quite simple for this to allow *any* datetime > > object to be passed to COM/Windows functions. > > Well, I don't see that a basetime marker class alone would > allow for that. > What else are you assuming? Well, at the very least I expect that I could do *something* . If I get an object of this type passed to my extension, I at least know it is *some* kind of time object, so poke-and-hope can kick in. There is an excellent chance, for example, that a "timetuple()" method exists. Certainly being able to deal explicitly only with mxDateTime and Python's datetime, without needing to link to either, would still be a huge win. So yeah, having more of a datetime *interface* would be nicer than a marker type, but given a base type and a clear convention for datetime objects, we are pretty close. On the other hand, I guess *just* the existance of a "timetuple()" method is as good an indicator as any. What-goes-around-comes-around ly, Mark. From tim.one@comcast.net Tue Jan 14 02:04:27 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 21:04:27 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <03e001c2bb6d$5b9f9690$530f8490@eden> Message-ID: > Well, I don't see that a basetime marker class alone would > allow for that. What else are you assuming? [Mark Hammond] > Well, at the very least I expect that I could do *something* . If > I get an object of this type passed to my extension, I at least know it > is *some* kind of time object, so poke-and-hope can kick in. There is > an excellent chance, for example, that a "timetuple()" method exists. That's a requirement, if the basetime marker is taken as meaning the thing implements /F's proposal. But the meaning of timetuple()'s result isn't so clear, as it contains no time zone info. The spec says "local time", but then you lose all info about what time zone the original object believes it belongs to, and you have to account for your local time's quirks too. The spec could say it's a UTC time instead, but then you also lose time zone info. Or it could do what datetime actually does, giving you a time tuple in the local time *of* the datetime object, not the local time on the box you happen to be running on. You lose time zone info also that way (there's a pattern here : struct tm simply contains no time zone info). At least for a datetime.datetime object, you can call utcoffset() to get the original object's offset (in minutes east of UTC). > Certainly being able to deal explicitly only with mxDateTime and Python's > datetime, without needing to link to either, would still be a huge win. If that's all you really want, a small chain of isinstance() checks would suffice. > So yeah, having more of a datetime *interface* would be nicer > than a marker type, but given a base type and a clear convention for > datetime objects, we are pretty close. > > On the other hand, I guess *just* the existance of a > "timetuple()" method is as good an indicator as any. Provided what timetuple() returns is enough info for your app to live with. Because of the lack of time zone information, I doubt that it will be, and for many apps. From guido@python.org Tue Jan 14 02:06:03 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 21:06:03 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: "Your message of Tue, 14 Jan 2003 11:57:01 +1100." <03b901c2bb67$d9c4bc90$530f8490@eden> References: <03b901c2bb67$d9c4bc90$530f8490@eden> Message-ID: <200301140206.h0E264a11590@pcp02138704pcs.reston01.va.comcast.net> > Certainly, such a subclass would help me enormously, and would > probably make it quite simple for this to allow *any* datetime > object to be passed to COM/Windows functions. If we can do the same > for MAL (ie, a Python datetime magically works anywhere an > mxDateTime did before), then this would be a real bonus :) OK, we can do this. (If someone can cook up a patch that would be great.) Mark, what APIs would you use on a time object? And don't you need an API to *create* time objects? Still waiting for an answer from Marc-Andre... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jan 14 02:09:00 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 21:09:00 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: "Your message of Mon, 13 Jan 2003 20:21:05 EST." References: Message-ID: <200301140209.h0E290A11629@pcp02138704pcs.reston01.va.comcast.net> > [Guido] > > The proposal also recomments an abstract base type, "basetime", for > > all time types. Without this, cmp() is hard to do (see Tim's post for > > explanation; we don't want datetime objects to be comparable to > > objects with arbitrary other types, because the default comparison iss > > meaningless). > > > > This could be a pure "marker" type, like "basestring". Marc-Andre, > > if we export basetime from the core, can mxDateTime subclass from > > that? [Tim] > Let me ask a question: when I tried to make datatime.tzinfo a pure > "marker" type, I eventually had to give up, because I absolutely > could not make it work with pickling (and recalling that pickles > produced by datetime.py had to be readable by datetimemodule.c, and > vice versa). Instead I had to make it instantiable (give it an init > method that didn't complain), and require that tzinfo subclasses > also have an __init__method callable with no arguments. Are we able > to get away with basestring because pickle already has deep > knowledge about Python's string types? Detail to be worked out. If the only way to make pickling work is to make basestring instantiable, so be it. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jan 14 02:13:00 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 21:13:00 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: "Your message of Tue, 14 Jan 2003 12:36:27 +1100." <03e001c2bb6d$5b9f9690$530f8490@eden> References: <03e001c2bb6d$5b9f9690$530f8490@eden> Message-ID: <200301140213.h0E2D0F13837@pcp02138704pcs.reston01.va.comcast.net> > On the other hand, I guess *just* the existance of a "timetuple()" > method is as good an indicator as any. Good point. But timetuple() is also the proposal's weakness, because you don't know in which timezone it is expressed. I think utctimetuple() would be more useful -- mxDateTime can support that, and so can datetime, when a tzinfo object is given. You could fall back on timetuple() if utctimetuple() doesn't exist. (Hm, I just found that utctimetuple() returns the same as timetuple() when the tzinfo is None -- that doesn't seem right.) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Tue Jan 14 02:20:08 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 21:20:08 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: [Brett Cannon] > ... > Well, since this code was added in 2.3a0 and I have no API to worry > about I can easily make use of the code. Is the datetime API all > settled after the last shakedown of the class hierarchy? I doubt it, but toordinal() and weekday() won't go away or change. Even if timetuple() changes, the "Julian day" would still be easily gotten via: (d - d.replace(month=1, day=1)).days + 1 The potential problem with timetuple() now is that /F's spec says it returns (machine) local time, and if we adopt that then the day may change from what's returned now (time zone adjustment can move you one day in either direction). > Or should I hold off for a little while? Might as well cut back on the > code duplication as much as possible. Code dup was my main concern, and these algorithms are irritating enough that it would be good to fix bugs in only one place . > And I take it there is no desire to integrate strptime into datetime at > all (I remember someone saying that this would be going in the wrong > direction although you do have strftime). There's no desire on my or Guido's parts, although I'm not entirely sure why not. A large part of it for me is just avoiding another time sink. As to the rest, I bet we'll be fine if you just plug in the most reasonable values you can, and that aren't likely to tickle platform bugs. That means 1900 for an unspecified year, no months outside 1-12, no days below 1, etc. No leap seconds either, and no -1 except for tm_isdst. That should give Kevin the roundtrip date behavior he needs, and shouldn't screw anyone. From skip@pobox.com Tue Jan 14 02:22:38 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 13 Jan 2003 20:22:38 -0600 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: References: <03b901c2bb67$d9c4bc90$530f8490@eden> Message-ID: <15907.29806.635259.42888@montanaro.dyndns.org> Tim> Let me ask a question: when I tried to make datatime.tzinfo a pure Tim> "marker" type, I eventually had to give up, because I absolutely Tim> could not make it work with pickling (and recalling that pickles Tim> produced by datetime.py had to be readable by datetimemodule.c, and Tim> vice versa). I haven't been following any of the datetime machinations closely, but this issue of reading pickles caught my eye. Why would the two versions have to be able to instantiate each others' pickles? That seems like an unreasonable constraint to place on things. Is there something platform-dependent in datetimemodule.c that would mean it couldn't be built everywhere? I thought datetime.py was simply a place to try out new ideas before moving them to C. Skip From tim.one@comcast.net Tue Jan 14 02:31:28 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 21:31:28 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <200301140213.h0E2D0F13837@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] > (Hm, I just found that utctimetuple() returns the same as timetuple() > when the tzinfo is None Yes. Also if d.tzinfo is not None but d.tzinfo.utcoffset(d) returns 0 or None. > -- that doesn't seem right.) Because ...? I'd rather get rid of utctimetuple(), supply a datetime.utc tzinfo subclass out of the box, and change the spelling of d.utctimetuple() to d.timetuple(tzinfo=utc) Similarly for datetime.now() vs datetime.utcnow(). The proliferation of ABCutcXYZ methods is confusing, especially since they return naive objects. None of that helps make for a lowest common denonminator, though. From tim.one@comcast.net Tue Jan 14 02:34:11 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 21:34:11 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <15907.29806.635259.42888@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > I haven't been following any of the datetime machinations > closely, but this issue of reading pickles caught my eye. Why > would the two versions have to be able to instantiate each others' > pickles? Zope Corp funded this work for Zope3, Zope3 has to run under Python 2.2.2 at first, datetimemodule.c relies on new-in-2.3 C features (like METH_CLASS) so can't be used in Zope3 at first, and pickles written by the Python datetime.py (which Zope3 is using at first) need to be readable later when Zope moves to Python 2.3. Vice versa also for Zope3 reasons. From bac@OCF.Berkeley.EDU Tue Jan 14 02:42:25 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Mon, 13 Jan 2003 18:42:25 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: References: Message-ID: [Tim Peters] > [Brett Cannon] > > ... > > Well, since this code was added in 2.3a0 and I have no API to worry > > about I can easily make use of the code. Is the datetime API all > > settled after the last shakedown of the class hierarchy? > > I doubt it, but toordinal() and weekday() won't go away or change. Even if > timetuple() changes, the "Julian day" would still be easily gotten via: > > (d - d.replace(month=1, day=1)).days + 1 > > The potential problem with timetuple() now is that /F's spec says it returns > (machine) local time, and if we adopt that then the day may change from > what's returned now (time zone adjustment can move you one day in either > direction). > I will just hold off for now but plan on making the switch before 2.3 goes final. > > Or should I hold off for a little while? Might as well cut back on the > > code duplication as much as possible. > > Code dup was my main concern, and these algorithms are irritating enough > that it would be good to fix bugs in only one place . > =) You can say that again. > > And I take it there is no desire to integrate strptime into datetime at > > all (I remember someone saying that this would be going in the wrong > > direction although you do have strftime). > > There's no desire on my or Guido's parts, although I'm not entirely sure why > not. A large part of it for me is just avoiding another time sink. > Well, I would assume the responsibility of maintaining it would fall on my shoulders. But leaving it out is fine with me. > As to the rest, I bet we'll be fine if you just plug in the most reasonable > values you can, and that aren't likely to tickle platform bugs. That means > 1900 for an unspecified year, no months outside 1-12, no days below 1, etc. > No leap seconds either, and no -1 except for tm_isdst. That should give > Kevin the roundtrip date behavior he needs, and shouldn't screw anyone. > OK, I will then plan on writing up a patch to give reasonable default values to ``_strptime``. I will wait a week, though, to make sure there are no latent comments on any of this. Also, is there any desire for the C wrapper to do any checking of the values so we can change the docs and guarantee that any value returned by ``time.strptime()`` will have valid values? -Brett From guido@python.org Tue Jan 14 02:43:34 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 21:43:34 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: "Your message of Mon, 13 Jan 2003 21:31:28 EST." References: Message-ID: <200301140243.h0E2hYu18479@pcp02138704pcs.reston01.va.comcast.net> > [Guido] > > (Hm, I just found that utctimetuple() returns the same as timetuple() > > when the tzinfo is None [Tim] > Yes. Also if d.tzinfo is not None but d.tzinfo.utcoffset(d) returns 0 or > None. > > > -- that doesn't seem right.) > > Because ...? Because if utcoffset() returns None it's misleading to pretend to be able to return a UTC tuple. > I'd rather get rid of utctimetuple(), supply a datetime.utc > tzinfo subclass out of the box, and change the spelling of > > d.utctimetuple() > > to > > d.timetuple(tzinfo=utc) > > Similarly for datetime.now() vs datetime.utcnow(). The proliferation of > ABCutcXYZ methods is confusing, especially since they return naive objects. Tell you what. I sincerely doubt that we'll be able to agree on useful semantics for the common base API. /F's proposal doesn't map well on datetime, and that pretty much kills the idea. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Tue Jan 14 03:29:30 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 22:29:30 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <200301140243.h0E2hYu18479@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido] >>> (Hm, I just found that utctimetuple() returns the same as timetuple() >>> when the tzinfo is None [Tim] >> Yes. Also if d.tzinfo is not None but d.tzinfo.utcoffset(d) >> returns 0 or None. >>> -- that doesn't seem right.) >> Because ...? [Guido] > Because if utcoffset() returns None it's misleading to pretend to be > able to return a UTC tuple. So you would like to see instead ...? An exception? You didn't mention the utcoffset() is None case at the start, just the tzinfo is None case. Do you view those as being the same thing in the end? >> I'd rather get rid of utctimetuple(), supply a datetime.utc >> tzinfo subclass out of the box, and change the spelling of >> >> d.utctimetuple() >> >> to >> >> d.timetuple(tzinfo=utc) >> >> Similarly for datetime.now() vs datetime.utcnow(). The proliferation >> of ABCutcXYZ methods is confusing, especially since they return >> naive objects. > Tell you what. I sincerely doubt that we'll be able to agree on > useful semantics for the common base API. /F's proposal doesn't map > well on datetime, and that pretty much kills the idea. I tend to agree, but the comments about timetuple() vs utctimetuple() now() vs utcnow() fromtimestamp() vs utcfromtimestamp() are about the current datetime API, not really related to /F's proposal. Since datetime and datetimetz objects are no longer distinct, all spellings of timetuple(), now() and fromtimestamp() have an optional tzinfo argument now. This seems to me to make the 3 additional utc spellings of these methods at least extravagant. The semantics of these things are all wrong anyway (as in the SF bug report about this). I'd like to simplify this. From tim.one@comcast.net Tue Jan 14 03:40:46 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 13 Jan 2003 22:40:46 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: [Brett Cannon, on folding strptime into datetime] > Well, I would assume the responsibility of maintaining it would fall on > my shoulders. But leaving it out is fine with me. I also maintain a pure-Python version of datetime.py (for Zope's use). date/time modules are a bottomless pit. > ... > Also, is there any desire for the C wrapper The C wrapper for what? > to do any checking of the values so we can change the docs and > guarantee that any value returned by ``time.strptime()`` will have > valid values? I think we should lose the C version of strptime and use _strptime.py everywhere now -- allowing x-platform accidents to sneak thru is un-Pythonic (unless they're *valuable* x-platform accidents -- strptime accidents are random crap). From guido@python.org Tue Jan 14 04:02:55 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 23:02:55 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: "Your message of Mon, 13 Jan 2003 22:29:30 EST." References: Message-ID: <200301140402.h0E42th18726@pcp02138704pcs.reston01.va.comcast.net> > [Guido] > >>> (Hm, I just found that utctimetuple() returns the same as timetuple() > >>> when the tzinfo is None > > [Tim] > >> Yes. Also if d.tzinfo is not None but d.tzinfo.utcoffset(d) > >> returns 0 or None. > > >>> -- that doesn't seem right.) > > >> Because ...? > > [Guido] > > Because if utcoffset() returns None it's misleading to pretend to be > > able to return a UTC tuple. [Tim] > So you would like to see instead ...? An exception? Probably -- None isn't very helpful IMO, but either one is probably okay. > You didn't mention the utcoffset() is None case at the start, just the > tzinfo is None case. Do you view those as being the same thing in the end? Insofar as that the UTC is unknown in either case, yes. > >> I'd rather get rid of utctimetuple(), supply a datetime.utc > >> tzinfo subclass out of the box, and change the spelling of > >> > >> d.utctimetuple() > >> > >> to > >> > >> d.timetuple(tzinfo=utc) > >> > >> Similarly for datetime.now() vs datetime.utcnow(). The proliferation > >> of ABCutcXYZ methods is confusing, especially since they return > >> naive objects. > > > Tell you what. I sincerely doubt that we'll be able to agree on > > useful semantics for the common base API. /F's proposal doesn't map > > well on datetime, and that pretty much kills the idea. > > I tend to agree, but the comments about > > timetuple() vs utctimetuple() > now() vs utcnow() > fromtimestamp() vs utcfromtimestamp() > > are about the current datetime API, not really related to /F's > proposal. Since datetime and datetimetz objects are no longer > distinct, all spellings of timetuple(), now() and fromtimestamp() > have an optional tzinfo argument now. This seems to me to make the > 3 additional utc spellings of these methods at least extravagant. > The semantics of these things are all wrong anyway (as in the SF bug > report about this). I'd like to simplify this. Please do, and don't feel constrained by /F's proposal. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Jan 14 04:06:41 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 13 Jan 2003 23:06:41 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: "Your message of Mon, 13 Jan 2003 22:40:46 EST." References: Message-ID: <200301140406.h0E46fk18814@pcp02138704pcs.reston01.va.comcast.net> > I think we should lose the C version of strptime and use > _strptime.py everywhere now -- allowing x-platform accidents to > sneak thru is un-Pythonic (unless they're *valuable* x-platform > accidents -- strptime accidents are random crap). Guess what. Through clever use of the time-machine, we *are* using _strptime.py everwhere now. There's an #undef HAVE_STRPTIME in timemodule.c. --Guido van Rossum (home page: http://www.python.org/~guido/) From bac@OCF.Berkeley.EDU Tue Jan 14 10:05:19 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 14 Jan 2003 02:05:19 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: <200301140406.h0E46fk18814@pcp02138704pcs.reston01.va.comcast.net> References: <200301140406.h0E46fk18814@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Guido van Rossum] > > I think we should lose the C version of strptime and use > > _strptime.py everywhere now -- allowing x-platform accidents to > > sneak thru is un-Pythonic (unless they're *valuable* x-platform > > accidents -- strptime accidents are random crap). > > Guess what. Through clever use of the time-machine, we *are* using > _strptime.py everwhere now. There's an #undef HAVE_STRPTIME in > timemodule.c. > I need one of those new-fangled time-machines. Sure seem handy. =) -Brett From tim.one@comcast.net Tue Jan 14 13:47:58 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 14 Jan 2003 08:47:58 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: <200301140406.h0E46fk18814@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Tim] >> I think we should lose the C version of strptime and use >> _strptime.py everywhere now -- allowing x-platform accidents to >> sneak thru is un-Pythonic (unless they're *valuable* x-platform >> accidents -- strptime accidents are random crap). [Guido] > Guess what. Through clever use of the time-machine, we *are* using > _strptime.py everwhere now. There's an #undef HAVE_STRPTIME in > timemodule.c. I know. I want to delete the C wrapper too and get rid of HAVE_STRPTIME: full steam ahead, no looking back. This is pushing back against the growing notion that the way to deal with legacy platform strptime quirks is to keep an option open to recompile Python, to avoid using the portable code. Since that will be the easiest way out (one person here has already taken it), we'll never get Python's own strptime story straight so long as it's an option. From guido@python.org Tue Jan 14 14:18:34 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 14 Jan 2003 09:18:34 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Your message of "Tue, 14 Jan 2003 02:05:19 PST." References: <200301140406.h0E46fk18814@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200301141418.h0EEIYA28186@odiug.zope.com> > > > I think we should lose the C version of strptime and use > > > _strptime.py everywhere now -- allowing x-platform accidents to > > > sneak thru is un-Pythonic (unless they're *valuable* x-platform > > > accidents -- strptime accidents are random crap). > > > > Guess what. Through clever use of the time-machine, we *are* using > > _strptime.py everwhere now. There's an #undef HAVE_STRPTIME in > > timemodule.c. > > I need one of those new-fangled time-machines. Sure seem handy. =) We still need to make a final decision about this. Instead of #undef HAVE_STRPTIME, the code in timemodule.c that uses _strptime.py should simply be included unconditionally. Arguments for always using _strptime.py: - consistency across platforms - avoids buggy or non-conforming platform strptime() implementations Arguments for using the platform strptime() if it exists: - speed - may contain platform-specific extensions - consistency with other apps on the same platform --Guido van Rossum (home page: http://www.python.org/~guido/) From python@rcn.com Tue Jan 14 14:35:26 2003 From: python@rcn.com (Raymond Hettinger) Date: Tue, 14 Jan 2003 09:35:26 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS References: <200301140406.h0E46fk18814@pcp02138704pcs.reston01.va.comcast.net> <200301141418.h0EEIYA28186@odiug.zope.com> Message-ID: <013801c2bbda$2e5b5f80$cb11a044@oemcomputer> > We still need to make a final decision about this. Instead of #undef > HAVE_STRPTIME, the code in timemodule.c that uses _strptime.py should > simply be included unconditionally. > > Arguments for always using _strptime.py: > > - consistency across platforms > > - avoids buggy or non-conforming platform strptime() implementations > > Arguments for using the platform strptime() if it exists: > > - speed > > - may contain platform-specific extensions > > - consistency with other apps on the same platform My vote is for always using _strptime.py. * It is the cleanest solution. * Someone building new apps would be well advised to avoid the quirky, bugridden, and non-portable platform strptime() implementations. Upon their next upgrade, their code is a risk of changing behavior. * Speed is a non-issue. If it becomes critcal, then the appropriate part of _strptime.py could be coded in C to match the pure python version instead of the system version. * The only significant con is consistency with other apps on the same platform. However cross-platfrom consistency is at least as important and python users should reasonably expect to have it. my-two-cents-ly yours, Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From guido@python.org Tue Jan 14 15:03:30 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 14 Jan 2003 10:03:30 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Your message of "Tue, 14 Jan 2003 08:47:58 EST." References: Message-ID: <200301141503.h0EF3US28503@odiug.zope.com> > Subject: RE: [Python-Dev] Broken strptime in Python 2.3a1 & CVS > From: Tim Peters > To: guido@python.org > Cc: Brett Cannon , python-dev@python.org > Date: Tue, 14 Jan 2003 08:47:58 -0500 > X-warning: 24.153.64.2 in blacklist at unconfirmed.dsbl.org > (http://dsbl.org/listing.php?24.153.64.2) > X-Spam-Level: > X-MailScanner: Found to be clean > > [Tim] > >> I think we should lose the C version of strptime and use > >> _strptime.py everywhere now -- allowing x-platform accidents to > >> sneak thru is un-Pythonic (unless they're *valuable* x-platform > >> accidents -- strptime accidents are random crap). > > [Guido] > > Guess what. Through clever use of the time-machine, we *are* using > > _strptime.py everwhere now. There's an #undef HAVE_STRPTIME in > > timemodule.c. [Tim] > I know. I want to delete the C wrapper too and get rid of > HAVE_STRPTIME: full steam ahead, no looking back. This is pushing > back against the growing notion that the way to deal with legacy > platform strptime quirks is to keep an option open to recompile > Python, to avoid using the portable code. Since that will be the > easiest way out (one person here has already taken it), we'll never > get Python's own strptime story straight so long as it's an option. OK. Let's get rid of the C wrapper around the C library's strptime(). The C wrapper around _strptime.strptime() stays, of course. It currently has a bit of an inefficiency (what happens when it tries to import _strptime is a lot more than I'd like to see happen for each call) but that's a somewhat tricky issue that I'd like to put off for a little while; I've added a SF bug report as a reminder. (667770) --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Jan 14 16:30:37 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 14 Jan 2003 17:30:37 +0100 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <200301140206.h0E264a11590@pcp02138704pcs.reston01.va.comcast.net> References: <03b901c2bb67$d9c4bc90$530f8490@eden> <200301140206.h0E264a11590@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E243B2D.8070709@lemburg.com> Guido van Rossum wrote: >>Certainly, such a subclass would help me enormously, and would >>probably make it quite simple for this to allow *any* datetime >>object to be passed to COM/Windows functions. If we can do the same >>for MAL (ie, a Python datetime magically works anywhere an >>mxDateTime did before), then this would be a real bonus :) > > > OK, we can do this. (If someone can cook up a patch that would be > great.) > > Mark, what APIs would you use on a time object? And don't you need an > API to *create* time objects? > > Still waiting for an answer from Marc-Andre... An abstract baseclass would only help all the way if I can make mxDateTime objects new style classes. That's not going to happen for a few months because I don't have any requirement for it. Now for interop, I'm basically interested in adding support for most of the binary operations ... mxD = mx.DateTime.DateTime mxDD = mx.DateTime.DateTimeDelta dt = datetime t = timedelta d = date * mxD - d (assuming 0:00 as time part), mxD - dt * mxD -/+ t * mxDD + d (assuming 0:00 as time part), mxDD + dt * mxDD -/+ t * mxD < d * mxDD < t (and reverse order) ... and contructors * DateTimeFrom(dt), DateTimeFrom(d) * DateTimeDeltaFrom(t) etc. In order to get the binary ops to work, I'll have to enable new style number support in mxDateTime and drop the 1.5.2 support :-( Now the problem I see is when an API expects a datetime object and gets an mxDateTime object instead. For mxDateTime I have solved this by simply letting float(mxD) return a Unix ticks value and float(mxDD) return the value in seconds since midnight -- this makes mxDateTime object compatible to all APIs which have previously only accepted Unix ticks. mxDateTime also does mixed type operations using Unix ticks if it doesn't know the other type. So perhaps we need something like this: * a basedate class which is accessible at C level * compatibility to Unix ticks floats (nb_float) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From oren-py-d@hishome.net Tue Jan 14 17:20:50 2003 From: oren-py-d@hishome.net (Oren Tirosh) Date: Tue, 14 Jan 2003 12:20:50 -0500 Subject: [Python-Dev] Fwd: Re: PEP 267 improvement idea Message-ID: <20030114172050.GA80173@hishome.net> I guess this really belongs on python-dev, not python-list. ----- Forwarded message from Oren Tirosh ----- X-Envelope-To: oren-py-l@hishome.net From: Oren Tirosh To: Skip Montanaro Cc: python-list@python.org Subject: Re: PEP 267 improvement idea (fwd) Date: Tue, 14 Jan 2003 12:14:33 -0500 On Tue, Jan 14, 2003 at 09:54:48AM -0600, Skip Montanaro wrote: > > Oren> I know Guido likes the idea but the implementation is still far > Oren> from complete. > > Can you elaborate on its incompleteness? Ummmm... I've experimented with several different strategies and tricks. I'm not so sure which ones are actually implemented in the latest version of the patch on sourceforge. One of the big improvements was speeding up failed lookups by adding negative entries to "positively" identify keys not in the dictionary (me_value == NULL, me_key != NULL, != dummy). This considerably speeds up the global/builtin lookup chain. Management of these entries is somewhere between limited to nonexistent. New entries are simply not added if there is not enough free space the table, resizing a dictionary with negative entries is probably buggy. Negative entries should also help instance/class/superclass lookup chain but it segfaults if used for anything other than LOAD_NAME and LOAD_GLOBAL so there must be other bugs lurking there. Inlining works really fast - but only if the entry is found in the first hash probe. I experimented with dynamically shuffling the entries to ensure that the entry accessed most frequently would be first >90% of the time instead of ~%75 of the time. This is not in the patch and I don't have code that is anywhere near usable. Interning - now that interned strings are not immortal this patch could do be much more aggressive about interning everything. Dictionaries used as namespaces could be guaranteed to have only interned strings as entry keys and avoid the need for a second search pass on first failed lookup (before the negative entry is added). This is important because many small temporary objects have attributes that are accessed only once. Currently this first access is made slower by the patch instead of faster. The last strategy I was going to try before my free time shrunk considerably (a job found me) was to create a new species of dictionary in addition to "lookdict" and "lookdict_string". Dictionaries will start their life as either "lookdict_string" or "lookdict_namespace", depending on their expected usage. This is just an optimization hint - the semantics are otherwise identical. A "lookdict_namespace" dictionary interns all keys and adds negative entries for failed lookups. This overhead pays when it's used as a namespace. Both "lookdict_namespace" and "lookdict_string" will fall back to a "lookdict" dictionary when a non-string key is inserted. Optimizing or inlining setitem for the common case where the entry already exists and is only set to a new value will also be nice. Oren -- http://mail.python.org/mailman/listinfo/python-list ----- End forwarded message ----- From guido@python.org Tue Jan 14 17:29:34 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 14 Jan 2003 12:29:34 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: Your message of "Tue, 14 Jan 2003 17:30:37 +0100." <3E243B2D.8070709@lemburg.com> References: <03b901c2bb67$d9c4bc90$530f8490@eden> <200301140206.h0E264a11590@pcp02138704pcs.reston01.va.comcast.net> <3E243B2D.8070709@lemburg.com> Message-ID: <200301141729.h0EHTYh30356@odiug.zope.com> > An abstract baseclass would only help all the way if I can make > mxDateTime objects new style classes. That's not going to happen > for a few months because I don't have any requirement for it. OK, but see below. > Now for interop, I'm basically interested in adding support > for most of the binary operations ... > > mxD = mx.DateTime.DateTime > mxDD = mx.DateTime.DateTimeDelta > dt = datetime > t = timedelta > d = date > > * mxD - d (assuming 0:00 as time part), mxD - dt > * mxD -/+ t > * mxDD + d (assuming 0:00 as time part), mxDD + dt > * mxDD -/+ t > * mxD < d > * mxDD < t These you can all do. > (and reverse order) That is also doable, *except* for the comparisons. That's why I was proposing that you inherit from basetime. The datetime module's comparison currently always raises TypeError when the other argument isn't a datetime instance; my proposal would be to return NotImplemented if the other isn't a datetime instance but it inherits from basetime. I guess an alternative would be to check whether the other argument "smells like" a time object, e.g. by testing for a "timetuple" attribute (or whatever we agree on). > ... and contructors > > * DateTimeFrom(dt), DateTimeFrom(d) > * DateTimeDeltaFrom(t) > > etc. You should be able to do that. You should get dt.timetuple(), which gives time in dt's local time (not your local time), and then you can convert it to UTC by subtracting dt.utcoffset(). > In order to get the binary ops to work, I'll have to enable > new style number support in mxDateTime and drop the 1.5.2 > support :-( Time to bite that bullet. :-) > Now the problem I see is when an API expects a datetime > object and gets an mxDateTime object instead. Which APIs are those? > For mxDateTime I have solved this by simply letting float(mxD) > return a Unix ticks value and float(mxDD) return the value in > seconds since midnight -- this makes mxDateTime object compatible > to all APIs which have previously only accepted Unix ticks. You mean time.ctime(x), time.localtime(x), and time.gmtime(x)? Those operations are available as methods on datetime objects, though with different names and (in the case of localtime) with somewhat different semantics when timezones are involved. > mxDateTime also does mixed type operations using Unix > ticks if it doesn't know the other type. > > So perhaps we need something like this: > * a basedate class which is accessible at C level Um, I thought you just said you couldn't do a base class yet? > * compatibility to Unix ticks floats (nb_float) If you really want that, we could add a __float__ method to datetime, but I see several problems: - What to do if the datetime object's utcoffset() method returns None? Then you can't convert to ticks. I propose an error. - The range of datetime (1 <= year <= 9999) is much larger than the range of ticks (1970 <= year < 1938). I suppose it could raise an exception when the time is not representable as ticks. - A C double doesn't have enough precision for roundtrip guarantees. - Does it really need to be automatic? I.e., does it really need to be __float__()? I'd be less against this if it was an explicit method, e.g. dt.asposixtime(). --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy@udel.edu Tue Jan 14 18:44:37 2003 From: tjreedy@udel.edu (Terry Reedy) Date: Tue, 14 Jan 2003 13:44:37 -0500 Subject: [Python-Dev] Re: Re: PEP 267 improvement idea References: <20030114172050.GA80173@hishome.net> Message-ID: > overhead pays when it's used as a namespace. Both "lookdict_namespace" > and "lookdict_string" will fall back to a "lookdict" dictionary when a > non-string key is inserted. Until reading this sentence, I had never thought of doing something so bizarre as adding a non-string key to a namespace (via the 'backdoor' .__dict__ reference). But it currently works. >>> import __main__ >>> __main__.__dict__[1]='x' >>> dir() [1, '__builtins__', '__doc__', '__main__', '__name__', 'x'] However, since the implementation of namespaces is an implementation detail, and since I can (as yet) see no need for the above (except to astound and amaze), I think you (and Guido) should feel free to disallow this, especially if doing so facilitates speed improvements. Terry J. Reedy From pedronis@bluewin.ch Tue Jan 14 19:14:12 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Tue, 14 Jan 2003 20:14:12 +0100 Subject: [Python-Dev] Re: Re: PEP 267 improvement idea References: <20030114172050.GA80173@hishome.net> Message-ID: <001b01c2bc01$1f9bf960$6d94fea9@newmexico> From: "Terry Reedy" > > overhead pays when it's used as a namespace. Both > "lookdict_namespace" > > and "lookdict_string" will fall back to a "lookdict" dictionary when > a > > non-string key is inserted. > > Until reading this sentence, I had never thought of doing something so > bizarre as adding a non-string key to a namespace (via the 'backdoor' > .__dict__ reference). But it currently works. > > >>> import __main__ > >>> __main__.__dict__[1]='x' > >>> dir() > [1, '__builtins__', '__doc__', '__main__', '__name__', 'x'] > > However, since the implementation of namespaces is an implementation > detail, and since I can (as yet) see no need for the above (except to > astound and amaze), I think you (and Guido) should feel free to > disallow this, especially if doing so facilitates speed improvements. > I may add that Jython already does that, and gets away with it wihout much ado Jython 2.1 on java1.3.0 (JIT: null) Type "copyright", "credits" or "license" for more information. >>> import __main__ >>> __main__.__dict__[1]=2 Traceback (innermost last): File "", line 1, in ? TypeError: keys in namespace must be strings >>> same for class and instances dicts. From jack@performancedrivers.com Tue Jan 14 19:31:21 2003 From: jack@performancedrivers.com (Jack Diederich) Date: Tue, 14 Jan 2003 14:31:21 -0500 Subject: [Python-Dev] Re: Re: PEP 267 improvement idea In-Reply-To: ; from tjreedy@udel.edu on Tue, Jan 14, 2003 at 01:44:37PM -0500 References: <20030114172050.GA80173@hishome.net> Message-ID: <20030114143121.C13035@localhost.localdomain> On Tue, Jan 14, 2003 at 01:44:37PM -0500, Terry Reedy wrote: > > overhead pays when it's used as a namespace. Both > "lookdict_namespace" > > and "lookdict_string" will fall back to a "lookdict" dictionary when > a > > non-string key is inserted. > > Until reading this sentence, I had never thought of doing something so > bizarre as adding a non-string key to a namespace (via the 'backdoor' > .__dict__ reference). But it currently works. > > >>> import __main__ > >>> __main__.__dict__[1]='x' > >>> dir() > [1, '__builtins__', '__doc__', '__main__', '__name__', 'x'] > > However, since the implementation of namespaces is an implementation > detail, and since I can (as yet) see no need for the above (except to > astound and amaze), I think you (and Guido) should feel free to > disallow this, especially if doing so facilitates speed improvements. I think namespaces should be thier own type, even if name only so all those PEPs will have a drop-in place to experiment. The fact that namespaces are implemented with regular dicts is incidental. I have a couple experiments I'd love to do, but currently fooling with namespaces is very touchy. Granted if you fscked up the namespace type it would be hard to debug, but at least you would know it was namespace and not a side effect of breaking dict as well. The intial definition could be a string-key only dict down the road the following code: import mymodule foo = 7 could generate the following events namespace = __builtins__ # init namespace['mymodule.'] = mymodule.__dict__ # import 'mymodule' namespace['foo'] = 7 # create and assign local variable 'foo' -jackdied From tim.one@comcast.net Tue Jan 14 19:59:52 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 14 Jan 2003 14:59:52 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <200301141729.h0EHTYh30356@odiug.zope.com> Message-ID: [Guido] > ... > - The range of datetime (1 <= year <= 9999) is much larger than the > range of ticks (1970 <= year < 1938). I suppose it could raise an > exception when the time is not representable as ticks. The range of ticks is platform-dependent, if you want to use C library functions that accept ticks. The small range you're thinking of is due to platforms using 32-bit ints to represent ticks. A float has ("almost always has") 53 bits of precision, so as long as the timestamp isn't fed into platform C routines there's no problem expressing the whole range of dates datetime supports (to 1-second resolution, that only requires about 38 bits; so we wouldn't have a problem with the whole range even to millisecond resolution). > - A C double doesn't have enough precision for roundtrip guarantees. That part is so -- it would require a little more than 58 bits to roundtrip microseconds too correctly. There's another glitch: MAL plays tricks trying to accomodate boxes set up to support leap seconds. datetime does not, and on such a box there exist timestamps t1 and t2 such that t1 - t2 == 1 but where datetime.fromtimestamp(t1) == datetime.fromtimestamp(t2). The reason is that fromtimestamp() uses the platform localtime() or gmtime() to convert the timestamp to a struct tm, and then ruthlessly clamps tm_sec to be no larger than 59 (on boxes supporting leap seconds, the current standards allow for tm_sec to be 60 too, and older standards also allowed for tm_sec to be 61; if datetime sees one of those, it knocks it down to 59). datetime's fromtimestamp() should probably be reworked not to use the platform localtime()/gmtime(), implementing a "pure" POSIX timestamp regardless of platform delusions. OTOH, timestamps have no value in datetime now except as a legacy gimmick. > - Does it really need to be automatic? I.e., does it really need to > be __float__()? I'd be less against this if it was an explicit > method, e.g. dt.asposixtime(). Me too. totimestamp() would make most sense for a name (given existing methods like toordinal(), fromordinal(), and fromtimestamp()). From guido@python.org Tue Jan 14 20:30:45 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 14 Jan 2003 15:30:45 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: Your message of "Tue, 14 Jan 2003 14:59:52 EST." References: Message-ID: <200301142030.h0EKUjg31629@odiug.zope.com> > [Guido] > > ... > > - The range of datetime (1 <= year <= 9999) is much larger than the > > range of ticks (1970 <= year < 1938). I suppose it could raise an > > exception when the time is not representable as ticks. [Tim] > The range of ticks is platform-dependent, if you want to use C library > functions that accept ticks. So is the epoch -- only POSIX requires it to be 1-1-1970. I think the C standard doesn't constrain this at all (it doesn't even have to be a numeric type). In fact, non-posix systems are allowed to incorporate leap seconds into their ticks, which makes it hard to understand how ticks could be computed except by converting to local time first (a problem in itself) and then using mktime(). > > - Does it really need to be automatic? I.e., does it really need to > > be __float__()? I'd be less against this if it was an explicit > > method, e.g. dt.asposixtime(). > > Me too. totimestamp() would make most sense for a name (given existing > methods like toordinal(), fromordinal(), and fromtimestamp()). OK. --Guido van Rossum (home page: http://www.python.org/~guido/) From bac@OCF.Berkeley.EDU Tue Jan 14 21:14:14 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 14 Jan 2003 13:14:14 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: <200301141503.h0EF3US28503@odiug.zope.com> References: <200301141503.h0EF3US28503@odiug.zope.com> Message-ID: [Guido van Rossum] > > [Tim] > > I know. I want to delete the C wrapper too and get rid of > > HAVE_STRPTIME: full steam ahead, no looking back. This is pushing > > back against the growing notion that the way to deal with legacy > > platform strptime quirks is to keep an option open to recompile > > Python, to avoid using the portable code. Since that will be the > > easiest way out (one person here has already taken it), we'll never > > get Python's own strptime story straight so long as it's an option. > > OK. Let's get rid of the C wrapper around the C library's strptime(). > > The C wrapper around _strptime.strptime() stays, of course. It > currently has a bit of an inefficiency (what happens when it tries to > import _strptime is a lot more than I'd like to see happen for each > call) but that's a somewhat tricky issue that I'd like to put off for > a little while; I've added a SF bug report as a reminder. (667770) > Anything I can do to help with that? If it is just a matter of re-coding it in a certain way just point me in the direction of docs and an example and I will take care of it. And to comment on the speed drawback: there is already a partial solution to this. ``_strptime`` has the ability to return the regex it creates to parse the data string and then subsequently have the user pass that in instead of a format string:: strptime_regex = _strptime.strptime('%c', False) #False triggers it for line in log_file: time_tuple = _strptime.strptime(strptime_regex, line) That at least eliminates the overhead of having to rediscover the locale information everytime. I will add a doc patch with the patch that I am going to do that adds the default values explaining this feature if no one has objections (can only think this is an issue if it is decided it would be better to write the whole thing in C and implementing this feature would become useless or too much of a pain). -Brett From guido@python.org Tue Jan 14 21:30:29 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 14 Jan 2003 16:30:29 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Your message of "Tue, 14 Jan 2003 13:14:14 PST." References: <200301141503.h0EF3US28503@odiug.zope.com> Message-ID: <200301142130.h0ELUTg32080@odiug.zope.com> > > The C wrapper around _strptime.strptime() stays, of course. It > > currently has a bit of an inefficiency (what happens when it tries to > > import _strptime is a lot more than I'd like to see happen for each > > call) but that's a somewhat tricky issue that I'd like to put off for > > a little while; I've added a SF bug report as a reminder. (667770) > > > > Anything I can do to help with that? If it is just a matter of re-coding > it in a certain way just point me in the direction of docs and an example > and I will take care of it. The issues are really subtle. E.g. you can't just store the python strptime function in a global, because of multiple independent interpreters and reload(). You can't peek in sys.modules because of rexec.py. If you still want to look into this, be my guest. > And to comment on the speed drawback: there is already a partial solution > to this. ``_strptime`` has the ability to return the regex it creates to > parse the data string and then subsequently have the user pass that in > instead of a format string:: > > strptime_regex = _strptime.strptime('%c', False) #False triggers it Why False and not None? > for line in log_file: > time_tuple = _strptime.strptime(strptime_regex, line) > > That at least eliminates the overhead of having to rediscover the locale > information everytime. I will add a doc patch with the patch that I am > going to do that adds the default values explaining this feature if no one > has objections (can only think this is an issue if it is decided it would > be better to write the whole thing in C and implementing this feature > would become useless or too much of a pain). Yeah, but this means people have to change their code. OK, I think for speed hacks that's acceptable. --Guido van Rossum (home page: http://www.python.org/~guido/) From bac@OCF.Berkeley.EDU Tue Jan 14 21:41:45 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 14 Jan 2003 13:41:45 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: <200301142130.h0ELUTg32080@odiug.zope.com> References: <200301141503.h0EF3US28503@odiug.zope.com> <200301142130.h0ELUTg32080@odiug.zope.com> Message-ID: [Guido van Rossum] > > > The C wrapper around _strptime.strptime() stays, of course. It > > > currently has a bit of an inefficiency (what happens when it tries to > > > import _strptime is a lot more than I'd like to see happen for each > > > call) but that's a somewhat tricky issue that I'd like to put off for > > > a little while; I've added a SF bug report as a reminder. (667770) > > > > > > > Anything I can do to help with that? If it is just a matter of re-coding > > it in a certain way just point me in the direction of docs and an example > > and I will take care of it. > > The issues are really subtle. E.g. you can't just store the python > strptime function in a global, because of multiple independent > interpreters and reload(). You can't peek in sys.modules because of > rexec.py. > Now I *really* wish we were ripping ``rexec`` out instead of crippling it. =) > If you still want to look into this, be my guest. > I will see what I can do, but it sounds like this is beyond my experience. > > And to comment on the speed drawback: there is already a partial solution > > to this. ``_strptime`` has the ability to return the regex it creates to > > parse the data string and then subsequently have the user pass that in > > instead of a format string:: > > > > strptime_regex = _strptime.strptime('%c', False) #False triggers it > > Why False and not None? > Just playing with booleans at the time. =) I also thought that it made sense: False as in it is false that you are going to get any info out of this. Although, None also makes sense. I can change it easily enough. > > for line in log_file: > > time_tuple = _strptime.strptime(strptime_regex, line) > > > > That at least eliminates the overhead of having to rediscover the locale > > information everytime. I will add a doc patch with the patch that I am > > going to do that adds the default values explaining this feature if no one > > has objections (can only think this is an issue if it is decided it would > > be better to write the whole thing in C and implementing this feature > > would become useless or too much of a pain). > > Yeah, but this means people have to change their code. OK, I think > for speed hacks that's acceptable. > So then I can document it, right? Or should we just leave this as a surprise for the more adventurous who read the source? -Brett From guido@python.org Tue Jan 14 21:46:17 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 14 Jan 2003 16:46:17 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Your message of "Tue, 14 Jan 2003 13:41:45 PST." References: <200301141503.h0EF3US28503@odiug.zope.com> <200301142130.h0ELUTg32080@odiug.zope.com> Message-ID: <200301142146.h0ELkHO32158@odiug.zope.com> > [Guido van Rossum] > > > > > The C wrapper around _strptime.strptime() stays, of course. It > > > > currently has a bit of an inefficiency (what happens when it tries to > > > > import _strptime is a lot more than I'd like to see happen for each > > > > call) but that's a somewhat tricky issue that I'd like to put off for > > > > a little while; I've added a SF bug report as a reminder. (667770) > > > > > > > > > > Anything I can do to help with that? If it is just a matter of re-coding > > > it in a certain way just point me in the direction of docs and an example > > > and I will take care of it. > > > > The issues are really subtle. E.g. you can't just store the python > > strptime function in a global, because of multiple independent > > interpreters and reload(). You can't peek in sys.modules because of > > rexec.py. > > > > Now I *really* wish we were ripping ``rexec`` out instead of > crippling it. =) Um, the issues aren't really rexec.py itself, but the general security framework; I think there's still something to say for that in the long run (even though right now it's not secure). > > If you still want to look into this, be my guest. > > I will see what I can do, but it sounds like this is beyond my experience. > > > > And to comment on the speed drawback: there is already a partial solution > > > to this. ``_strptime`` has the ability to return the regex it creates to > > > parse the data string and then subsequently have the user pass that in > > > instead of a format string:: > > > > > > strptime_regex = _strptime.strptime('%c', False) #False triggers it > > > > Why False and not None? > > Just playing with booleans at the time. =) I also thought that it made > sense: False as in it is false that you are going to get any info out of > this. Although, None also makes sense. I can change it easily enough. Please fix. > > > for line in log_file: > > > time_tuple = _strptime.strptime(strptime_regex, line) > > > > > > That at least eliminates the overhead of having to rediscover the locale > > > information everytime. I will add a doc patch with the patch that I am > > > going to do that adds the default values explaining this feature if no one > > > has objections (can only think this is an issue if it is decided it would > > > be better to write the whole thing in C and implementing this feature > > > would become useless or too much of a pain). > > > > Yeah, but this means people have to change their code. OK, I think > > for speed hacks that's acceptable. > > So then I can document it, right? Or should we just leave this as a > surprise for the more adventurous who read the source? No, it would be better if you ripped out any other undocumented "surprises" that might still be lurkig in _strptime.py. Or at least owe up to them now so we can decide what to do with them. --Guido van Rossum (home page: http://www.python.org/~guido/) From bac@OCF.Berkeley.EDU Tue Jan 14 21:59:21 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 14 Jan 2003 13:59:21 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: <200301142146.h0ELkHO32158@odiug.zope.com> References: <200301141503.h0EF3US28503@odiug.zope.com> <200301142130.h0ELUTg32080@odiug.zope.com> <200301142146.h0ELkHO32158@odiug.zope.com> Message-ID: [Guido van Rossum] > > [Guido van Rossum] > > > > Now I *really* wish we were ripping ``rexec`` out instead of > > crippling it. =) > > Um, the issues aren't really rexec.py itself, but the general security > framework; I think there's still something to say for that in the long > run (even though right now it's not secure). > OK, my mistake. > > > Why False and not None? > > > > Just playing with booleans at the time. =) I also thought that it made > > sense: False as in it is false that you are going to get any info out of > > this. Although, None also makes sense. I can change it easily enough. > > Please fix. > Sure thing. > > > > for line in log_file: > > > > time_tuple = _strptime.strptime(strptime_regex, line) > > > > > > > > That at least eliminates the overhead of having to rediscover the locale > > > > information everytime. I will add a doc patch with the patch that I am > > > > going to do that adds the default values explaining this feature if no one > > > > has objections (can only think this is an issue if it is decided it would > > > > be better to write the whole thing in C and implementing this feature > > > > would become useless or too much of a pain). > > > > > > Yeah, but this means people have to change their code. OK, I think > > > for speed hacks that's acceptable. > > > > So then I can document it, right? Or should we just leave this as a > > surprise for the more adventurous who read the source? > > No, it would be better if you ripped out any other undocumented > "surprises" that might still be lurkig in _strptime.py. Or at least > owe up to them now so we can decide what to do with them. > Nope, that is the only thing with ``strptime``. There is other stuff in the module that might be helpful (like a class that discovers all locale info for dates), but they in no way affect how ``strptime`` works. I don't want to keep bothering you with this, but I couldn't deduce from your response clearly whether you want me to document this feature or rip it out or leave it in undocumented. -Brett From tim.one@comcast.net Wed Jan 15 00:06:27 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 14 Jan 2003 19:06:27 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: Message-ID: [Brett Cannon] > ... > And to comment on the speed drawback: there is already a partial solution > to this. ``_strptime`` has the ability to return the regex it creates to > parse the data string and then subsequently have the user pass that in > instead of a format string:: You're carrying restructured text too far :: I expect it would be better for strptime to maintain its own internal cache mapping format strings to compiled regexps (as a dict, indexed by format strings). Dict lookup is cheap. In most programs, this dict will remain empty. In most of the rest, it will have one entry. *Some* joker will feed it an unbounded number of distinct format strings, though, so blow the cache away if it gets "too big": regexp = cache.get(fmtstring) if regexp is None: regexp = compile_the_regexp(fmtstring) if len(cache) > 30: # whatever cache.clear() cache[fmtstring] = regexp Then you're robust against all comers (it's also thread-safe). From guido@python.org Wed Jan 15 00:56:06 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 14 Jan 2003 19:56:06 -0500 Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CVS In-Reply-To: "Your message of Tue, 14 Jan 2003 19:06:27 EST." References: Message-ID: <200301150056.h0F0u6w29207@pcp02138704pcs.reston01.va.comcast.net> > I expect it would be better for strptime to maintain its own internal cache > mapping format strings to compiled regexps (as a dict, indexed by format > strings). Dict lookup is cheap. In most programs, this dict will remain > empty. In most of the rest, it will have one entry. *Some* joker will feed > it an unbounded number of distinct format strings, though, so blow the cache > away if it gets "too big": > > regexp = cache.get(fmtstring) > if regexp is None: > regexp = compile_the_regexp(fmtstring) > if len(cache) > 30: # whatever > cache.clear() > cache[fmtstring] = regexp > > Then you're robust against all comers (it's also thread-safe). Yes. I think that Brett mentioned that the compilation is locale-aware, so it should at least fetch the relevant locale settings and blow away the cache if the locale has changed. --Guido van Rossum (home page: http://www.python.org/~guido/) From bac@OCF.Berkeley.EDU Wed Jan 15 01:25:53 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 14 Jan 2003 17:25:53 -0800 (PST) Subject: [Python-Dev] Broken strptime in Python 2.3a1 & CV In-Reply-To: References: Message-ID: [Tim Peters] > [Brett Cannon] > > ... > > And to comment on the speed drawback: there is already a partial solution > > to this. ``_strptime`` has the ability to return the regex it creates to > > parse the data string and then subsequently have the user pass that in > > instead of a format string:: > > You're carrying restructured text too far :: > =) Need the practice; giving a lightning tutorial on it at PyCon. But I will cut back on the literal markup. > I expect it would be better for strptime to maintain its own internal cache > mapping format strings to compiled regexps (as a dict, indexed by format > strings). Dict lookup is cheap. In most programs, this dict will remain > empty. In most of the rest, it will have one entry. *Some* joker will feed > it an unbounded number of distinct format strings, though, so blow the cache > away if it gets "too big": > > regexp = cache.get(fmtstring) > if regexp is None: > regexp = compile_the_regexp(fmtstring) > if len(cache) > 30: # whatever > cache.clear() > cache[fmtstring] = regexp > > Then you're robust against all comers (it's also thread-safe). > Hmm. Could do that. Could also cache the locale information that I discover (only one copy should be enough; don't think people swap between locales that often). Caching the object that stores locale info, called TimeRE (see, no `` `` markup; fast learner I am =), would speed up value calculations (have to compare against it to figure out what month it is, etc.) along with creating multiple regexes (since the locale info won't have to be recalculated). And then the cache that you are suggesting, Tim, would completely replace the need to be able to return regex objects. Spiffy. =) OK, so, with the above-mentioned improvements I can rip out the returning of regex objects functionality. I am going to assume no one has any issue with this design idea, so I will do another patch for this (now I have one on SF dealing with a MacOS 9 issue, going to have one doing default values and making the %y directive work the way most people expect it to along with doc changes specifying that you *can* expect reliable behavior, and now a speed-up patch which will also remove my one use of the string module; fun =). Now all I need is Alex to step in here and fiddle with Tim's code and then Christian and Raymond to come in and speed up the underlying C code for Tim's code that Alex touched and we will be in business. =) sometimes-I-think-I-read-too-much-python-dev-mail-ly y'rs, Brett From tim.one@comcast.net Wed Jan 15 04:40:21 2003 From: tim.one@comcast.net (Tim Peters) Date: Tue, 14 Jan 2003 23:40:21 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <200301142030.h0EKUjg31629@odiug.zope.com> Message-ID: [Guido] > So is the epoch -- only POSIX requires it to be 1-1-1970. I think the > C standard doesn't constrain this at all (it doesn't even have to be a > numeric type). That's almost all so -- time_t and clock_t are constrained to be of "arithmetic types" by C99, meaning they must be integers (of some size) or floats (of some size). This is good for Python, because timemodule.c blithely casts time_t to double, and so does datetimemodule.c. The range and precision are left implementation-defined. > In fact, non-posix systems are allowed to incorporate leap seconds > into their ticks, which makes it hard to understand how ticks could be > computed except by converting to local time first (a problem in > itself) and then using mktime(). It depends on what "ticks" means. If we take ticks to mean what POSIX defines "seconds since the epoch" to mean, then I can easily generate a POSIX timestamp without using platform C functions at all. The formula for "seconds since the epoch" is given explicitly in the POSIX docs: tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 + (tm_year-70)*31536000 + ((tm_year-69)/4)*86400 - ((tm_year-1)/100)*86400 + ((tm_year+299)/400)*86400 While POSIX explicitly refuses to define the relationship for years before 1970, or for negative "seconds since the epoch" values, extending the formula to cover those things is trivial (it's enough to convert the timedelta dt - datetime(1970, 1, 1) to seconds). POSIX also explicitly warns that The relationship between the actual time of day and the current value for seconds since the Epoch is unspecified. which is partly a consequence of that the POSIX formula doesn't allow for leap seconds, but that leap seconds are part of the definition of UTC. The other meaning for "ticks" is whatever the platform means by it. mxDateTime does cater to boxes that account for leap seconds -- Marc-Andre checks whether 1986-12-31 23:59:59 UTC maps to 536457599 (POSIX) or 536457612 (leap seconds) ticks, and fiddles accordingly, falling back to platform C functions (IIRC) on non-POSIX boxes. datetime squashes leap seconds out of existence *when it can detect them*, though. If there's any hope for a common base API here, I expect it has to follow the POSIX definition, platform quirks be damned. From mal@lemburg.com Wed Jan 15 10:13:53 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 15 Jan 2003 11:13:53 +0100 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <200301141729.h0EHTYh30356@odiug.zope.com> References: <03b901c2bb67$d9c4bc90$530f8490@eden> <200301140206.h0E264a11590@pcp02138704pcs.reston01.va.comcast.net> <3E243B2D.8070709@lemburg.com> <200301141729.h0EHTYh30356@odiug.zope.com> Message-ID: <3E253461.7030200@lemburg.com> Guido van Rossum wrote: >>An abstract baseclass would only help all the way if I can make >>mxDateTime objects new style classes. That's not going to happen >>for a few months because I don't have any requirement for it. > > OK, but see below. > >>Now for interop, I'm basically interested in adding support >>for most of the binary operations ... >> >>mxD = mx.DateTime.DateTime >>mxDD = mx.DateTime.DateTimeDelta >>dt = datetime >>t = timedelta >>d = date >> >>* mxD - d (assuming 0:00 as time part), mxD - dt >>* mxD -/+ t >>* mxDD + d (assuming 0:00 as time part), mxDD + dt >>* mxDD -/+ t >>* mxD < d >>* mxDD < t > > These you can all do. Right (with the new style numbers). >>(and reverse order) > > That is also doable, *except* for the comparisons. That's why I was > proposing that you inherit from basetime. The datetime module's > comparison currently always raises TypeError when the other argument > isn't a datetime instance; my proposal would be to return > NotImplemented if the other isn't a datetime instance but it inherits > from basetime. > > I guess an alternative would be to check whether the other argument > "smells like" a time object, e.g. by testing for a "timetuple" > attribute (or whatever we agree on). Doesn't compare use the same coercion scheme as all the other operators ? Ie. if datetimeobject.cmp(datetimeobject, otherobject) returns NotImplemented, wouldn't otherobject.cmp(datetimeobject, otherobject) be called ? (I don't really remember and the rich compares scheme has me pretty confused.) >>... and contructors >> >>* DateTimeFrom(dt), DateTimeFrom(d) >>* DateTimeDeltaFrom(t) >> >>etc. > > You should be able to do that. You should get dt.timetuple(), which > gives time in dt's local time (not your local time), and then you can > convert it to UTC by subtracting dt.utcoffset(). Right; provided I can easily test for the datetime types at C level. That doesn't seem to be easily possible, though, since it requires going through Python to get at the type objects. >>In order to get the binary ops to work, I'll have to enable >>new style number support in mxDateTime and drop the 1.5.2 >>support :-( > > > Time to bite that bullet. :-) Oh well. >>Now the problem I see is when an API expects a datetime >>object and gets an mxDateTime object instead. > > > Which APIs are those? None yet, but these are likely to emerge sooner or later :-) >>For mxDateTime I have solved this by simply letting float(mxD) >>return a Unix ticks value and float(mxDD) return the value in >>seconds since midnight -- this makes mxDateTime object compatible >>to all APIs which have previously only accepted Unix ticks. > > You mean time.ctime(x), time.localtime(x), and time.gmtime(x)? Those > operations are available as methods on datetime objects, though with > different names and (in the case of localtime) with somewhat different > semantics when timezones are involved. Not only these. Many third party modules like e.g database modules also work just fine with Unix ticks floats. >>mxDateTime also does mixed type operations using Unix >>ticks if it doesn't know the other type. >> >>So perhaps we need something like this: >>* a basedate class which is accessible at C level > > Um, I thought you just said you couldn't do a base class yet? Right, but in the long run, this is the right solution. >>* compatibility to Unix ticks floats (nb_float) > > If you really want that, we could add a __float__ method to datetime, > but I see several problems: > > - What to do if the datetime object's utcoffset() method returns None? > Then you can't convert to ticks. I propose an error. +1 > - The range of datetime (1 <= year <= 9999) is much larger than the > range of ticks (1970 <= year < 1938). I suppose it could raise an > exception when the time is not representable as ticks. ticks represented as floats have a much larger range and things are moving into that direction as it seems. On Windows they are already using this kind of approach (twisted in the usual MS way, of course). > - A C double doesn't have enough precision for roundtrip guarantees. True, but then roundtripping isn't guaranteed for many datetime operations anyway. Pretty much the same as with floating point arithmetic in general. The only roundtripping that needs to be reliable is that of storing broken down values in the type and then retrieving the exact same values. mxDateTime has special provisions for this and also makes sure that COMDates make the roundtrip (as per request from MS COM users). > - Does it really need to be automatic? I.e., does it really need to > be __float__()? I'd be less against this if it was an explicit > method, e.g. dt.asposixtime(). Why not both ? (mxDateTime DateTime objects have a .ticks() method for this) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Wed Jan 15 12:22:28 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Wed, 15 Jan 2003 13:22:28 +0100 Subject: [Python-Dev] import a.b as c Message-ID: <200301151222.h0FCMSQH011056@mira.informatik.hu-berlin.de> The docs currently say that import-as doesn't work if the imported thing is a submodule. Patch #662454 points out that this documentation is factually incorrect, and suggests to remove it. Can anybody remember what the rationale was for documenting such a restriction? If not, I'll apply this patch. Regards, Martin From webmaster@pferdemarkt.ws Wed Jan 15 12:32:59 2003 From: webmaster@pferdemarkt.ws (webmaster@pferdemarkt.ws) Date: Wed, 15 Jan 2003 04:32:59 -0800 Subject: [Python-Dev] Pferdemarkt.ws informiert! Newsletter 01/2003 Message-ID: <200301151232.EAA07142@eagle.he.net> http://www.pferdemarkt.ws Wir sind in 2003 erfolgreich in des neue \"Pferdejahr 2003 gestartet. Für den schnellen Erfolg unseres Marktes möchten wir uns bei Ihnen bedanken. Heute am 15. Januar 2003 sind wir genau 14 Tage Online! Täglich wächst unsere Datenbank um ca. 30 neue Angebote. Stellen auch Sie als Privatperson Ihre zu verkaufenden Pferde direkt und vollkommen Kostenlos ins Internet. Zur besseren Sichtbarmachung Ihrer Angebote können SIe bis zu ein Bild zu Ihrer Pferdeanzeige kostenlos einstellen! Klicken Sie hier um sich direkt einzuloggen http://www.Pferdemarkt.ws Kostenlos Anbieten, Kostenlos Suchen! Direkt von Privat zu Privat! Haben Sie noch Fragen mailto: webmaster@pferdemarkt.ws From guido@python.org Wed Jan 15 12:46:47 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 15 Jan 2003 07:46:47 -0500 Subject: [Python-Dev] import a.b as c In-Reply-To: "Your message of Wed, 15 Jan 2003 13:22:28 +0100." <200301151222.h0FCMSQH011056@mira.informatik.hu-berlin.de> References: <200301151222.h0FCMSQH011056@mira.informatik.hu-berlin.de> Message-ID: <200301151246.h0FCkl730799@pcp02138704pcs.reston01.va.comcast.net> > The docs currently say that import-as doesn't work if the imported > thing is a submodule. Patch #662454 points out that this documentation > is factually incorrect, and suggests to remove it. > > Can anybody remember what the rationale was for documenting such a > restriction? Not me. > If not, I'll apply this patch. +1 --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Wed Jan 15 17:59:20 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 15 Jan 2003 12:59:20 -0500 Subject: [Python-Dev] Interop between datetime and mxDateTime In-Reply-To: <3E253461.7030200@lemburg.com> Message-ID: [M.-A. Lemburg] > ... > Doesn't compare use the same coercion scheme as all the other > operators ? Ie. if datetimeobject.cmp(datetimeobject, otherobject) > returns NotImplemented, wouldn't otherobject.cmp(datetimeobject, > otherobject) be called ? *If* it returned NotImplemented, yes. But it doesn't return NotImplemented, it raises TypeError. This is so that, e.g., date.today() < 12 doesn't *end up* falling back on the default compare-object-addresses scheme. Comparison is different than, e.g., + in that way: there's a default non-exception-raising implementation of comparison when both objects return NotImplemented, but the default implementation of addition when both objects return NotImplemented raises TypeError. > ... > Right; provided I can easily test for the datetime types > at C level. That doesn't seem to be easily possible, though, > since it requires going through Python to get at the > type objects. Patches accepted . > ... > True, but then roundtripping isn't guaranteed for many > datetime operations anyway. Pretty much the same as with floating > point arithmetic in general. Not so for this datetime implementation: all datetime operations are exact, except for those that raise OverflowError. From blunck@gst.com Wed Jan 15 18:45:00 2003 From: blunck@gst.com (Christopher Blunck) Date: Wed, 15 Jan 2003 13:45:00 -0500 Subject: [Python-Dev] import a.b as c In-Reply-To: <200301151246.h0FCkl730799@pcp02138704pcs.reston01.va.comcast.net> References: <200301151222.h0FCMSQH011056@mira.informatik.hu-berlin.de> <200301151246.h0FCkl730799@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030115184500.GA17487@homer.gst.com> On Wed, Jan 15, 2003 at 07:46:47AM -0500, Guido van Rossum wrote: > > The docs currently say that import-as doesn't work if the imported > > thing is a submodule. Patch #662454 points out that this documentation > > is factually incorrect, and suggests to remove it. > > > > Can anybody remember what the rationale was for documenting such a > > restriction? No idea. I originally speculated that v1.x of py wouldn't allow import-as with a submodule and that the documentation was a relic. Turns out that v1.x doesn't allow import-as at all. That kind of invalidates my hypothesis. > > If not, I'll apply this patch. +1 -c -- 10:25am up 86 days, 1:41, 1 user, load average: 3.47, 3.49, 3.58 From mgarcia@cole-switches.com Wed Jan 15 19:58:13 2003 From: mgarcia@cole-switches.com (Manuel Garcia, VP MIS) Date: Wed, 15 Jan 2003 11:58:13 -0800 Subject: [Python-Dev] functions exposed by datetime Message-ID: The more I read about "datetime", I am beginning to realize that like "mxDateTime", it is a needlessly awkward to use for people who want to cheaply turn a calendar date into a 32 bit integer, and back again. (an integer suitable for doing date arithmetic, of course) for example: 2/28/2004 38045 2/29/2004 38046 3/1/2004 38047 3/2/2004 38048 No matter how complete "datetime" turns out to be, you will have thousands of programmers that will have to code their own company's fiscal calendar anyway. So it makes sense to have cheap access to date arithmetic logic. (I work for a manufacturing company, our fiscal months and years always begin on a Sunday, fiscal years are exactly 52 or 53 weeks long, and the length of a given year is determined as much by the tax code as by the positions of the earth and sun.) And what is day zero? Who cares? Already I have to keep track of Excel's day zero, PICK (legacy database) day zero, Unix's "day zero", fiscal calendar "day zero", so keeping track of one more is no big deal, since obviously they only differ by a constant. And most of the time, day zero is irrelevant. Besides date arithmetic, there are other reasons to make conversion between dates and integers very cheap. 32 bit integers obviously are easy to store, hash perfectly, can be quickly put into bins with use of "bisect", and they make calculations that loop over every day in a month or year a simple loop over a range(x,y). The hashing is the biggest concern. If I understand correctly, Guido said hash for datetime objects was not straightforward, because the same day can have more than one representation. I am constantly using a "date" for part of a dictionary key. Sourceforge is gagging right now, so I cannot confirm what is in "datetime", but I never heard any mention of cheap conversion of dates into integers. My current fiscal calendar code uses mktime, localtime, int(round(x + (y - z) / 86400.0)), and prayer. Currently the program is swamped by I/O, so this is good enough, and I can't justify installing "mxDateTime" on all the client machines. But I wouldn't mind using a simple, cheap built-in. Manuel Garcia Email mgarcia@cole-switches.com From tim@zope.com Wed Jan 15 20:42:13 2003 From: tim@zope.com (Tim Peters) Date: Wed, 15 Jan 2003 15:42:13 -0500 Subject: [Python-Dev] functions exposed by datetime In-Reply-To: Message-ID: [Manuel Garcia, VP MIS] > The more I read about "datetime", I am beginning to realize that like > "mxDateTime", it is a needlessly awkward to use for people who want to > cheaply turn a calendar date into a 32 bit integer, and back again. (an > integer suitable for doing date arithmetic, of course) I think you should try reading the modules' documentation next . > for example: > 2/28/2004 38045 > 2/29/2004 38046 > 3/1/2004 38047 > 3/2/2004 38048 >>> from datetime import date >>> date(2004, 2, 29).toordinal() 731640 >>> date(2004, 3, 1).toordinal() 731641 >>> date.fromordinal(731640) datetime.date(2004, 2, 29) >>> date.fromordinal(731641) datetime.date(2004, 3, 1) >>> > No matter how complete "datetime" turns out to be, you will have thousands > of programmers that will have to code their own company's fiscal calendar > anyway. So it makes sense to have cheap access to date arithmetic logic. You do. > (I work for a manufacturing company, our fiscal months and years always > begin on a Sunday, fiscal years are exactly 52 or 53 weeks long, and the > length of a given year is determined as much by the tax code as by the > positions of the earth and sun.) Sure. > And what is day zero? Who cares? In datetime, day 1 is 1/1/1, in the proleptic Gregorian calendar. The module docs say more about that. > ... > Besides date arithmetic, there are other reasons to make > conversion between dates and integers very cheap. 32 bit integers > obviously are easy to store, hash perfectly, So do date objects. > can be quickly put into bins with use of "bisect", So can date objects directly. > and they make calculations that loop over every day in a month or year a > simple loop over a range(x,y). There are many ways to do this, and I expect you're making life too difficult if you keep converting to and from integers by hand. Like x = some starting date in 2003 aweek = datetime.timedelta(weeks=1) while x.year == 2003: do something with x x += aweek This is cheap. > The hashing is the biggest concern. If I understand correctly, Guido said > hash for datetime objects was not straightforward, because the same day can > have more than one representation. I am constantly using a "date" for part > of a dictionary key. A datetime.date object is basically a 4-byte string, and is no more difficult or expensive to hash than the string "date". A datetime object is more complicated, *if* its tzinfo member isn't None. Then hashing has to take the time zone information into account, as different datetime objects with non-None tzinfo members can represent the same time in UTC, and so compare equal, and so must have the same hash codes. > Sourceforge is gagging right now, so I cannot confirm what is in > "datetime", but I never heard any mention of cheap conversion of dates > into integers. You can browse the module docs online at python.org too, via the "development version" link on the doc page. > My current fiscal calendar code uses mktime, localtime, int(round(x + (y - > z) / 86400.0)), and prayer. Currently the program is swamped by I/O, so > this is good enough, and I can't justify installing "mxDateTime" > on all the client machines. But I wouldn't mind using a simple, cheap > built-in. Date ordinals are cheaper under mxDateTime, because it stores datetimes internally as a pair (day ordinal as an integer, # of seconds into the day as a double) datetime objects store year, month, day, hour, minute, second and microsecond as distinct internal fields, for efficient field extraction and exact (no floating point rounding surprises) datetime arithmetic. Conversion to and from day ordinals requires non-trivial runtime conversion code in datetime, but it runs at C speed and I expect you'll never notice it. From guido@python.org Wed Jan 15 20:51:34 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 15 Jan 2003 15:51:34 -0500 Subject: [Python-Dev] functions exposed by datetime In-Reply-To: Your message of "Wed, 15 Jan 2003 11:58:13 PST." References: Message-ID: <200301152051.h0FKpYw03625@odiug.zope.com> > The more I read about "datetime", I am beginning to realize that like > "mxDateTime", it is a needlessly awkward to use for people who want to > cheaply turn a calendar date into a 32 bit integer, and back again. Is the following really too awkward? >>> from datetime import * >>> a = date.today() >>> b = date(1956, 1, 31) >>> a-b datetime.timedelta(17151) >>> (a-b).days 17151 >>> --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Wed Jan 15 21:40:00 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 15 Jan 2003 22:40:00 +0100 Subject: [Python-Dev] import a.b as c In-Reply-To: <20030115184500.GA17487@homer.gst.com> References: <200301151222.h0FCMSQH011056@mira.informatik.hu-berlin.de> <200301151246.h0FCkl730799@pcp02138704pcs.reston01.va.comcast.net> <20030115184500.GA17487@homer.gst.com> Message-ID: Christopher Blunck writes: > No idea. I originally speculated that v1.x of py wouldn't allow > import-as with a submodule and that the documentation was a relic. > Turns out that v1.x doesn't allow import-as at all. That kind of > invalidates my hypothesis. Indeed, that's why I asked on python-dev :-) I'll apply the patch tomorrow. Regards, Martin From mhammond@skippinet.com.au Thu Jan 16 00:05:29 2003 From: mhammond@skippinet.com.au (Mark Hammond) Date: Thu, 16 Jan 2003 11:05:29 +1100 Subject: [Python-Dev] test_logging failing on Windows 2000 Message-ID: <0a1801c2bcf2$fb522f50$530f8490@eden> For some reason, test_logging.py is failing on my machine - but only when run via "regrtest.py" - running stand-alone works fine. The output I see is: test_logging Traceback (most recent call last): File "E:\src\python-cvs\lib\logging\__init__.py", line 645, in emit self.stream.write("%s\n" % msg) ValueError: I/O operation on closed file Traceback (most recent call last): File "E:\src\python-cvs\lib\logging\__init__.py", line 645, in emit self.stream.write("%s\n" % msg) Vtest test_logging produced unexpected output: [Sometimes the "ValueError" will be repeated quite a few times. Often these exceptions are intermingled with the next test output - ie, the logging test continues to run even once the following test has started.] I am guessing that some threads are spawned, but for some reason we aren't waiting for them to complete before closing the output file. I will have a look at this once I actually finish what I was trying to start - but if someone has a clue, let me know! Thanks, Mark. From guido@python.org Thu Jan 16 00:59:07 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 15 Jan 2003 19:59:07 -0500 Subject: [Python-Dev] test_logging failing on Windows 2000 In-Reply-To: "Your message of Thu, 16 Jan 2003 11:05:29 +1100." <0a1801c2bcf2$fb522f50$530f8490@eden> References: <0a1801c2bcf2$fb522f50$530f8490@eden> Message-ID: <200301160059.h0G0x7931717@pcp02138704pcs.reston01.va.comcast.net> > For some reason, test_logging.py is failing on my machine - but only when > run via "regrtest.py" - running stand-alone works fine. > > The output I see is: > > test_logging > Traceback (most recent call last): > File "E:\src\python-cvs\lib\logging\__init__.py", line 645, in emit > self.stream.write("%s\n" % msg) > ValueError: I/O operation on closed file > Traceback (most recent call last): > File "E:\src\python-cvs\lib\logging\__init__.py", line 645, in emit > self.stream.write("%s\n" % msg) > Vtest test_logging produced unexpected output: > > [Sometimes the "ValueError" will be repeated quite a few times. Often these > exceptions are intermingled with the next test output - ie, the logging test > continues to run even once the following test has started.] > > I am guessing that some threads are spawned, but for some reason we aren't > waiting for them to complete before closing the output file. > > I will have a look at this once I actually finish what I was trying to > start - but if someone has a clue, let me know! I had this exact same failure mode too, on Linux -- and then the next day I couldn't reproduce it! Glad it's not just me, and not just Linux either. :-) I guess the test is using threads and there's a race condition. No time to actually look at any code, but it might be obvious. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Thu Jan 16 02:34:17 2003 From: tim.one@comcast.net (Tim Peters) Date: Wed, 15 Jan 2003 21:34:17 -0500 Subject: [Python-Dev] test_logging failing on Windows 2000 In-Reply-To: <0a1801c2bcf2$fb522f50$530f8490@eden> Message-ID: [Mark Hammond] > For some reason, test_logging.py is failing on my machine - but only when > run via "regrtest.py" - running stand-alone works fine. > > The output I see is: > > test_logging > Traceback (most recent call last): > File "E:\src\python-cvs\lib\logging\__init__.py", line 645, in emit > self.stream.write("%s\n" % msg) > ValueError: I/O operation on closed file > Traceback (most recent call last): > File "E:\src\python-cvs\lib\logging\__init__.py", line 645, in emit > self.stream.write("%s\n" % msg) > Vtest test_logging produced unexpected output: > > [Sometimes the "ValueError" will be repeated quite a few times. > Often these exceptions are intermingled with the next test output - ie, > the logging test continues to run even once the following test has > started. I haven't seen this (yet), on Win2K or Win98. > I am guessing that some threads are spawned, but for some reason we > aren't waiting for them to complete before closing the output file. > > I will have a look at this once I actually finish what I was trying to > start - but if someone has a clue, let me know! I didn't see anything obvious. The logging module itself doesn't spawn any threads, but the test driver does. You'd *think* that would narrow it down . From guido@python.org Thu Jan 16 03:02:23 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 15 Jan 2003 22:02:23 -0500 Subject: [Python-Dev] test_logging failing on Windows 2000 In-Reply-To: "Your message of Wed, 15 Jan 2003 21:34:17 EST." References: Message-ID: <200301160302.h0G32Ne01290@pcp02138704pcs.reston01.va.comcast.net> > [Mark Hammond] > > For some reason, test_logging.py is failing on my machine - but only when > > run via "regrtest.py" - running stand-alone works fine. > > > > The output I see is: > > > > test_logging > > Traceback (most recent call last): > > File "E:\src\python-cvs\lib\logging\__init__.py", line 645, in emit > > self.stream.write("%s\n" % msg) > > ValueError: I/O operation on closed file > > Traceback (most recent call last): > > File "E:\src\python-cvs\lib\logging\__init__.py", line 645, in emit > > self.stream.write("%s\n" % msg) > > Vtest test_logging produced unexpected output: > > > > [Sometimes the "ValueError" will be repeated quite a few times. > > Often these exceptions are intermingled with the next test output - ie, > > the logging test continues to run even once the following test has > > started. [Tim] > I haven't seen this (yet), on Win2K or Win98. > > > I am guessing that some threads are spawned, but for some reason we > > aren't waiting for them to complete before closing the output file. > > > > I will have a look at this once I actually finish what I was trying to > > start - but if someone has a clue, let me know! > > I didn't see anything obvious. The logging module itself doesn't spawn any > threads, but the test driver does. You'd *think* that would narrow it down > . Here's what the test driver does: - It creates one thread which runs a subclass of SocketServer.ThreadingTCPServer, and starts the thread. - It runs a bunch of tests that all log to that server. - ThreadingTCPServer creates a new thread for each incoming connection, and makes this a daemon thread (meaning it won't be waited for at the end of the process). - Two lines above "finally:", sockOut is closed. I believe this is crucial: - The threads handling the requests are still running, and there's nothing to guarantee that they have processed all requests. The test driver needs to somehow wait until all the threads handling connections (how many? maybe there's only one?) are finished before it closes sockOut. Vinay, can you suggest a patch? --Guido van Rossum (home page: http://www.python.org/~guido/) From python@rcn.com Thu Jan 16 03:12:20 2003 From: python@rcn.com (Raymond Hettinger) Date: Wed, 15 Jan 2003 22:12:20 -0500 Subject: [Python-Dev] test_logging failing on Windows 2000 References: <200301160302.h0G32Ne01290@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <003301c2bd0d$15e13360$4b11a044@oemcomputer> Here's one suggestion. Create a semaphore. > - ThreadingTCPServer creates a new thread for each incoming > connection, and makes this a daemon thread (meaning it won't be > waited for at the end of the process). Bump the semaphore count up by one before running each new thread. > - The threads handling the requests are still running, and there's > nothing to guarantee that they have processed all requests. Have the daemons decrement the semaphore when they're done handling a request. > > The test driver needs to somehow wait until all the threads handling > connections (how many? maybe there's only one?) are finished before it > closes sockOut. Have the last step in the main thread be a blocking call to the semaphore so that it doesn't bail out until all requests are handled. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From vinay_sajip@red-dove.com Thu Jan 16 09:03:59 2003 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Thu, 16 Jan 2003 09:03:59 -0000 Subject: [Python-Dev] test_logging failing on Windows 2000 References: <200301160302.h0G32Ne01290@pcp02138704pcs.reston01.va.comcast.net> <003301c2bd0d$15e13360$4b11a044@oemcomputer> Message-ID: <003101c2bd3e$38e74da0$652b6992@alpha> [Raymond] > Create a semaphore. [snip] > Bump the semaphore count up by one before running each new > thread. [snip] > Have the daemons decrement the semaphore when they're done > handling a request. [snip] > Have the last step in the main thread be a blocking call to the semaphore > so that it doesn't bail out until all requests are handled. A semaphore seems the right thing, but I think it would need to be incremented after each *request* rather than for each thread (to correctly match with the decrement after each request). This needs to be done in the test script before each logging call - the alternative, to have a SocketHandler-derived class used in the test script, may be too intrusive. There seems to be some problem with Sourceforge CVS - I can't log in, nor can I browse via ViewCVS. I don't have the latest version of the script (as checked in by Neal). I'll try again in a while. Regards, Vinay From mal@lemburg.com Thu Jan 16 09:32:29 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 16 Jan 2003 10:32:29 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro Message-ID: <3E267C2D.3090907@lemburg.com> Hisao SUZUKI has just recently uploaded a patch to SF which includes codecs for the Japanese encodings EUC-JP, Shift_JIS and ISO-2022-JP and wants to contribute the code to the PSF. The advantage of his codecs over the ones written by Tamito KAJIYAMA (http://www.asahi-net.or.jp/~rd6t-kjym/python/) lies in the fact that Hisao's codecs are small (88kB) and written in pure Python. This makes it much easier to adapt the codecs to special needs or to correct errors. Provided Hisao volunteers to maintain these codecs, I'd like to suggest adding them to Python's encodings package and making them the default implementations for the above encodings. Ideal would be if we could get Hisao and Tamito to team up to support these codecs (I put him on CC). Adding the codecs to the distribution would give Python a very good argument in the Japanese world and also help people working with XML or HTML targetting these locales. Thoughts ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Thu Jan 16 10:05:55 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 16 Jan 2003 11:05:55 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E267C2D.3090907@lemburg.com> References: <3E267C2D.3090907@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > Thoughts ? I'm in favour of adding support for Japanese codecs, but I wonder whether we shouldn't incorporate the C version of the Japanese codecs package instead, despite its size. I would also suggest that it might be more worthwhile to expose platform codecs, which would give us all CJK codecs on a number of major platforms, with a minimum increase in the size of the Python distribution, and with very good performance. *If* Suzuki's code is incorporated, I'd like to get independent confirmation that it is actually correct. I know Tamito has taken many iterations until it was correct, where "correct" is a somewhat fuzzy term, since there are some really tricky issues for which there is no single one correct solution (like whether \x5c is a backslash or a Yen sign, in these encodings). I notice (with surprise) that the actual mapping tables are extracted from Java, through Jython. I also dislike absence of the cp932 encoding in Suzuki's codecs. The suggestion to equate this to "mbcs" on Windows is not convincing, as a) "mbcs" does not mean cp932 on all Windows installations, and b) cp932 needs to be processed on other systems, too. I *think* cp932 could be implemented as a delta to shift-jis, as shown in http://hp.vector.co.jp/authors/VA003720/lpproj/test/cp932sj.htm (although I wonder why they don't list the backslash issue as a difference between shift-jis and cp932) Regards, Martin From ishimoto@axissoft.co.jp Thu Jan 16 11:08:21 2003 From: ishimoto@axissoft.co.jp (Atsuo Ishimoto) Date: Thu, 16 Jan 2003 20:08:21 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: References: <3E267C2D.3090907@lemburg.com> Message-ID: <20030116195029.37E1.ISHIMOTO@axissoft.co.jp> Hello from Japan, On 16 Jan 2003 11:05:55 +0100 martin@v.loewis.de (Martin v. Lvwis) wrote: > "M.-A. Lemburg" writes: > > > Thoughts ? > > I'm in favour of adding support for Japanese codecs, but I wonder > whether we shouldn't incorporate the C version of the Japanese codecs > package instead, despite its size. I also vote for JapaneseCodec. Talking about it's size, JapaneseCodec package is much lager because it contains both C version and pure Python version. Size of C version part of JapaneseCodec is about 160kb(compiled on Windows platform), and I don't think it makes difference. > *If* Suzuki's code is incorporated, I'd like to get independent > confirmation that it is actually correct. I know Tamito has taken many > iterations until it was correct, where "correct" is a somewhat fuzzy > term, since there are some really tricky issues for which there is no > single one correct solution (like whether \x5c is a backslash or a Yen > sign, in these encodings). Yes, Tamito's JapaneseCodec has been used for years by many Japanese users, while I've never heard about Suzuki's one. > mapping tables are extracted from Java, through Jython. > > I also dislike absence of the cp932 encoding in Suzuki's codecs. The > suggestion to equate this to "mbcs" on Windows is not convincing, as > a) "mbcs" does not mean cp932 on all Windows installations, and b) > cp932 needs to be processed on other systems, too. Agreed. > I *think* cp932 > could be implemented as a delta to shift-jis, as shown in > > http://hp.vector.co.jp/authors/VA003720/lpproj/test/cp932sj.htm > > (although I wonder why they don't list the backslash issue as a > difference between shift-jis and cp932) > http://www.ingrid.org/java/i18n/unicode-utf8.html may be better reference. This page is written in English with utf-8. -------------------------- Atsuo Ishimoto ishimoto@gembook.org Homepage:http://www.gembook.jp From mal@lemburg.com Thu Jan 16 11:22:58 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 16 Jan 2003 12:22:58 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: References: <3E267C2D.3090907@lemburg.com> Message-ID: <3E269612.5060305@lemburg.com> Martin v. L=F6wis wrote: > "M.-A. Lemburg" writes: >=20 >>Thoughts ? >=20 > I'm in favour of adding support for Japanese codecs, but I wonder > whether we shouldn't incorporate the C version of the Japanese codecs > package instead, despite its size. I was suggesting to make Suzuki's codecs the default. That doesn't prevent Tamito's codecs from working, since these are inside a package. If someone wants the C codecs, we should provide them as separate download right alongside of the standard distro (as discussed several times before). Note that the C codecs are not as easy to modify to special needs as the Python ones. While this may seem unnecessary I've heard from a few people that especially companies tend to extend the mappings with their own set of company specific code points. > I would also suggest that it might be more worthwhile to expose > platform codecs, which would give us all CJK codecs on a number of > major platforms, with a minimum increase in the size of the Python > distribution, and with very good performance. +1 We already have this on Windows (via the mbcs codec). If you could contribute your iconv codecs under the PSF license we'd go a long way in that direction on Unix as well. > *If* Suzuki's code is incorporated, I'd like to get independent > confirmation that it is actually correct.=20 Since he built the codecs on the mappings in Java, this looks like enough third party confirmation already. > I know Tamito has taken many > iterations until it was correct, where "correct" is a somewhat fuzzy > term, since there are some really tricky issues for which there is no > single one correct solution (like whether \x5c is a backslash or a Yen > sign, in these encodings). I notice (with surprise) that the actual > mapping tables are extracted from Java, through Jython. Indeed. I think that this kind of approach is a good one in the light of the "correctness" problems you mention above. It also helps with the compatibility side. > I also dislike absence of the cp932 encoding in Suzuki's codecs. The > suggestion to equate this to "mbcs" on Windows is not convincing, as > a) "mbcs" does not mean cp932 on all Windows installations, and b) > cp932 needs to be processed on other systems, too. I *think* cp932 > could be implemented as a delta to shift-jis, as shown in >=20 > http://hp.vector.co.jp/authors/VA003720/lpproj/test/cp932sj.htm >=20 > (although I wonder why they don't list the backslash issue as a > difference between shift-jis and cp932) As always: contributions are welcome :-) --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From perky@fallin.lv Thu Jan 16 11:38:55 2003 From: perky@fallin.lv (Hye-Shik Chang) Date: Thu, 16 Jan 2003 20:38:55 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: References: <3E267C2D.3090907@lemburg.com> Message-ID: <20030116113855.GA49646@fallin.lv> On Thu, Jan 16, 2003 at 11:05:55AM +0100, Martin v. L?wis wrote: > "M.-A. Lemburg" writes: > > > Thoughts ? > > I'm in favour of adding support for Japanese codecs, but I wonder > whether we shouldn't incorporate the C version of the Japanese codecs > package instead, despite its size. And, the most important merit that C version have but Pure version doesn't is sharing library texts inter processes. Most modern OSes can share them and C version is even smaller than Python version in case of KoreanCodecs 2.1.x (on CVS) Here's process status on FreeBSD 5.0/i386 with Python 2.3a1(of 2003-01-15) system. USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND perky 56713 0.0 1.2 3740 3056 p3 S+ 8:11PM 0:00.08 python : python without any codecs perky 56739 6.3 5.7 15376 14728 p3 S+ 8:17PM 0:04.02 python : python with python.cp949 codec perky 56749 0.0 1.2 3884 3196 p3 S+ 8:20PM 0:00.06 python : python with c.cp949 codec alice(perky):/usr/pkg/lib/python2.3/site-packages/korean% size _koco.so text data bss dec hex filename 122861 1844 32 124737 1e741 _koco.so On C codec, processes shares 122861 bytes on system-wide and consumes only 1844 bytes each, besides on Pure codec consumes 12 Mega bytes each. This must concerned very seriously for launching time of have "# encoding: euc-jp" or something CJK encodings. > I would also suggest that it might be more worthwhile to expose > platform codecs, which would give us all CJK codecs on a number of > major platforms, with a minimum increase in the size of the Python > distribution, and with very good performance. KoreanCodecs is tested on {Free,Net,Open}BSD, Linux, Solaris, HP-UX, Windows{95,98,NT,2000,XP}, Cygwin without any platform #ifdef's. I sure that any CJK codecs can be ported into any platforms that Python is ported. Regards, Hye-Shik =) From martin@v.loewis.de Thu Jan 16 12:02:04 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 16 Jan 2003 13:02:04 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E269612.5060305@lemburg.com> References: <3E267C2D.3090907@lemburg.com> <3E269612.5060305@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > I was suggesting to make Suzuki's codecs the default. That > doesn't prevent Tamito's codecs from working, since these > are inside a package. I wonder who will be helped by adding these codecs, if anybody who needs to process Japanese data on a regular basis will have to install that other package, anyway. > If someone wants the C codecs, we should provide them as > separate download right alongside of the standard distro (as > discussed several times before). I still fail to see the rationale for that (or, rather, the rationale seems to vanish more and more). AFAIR, "size" was brought up as an argument against the code. However, the code base already contains huge amounts of code that not everybody needs, and the size increase on a binary distribution is rather minimal. > Note that the C codecs are not as easy to modify to special > needs as the Python ones. While this may seem unnecessary > I've heard from a few people that especially companies tend > to extend the mappings with their own set of company specific > code points. The Python codecs are not easy to modify, either: there is a large generated table, and you actually have to understand the generation algorithm, augment it, run it through Jython. After that, you get a new mapping table, which you need to carry around *instead* of the one shipped with Python. So any user who wants to extend the mapping needs the generator more than the generated output. If you want to augment the codec as-is, i.e. by wrapping it, you best install a PEP 293 error handler. This works nicely both with C codecs and pure Python codecs (out of the box, it probably works with neither of the candidate packages, but that would have to be fixed). Or, if you don't go the PEP 293, you can still use a plain wrapper around both codecs. > We already have this on Windows (via the mbcs codec). That is insufficient, though, since it gives access to a single platform codec only. I have some code sitting around that exposes the codecs from inet.dll (or some such); this is the codec library that IE6 uses. > If you could contribute your iconv codecs under the PSF license we'd > go a long way in that direction on Unix as well. Ok, will do. There are still some issues with the code itself that need to be fixed, then I'll contribute it. > > *If* Suzuki's code is incorporated, I'd like to get independent > > confirmation that it is actually correct. > > Since he built the codecs on the mappings in Java, this > looks like enough third party confirmation already. Not really. I *think* Sun has, when confronted with a popularity-or-correctness issue, taken the popularity side, leaving correctness alone. Furthermore, the code doesn't use the Java tables throughout, but short-cuts them. E.g. in shift_jis.py, we find if i < 0x80: # C0, ASCII buf.append(chr(i)) where i is a Unicode codepoint. I believe this is incorrect: In shift-jis, 0x5c is YEN SIGN, and indeed, the codec goes on with elif i == 0xA5: # Yen buf.append('\\') So it maps both REVERSE SOLIDUS and YEN SIGN to 0x5c; this is an error (if it was a CP932 codec, it might (*) have been correct). See http://rf.net/~james/Japanese_Encodings.txt Regards, Martin (*) I'm not sure here, it also might be that Microsoft maps YEN SIGN to the full-width yen sign, in CP 932. From guido@python.org Thu Jan 16 14:38:21 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 16 Jan 2003 09:38:21 -0500 Subject: [Python-Dev] test_logging failing on Windows 2000 In-Reply-To: Your message of "Thu, 16 Jan 2003 09:03:59 GMT." <003101c2bd3e$38e74da0$652b6992@alpha> References: <200301160302.h0G32Ne01290@pcp02138704pcs.reston01.va.comcast.net> <003301c2bd0d$15e13360$4b11a044@oemcomputer> <003101c2bd3e$38e74da0$652b6992@alpha> Message-ID: <200301161438.h0GEcLu11489@odiug.zope.com> > [Raymond] > > Create a semaphore. > [snip] > > Bump the semaphore count up by one before running each new > > thread. > [snip] > > Have the daemons decrement the semaphore when they're done > > handling a request. > [snip] > > Have the last step in the main thread be a blocking call to the semaphore > > so that it doesn't bail out until all requests are handled. > > A semaphore seems the right thing, but I think it would need to be > incremented after each *request* rather than for each thread (to correctly > match with the decrement after each request). This needs to be done in the > test script before each logging call - the alternative, to have a > SocketHandler-derived class used in the test script, may be too intrusive. I'd prefer a different approach: register the threads started by ThreadingTCPServer in the 'threads' variable. Then they are waited for in the 'finally:' clause. You probably should move the flushing and closing of sockOut also. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 16 14:41:32 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 16 Jan 2003 09:41:32 -0500 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: Your message of "Thu, 16 Jan 2003 10:32:29 +0100." <3E267C2D.3090907@lemburg.com> References: <3E267C2D.3090907@lemburg.com> Message-ID: <200301161441.h0GEfWE11519@odiug.zope.com> > Hisao SUZUKI has just recently uploaded a patch to SF which > includes codecs for the Japanese encodings EUC-JP, Shift_JIS and > ISO-2022-JP and wants to contribute the code to the PSF. > > The advantage of his codecs over the ones written by Tamito > KAJIYAMA (http://www.asahi-net.or.jp/~rd6t-kjym/python/) > lies in the fact that Hisao's codecs are small (88kB) and > written in pure Python. This makes it much easier to adapt > the codecs to special needs or to correct errors. > > Provided Hisao volunteers to maintain these codecs, I'd like > to suggest adding them to Python's encodings package and making > them the default implementations for the above encodings. > > Ideal would be if we could get Hisao and Tamito to team up > to support these codecs (I put him on CC). > > Adding the codecs to the distribution would give Python a very > good argument in the Japanese world and also help people working > with XML or HTML targetting these locales. > > Thoughts ? Assuming the code is good, this seems the right thing from a technical perspective, but I'm worried what Tamito will think about it. Also, are there (apart from implementation technology) differences in features between the two? Do they always produce the same results? Would this kill Tamito's codecs, or are those still preferred for people doing a lot of Japanese? --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@python.org Thu Jan 16 14:47:40 2003 From: barry@python.org (Barry A. Warsaw) Date: Thu, 16 Jan 2003 09:47:40 -0500 Subject: [Python-Dev] Adding Japanese Codecs to the distro References: <3E267C2D.3090907@lemburg.com> Message-ID: <15910.50700.726791.488944@gargle.gargle.HOWL> >>>>> "MAL" == M writes: MAL> Adding the codecs to the distribution would give Python a MAL> very good argument in the Japanese world and also help people MAL> working with XML or HTML targetting these locales. +1 MAL> Ideal would be if we could get Hisao and Tamito to team up MAL> to support these codecs (I put him on CC). Yes, please. I've been using Tamito's codecs in Mailman and the Japanese users on my list have never complained, so I take that as that they're doing their job well. Let's please try to get consensus before we choose one or the other, but I agree, I'd love to see them in Python. What about other Asian codecs? -Barry From martin@v.loewis.de Thu Jan 16 15:02:48 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 16 Jan 2003 16:02:48 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <200301161441.h0GEfWE11519@odiug.zope.com> References: <3E267C2D.3090907@lemburg.com> <200301161441.h0GEfWE11519@odiug.zope.com> Message-ID: Guido van Rossum writes: > Also, are there (apart from implementation technology) differences in > features between the two? Do they always produce the same results? The JapaneseCodecs package comes with both Python and C versions of the codecs. It includes more encodings, in particular the cp932 codec, which is used on Windows (cp932 used to be understood as a synonym for shift-jis, but that understanding is incorrect, so these are considered as two different encodings these days). I believe they produce different output, but haven't tested. Hisao complains that Tamito's codecs don't include the full source for the generated files, but I believe (without testing) that you just need a few files from the Unicode consortium to generate all source code. > Would this kill Tamito's codecs, or are those still preferred for > people doing a lot of Japanese? As long as Python doesn't provide cp932, people will still install the JapaneseCodecs. Regards, Martin From mal@lemburg.com Thu Jan 16 16:05:21 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 16 Jan 2003 17:05:21 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <15910.50700.726791.488944@gargle.gargle.HOWL> References: <3E267C2D.3090907@lemburg.com> <15910.50700.726791.488944@gargle.gargle.HOWL> Message-ID: <3E26D841.2020805@lemburg.com> Barry A. Warsaw wrote: >>>>>>"MAL" == M writes: > > > MAL> Adding the codecs to the distribution would give Python a > MAL> very good argument in the Japanese world and also help people > MAL> working with XML or HTML targetting these locales. > > +1 > > MAL> Ideal would be if we could get Hisao and Tamito to team up > MAL> to support these codecs (I put him on CC). > > Yes, please. I've been using Tamito's codecs in Mailman and the > Japanese users on my list have never complained, so I take that as > that they're doing their job well. I'm not biased in any direction here. Again, I'd love to see the two sets be merged into one, e.g. take the Python ones from Hisao and use the C ones from Tamito if they are installed instead. > Let's please try to get consensus before we choose one or the other, > but I agree, I'd love to see them in Python. Sure. > What about other Asian codecs? The other codecs in the SF Python Codecs project have license and maintenance problems. Most of these stem from Tamito's original codec which was under GPL. There are plenty other encodings we'd need to cover most of the Asian scripts. However, in order for them to be usable we'll have to find people willing to maintain them or at least make sure that they fit the need and are correct in their operation (where "correct" means usable in real life). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From perky@fallin.lv Thu Jan 16 18:43:31 2003 From: perky@fallin.lv (Hye-Shik Chang) Date: Fri, 17 Jan 2003 03:43:31 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E26D841.2020805@lemburg.com> References: <3E267C2D.3090907@lemburg.com> <15910.50700.726791.488944@gargle.gargle.HOWL> <3E26D841.2020805@lemburg.com> Message-ID: <20030116184331.GA66593@fallin.lv> On Thu, Jan 16, 2003 at 05:05:21PM +0100, M.-A. Lemburg wrote: > >What about other Asian codecs? > > The other codecs in the SF Python Codecs project have license > and maintenance problems. Most of these stem from Tamito's > original codec which was under GPL. > > There are plenty other encodings we'd need to cover most > of the Asian scripts. However, in order for them to be usable > we'll have to find people willing to maintain them or at least > make sure that they fit the need and are correct in their > operation (where "correct" means usable in real life). > KoreanCodecs in the SF Korean Python Codecs (http://sf.net/projects/koco) a.k.a KoCo is changed from PSF License to LGPL on Barry's request in early 2002. Because it has many fancy codecs that isn't used in korean real world, I'd like to make an essence of KoreanCodecs in PSF License if python needs it. KoCo implementation is the only widely-used codec set for korean encodings and I can maintain if it needs. Regards, Hye-Shik =) From skip@pobox.com Thu Jan 16 22:43:27 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 16 Jan 2003 16:43:27 -0600 Subject: [Python-Dev] Semi-OT SF CVS offline until... (fwd) Message-ID: <15911.13711.160128.479572@montanaro.dyndns.org> --rbvdoocTT3 Content-Type: text/plain; charset=us-ascii Content-Description: message body text Content-Transfer-Encoding: 7bit For those of you who don't cross the boundary between python-dev and python-list very often, note the appended message about SF CVS. Skip --rbvdoocTT3 Content-Type: message/rfc822 Content-Description: forwarded message Content-Transfer-Encoding: 7bit MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Return-Path: Received: from localhost [127.0.0.1] by localhost with POP3 (fetchmail-6.1.0) for skip@localhost (single-drop); Thu, 16 Jan 2003 16:10:56 -0600 (CST) Received: from icicle.pobox.com (icicle-mx.pobox.com [199.26.64.83]) by manatee.mojam.com (8.12.1/8.12.1) with ESMTP id h0GM8leb021175 for ; Thu, 16 Jan 2003 16:08:48 -0600 Received: from icicle.pobox.com (localhost [127.0.0.1]) by icicle.pobox.com (Postfix) with ESMTP id 4798E23E8E for ; Thu, 16 Jan 2003 17:08:47 -0500 (EST) Delivered-To: skip@pobox.com Received: from mail.python.org (mail.python.org [12.155.117.29]) by icicle.pobox.com (Postfix) with ESMTP id BEF2523DF8 for ; Thu, 16 Jan 2003 17:08:46 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=mail.python.org) by mail.python.org with esmtp (Exim 4.05) id 18ZIBf-0008CO-00; Thu, 16 Jan 2003 17:08:35 -0500 Received: from smtp4.sea.theriver.com ([216.39.128.19]) by mail.python.org with smtp (Exim 4.05) id 18ZIB3-0007wd-00 for python-list@python.org; Thu, 16 Jan 2003 17:07:57 -0500 Received: (qmail 8733 invoked from network); 16 Jan 2003 22:44:39 -0000 Received: from sense-sea-megasub-1-1008.oz.net (HELO lion) (216.39.170.247) by smtp4.sea.theriver.com with SMTP; 16 Jan 2003 22:44:39 -0000 Message-ID: X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Importance: Normal X-Spam-Status: No, hits=-1.7 required=5.0 tests=BODY_PYTHON_ZOPE,SPAM_PHRASE_00_01,USER_AGENT_OUTLOOK X-Spam-Level: Errors-To: python-list-admin@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.0.13 (101270) Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: X-Hammie-Debug: '*H*': 1.00; '*S*': 0.00; 'underlying': 0.00; 'from:addr:oz.net': 0.00; 'repository': 0.00; 'subject:CVS': 0.00; 'python!': 0.01; 'leblanc': 0.01; 'cvs': 0.01; 'developers.': 0.01; 'url:python-list': 0.01; 'url:listinfo': 0.02; 'to:addr:python-list': 0.02; 'url:python': 0.02; 'from:addr:whisper': 0.02; 'from:name:david leblanc': 0.02; 'message-id:@oz.net': 0.02; 'url:mailman': 0.02; 'possible).': 0.03; 'to:addr:python.org': 0.04; 'viewcvs': 0.05; 'appreciated.': 0.06; 'url:org': 0.06; 'header:Errors-To:1': 0.07; 'header:Received:8': 0.07; 'seattle,': 0.08; 'url:mail': 0.09; 'scalability': 0.09; 'subject:Semi': 0.09; '(as': 0.14; 'posted': 0.17; 'resolved': 0.18; 'david': 0.19; 'subject:-': 0.21; 'server': 0.22; 'ago.': 0.23; 'x-mailer:microsoft outlook imo, build 9.0.2416 (9.0.2910.0)': 0.25; 'issues': 0.27; 'patience': 0.28; 'project': 0.28; 'projects,': 0.29; 'stabilize': 0.30; 'what': 0.31; 'header:Message-ID:1': 0.34; 'analyzed': 0.34; 'they': 0.36; 'applies': 0.37; 'usa': 0.62; 'additional': 0.64; 'site': 0.66; 'your': 0.66; 'become': 0.68; 'minutes': 0.75; 'services': 0.79; 'say:': 0.80 From: "David LeBlanc" Sender: python-list-admin@python.org To: "Python-List@Python. Org" Subject: Semi-OT SF CVS offline until... Date: Thu, 16 Jan 2003 14:08:11 -0800 X-Spambayes-Classification: ham; 0.00 What they say: (2003-01-14 14:04:19 - Project CVS Services) As of 2003-01-14, pserver-based CVS repository access and ViewCVS (web-based) CVS repository access have been taken offline as to stabilize CVS server performance for developers. These services will be re-enabled as soon as the underlying scalability issues have been analyzed and resolved (as soon as 2003-01-15, if possible). Additional updates will be posted to the Site Status page as they become available. Your patience is appreciated. Not up as of 5 minutes ago. This applies to ALL SF projects, not just Python! David LeBlanc Seattle, WA USA -- http://mail.python.org/mailman/listinfo/python-list --rbvdoocTT3-- From brett@python.org Thu Jan 16 22:51:14 2003 From: brett@python.org (Brett Cannon) Date: Thu, 16 Jan 2003 14:51:14 -0800 (PST) Subject: [Python-Dev] python-dev Summary for 2003-01-01 through 2003-01-15 Message-ID: Here is the rough draft. Have a look and let me know where I botched things. +++++++++++++++++++++++++++++++++++++++++++++++++++++ python-dev Summary for 2003-01-01 through 2002-01-15 +++++++++++++++++++++++++++++++++++++++++++++++++++++ ====================== Summary Announcements ====================== This summary has been written in the first-person. Aahz had suggested I write in third-person, but I just can't stand it. =) A new project named `Minimal Python`_ whose goal is to write a Python interpreter using a minimal amount of C and maximizing the amount of code written in Python. As of now it is just a mailing list but some interesting points have already been made and it has some major Python programmers involved with it so it will more than likely succeed. Leaving out `Broken strptime in Python 2.3a1 & CVS`_ for two reasons. One, the only lesson in that thread is that when C does not give you a good standard, rewrite the code in Python and come up with one. Two, since the whole thing revolves around my code I would be more biased than I normally am. =) Also, some of the threads have notes with them mentioning how I could not find them in the Mailman archive. .. _Minimal Python: http://codespeak.net/mailman/listinfo/pypy-dev .. _Broken strptime in Python 2.3a1 & CVS: http://mail.python.org/pipermail/python-dev/2003-January/031983.html =================================================== `PEP 303: Extend divmod() for Multiple Divisors`__ =================================================== __ http://mail.python.org/pipermail/python-dev/2002-December/031511.html Splinter threads: - map, filter, reduce, lambda (can't find thread in Mailman archive) As mentioned in the `last summary`__, this thread sparked a huge response to the built-ins in Python (if you don't remember what all the built-ins are, just execute ``dir(__builtins__)`` in the interpreter). It started with Christian Tismer calling for the deprecation of ``divmod()`` since he "found that divmod compared to / and % is at least 50 percent *slower* to compute", primarily because ``divmod()`` is a method and the overhead that comes from being that. Guido responded and agree that, in hindsight, "Maybe it wasn't such a good idea". But since it isn't broken he is not for replacing it since "At this point, any change causes waste". Besides, Tim Peters came up with the great compromise of removing " divmod() when [Tim's] dead. Before then, it stays." Guido would rather have the energy spent on adding the ability for the language to recognize that it "is a built-in so it can generate more efficient code". This led to Christian suggesting changing Python so that built-ins like ``divmod()`` that can easily be written in Python be done so and thus remove the C version without any backwards-compatibility issues. Guido's response was to "first make some steps towards the better compiler [Guido] alluded to. Then we can start cutting." Tim in an email said that "If we have to drop a builtin, I never liked reduce ". This then led to Barry Warsaw saying he could stand to lose ``apply()``. Then Raymond Hettinger nominated ``buffer()`` and ``intern()`` as things that could stand to go away. Michael Hudson suggested ``map()`` and ``filter()``. Guido said ``apply()`` could stand getting a PendingDeprecationWarning since the new calling conventions (``fxn(*args, **kwargs)``) replace ``apply()``'s use. He almost mentioned how he wanted something like ``foo(1, 2, 3, *args, 5, 6, 7, **kw)`` since the ``5, 6, 7`` is currently not supported. But the call for the removal of ``lambda`` is what really set things off. People in support of the removal said that nested scoping removed a huge reason for having ``lambda``. But functional programmers said that they wouldn't let ``lambda`` be taken away. It was agreed it had its uses, but the name was not relevant to its use and its syntax is non-standard and should be changed in Py3K. Someone suggested renaming it ``def``, but Guido said no since "Python's parser is intentionally simple-minded and doesn't like having to look ahead more than one token" and using ``def`` for ``lambda`` would break that design. Barry suggested ``anon`` which garnered some support. It was also suggested to have ``lambda`` use parentheses to hold its argument list like a good little function. And then all of this led to a discussion over features in Python that are not necessarily easy to read but are powerful. Some said that features such as list comprehensions, the ``*/**`` calling syntax, etc., were not good since they are not syntactically obvious. This was argued against by saying they are quite useful and are not needed for basic programming. .. _last summary: http://www.python.org/dev/summary/2002-12-16_2002-12-31.html ================== `Holes in time`__ ================== __ http://mail.python.org/pipermail/python-dev/2003-January/031836.html Tim Peters brought up an inherent problem with the new datetime_ type and dealing with timezones and daylight savings. Imagine when DST starts; the clock jumps from 1:59 to 3:00. Now what happens if someone enters a time that does not use DST? The implementation has to handle the possibility of something inputting 2:00 and instantly pushing forward an hour. But what is worse is when DST ends. You go from 1:59 back to 1:00! That means you pass over everything from 1:00 to 1:59 twice during that day. Since it is flat-out impossible to tell, what should be done in this case? Originally ``datetime`` raised ValueError. It seems to be staying that way (after a *very* lengthy discussion clarifying the whole situation). If you want to read a paper on the history of time go to http://www.naggum.no/lugm-time.html as recommended by Neil Schemenauer. .. _datetime: http://www.python.org/dev/doc/devel/lib/module-datetime.html =========================================== `no expected test output for test_sort?`__ =========================================== __ http://mail.python.org/pipermail/python-dev/2003-January/031889.html This thread was originally started by Skip Montanaro to try to figure out why ``test_sort`` was failing to catch an error when run through ``regrtest.py``. But Skip quickly realized ``regrtest.py`` was using the wrong directories. What makes this thread worth mentioning is a question I posed about whether it was worth converting tests over to PyUnit. Guido said not for the sake of converting, but if you were going to be improving upon the tests, then moving a testing suite over to PyUnit was a good idea. It was also said that all new testing suites should use PyUnit (although doctest is still acceptable, just not preferred). Walter Drwald started `patch #662807`_ to use for converting tests to PyUnit. So give a hand if you care to. .. _patch #662807: http://www.python.org/sf/662807 ============ binutils ============ (Can't find in Mailman archive) Andrew Koenig told python-dev of a problem with binutils_ 2.13 way back in the `2002-089-16 through 2002-09-01 summary`_. Well, it appears that 2.13.2 fixes the issues. .. _2002-089-16 through 2002-09-01 summary: http://www.python.org/dev/summary/2002-08-16-2002-09-01.html =========================== The meaning of __all__ =========================== (Can't find in Mailman archive) Jack Jansen asked what the proper use of ``__all__``; a discussion on the PyObjC_ mailing list sparked this question. The answer is that ``__all__`` is used for the public API of a module. .. _PyObjC: http://pyobjc.sourceforge.net/ ====================================== PEP 301 implementation checked in ====================================== (Can't find in Mailman archive) AM Kuchling announced that the `PEP 301`_ implementation by Richard Jones. The web server handling everything is at `amk.ca`_ off of a DSL line so don't hit it too hard. An example of how to change a Distutils ``setup.py`` file is given in the e-mail. .. _PEP 301: http://www.python.org/peps/pep-0301.html .. _amk.ca: http://www.amk.ca/cgi-bin/pypi.cgi ======================================= bz2 problem deriving subclass in C ======================================= (Can't find in Mailman archive) This thread was originally about a patch by Neal Norwitz and how to properly deal with deallocating in C code. Apparently this has nipped several people in the bum and has changed since the introduction of new-style classes. It appears that if you are deallocating a base type, you call its ``tp_dealloc()`` function; if it is a new-style class, you call ``self->ob_type->tp_free()``. ==================== `Cross compiling`__ ==================== __ http://mail.python.org/pipermail/python-dev/2003-January/031848.html (Link only part of thread; rest missing) Splinter threads: - `Bastion too `__ - `What attempts at security should/can Python implement? `__ - `Whither rexec? `__ - `tainting `__ Originally a thread about cross-compilation on OS X to Windows by Timothy Wood, this ended up changing into a thread about security in Python. As has come up multiple times before, rexec_'s lack of actual security for new-style classes came up. This then led to a question of whether Bastion_ was secure as well. The answer was no and have now been subsequently sabotaged on purpose by Guido; they now require you to manually edit the code to allow them to work. A discussion on how to make Python secure came up. Various points were made such as how to deal with introspection and how people might try to get around being locked out of features. Tainting was also mentioned for strings. If you need some security in your app, take a look at mxProxy_. Its design is similar to how Zope3_ handles security. .. _rexec: http://www.python.org/doc/current/lib/module-rexec.html .. _Bastion: http://www.python.org/doc/current/lib/module-Bastion.html .. _mxProxy: http://www.egenix.com/files/python/mxProxy.html .. _Zope3: http://dev.zope.org/Wikis/DevSite/Projects/ComponentArchitecture/FrontPage =========================== `tutor(function|module)`__ =========================== __ http://mail.python.org/pipermail/python-dev/2003-January/031893.html (Only part of thread; rest missing from archive) Christopher Blunk proposed the idea of having a built-in ``examples()`` function or something similar that would spit out example code on how to use something. There was a discussion as to whether it was reasonable for Python to try to keep up a group of examples in the distribution. It was agreed, though, that more code examples would be good to have *somewhere*. One place is in the Demos_ directory of the source distribution. Problem with that is it is only available in the source distribution and it is not vigorously maintained. Both of these are solvable, but it didn't look like anyone was taking the reigns to cause this to happen. Another was to push people to give to an online repository (such as the `Python Cookbook`_) to keep it centralized and accessible by anyone. Problem with that is it creates a dependency on something outside of PythonLab's control. The last option was adding it to the module documentation. This is the safest since it means it becomes permanent and everyone will have access to it. But it has the same drawback as all of these other suggestions; you can't access it in the interpreter as easily as you can ``help()``. The thread ended with no conclusion beyond cleaning up the ``Demos`` directory and adding more examples to the official documentation would be nice. .. _Python Cookbook: http://aspn.activestate.com/ASPN/Python/Cookbook/ ========================================= `PEP 297: Support for System Upgrades`__ ========================================= __ http://mail.python.org/pipermail/python-dev/2003-January/031838.html MA Lemburg asked if Guido would be willing to let an implementation of `PEP 297`_ into Python 2.3. He said yes and would also be willing to add it to 2.2.3 and even 2.1.4 if that version ever comes about. A discussion of how to handle the implementation began. The agreed upon directory name became ``site-package-`` with ```` being the full version: major.minor.micro. ``site.py`` will be modified so that if the current version of Python is different from the version of the directory that it won't add it to ``sys.path``. There was the brief idea of deleting the directory as soon as the version did not match, but that was thought of being harsh and unneeded. .. _PEP 297: http://www.python.org/peps/pep-0297.html ================================ `PyBuffer* vs. array.array()`__ ================================ __ http://mail.python.org/pipermail/python-dev/2003-January/031841.html (Only part of thread; rest missing from archive) Splinter threads: - `Slow String Repeat `__ Bill Bumgarner asked whether ``PyBuffer*`` or ``array.array()`` was the proper thing to use because one is like a huge string and the other is unsigned ints. Guido basically said to just use strings like everyone else for byte storage. Bill said he would just live with the dichotomy. This thread also brought up a question of performance for the code ``array.array('B').fromlist([0 for x in range(0, width*height*3)])``. Was cool to watch various people take stabs as speeding it up. Written in Python, the fast way was to change it to ``array.array('B', [0])*width*height*3``; doing several decent-sized calls minimized ``memcpy()`` calls in the underlying C code. Raymond Hettinger came up with a fast way to do it in C. ========================== `new features for 2.3?`__ ========================== __ http://mail.python.org/pipermail/python-dev/2003-January/031837.html Neal Norwitz wanted to get the tarfile_ module into Python 2.3. This led to Guido saying he wanted to get Docutils_ and better pickling for new-style classes in as well. The Docutils idea was shot down because it was stated it currently is not ready for inclusion. 2.4 is probably the version being shot for. The pickle idea raised security ideas (this is becoming a trend for some reason on the list lately). In case you didn't know, you shouldn't run pickled code you don't trust. .. _tarfile: http://python.org/sf/651082 ================ `sys.path[0]`__ ================ __ http://mail.python.org/pipermail/python-dev/2003-January/031896.html Thomas Heller had a question about the documented handling of ``sys.path[0]``. This quickly led to the point that the value, when specified and not the empty string, is a relative path; not necessarily the desired result. And so `patch #664376`_ was opened to make it be absolute. .. _patch #664376: http://www.python.org/sf/664376 =================== `Misc. warnings`__ =================== __ http://mail.python.org/pipermail/python-dev/2003-January/031884.html MA Lemburg noticed that ``test_ossaudiodev`` and ``test_bsddb3`` were not expected skips on his Linux box. He wondered why they were not exepected to be skipped on his platform. This led to a discussion over the expected skip set and how it is decided what is going to be skipped (experts on the various OSs make the call). ============================== `Raising string exceptions`__ ============================== __ http://mail.python.org/pipermail/python-dev/2003-January/031907.html It looks like string exceptions are going to raise PendingDeprecationWarning; you have been warned. ====================== `PEP 290 revisited`__ ====================== __ http://mail.python.org/pipermail/python-dev/2003-January/032003.html Kevin Altis was cleaning up wxPython_ and all the references to the ``string`` module and pointed out that the stdlib probably could stand to have a similar cleaning as well. Guido said, though, that it is better to choose a module you are familiar with and do a complete style clean-up instead of doing a library-wide clean-up of a single thing. .. _wxPython: http://www.wxpython.org/ ============================ `Assignment to __class__`__ ============================ __ http://mail.python.org/pipermail/python-dev/2003-January/032016.html Kevin Jacobs got bitten by the change to non-heap objects that prevents the assigning to ``__class__`` (it was recently backported to the 2.2 branch). He asked if there was any way to bring it back; Guido said no because of various layout issues in the C struct and such. ================================= `Parallel pyc construction`__ ================================= __ http://mail.python.org/pipermail/python-dev/2003-January/032060.html Paul Dubois ran into an problem on a 384 processor system with .pyc file corruption. He wanted a way to just skip the creation of .pyc files entirely since his programs run for so long that initial start-up times are inconsequential and that is the point of .pyc files. Neal Norwitz is working on `patch #602345`_ to implement a command-line option. .. _patch #602345: http://www.python.org/sf/602345 ============================================== `Extension modules, Threading, and the GIL`__ ============================================== __ http://mail.python.org/pipermail/python-dev/2003-January/032017.html This thread was briefly mentioned in the `last summary`_. But it had changed into a discussion of Python external thread API. Mark Hammond commented how he had bumped up against its limitations in multiple contexts. Mark thought it would be useful to define an API where one wanted the ability to say "I'm calling out to/in from an external API". This would be a large project, though, and so Mark was not up to doing it on his own. David Abrahams, though, stepped up and said he was willing to help. So did Anthony Baxter. Tim Peters had his own twist by wanting to be able to call from a thread into a Python thread without knowing a single thing about that state of Python at that point, e.g. don't even know that that interpreter is done initializing. Mark saw the way to go about solving all of this by writing a PEP, write a new C API to solve this issue that is optional and used by extension modules wanting to use the new features, and then define a new TLS (Thread Local Storage; thank you acronymfinder.com). Mark thought the TLS would only be needed for thread-state. Someone suggested ACE_ as a TSS (Thread State Storage) implementation which can cover all the platforms that Python needs covered. But Mark suggested coming up with a "pluggable TLS" design. .. _ACE: http://doc.ece.uci.edu/Doxygen/Beta/html/ace/classACE__TSS.html ============================================ `Interop between datetime and mxDateTime`__ ============================================ __ http://mail.python.org/pipermail/python-dev/2003-January/032100.html MA Lemburg expressed interest in trying to get mxDateTime_ to play nice with datetime_. The idea of having a base type like the new basestring in Python 2.3 was suggested. But there was pickling issues with datetime, MA wanting to keep mxDateTime compatible with Python 1.5.2, and getting everyone to work in a consistent way through the base type. Another instance where interfaces would be handy. .. _mxDateTime: http://www.lemburg.com/files/python/mxDateTime.html =========================== `properties on modules?`__ =========================== __ http://mail.python.org/pipermail/python-dev/2003-January/032106.html Neil Schemenauer wanted to be able to make module variables be properties. Samuele Pedroni stepped up and said the issue for this to work was "about instance-level property apart the module thing details". He also pointed out that it would mean you could directly import a property which would cause unexpected results, thus killing the idea. Guido was disappointed he didn't get to wield his BDFL pronouncement stick and smack the idea down . ====================================== `Fwd: Re: PEP 267 improvement idea`__ ====================================== __ http://mail.python.org/pipermail/python-dev/2003-January/032165.html Oren Tirosh forwarded an e-mail from python-list@python.org to python-dev about `PEP 267`_ and various strategies. One was using negative keys for signaling guaranteed key lookup failure. Oren also played with inlining. He also said that since interned strings are now out of Python as of 2.3 (thanks to Oren) that the namespace could "have only interned strings as entry keys and avoid the need for a second search pass on first failed lookup". He even considered creating a special dict type for namespaces. .. _PEP 267: http://www.python.org/peps/pep-0267.html From aahz@pythoncraft.com Fri Jan 17 00:02:50 2003 From: aahz@pythoncraft.com (Aahz) Date: Thu, 16 Jan 2003 19:02:50 -0500 Subject: [Python-Dev] python-dev Summary for 2003-01-01 through 2003-01-15 In-Reply-To: References: Message-ID: <20030117000249.GA2917@panix.com> On Thu, Jan 16, 2003, Brett Cannon wrote: > > ================== > `Holes in time`__ > ================== > __ http://mail.python.org/pipermail/python-dev/2003-January/031836.html > > Tim Peters brought up an inherent problem with the new datetime_ type and > dealing with timezones and daylight savings. Imagine when DST starts; the > clock jumps from 1:59 to 3:00. Now what happens if someone enters a time > that does not use DST? The implementation has to handle the possibility > of something inputting 2:00 and instantly pushing forward an hour. But > what is worse is when DST ends. You go from 1:59 back to 1:00! That > means you pass over everything from 1:00 to 1:59 twice during that day. > Since it is flat-out impossible to tell, what should be done in this case? > > Originally ``datetime`` raised ValueError. It seems to be staying that > way (after a *very* lengthy discussion clarifying the whole situation). Unless Tim/Guido changed their minds yet again, I believe that we agreed that the conversion should in fact go through 1:00am to 1:59am twice. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "I used to have a .sig but I found it impossible to please everyone..." --SFJ From tim.one@comcast.net Fri Jan 17 00:12:39 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 16 Jan 2003 19:12:39 -0500 Subject: [Python-Dev] python-dev Summary for 2003-01-01 through 2003-01-15 In-Reply-To: <20030117000249.GA2917@panix.com> Message-ID: [Aahz] > ... > Unless Tim/Guido changed their minds yet again, I believe that we agreed > that the conversion should in fact go through 1:00am to 1:59am twice. Guido never changed his mind (indeed, I have a hard time imagining such a possibility ). I changed the implementation to do as you say. It would be giving it too much credit to say that it had a mind, though. It's content to have a killer bod. From Jack.Jansen@cwi.nl Fri Jan 17 13:49:03 2003 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 17 Jan 2003 14:49:03 +0100 Subject: [Python-Dev] make install failing with current cvs Message-ID: <708CA78A-2A22-11D7-ABDC-0030655234CE@cwi.nl> Make install has started failing on me with the current CVS tree. The problem is that I have various third-party packages installed in site-python that have inconsistent tab usage (PyOpenGL and Numeric are two of the more popular ones). The compileall step gets TabError exceptions for these files, and this causes it to finally exiting with a non-zero exit status. I think either compileall should skip site-packages, or it should at least not abort the install if there are compile errors there. I've looked around and I think patch 661719 has something to do with this, but I'm not sure. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From skip@pobox.com Fri Jan 17 14:32:26 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 17 Jan 2003 08:32:26 -0600 Subject: [Python-Dev] python-dev Summary for 2003-01-01 through 2003-01-15 In-Reply-To: References: Message-ID: <15912.5114.261289.49046@montanaro.dyndns.org> I think the first item on the list should be the security problems in rexec and Bastion which led Guido to cripple them. This is likely to be the most significant incompatibility in 2.3 (and 2.2.3 if it is ever released). Brett> =================================================== Brett> `PEP 303: Extend divmod() for Multiple Divisors`__ Brett> =================================================== ... Brett> to compute", primarily because ``divmod()`` is a method and the overhead function --------^^^^^^ ... Brett> should be changed in Py3K. Someone suggested renaming it ``def``, but Maybe define "Py3K" so the more reactionary readers on c.l.py don't think the language is going to break with the release of Python 2.3. (You might be surprised at how some of them react and what they react to.) Skip From guido@python.org Fri Jan 17 14:55:50 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 17 Jan 2003 09:55:50 -0500 Subject: [Python-Dev] make install failing with current cvs In-Reply-To: Your message of "Fri, 17 Jan 2003 14:49:03 +0100." <708CA78A-2A22-11D7-ABDC-0030655234CE@cwi.nl> References: <708CA78A-2A22-11D7-ABDC-0030655234CE@cwi.nl> Message-ID: <200301171455.h0HEtod15046@odiug.zope.com> > Make install has started failing on me with the current CVS tree. The > problem is that I have various third-party packages installed in > site-python that have inconsistent tab usage (PyOpenGL and Numeric are > two of the more popular ones). The compileall step gets TabError > exceptions for these files, and this causes it to finally exiting with > a non-zero exit status. > > I think either compileall should skip site-packages, or it should at > least not abort the install if there are compile errors there. Maybe for compiling site-packages a separate Python invocation could be used that doesn't use -tt. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@alum.mit.edu Fri Jan 17 16:15:46 2003 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 17 Jan 2003 11:15:46 -0500 Subject: [Python-Dev] PyConDC sprints Message-ID: <15912.11314.856979.394880@slothrop.zope.com> We have reserved the two days before the DC Python conference for sprints. Tres Seaver from Zope came up with the term sprint to describe a multi-day, focused development session that uses ideas from Extreme Programming. Another way to think about it is an opportunity to sit face-to-face with your fellow Python developers to work on Python. I'd like to get an idea for how many people on this list are planning to attend the sprints and what topics they want to work on. The key to making the sprint work is to find sprint coaches who can round up developers and makes realistic plans for a two-day development project. If you want to attend, want to coach, or have an idea about a topic, please send me an email. I'll summarize to this list later. I expect there will be a nominal cost for the sprints, maybe $50 to cover room and infrastructure. We're not planning to provide food, but there is a cafeteria in the building. Jeremy From altis@semi-retired.com Fri Jan 17 18:51:52 2003 From: altis@semi-retired.com (Kevin Altis) Date: Fri, 17 Jan 2003 10:51:52 -0800 Subject: [Python-Dev] sys.exit and PYTHONINSPECT Message-ID: While trying some unittests I was surprised to find that the -i command-line option would not keep the interpreter running. Bill Bumgarner pointed out to me that "-i doesn't deal with sys.exit. The last line of the runTests() method in the unittest module invokes sys.exit()." Bill also suggested that if I wanted to keep the interpreter up in the case of the unittest I could do so with: if __name__ == '__main__': try: unittest.main() except SystemExit: pass But the core issue to me is that if you invoke the Python interpreter with -i, then even sys.exit shouldn't kill the interpreter, especially since sys.exit generates an exception which can be caught. I can't think of any other case where -i fails to keep the interpreter alive after a script exits, whether because of an syntax or runtime error or normal termination. The interpreter help states: -i : inspect interactively after running script, (also PYTHONINSPECT=x) and force prompts, even if stdin does not appear to be a terminal So should the sys.exit behavior with the -i command-line option be considered a bug or a feature? ;-) I'm happy to enter a report to get the behavior changed if possible. ka --- Kevin Altis altis@semi-retired.com http://radio.weblogs.com/0102677/ http://www.pythoncard.org/ From guido@python.org Fri Jan 17 18:56:52 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 17 Jan 2003 13:56:52 -0500 Subject: [Python-Dev] sys.exit and PYTHONINSPECT In-Reply-To: Your message of "Fri, 17 Jan 2003 10:51:52 PST." References: Message-ID: <200301171856.h0HIurr20514@odiug.zope.com> > So should the sys.exit behavior with the -i command-line option be > considered a bug or a feature? ;-) I'm happy to enter a report to get the > behavior changed if possible. I'd like a patch. --Guido van Rossum (home page: http://www.python.org/~guido/) From altis@semi-retired.com Fri Jan 17 19:35:36 2003 From: altis@semi-retired.com (Kevin Altis) Date: Fri, 17 Jan 2003 11:35:36 -0800 Subject: [Python-Dev] sys.exit and PYTHONINSPECT In-Reply-To: <200301171856.h0HIurr20514@odiug.zope.com> Message-ID: > From: Guido van Rossum > > > So should the sys.exit behavior with the -i command-line option be > > considered a bug or a feature? ;-) I'm happy to enter a report > to get the > > behavior changed if possible. > > I'd like a patch. I've been grepping the .py sources and haven't found where an unhandled SystemExit exception is dealt with that relates to this issue. I'm guessing that is the place that needs the change, but if the change is in C source I'm not your man. Plus I know nothing of the Python internals. I'll enter a bug report if nobody else steps up to the plate. ka From guido@python.org Sat Jan 18 17:24:23 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 18 Jan 2003 12:24:23 -0500 Subject: [Python-Dev] logging package -- rename warn to warning? Message-ID: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> I think I'd like to change 'warn' to 'warning' (and WARN to WARNING). While warn() sounds nice as a verb, none of the other functions/methods are verbs, so I think it's better to name the levels after conventional level names. 'Warning' is definitely a better name if you think of a logging level than 'warn'. Thoughts? I can implement this for Python 2.3a2. --Guido van Rossum (home page: http://www.python.org/~guido/) From vinay_sajip@red-dove.com Sat Jan 18 19:48:11 2003 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Sat, 18 Jan 2003 19:48:11 -0000 Subject: [Python-Dev] test_logging failing on Windows 2000 References: <200301160302.h0G32Ne01290@pcp02138704pcs.reston01.va.comcast.net> <003301c2bd0d$15e13360$4b11a044@oemcomputer> <003101c2bd3e$38e74da0$652b6992@alpha> <200301161438.h0GEcLu11489@odiug.zope.com> Message-ID: <006401c2bf2a$8987b860$652b6992@alpha> Guido van Rossum wrote: > I'd prefer a different approach: register the threads started by > ThreadingTCPServer in the 'threads' variable. Then they are waited > for in the 'finally:' clause. You probably should move the flushing > and closing of sockOut also. I've posted a patch on SourceForge: http://sourceforge.net/tracker/?func=detail&aid=670390&group_id=5470&atid=30 5470 I've temporarily lost my Windows 2000 and Linux machines, so I could only test under Windows XP Home with ActivePython 2.1.1 build 212. The CVS version of the script runs OK from the command line, but if invoked via the >>> prompt and "import test_logging; test_logging.test_main()" it caused a hang on my system. I think this is an obscure problem caused by some pathological interaction between logging, threading.py and SocketServer.py: to solve the problem I copied the process_request from SocketServer.ThreadingMixin to my ThreadingTCPServer-derived server. It was still failing, but when I commented out the redundant (the test script already imports threading) line "import threading" from process_request, the test passed! I couldn't figure out why commenting out the line caused success and leaving it in caused a hang - is there something obvious I've missed? Regards Vinay From vinay_sajip@red-dove.com Sat Jan 18 19:51:01 2003 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Sat, 18 Jan 2003 19:51:01 -0000 Subject: [Python-Dev] Re: logging package -- rename warn to warning? References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <006a01c2bf2a$ee6dcc60$652b6992@alpha> Guido van Rossum wrote: > I think I'd like to change 'warn' to 'warning' (and WARN to WARNING). > Thoughts? I can implement this for Python 2.3a2. +1 on the name change, but it will break a fair amount of code out there (from the emails I've been receiving). Could we leave warn and WARN in as synonyms for now, but only mention warning and WARNING in the documentation? Or is this against policy? Vinay From fdrake@acm.org Sat Jan 18 20:10:12 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Sat, 18 Jan 2003 15:10:12 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: <006a01c2bf2a$ee6dcc60$652b6992@alpha> References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> <006a01c2bf2a$ee6dcc60$652b6992@alpha> Message-ID: <15913.46244.345533.819617@grendel.zope.com> Vinay Sajip writes: > +1 on the name change, but it will break a fair amount of code out there > (from the emails I've been receiving). Could we leave warn and WARN in as > synonyms for now, but only mention warning and WARNING in the documentation? I think this would be very reasonable; we don't want to break things for people that have been using this package already. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From pyth@devel.trillke.net Sat Jan 18 20:23:39 2003 From: pyth@devel.trillke.net (holger krekel) Date: Sat, 18 Jan 2003 21:23:39 +0100 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: <15913.46244.345533.819617@grendel.zope.com>; from fdrake@acm.org on Sat, Jan 18, 2003 at 03:10:12PM -0500 References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> <006a01c2bf2a$ee6dcc60$652b6992@alpha> <15913.46244.345533.819617@grendel.zope.com> Message-ID: <20030118212339.B9768@prim.han.de> Fred L. Drake, Jr. wrote: > > Vinay Sajip writes: > > +1 on the name change, but it will break a fair amount of code out there > > (from the emails I've been receiving). Could we leave warn and WARN in as > > synonyms for now, but only mention warning and WARNING in the documentation? > > I think this would be very reasonable; we don't want to break things > for people that have been using this package already. But isn't this a case where the number of people who *will* use it in the future exceeds the current number of users by a factor of 10 or 100? Is it really worth the possible confusion? Introducing a new module with already "deprecated" method names seems strange to me. holger From fdrake@acm.org Sat Jan 18 20:37:34 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Sat, 18 Jan 2003 15:37:34 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: <20030118212339.B9768@prim.han.de> References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> <006a01c2bf2a$ee6dcc60$652b6992@alpha> <15913.46244.345533.819617@grendel.zope.com> <20030118212339.B9768@prim.han.de> Message-ID: <15913.47886.987147.73598@grendel.zope.com> holger krekel writes: > But isn't this a case where the number of people who *will* use > it in the future exceeds the current number of users by a factor > of 10 or 100? Hopefully! > Is it really worth the possible confusion? The suggestion wasn't that we document the old names as deprecated, but to simply not document them. Only people with existing code using the package will be using them. We certainly could add a DeprecationWarning for the warn() method. > Introducing a new module with already "deprecated" method names > seems strange to me. The catch, of course, is that we're not introducing it, we're simply adding it to the standard library. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry@python.org Sun Jan 19 01:26:52 2003 From: barry@python.org (Barry A. Warsaw) Date: Sat, 18 Jan 2003 20:26:52 -0500 Subject: [Python-Dev] Adding Japanese Codecs to the distro References: <3E267C2D.3090907@lemburg.com> <15910.50700.726791.488944@gargle.gargle.HOWL> <3E26D841.2020805@lemburg.com> <20030116184331.GA66593@fallin.lv> Message-ID: <15913.65244.572844.487037@gargle.gargle.HOWL> >>>>> "HC" == Hye-Shik Chang writes: HC> KoreanCodecs in the SF Korean Python Codecs HC> (http://sf.net/projects/koco) a.k.a KoCo is changed from PSF HC> License to LGPL on Barry's request in early 2002. Which I really appreciated because it made it easier to include the codecs in Mailman. But I won't have to worry about that if the codecs are part of Python, and a PSF license probably makes most sense for that purpose. You own the copyrights, correct? If so, there should be no problem re-licensing it under the PSF license for use in Python. Thanks! -Barry From guido@python.org Sun Jan 19 02:14:59 2003 From: guido@python.org (Guido van Rossum) Date: Sat, 18 Jan 2003 21:14:59 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: Your message of "Sat, 18 Jan 2003 19:51:01 GMT." <006a01c2bf2a$ee6dcc60$652b6992@alpha> References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> <006a01c2bf2a$ee6dcc60$652b6992@alpha> Message-ID: <200301190215.h0J2F0T11876@pcp02138704pcs.reston01.va.comcast.net> > Guido van Rossum wrote: > > I think I'd like to change 'warn' to 'warning' (and WARN to WARNING). > > Thoughts? I can implement this for Python 2.3a2. > > +1 on the name change, but it will break a fair amount of code out there > (from the emails I've been receiving). Could we leave warn and WARN in as > synonyms for now, but only mention warning and WARNING in the documentation? > Or is this against policy? I'd rather only have one name -- otherwise we'll end up supporting both names forever. In your own distro (if you keep updating it) feel free to support both; in Python, I think I'd like to use only warning/WARNING. --Guido van Rossum (home page: http://www.python.org/~guido/) From kent@lysator.liu.se Sun Jan 19 11:20:03 2003 From: kent@lysator.liu.se (kent@lysator.liu.se) Date: 19 Jan 2003 12:20:03 +0100 Subject: [Python-Dev] String method: longest common prefix of two strings Message-ID: I'd like to get some feedback on a new string method which might be useful: getting the longest common prefix of two strings. Questions: 1. First of all: YAGNI or not? I have an application where I could use this. I will be importing emails from a trouble-ticket system into another system. The body of a trouble ticket contains a log of actions. To find the new log entries, I'd like to compare the bodies of the last and next-to-last email. 2. Is there a need for a fast built-in operation coded in C? Should I use difflib instead? 3. Is s1.commonprefix(s2) OK? Or should the name be different? s1.longestcommonprefixwith(s2) is a bit awkward to write... 4. Should the method return the length of the common prefix instead of the common prefix itself? 5. Should there be a suffix version too? I would be willing to provide the necessary patches to stringobject.c, test_string.py and libstdtypes.tex. Regards, / Kent Engstr=F6m, kent@lysator.liu.se From vinay_sajip@red-dove.com Sun Jan 19 11:48:41 2003 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Sun, 19 Jan 2003 11:48:41 -0000 Subject: [Python-Dev] Re: logging package -- rename warn to warning? References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> <006a01c2bf2a$ee6dcc60$652b6992@alpha> <200301190215.h0J2F0T11876@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <000d01c2bfb0$b7e6c9e0$652b6992@alpha> [Guido] > I'd rather only have one name -- otherwise we'll end up supporting > both names forever. In your own distro (if you keep updating it) feel > free to support both; in Python, I think I'd like to use only > warning/WARNING. OK, that's fine - warning/WARNING it is. For consistency, though, we should also lose fatal/FATAL which are currently synonyms for critical/CRITICAL. Regards, Vinay From skip@manatee.mojam.com Sun Jan 19 13:00:19 2003 From: skip@manatee.mojam.com (Skip Montanaro) Date: Sun, 19 Jan 2003 07:00:19 -0600 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200301191300.h0JD0Jrl016923@manatee.mojam.com> Bug/Patch Summary ----------------- 336 open / 3224 total bugs (-3) 107 open / 1910 total patches (no change) New Bugs -------- os.popen+() can take string list and bypass shell. (2003-01-12) http://python.org/sf/666700 repr.repr not always safe (2003-01-12) http://python.org/sf/666958 import C API mess (2003-01-14) http://python.org/sf/667770 BoundaryError: multipart message with no defined boundary (2003-01-14) http://python.org/sf/667931 Abort with "negative ref count" (2003-01-15) http://python.org/sf/668433 Compiling C sources with absolute path bug (2003-01-15) http://python.org/sf/668662 Py_NewInterpreter() doesn't work (2003-01-15) http://python.org/sf/668708 classmethod does not check its arguments (2003-01-16) http://python.org/sf/668980 zipimport doesn't support prepended junk (2003-01-16) http://python.org/sf/669036 pdb user_call breakage (2003-01-17) http://python.org/sf/669692 Seg fault in gcmodule.c (2003-01-17) http://python.org/sf/669838 socket.inet_aton() succeeds on invalid input (2003-01-17) http://python.org/sf/669859 sys.exit and PYTHONINSPECT (2003-01-18) http://python.org/sf/670311 New Patches ----------- Add missing constants for IRIX al module (2003-01-13) http://python.org/sf/667548 More DictMixin (2003-01-14) http://python.org/sf/667730 doctest handles comments incorrectly (2003-01-15) http://python.org/sf/668500 Add cflags to RC compile (2003-01-16) http://python.org/sf/669198 fix memory (ref) leaks (2003-01-16) http://python.org/sf/669553 HTMLParser -- allow "," in attributes (2003-01-17) http://python.org/sf/669683 Patched test harness for logging (2003-01-18) http://python.org/sf/670390 Closed Bugs ----------- packing double yields garbage (2002-04-02) http://python.org/sf/538361 test_import crashes/hangs for MacPython (2002-06-20) http://python.org/sf/571845 Jaguar "install" does not overwrite (2002-08-30) http://python.org/sf/602398 gethostbyname("LOCALHOST") fails (2002-09-12) http://python.org/sf/608584 Mac IDE Browser / ListManager issue (2002-09-16) http://python.org/sf/610149 file URLs mis-handled by webbrowser (2002-11-26) http://python.org/sf/644246 refman: importing x.y.z as m is possible, docs say otherwise (2003-01-01) http://python.org/sf/660811 inspect.getsource bug (2003-01-02) http://python.org/sf/661184 test_strptime fails on the Mac (2003-01-02) http://python.org/sf/661354 macpath.py missing ismount splitunc (2003-01-03) http://python.org/sf/661762 str.index() exception message not consistent (2003-01-03) http://python.org/sf/661913 sys.version[:3] gives incorrect version (2003-01-05) http://python.org/sf/662701 Lib Man 2.2.6.2 word change (2003-01-07) http://python.org/sf/664044 files with long lines and an encoding crash (2003-01-09) http://python.org/sf/665014 curses causes interpreter crash (2003-01-10) http://python.org/sf/665570 Closed Patches -------------- SimpleXMLRPCServer - fixes and CGI (2001-10-22) http://python.org/sf/473586 allow py_compile to re-raise exceptions (2003-01-03) http://python.org/sf/661719 bug 661354 fix; _strptime handle OS9's lack of timezone info (2003-01-04) http://python.org/sf/662053 (Bug 660811: importing x.y.z as m is possible, docs say othe (2003-01-04) http://python.org/sf/662454 Implement FSSpec.SetDates() (2003-01-05) http://python.org/sf/662836 664044: 2.2.6.2 String formatting operations (2003-01-07) http://python.org/sf/664183 661913: inconsistent error messages between string an unicod (2003-01-07) http://python.org/sf/664192 From guido@python.org Sun Jan 19 14:42:06 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 19 Jan 2003 09:42:06 -0500 Subject: [Python-Dev] String method: longest common prefix of two strings In-Reply-To: Your message of "19 Jan 2003 12:20:03 +0100." References: Message-ID: <200301191442.h0JEg6621184@pcp02138704pcs.reston01.va.comcast.net> > I'd like to get some feedback on a new string method which might > be useful: getting the longest common prefix of two strings. If you really need this implemented as fast as possible, I'd suggest writing a C extension that does just that. I don't want to add every useful string operation that anybody ever comes up with to be added as a string method -- there are just too many of those. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Jan 19 14:44:23 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 19 Jan 2003 09:44:23 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: Your message of "Sun, 19 Jan 2003 11:48:41 GMT." <000d01c2bfb0$b7e6c9e0$652b6992@alpha> References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> <006a01c2bf2a$ee6dcc60$652b6992@alpha> <200301190215.h0J2F0T11876@pcp02138704pcs.reston01.va.comcast.net> <000d01c2bfb0$b7e6c9e0$652b6992@alpha> Message-ID: <200301191444.h0JEiNq21202@pcp02138704pcs.reston01.va.comcast.net> > OK, that's fine - warning/WARNING it is. For consistency, though, we should > also lose fatal/FATAL which are currently synonyms for critical/CRITICAL. Yes, except I find fatal/FATAL so much easier to write. Do we really need the "political correctness" of critical/CRITICAL? --Guido van Rossum (home page: http://www.python.org/~guido/) From neal@metaslash.com Sun Jan 19 15:02:23 2003 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 19 Jan 2003 10:02:23 -0500 Subject: [Python-Dev] String method: longest common prefix of two strings In-Reply-To: References: Message-ID: <20030119150223.GI28870@epoch.metaslash.com> On Sun, Jan 19, 2003 at 12:20:03PM +0100, kent@lysator.liu.se wrote: > I'd like to get some feedback on a new string method which might > be useful: getting the longest common prefix of two strings. os.path.commonprefix(sequence_of_strings) does this. Neal From neal@metaslash.com Sun Jan 19 16:14:16 2003 From: neal@metaslash.com (Neal Norwitz) Date: Sun, 19 Jan 2003 11:14:16 -0500 Subject: [Python-Dev] very slow compare of recursive objects Message-ID: <20030119161416.GK28870@epoch.metaslash.com> Could someone take a look at the patch attached to this bug report? http://python.org/sf/625698 It fixes a problem comparing recursive objects by adding a check in PyObject_RichCompare. Seems to work fine for the specific problem, however, it still doesn't help this case: a = [] b = [] for i in range(2): a.append(a) for i in range(2): b.append(b) print a == b Thanks, Neal From martin@v.loewis.de Sun Jan 19 18:14:56 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 19 Jan 2003 19:14:56 +0100 Subject: [Python-Dev] make install failing with current cvs In-Reply-To: <708CA78A-2A22-11D7-ABDC-0030655234CE@cwi.nl> References: <708CA78A-2A22-11D7-ABDC-0030655234CE@cwi.nl> Message-ID: Jack Jansen writes: > The compileall step gets TabError > exceptions for these files, and this causes it to finally exiting with > a non-zero exit status. [...] > I've looked around and I think patch 661719 has something to do with > this, but I'm not sure. It does: Any kind of SyntaxError would be ignored in compileall, now it causes a non-zero exist status (although compileall continues after the error) If this is desired, I can restore it to let SyntaxErrors pass silently again. I think it was unintentional that compileall would succeed even if there were syntax errors; compile_dir would check whether the result of py_compile.compile was 0, however, the only possible result of that function was None. Regards, Martin From DavidA@ActiveState.com Sun Jan 19 18:57:47 2003 From: DavidA@ActiveState.com (David Ascher) Date: Sun, 19 Jan 2003 10:57:47 -0800 Subject: [Python-Dev] Re: logging package -- rename warn to warning? References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> <006a01c2bf2a$ee6dcc60$652b6992@alpha> <200301190215.h0J2F0T11876@pcp02138704pcs.reston01.va.comcast.net> <000d01c2bfb0$b7e6c9e0$652b6992@alpha> <200301191444.h0JEiNq21202@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E2AF52B.8090804@ActiveState.com> Guido van Rossum wrote: >>OK, that's fine - warning/WARNING it is. For consistency, though, we should >>also lose fatal/FATAL which are currently synonyms for critical/CRITICAL. > > > Yes, except I find fatal/FATAL so much easier to write. Do we really > need the "political correctness" of critical/CRITICAL? Are you seriously arguing for saving a few keystrokes? After adding three to warn? =) It's not political correctness, it's technical correctness. I say do it right for all the calls or leave it as is (or revert to Java compatibility). --david From goodger@python.org Sun Jan 19 19:37:40 2003 From: goodger@python.org (David Goodger) Date: Sun, 19 Jan 2003 14:37:40 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: <200301191444.h0JEiNq21202@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Vinay Sajip ] >> OK, that's fine - warning/WARNING it is. For consistency, though, we should >> also lose fatal/FATAL which are currently synonyms for critical/CRITICAL. [Guido van Rossum] > Yes, except I find fatal/FATAL so much easier to write. Do we really > need the "political correctness" of critical/CRITICAL? I would characterize "fatal" as misleading, and "critical" as accurate. "Fatal" implies "death", as in process termination, or sys.exit(), or at least an exception being raised. It's an important distinction, well worth the extra 3 letters. -- David Goodger http://starship.python.net/~goodger Programmer/sysadmin for hire: http://starship.python.net/~goodger/cv From Jack.Jansen@oratrix.com Sun Jan 19 22:02:38 2003 From: Jack.Jansen@oratrix.com (Jack Jansen) Date: Sun, 19 Jan 2003 23:02:38 +0100 Subject: [Python-Dev] make install failing with current cvs In-Reply-To: Message-ID: On zondag, jan 19, 2003, at 19:14 Europe/Amsterdam, Martin v. L=F6wis=20 wrote: > Jack Jansen writes: > >> The compileall step gets TabError >> exceptions for these files, and this causes it to finally exiting = with >> a non-zero exit status. > [...] >> I've looked around and I think patch 661719 has something to do with >> this, but I'm not sure. > > It does: Any kind of SyntaxError would be ignored in compileall, now > it causes a non-zero exist status (although compileall continues after > the error) > > If this is desired, I can restore it to let SyntaxErrors pass silently > again. No, I think it's a good idea if syntax errors in the standard library=20 cause "make install" to fail. But site-python is a bit of an exception.=20= And especially in the case of Holier-Than-Thou-Tab-Usage checking. I like Guido's suggestion of using '-tt' only in Lib proper and=20 compiling site-python without it (possibly with '-t', but maybe not=20 even that?). I haven't looked at whether this is doable with=20 compileall.py, though... -- - Jack Jansen =20 http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma=20 Goldman - From martin@v.loewis.de Sun Jan 19 22:14:14 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 19 Jan 2003 23:14:14 +0100 Subject: [Python-Dev] make install failing with current cvs In-Reply-To: References: Message-ID: <3E2B2336.8020001@v.loewis.de> Jack Jansen wrote: > I like Guido's suggestion of using '-tt' only in Lib proper and > compiling site-python without it (possibly with '-t', but maybe not even > that?). I haven't looked at whether this is doable with compileall.py, > though... It is very difficult. site-packages is compiled as a side effect of compiling the entire lib directory, by means of recursive traversal. I don't think it is possible to turn of -tt at run time. It might be possible to exclude it from compilation altogether. Regards, Martin From brett@python.org Sun Jan 19 23:57:26 2003 From: brett@python.org (Brett Cannon) Date: Sun, 19 Jan 2003 15:57:26 -0800 (PST) Subject: [Python-Dev] test_email always supposed to be skipped? Message-ID: I just updated my copy of CVS, compiled, and installed and once again the test_email testing suite was skipped because regrtest.py couldn't find email.test when running it from my newly installed copy. Since the Makefile does not list email/test as a subdirectory to install I assume it is not supposed to be installed and all of this is on purpose. So why even have test_email.py installed if it is always going to fail? If the only way to run it is to use the executable compiled in your CVS directory I don't see the point of having ``make install`` copy test_email over. Would it be reasonable to add a note to test_email.py that email.test is not installed and so it being skipped is expected outside of the CVS directory? Or add some code that will just pass the test if email.test can't be imported? Or at least have it raise TestSkipped on its own and have it output that this is expected if you are not running from a copy of CVS? -Brett From tim.one@comcast.net Mon Jan 20 00:38:16 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 19 Jan 2003 19:38:16 -0500 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: <20030119161416.GK28870@epoch.metaslash.com> Message-ID: This is a multi-part message in MIME format. --Boundary_(ID_LbvhV0B4RROIr+t2rcTs3Q) Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT [Neal Norwitz] > Could someone take a look at the patch attached to this bug report? > > http://python.org/sf/625698 -1: it doesn't really solve the problem, and it's not necessarily the case anymore that x is y implies x == y > It fixes a problem comparing recursive objects by adding a check in > PyObject_RichCompare. The bug report is confusing: the original report claimed crashes, but nobody was able to reproduce that, and jaw-dropping slowness isn't a bug. > Seems to work fine for the specific problem, > however, it still doesn't help this case: > > a = [] > b = [] > for i in range(2): a.append(a) > for i in range(2): b.append(b) > print a == b That runs in an eyeblink with the attached patch, but this is delicate stuff and the attached is pure hackery. I think there are two "deep" problems with the recursive compare gimmicks: 1. Entries in the inprogress dict are cleared too early, requiring exponential time to rebuild them over & over & ... & over again. 2. By the time check_recursion() is called, compare_nesting is already larger than NESTING_LIMIT. As a result, the tuple rich comparisons *implied by* the PyDict_GetItem and PyDict_SetItem calls within check_recursion also trigger the "recursive compare" gimmicks: the very machinery used to check whether recursion is in progress ends up exacerbating the problem by doing "hidden" comparisons on the (address#1, address#2, comparison_op) tuples it uses to index the dict. So, the addresses of those tuples also end up in the inprogress dict in new tuples of their own, and so on -- it's something of a miracle that it ever stops <0.9 wink>. The attached worms around both of those without any grac, so I'm -1 on the attached too. But seeing one way that seems to get close to solving the problem "for real" may inspire someone to do it gracefully. --Boundary_(ID_LbvhV0B4RROIr+t2rcTs3Q) Content-type: text/plain; name=patch.txt Content-transfer-encoding: 7BIT Content-disposition: attachment; filename=patch.txt Index: object.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Objects/object.c,v retrieving revision 2.195 diff -u -r2.195 object.c --- object.c 13 Jan 2003 20:13:04 -0000 2.195 +++ object.c 20 Jan 2003 00:32:21 -0000 @@ -708,6 +708,7 @@ #define NESTING_LIMIT 20 static int compare_nesting = 0; +static int use_dict = 0; static PyObject* get_inprogress_dict(void) @@ -746,18 +747,21 @@ check_recursion(PyObject *v, PyObject *w, int op) { PyObject *inprogress; - PyObject *token; + PyObject *token = NULL; Py_uintptr_t iv = (Py_uintptr_t)v; Py_uintptr_t iw = (Py_uintptr_t)w; PyObject *x, *y, *z; + int save_use_dict = use_dict; + + use_dict = 0; inprogress = get_inprogress_dict(); if (inprogress == NULL) - return NULL; + goto Done; token = PyTuple_New(3); if (token == NULL) - return NULL; + goto Done; if (iv <= iw) { PyTuple_SET_ITEM(token, 0, x = PyLong_FromVoidPtr((void *)v)); @@ -771,19 +775,24 @@ PyTuple_SET_ITEM(token, 2, z = PyInt_FromLong((long)op)); if (x == NULL || y == NULL || z == NULL) { Py_DECREF(token); - return NULL; + token = NULL; + goto Done;; } if (PyDict_GetItem(inprogress, token) != NULL) { Py_DECREF(token); - return Py_None; /* Without INCREF! */ + token = Py_None; /* Without INCREF! */ + goto Done; } if (PyDict_SetItem(inprogress, token, token) < 0) { Py_DECREF(token); - return NULL; + token = NULL; + goto Done; } +Done: + use_dict = save_use_dict; return token; } @@ -802,6 +811,18 @@ Py_DECREF(token); } +static void +clear_inprogress_dict() +{ + PyObject *inprogress; + + inprogress = get_inprogress_dict(); + if (inprogress == NULL) + PyErr_Clear(); + else + PyDict_Clear(inprogress); +} + /* Compare v to w. Return -1 if v < w or exception (PyErr_Occurred() true in latter case). 0 if v == w. @@ -926,18 +947,21 @@ assert(Py_LT <= op && op <= Py_GE); compare_nesting++; - if (compare_nesting > NESTING_LIMIT && + if ((use_dict || compare_nesting > NESTING_LIMIT) && (v->ob_type->tp_as_mapping || (v->ob_type->tp_as_sequence && !PyString_Check(v) && !PyTuple_Check(v)))) { /* try to detect circular data structures */ - PyObject *token = check_recursion(v, w, op); - if (token == NULL) { + int save_compare_nesting = compare_nesting; + PyObject *token; + + use_dict = 1; + compare_nesting = NESTING_LIMIT - 2; + token = check_recursion(v, w, op); + if (token == NULL) res = NULL; - goto Done; - } else if (token == Py_None) { /* already comparing these objects with this operator. assume they're equal until shown otherwise */ @@ -952,10 +976,9 @@ } Py_XINCREF(res); } - else { + else res = do_richcmp(v, w, op); - delete_token(token); - } + compare_nesting = save_compare_nesting; goto Done; } @@ -993,6 +1016,10 @@ res = do_richcmp(v, w, op); Done: compare_nesting--; + if (use_dict && compare_nesting <= 0) { + clear_inprogress_dict(); + use_dict = 0; + } return res; } --Boundary_(ID_LbvhV0B4RROIr+t2rcTs3Q)-- From tim.one@comcast.net Mon Jan 20 00:48:36 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 19 Jan 2003 19:48:36 -0500 Subject: [Python-Dev] test_email always supposed to be skipped? In-Reply-To: Message-ID: test_email is supposed to run, whether from the build directory or from an installation. It does both on Windows today, so yours must be a platform-specific bug. > -----Original Message----- > From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On > Behalf Of Brett Cannon > Sent: Sunday, January 19, 2003 6:57 PM > To: python-dev@python.org > Subject: [Python-Dev] test_email always supposed to be skipped? > > > I just updated my copy of CVS, compiled, and installed and once again the > test_email testing suite was skipped because regrtest.py couldn't find > email.test when running it from my newly installed copy. Since the > Makefile does not list email/test as a subdirectory to install I assume > it is not supposed to be installed and all of this is on purpose. So why > even have test_email.py installed if it is always going to fail? If the > only way to run it is to use the executable compiled in your CVS > directory I don't see the point of having ``make install`` copy > test_email over. > Would it be reasonable to add a note to test_email.py that email.test is > not installed and so it being skipped is expected outside of the CVS > directory? Or add some code that will just pass the test if email.test > can't be imported? Or at least have it raise TestSkipped on its own and > have it output that this is expected if you are not running from a > copy of CVS? > > -Brett > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From bac@OCF.Berkeley.EDU Mon Jan 20 01:00:36 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Sun, 19 Jan 2003 17:00:36 -0800 (PST) Subject: [Python-Dev] test_email always supposed to be skipped? In-Reply-To: References: Message-ID: [Tim Peters] > test_email is supposed to run, whether from the build directory or from an > installation. It does both on Windows today, so yours must be a > platform-specific bug. > OK, thanks, Tim. I will see if I can get a patch for this done. -Brett From barry@python.org Mon Jan 20 01:08:28 2003 From: barry@python.org (Barry A. Warsaw) Date: Sun, 19 Jan 2003 20:08:28 -0500 Subject: [Python-Dev] test_email always supposed to be skipped? References: Message-ID: <15915.19468.533860.117395@gargle.gargle.HOWL> >>>>> "BC" == Brett Cannon writes: BC> Since the Makefile does not list email/test as a subdirectory BC> to install I assume it is not supposed to be installed and all BC> of this is on purpose. I wouldn't make that assumption. I think email/test (and email/test/data and bsddb/test) should be installed in a from-source build. I'll make the fix to Makefile.pre. -Barry From tim.one@comcast.net Mon Jan 20 01:09:14 2003 From: tim.one@comcast.net (Tim Peters) Date: Sun, 19 Jan 2003 20:09:14 -0500 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: Message-ID: Speaking of graceful approaches, I expect that this code: (v->ob_type->tp_as_mapping || (v->ob_type->tp_as_sequence && !PyString_Check(v) && !PyTuple_Check(v)))) { no longer does what it intended to do for tuples. Tuples can't be recursive, so it intended to exempt tuples from the recursive-compare machinery. But Michael Hudson added a non-NULL tp_as_mapping slot to tuples in rev 2.65 of tupleobject.c, so tuples are no longer exempted by the recursive-compare gimmick. That makes the problem in 2.3 worse than it used to be (although it's never been zippy, due to clearing inprogress dict entries while they're still potentially useful). So all the hacks I added (in my ugly patch) to keep check_recursion's tuple operations out of this code would be better done by restoring the "don't look at tuples at all" intent of the code above. From guido@python.org Mon Jan 20 01:15:09 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 19 Jan 2003 20:15:09 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: Your message of "Sun, 19 Jan 2003 10:57:47 PST." <3E2AF52B.8090804@ActiveState.com> References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> <006a01c2bf2a$ee6dcc60$652b6992@alpha> <200301190215.h0J2F0T11876@pcp02138704pcs.reston01.va.comcast.net> <000d01c2bfb0$b7e6c9e0$652b6992@alpha> <200301191444.h0JEiNq21202@pcp02138704pcs.reston01.va.comcast.net> <3E2AF52B.8090804@ActiveState.com> Message-ID: <200301200115.h0K1F9s23575@pcp02138704pcs.reston01.va.comcast.net> > >>OK, that's fine - warning/WARNING it is. For consistency, though, we should > >>also lose fatal/FATAL which are currently synonyms for critical/CRITICAL. > > > > Yes, except I find fatal/FATAL so much easier to write. Do we really > > need the "political correctness" of critical/CRITICAL? > > Are you seriously arguing for saving a few keystrokes? After adding > three to warn? =) Not at all. I think I've seen many loggers with a series of levels going something like warning, error, fatal. I've never seen 'critical' before in this context. Thus I find that it takes more mental work to remember or interpret 'critical' than 'fatal'. > It's not political correctness, it's technical correctness. I say do it > right for all the calls or leave it as is (or revert to Java compatibility). I know that 'critical' is technically more correct, but somehow I don't care. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jan 20 01:17:05 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 19 Jan 2003 20:17:05 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: Your message of "Sun, 19 Jan 2003 14:37:40 EST." References: Message-ID: <200301200117.h0K1H5V23597@pcp02138704pcs.reston01.va.comcast.net> > I would characterize "fatal" as misleading, and "critical" as > accurate. "Fatal" implies "death", as in process termination, or > sys.exit(), or at least an exception being raised. It's an > important distinction, well worth the extra 3 letters. I expected this response, hence my (joking) use of "political correctness". I don't think the subtlety is worth using a less common word. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Jan 20 01:31:35 2003 From: guido@python.org (Guido van Rossum) Date: Sun, 19 Jan 2003 20:31:35 -0500 Subject: [Python-Dev] make install failing with current cvs In-Reply-To: Your message of "Sun, 19 Jan 2003 23:14:14 +0100." <3E2B2336.8020001@v.loewis.de> References: <3E2B2336.8020001@v.loewis.de> Message-ID: <200301200131.h0K1VZw23710@pcp02138704pcs.reston01.va.comcast.net> > It is very difficult. site-packages is compiled as a side effect of > compiling the entire lib directory, by means of recursive traversal. > I don't think it is possible to turn of -tt at run time. It might be > possible to exclude it from compilation altogether. Would this work? *** compileall.py 16 Jan 2003 11:02:43 -0000 1.14 --- compileall.py 20 Jan 2003 01:31:13 -0000 *************** *** 87,93 **** Arguments (all optional): ! skip_curdir: if true, skip current directory (default true) maxlevels: max recursion level (default 0) force: as for compile_dir() (default 0) quiet: as for compile_dir() (default 0) --- 87,93 ---- Arguments (all optional): ! skip_curdir: if true, skip current directory and site-package (default true) maxlevels: max recursion level (default 0) force: as for compile_dir() (default 0) quiet: as for compile_dir() (default 0) *************** *** 97,102 **** --- 97,104 ---- for dir in sys.path: if (not dir or dir == os.curdir) and skip_curdir: print 'Skipping current directory' + elif os.path.basename(dir).lower() == 'site-packages' and skip_curdir: + print 'Skipping site-packages' else: success = success and compile_dir(dir, maxlevels, None, force, quiet=quiet) --Guido van Rossum (home page: http://www.python.org/~guido/) From bac@OCF.Berkeley.EDU Mon Jan 20 01:37:52 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Sun, 19 Jan 2003 17:37:52 -0800 (PST) Subject: [Python-Dev] test_email always supposed to be skipped? In-Reply-To: <15915.19468.533860.117395@gargle.gargle.HOWL> References: <15915.19468.533860.117395@gargle.gargle.HOWL> Message-ID: [Barry A. Warsaw] > > >>>>> "BC" == Brett Cannon writes: > > BC> Since the Makefile does not list email/test as a subdirectory > BC> to install I assume it is not supposed to be installed and all > BC> of this is on purpose. > > I wouldn't make that assumption. I think email/test (and > email/test/data and bsddb/test) should be installed in a from-source > build. I'll make the fix to Makefile.pre. > I was checking the CVS web interface to see if I could peek at the 2.2-maint branch version and noticed you had already fixed it. =) Had a patch ready to go, too. Must be slow today. =) Should this be backported to 2.2-maint? -Brett From barry@python.org Mon Jan 20 01:50:20 2003 From: barry@python.org (Barry A. Warsaw) Date: Sun, 19 Jan 2003 20:50:20 -0500 Subject: [Python-Dev] test_email always supposed to be skipped? References: <15915.19468.533860.117395@gargle.gargle.HOWL> Message-ID: <15915.21980.867296.135842@gargle.gargle.HOWL> >>>>> "BC" == Brett Cannon writes: BC> I was checking the CVS web interface to see if I could peek at BC> the 2.2-maint branch version and noticed you had already fixed BC> it. =) Had a patch ready to go, too. Must be slow today. =) BC> Should this be backported to 2.2-maint? The email/test and email/test/data fix should, but not bsddb/test. I'll do that next. -Barry From fdrake@acm.org Mon Jan 20 01:54:38 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Sun, 19 Jan 2003 20:54:38 -0500 Subject: [Python-Dev] make install failing with current cvs In-Reply-To: <3E2B2336.8020001@v.loewis.de> References: <3E2B2336.8020001@v.loewis.de> Message-ID: <15915.22238.238654.610860@grendel.zope.com> "Martin v. L=F6wis" writes: > It is very difficult. site-packages is compiled as a side effect of=20= > compiling the entire lib directory, by means of recursive traversal.= > I don't think it is possible to turn of -tt at run time. It might be= =20 > possible to exclude it from compilation altogether. Just not doing anything with site-packages makes more sense to me anyway, so this seems preferable. -Fred --=20 Fred L. Drake, Jr. PythonLabs at Zope Corporation From martin@v.loewis.de Mon Jan 20 08:39:16 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 20 Jan 2003 09:39:16 +0100 Subject: [Python-Dev] make install failing with current cvs In-Reply-To: <200301200131.h0K1VZw23710@pcp02138704pcs.reston01.va.comcast.net> References: <3E2B2336.8020001@v.loewis.de> <200301200131.h0K1VZw23710@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > Would this work? I think not: During installation, compileall is invoked with =A7(LIBDEST), so compile_path isn't used. What might work is to write -x "badsyntax|site-packages" in Makefile.pre.in to skip compilation. Regards, Martin From Jack.Jansen@cwi.nl Mon Jan 20 10:46:53 2003 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Mon, 20 Jan 2003 11:46:53 +0100 Subject: [Python-Dev] make install failing with current cvs In-Reply-To: Message-ID: <7D00E102-2C64-11D7-95BB-0030655234CE@cwi.nl> On Monday, Jan 20, 2003, at 09:39 Europe/Amsterdam, Martin v. L=F6wis=20 wrote: > Guido van Rossum writes: > >> Would this work? > > I think not: During installation, compileall is invoked with > =A7(LIBDEST), so compile_path isn't used. What might work is to write > > -x "badsyntax|site-packages" > > in Makefile.pre.in to skip compilation. this works fine. The only question remaining is: do we want to=20 compileall site-python with just -t, or do we want to skip it altogether, as Fred suggested? -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma=20 Goldman From guido@python.org Mon Jan 20 13:03:50 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 20 Jan 2003 08:03:50 -0500 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: Your message of "Sun, 19 Jan 2003 20:09:14 EST." References: Message-ID: <200301201303.h0KD3pn25577@pcp02138704pcs.reston01.va.comcast.net> > Speaking of graceful approaches, I expect that this code: > > (v->ob_type->tp_as_mapping > || (v->ob_type->tp_as_sequence > && !PyString_Check(v) > && !PyTuple_Check(v)))) { > > no longer does what it intended to do for tuples. Tuples can't be > recursive, Oh yes they can be: >>> L = [] >>> t = (L, L) >>> L.append(L) >>> > so it intended to exempt tuples from the recursive-compare > machinery. But Michael Hudson added a non-NULL tp_as_mapping slot to tuples > in rev 2.65 of tupleobject.c, so tuples are no longer exempted by the > recursive-compare gimmick. That makes the problem in 2.3 worse than it used > to be (although it's never been zippy, due to clearing inprogress dict > entries while they're still potentially useful). So all the hacks I added > (in my ugly patch) to keep check_recursion's tuple operations out of this > code would be better done by restoring the "don't look at tuples at all" > intent of the code above. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@comcast.net Mon Jan 20 15:32:50 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 20 Jan 2003 10:32:50 -0500 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: <200301201303.h0KD3pn25577@pcp02138704pcs.reston01.va.comcast.net> Message-ID: [Tim] >Tuples can't be recursive, [Guido] > Oh yes they can be: > > >>> L = [] > >>> t = (L, L) > >>> L.append(L) > >>> That's an example of a tuple *containing* a recursive structure; there's still no way, starting at t, to get back to t. I think mutation is required for that. Like L = [None] t = (L, L) L[0] = t Then t[0]0] is t and that's what I mean by "recursive tuple" (a tuple that can be reached from itself). It's not clear that this matters to the recursion-checker, though. If it indeed requires mutation to create a recursive tuple in the sense I mean it, then nesting of tuples alone can't create one. Stick a mutable container type into the mix, and the recursion-checker will eventually find that. I expect we want to continue exempting deeply nested tuples, since they come up in real life (e.g., Python "parse trees" routinely exceed NESTING_LIMIT). From barry@python.org Mon Jan 20 15:44:23 2003 From: barry@python.org (Barry A. Warsaw) Date: Mon, 20 Jan 2003 10:44:23 -0500 Subject: [Python-Dev] very slow compare of recursive objects References: <200301201303.h0KD3pn25577@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15916.6487.774280.701231@gargle.gargle.HOWL> >>>>> "TP" == Tim Peters writes: TP> That's an example of a tuple *containing* a recursive TP> structure; there's still no way, starting at t, to get back to TP> t. Except, of course, at the C level. Do we care about that? -Barry From neal@metaslash.com Mon Jan 20 15:50:17 2003 From: neal@metaslash.com (Neal Norwitz) Date: Mon, 20 Jan 2003 10:50:17 -0500 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: References: <200301201303.h0KD3pn25577@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030120155017.GN28870@epoch.metaslash.com> On Mon, Jan 20, 2003 at 10:32:50AM -0500, Tim Peters wrote: > [Tim] > >Tuples can't be recursive, > > [Guido] > > Oh yes they can be: > > > > >>> L = [] > > >>> t = (L, L) > > >>> L.append(L) > > >>> > > That's an example of a tuple *containing* a recursive structure; there's > still no way, starting at t, to get back to t. I think mutation is required > for that. Like > > L = [None] > t = (L, L) > L[0] = t > > Then > > t[0]0] is t > > and that's what I mean by "recursive tuple" (a tuple that can be reached > from itself). What about: >>> t = ([],) >>> t[0].append(t) >>> print t ([([...],)],) >>> print t[0][0] is t True Not sure if that's a tuple containing a recursive structure or a recursive tuple. Neal From jepler@unpythonic.net Mon Jan 20 15:53:15 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Mon, 20 Jan 2003 09:53:15 -0600 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: References: <200301201303.h0KD3pn25577@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030120155314.GC7736@unpythonic.net> On Mon, Jan 20, 2003 at 10:32:50AM -0500, Tim Peters wrote: > [Tim] > >Tuples can't be recursive, > > [Guido] > > Oh yes they can be: > > > > >>> L = [] > > >>> t = (L, L) > > >>> L.append(L) > > >>> > > That's an example of a tuple *containing* a recursive structure; there's > still no way, starting at t, to get back to t. I think mutation is required > for that. Like > > L = [None] > t = (L, L) > L[0] = t > > Then > > t[0]0] is t > > and that's what I mean by "recursive tuple" (a tuple that can be reached > from itself). Is this an example of the elusive "recursive tuple"? class X(tuple): def __getitem__(self, idx): if idx == 0: return self return tuple.__getitem__(self, idx) def __len__(self): return min(1, tuple.__len__(self)) >>> x = X([None]) >>> print len(x), x[0] is x, x==x 1 True True >>> print x (None,) I'm also a bit confused by the last line. I guess the builtin tuple.__repr__ uses the C-level API to access tuple items, not the real __getitem__ slot? pprint shows the same thing, though. Jeff From tim.one@comcast.net Mon Jan 20 15:57:06 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 20 Jan 2003 10:57:06 -0500 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: <15916.6487.774280.701231@gargle.gargle.HOWL> Message-ID: [Tim] > That's an example of a tuple *containing* a recursive > structure; there's still no way, starting at t, to get back to t. [Barry] > Except, of course, at the C level. I'm unclear on your meaning. In the specific example, you couldn't get from t to t period. > Do we care about that? It may depend on what you really meant . If you're talking about creating recursive tuples via mutation in C, where tuples are the only container type involved, then 2.2.2 and 2.1.3 are broken now (they may blow the stack while comparing such beasts). From tim.one@comcast.net Mon Jan 20 15:59:30 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 20 Jan 2003 10:59:30 -0500 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: <20030120155017.GN28870@epoch.metaslash.com> Message-ID: [Neal Norwitz] > What about: > > >>> t = ([],) > >>> t[0].append(t) > >>> print t > ([([...],)],) > >>> print t[0][0] is t > True > > Not sure if that's a tuple containing a recursive structure or a > recursive tuple. It's both, as was my example too. Note that it required mutation of a mutable container object to create it. From mwh@python.net Mon Jan 20 16:11:09 2003 From: mwh@python.net (Michael Hudson) Date: 20 Jan 2003 16:11:09 +0000 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: Tim Peters's message of "Mon, 20 Jan 2003 10:57:06 -0500" References: Message-ID: <2my95fsswy.fsf@starship.python.net> Tim Peters writes: > It may depend on what you really meant . If you're talking about > creating recursive tuples via mutation in C, where tuples are the only > container type involved, then 2.2.2 and 2.1.3 are broken now (they may blow > the stack while comparing such beasts). That's what I thought Barry meant, and if so I really don't think we care. It's not like we can prevent all misbehaving third party C code from segfaulting... Cheers, M. -- It could be argued that since Suitespot is infinitely terrible, that anything else, by very definition of being anything else, is infinitely superior. -- ".", alt.sysadmin.recovery From tim.one@comcast.net Mon Jan 20 16:17:23 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 20 Jan 2003 11:17:23 -0500 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: <20030120155314.GC7736@unpythonic.net> Message-ID: [Jeff Epler] > Is this an example of the elusive "recursive tuple"? It's an example of a tuple subclass; we already have examples of recursive tuples (without subclassing). > class X(tuple): > def __getitem__(self, idx): > if idx == 0: return self > return tuple.__getitem__(self, idx) > > def __len__(self): > return min(1, tuple.__len__(self)) > > >>> x = X([None]) > >>> print len(x), x[0] is x, x==x > 1 True True > >>> print x > (None,) > > I'm also a bit confused by the last line. I guess the builtin > tuple.__repr__ uses the C-level API to access tuple items, not the real > __getitem__ slot? pprint shows the same thing, though. It's off-topic for me. You have the source code . The key to both is likely what happens when doing for y in x: print y From barry@python.org Mon Jan 20 16:26:10 2003 From: barry@python.org (Barry A. Warsaw) Date: Mon, 20 Jan 2003 11:26:10 -0500 Subject: [Python-Dev] very slow compare of recursive objects References: <15916.6487.774280.701231@gargle.gargle.HOWL> Message-ID: <15916.8994.197885.427421@gargle.gargle.HOWL> >>>>> "TP" == Tim Peters writes: TP> If you're talking about creating recursive tuples via mutation TP> in C, where tuples are the only container type involved, then TP> 2.2.2 and 2.1.3 are broken now (they may blow the stack while TP> comparing such beasts). Yep, that's all I meant. I guess if we didn't care for 2.2.2 and 2.1.3, we probably still don't care. -Barry From martin@v.loewis.de Mon Jan 20 15:53:51 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 20 Jan 2003 16:53:51 +0100 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: References: Message-ID: Tim Peters writes: > > Oh yes they can be: > > > > >>> L = [] > > >>> t = (L, L) > > >>> L.append(L) > > >>> > > That's an example of a tuple *containing* a recursive structure; there's > still no way, starting at t, to get back to t. I think mutation is required > for that. Like > > L = [None] > t = (L, L) > L[0] = t Or, using Guido's example: >>> L=[] >>> t=(L,) >>> L.append(t) >>> t[0][0] is t True > It's not clear that this matters to the recursion-checker, though. It doesn't: You can't create a tuple that contains itself, except on the C level (which I would declare a bug in the C code that does so). So any cycle involving a tuple must involve a non-tuple object as well. Regards, Martin From jepler@unpythonic.net Mon Jan 20 16:49:37 2003 From: jepler@unpythonic.net (Jeff Epler) Date: Mon, 20 Jan 2003 10:49:37 -0600 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: References: Message-ID: <20030120164936.GE7736@unpythonic.net> On Mon, Jan 20, 2003 at 04:53:51PM +0100, Martin v. L=F6wis wrote: > It doesn't: You can't create a tuple that contains itself, except on > the C level (which I would declare a bug in the C code that does so). > So any cycle involving a tuple must involve a non-tuple object as > well. But a tuple subclass can. Does the code use PyTuple_Check or PyTuple_CheckExact? Jeff From tim.one@comcast.net Mon Jan 20 17:03:26 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 20 Jan 2003 12:03:26 -0500 Subject: [Python-Dev] very slow compare of recursive objects In-Reply-To: <20030120164936.GE7736@unpythonic.net> Message-ID: [Martin v. L=F6wis] > It doesn't: You can't create a tuple that contains itself, except o= n > the C level (which I would declare a bug in the C code that does so= ). > So any cycle involving a tuple must involve a non-tuple object as > well. [Jeff Epler] > But a tuple subclass can. Does the code use PyTuple_Check or > PyTuple_CheckExact? It did use PyTuple_Check, and still does in 2.2. Likewise PyString_C= heck. I already changed 2.3 to use the XYZ_CheckExact() spellings instead. Leaving repair of 2.2 to someone who cares more than I do . WRT speed, this isn't going to get faster unless something akin to my original patch (posted here) is used to keep on using the inprogress = dict even when compare_nesting falls below NESTING_LIMIT again. The patch= I posted here wasn't safe, because it can't know that the saved-away ad= dresses still refer to the same objects. That could be repaired by changing = the inprogress dict to map "tokens" to comparand pairs (instead of to themselves). The references to the comparand objects in the dict val= ues would ensure the objects stayed alive until the dict entry was cleare= d. I'm out of time for this, so if someone cares enough, that's what to do. From vinay_sajip@red-dove.com Mon Jan 20 20:07:29 2003 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Mon, 20 Jan 2003 20:07:29 -0000 Subject: [Python-Dev] Re: logging package -- rename warn to warning? References: <200301200117.h0K1H5V23597@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <005c01c2c0bf$90b40a80$652b6992@alpha> > > I would characterize "fatal" as misleading, and "critical" as > > accurate. "Fatal" implies "death", as in process termination, or > > sys.exit(), or at least an exception being raised. It's an > > important distinction, well worth the extra 3 letters. > > I expected this response, hence my (joking) use of "political > correctness". I don't think the subtlety is worth using a less common > word. How about severe/SEVERE - a fairly common word, and only one letter more to type than fatal? Vinay From goodger@python.org Mon Jan 20 20:45:50 2003 From: goodger@python.org (David Goodger) Date: Mon, 20 Jan 2003 15:45:50 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: <005c01c2c0bf$90b40a80$652b6992@alpha> Message-ID: Vinay Sajip wrote: > How about severe/SEVERE - a fairly common word, and only one letter > more to type than fatal? +1 Great idea! But I'm biased: it was my suggestion long ago, and what we currently use in Docutils. -- David Goodger http://starship.python.net/~goodger Programmer/sysadmin for hire: http://starship.python.net/~goodger/cv From tim.one@comcast.net Mon Jan 20 21:50:14 2003 From: tim.one@comcast.net (Tim Peters) Date: Mon, 20 Jan 2003 16:50:14 -0500 Subject: [Python-Dev] Re: very slow compare of recursive objects Message-ID: The OP added a cute example to http://python.org/sf/625698 It's a recursive object that should compare equal to itself if and only if it does not compare equal to itself -- the Russell's Paradox of Python comparisons. As things stand, whether a==a (for this object) returns True or False depends on the parity (odd or even) of object.c's NESTING_LIMIT. I vote "give up". The graph isomorphism business is cute but has no practical application I've ever seen. Even without the paradoxes, the 2-element list example takes about 4 seconds to return True now, and I estimate a 3-element list would take 3-4 hours, and a 4-element list about 48 days. Python isn't checking for KeyboardInterrupt during this, so killing the process is the only way to stop it. I'd rather raise a "can't compare recursive objects" exception as soon as recursion is detected. From neal@metaslash.com Mon Jan 20 22:32:04 2003 From: neal@metaslash.com (Neal Norwitz) Date: Mon, 20 Jan 2003 17:32:04 -0500 Subject: [Python-Dev] disable writing .py[co] Message-ID: <20030120223204.GP28870@epoch.metaslash.com> In http://python.org/sf/602345 (option for not writing .py[co] files) Martin suggested that in addition to a flag (-R, as in read-only), perhaps there should be an env't variable and/or a sys attribute which controls whether .py[co] files are written or not. Opinions? If you like the idea of an env't variable or sys attribute, do you have suggestions for names? PYTHONREADONLY and sys.readonly_import? Neal From guido@python.org Mon Jan 20 22:42:29 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 20 Jan 2003 17:42:29 -0500 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: Your message of "Mon, 20 Jan 2003 17:32:04 EST." <20030120223204.GP28870@epoch.metaslash.com> References: <20030120223204.GP28870@epoch.metaslash.com> Message-ID: <200301202242.h0KMgT014476@odiug.zope.com> > In http://python.org/sf/602345 (option for not writing .py[co] files) > Martin suggested that in addition to a flag (-R, as in read-only), > perhaps there should be an env't variable and/or a sys attribute > which controls whether .py[co] files are written or not. > > Opinions? > > If you like the idea of an env't variable or sys attribute, do you > have suggestions for names? PYTHONREADONLY and sys.readonly_import? Traditionally, this has been the area of environment variables (tested at the time when main() parses its arguments) and command line options. Do you really need to be able to control this dynamically during program execution? --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@python.org Mon Jan 20 22:40:24 2003 From: barry@python.org (Barry A. Warsaw) Date: Mon, 20 Jan 2003 17:40:24 -0500 Subject: [Python-Dev] disable writing .py[co] References: <20030120223204.GP28870@epoch.metaslash.com> Message-ID: <15916.31448.56084.872729@gargle.gargle.HOWL> >>>>> "NN" == Neal Norwitz writes: NN> In http://python.org/sf/602345 (option for not writing .py[co] NN> files) Martin suggested that in addition to a flag (-R, as in NN> read-only), NN> perhaps there should be an env't variable -1. Do we really need more of these? NN> and/or a sys attribute +0 No opinions on the sys attr name. -Barry From martin@v.loewis.de Mon Jan 20 22:57:47 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 20 Jan 2003 23:57:47 +0100 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <200301202242.h0KMgT014476@odiug.zope.com> References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> Message-ID: <3E2C7EEB.3020209@v.loewis.de> Guido van Rossum wrote: > Traditionally, this has been the area of environment variables (tested > at the time when main() parses its arguments) and command line > options. Do you really need to be able to control this dynamically > during program execution? No need is real. The specific requirement comes from http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=96111 where a user complains that mailman writes into /usr (writing actually failed, but a Secure Linux kernel detected the problem, and the user wants to silence the warning, implementing some policy). It turns out that this is Python trying to write .pyc files. It would be desirable to turn pyc generation completely off for mailman. This could be done best through actually modifying the mailman source code. Setting an environment variable is less convenient, as you then have to find all places where mailman scripts are invoked, or have to wrap all mailman scripts. Of course, if people think YAGNI, this could be left out in 2.3 and only be added in 2.4 if there is high demand. Regards, Martin From neal@metaslash.com Mon Jan 20 22:57:33 2003 From: neal@metaslash.com (Neal Norwitz) Date: Mon, 20 Jan 2003 17:57:33 -0500 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <200301202242.h0KMgT014476@odiug.zope.com> References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> Message-ID: <20030120225733.GR28870@epoch.metaslash.com> On Mon, Jan 20, 2003 at 05:42:29PM -0500, Guido van Rossum wrote: > > In http://python.org/sf/602345 (option for not writing .py[co] files) > > Martin suggested that in addition to a flag (-R, as in read-only), > > perhaps there should be an env't variable and/or a sys attribute > > which controls whether .py[co] files are written or not. > > > > Opinions? > > > > If you like the idea of an env't variable or sys attribute, do you > > have suggestions for names? PYTHONREADONLY and sys.readonly_import? > > Traditionally, this has been the area of environment variables (tested > at the time when main() parses its arguments) and command line > options. Do you really need to be able to control this dynamically > during program execution? I don't need the feature at all. :-) It was not requested in the original bug report. Paul Dubois, who also wanted this feature, didn't indicate it was desirable. So unless someone speaks up, it's YAGNI. I've updated the man page, and will add an entry to NEWS. Is there any other doc that should be added? Neal From barry@python.org Mon Jan 20 23:09:25 2003 From: barry@python.org (Barry A. Warsaw) Date: Mon, 20 Jan 2003 18:09:25 -0500 Subject: [Python-Dev] disable writing .py[co] References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> Message-ID: <15916.33189.872024.702720@gargle.gargle.HOWL> >>>>> "MvL" =3D=3D Martin v L=F6wis writes: MvL> No need is real. The specific requirement comes from MvL> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D96111 MvL> where a user complains that mailman writes into /usr (writing MvL> actually failed, but a Secure Linux kernel detected the MvL> problem, and the user wants to silence the warning, MvL> implementing some policy). MvL> It turns out that this is Python trying to write .pyc MvL> files. It would be desirable to turn pyc generation MvL> completely off for mailman. This could be done best through MvL> actually modifying the mailman source code. Setting an MvL> environment variable is less convenient, as you then have to MvL> find all places where mailman scripts are invoked, or have to MvL> wrap all mailman scripts. That particular bug is because the debian package isn't/wasn't complete. In a from-source installation of Mailman, we do indeed run compileall.py on all the Python files. From skimming the debian bug report, it looks like they fixed their post-install script to do the same. -Barry From skip@pobox.com Tue Jan 21 12:56:56 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 21 Jan 2003 06:56:56 -0600 Subject: [Python-Dev] disable writing .py[co] Message-ID: <15917.17304.915965.539931@montanaro.dyndns.org> Here's a slightly different alternative. (It woke me up this morning, so I know it's a good idea. ;-) Instead of an environment variable which functions simply as an on/off switch, add an environment variable named PYCROOT which can be used to control writing of .pyc files as follows: * if not present, status quo * if present and refers to an existing directory, prepend PYCROOT to any attempts to read or write .pyc files. * if present and refers to a non-existent or non-directory file, disable writing .pyc files altogether. All that happens is that when you go to read or write a .pyc file is that you prepend PYCROOT to the full path to the .py source file in addition to adding a 'c' to the end. Pros: * it allows people to run with .py files on read-only filesystems but still gain the benefits of using .pyc files * on systems with ram-based file systems (such as /tmp on Solaris), you can gain a performance boost. * it's easy to suppress writing .pyc files altogether. Cons: * it's not obvious (to me) what the semantics should be on multi-root systems like Windows (I can see a couple alternatives). * slightly more complexity added to import. Skip From martin@v.loewis.de Tue Jan 21 13:08:15 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 21 Jan 2003 14:08:15 +0100 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <15917.17304.915965.539931@montanaro.dyndns.org> References: <15917.17304.915965.539931@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > Cons: > > * it's not obvious (to me) what the semantics should be on multi-root > systems like Windows (I can see a couple alternatives). > > * slightly more complexity added to import. * Doesn't solve the original problem: many processors writing to the same file system (unless you manage to set an environment variable differently on each node). Regards, Martin From mwh@python.net Tue Jan 21 13:12:27 2003 From: mwh@python.net (Michael Hudson) Date: 21 Jan 2003 13:12:27 +0000 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: Skip Montanaro's message of "Tue, 21 Jan 2003 06:56:56 -0600" References: <15917.17304.915965.539931@montanaro.dyndns.org> Message-ID: <2mof6asl38.fsf@starship.python.net> Skip Montanaro writes: > Here's a slightly different alternative. (It woke me up this morning, so I > know it's a good idea. ;-) > > Instead of an environment variable which functions simply as an on/off > switch, add an environment variable named PYCROOT which can be used to > control writing of .pyc files as follows: > > * if not present, status quo > > * if present and refers to an existing directory, prepend PYCROOT to any > attempts to read or write .pyc files. The idea of writing .pycs to a world writable area (say /tmp) on a multi-user system sounds like a Bad Thing. Cheers, M. -- The rapid establishment of social ties, even of a fleeting nature, advance not only that goal but its standing in the uberconscious mesh of communal psychic, subjective, and algorithmic interbeing. But I fear I'm restating the obvious. -- Will Ware, comp.lang.python From skip@pobox.com Tue Jan 21 13:47:14 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 21 Jan 2003 07:47:14 -0600 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <2mof6asl38.fsf@starship.python.net> References: <15917.17304.915965.539931@montanaro.dyndns.org> <2mof6asl38.fsf@starship.python.net> Message-ID: <15917.20322.692401.123150@montanaro.dyndns.org> Michael> The idea of writing .pycs to a world writable area (say /tmp) Michael> on a multi-user system sounds like a Bad Thing. As I mentioned in my original note, you'd prepend PYCROOT to the .py file and append 'c' to create a filename for the .pyc file. If socket.py was found in /usr/lib/python2.3/socket.py and PYCROOT was set to /tmp, you'd try to read from/write to /tmp/usr/lib/python2.3/socket.pyc. The only requirement on PYCROOT would be that /tmp would have to exist. The user wouldn't be responsible for creating the full directory tree underneath /tmp. Skip From pyth@devel.trillke.net Tue Jan 21 13:47:41 2003 From: pyth@devel.trillke.net (holger krekel) Date: Tue, 21 Jan 2003 14:47:41 +0100 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <2mof6asl38.fsf@starship.python.net>; from mwh@python.net on Tue, Jan 21, 2003 at 01:12:27PM +0000 References: <15917.17304.915965.539931@montanaro.dyndns.org> <2mof6asl38.fsf@starship.python.net> Message-ID: <20030121144741.F2724@prim.han.de> Michael Hudson wrote: > Skip Montanaro writes: > > > Here's a slightly different alternative. (It woke me up this morning, so I > > know it's a good idea. ;-) > > > > Instead of an environment variable which functions simply as an on/off > > switch, add an environment variable named PYCROOT which can be used to > > control writing of .pyc files as follows: > > > > * if not present, status quo > > > > * if present and refers to an existing directory, prepend PYCROOT to any > > attempts to read or write .pyc files. > > The idea of writing .pycs to a world writable area (say /tmp) on a > multi-user system sounds like a Bad Thing. Then don't do it :-) It's very well possible to set an environment variable for each user and point it to a per-user area (with proper permissions and all). If adding an environment variable at all then i think Skip's idea has some virtues. Actually i'd prefer to have a runtime call to control writing and/or usage of .pyc files. holger From mwh@python.net Tue Jan 21 13:54:07 2003 From: mwh@python.net (Michael Hudson) Date: 21 Jan 2003 13:54:07 +0000 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: Skip Montanaro's message of "Tue, 21 Jan 2003 07:47:14 -0600" References: <15917.17304.915965.539931@montanaro.dyndns.org> <2mof6asl38.fsf@starship.python.net> <15917.20322.692401.123150@montanaro.dyndns.org> Message-ID: <2mlm1esj5s.fsf@starship.python.net> Skip Montanaro writes: > Michael> The idea of writing .pycs to a world writable area (say /tmp) > Michael> on a multi-user system sounds like a Bad Thing. > > As I mentioned in my original note, you'd prepend PYCROOT to the .py file > and append 'c' to create a filename for the .pyc file. If socket.py was > found in /usr/lib/python2.3/socket.py and PYCROOT was set to /tmp, you'd try > to read from/write to /tmp/usr/lib/python2.3/socket.pyc. The only > requirement on PYCROOT would be that /tmp would have to exist. The user > wouldn't be responsible for creating the full directory tree underneath > /tmp. Nooo, it was the security implications that bothered me. I'm still somewhat confused by the need for this change beyond coping with dubious installations that shouldn't be our problem. Cheers, M. -- surely, somewhere, somehow, in the history of computing, at least one manual has been written that you could at least remotely attempt to consider possibly glancing at. -- Adam Rixey From barry@python.org Tue Jan 21 14:18:40 2003 From: barry@python.org (Barry A. Warsaw) Date: Tue, 21 Jan 2003 09:18:40 -0500 Subject: [Python-Dev] disable writing .py[co] References: <15917.17304.915965.539931@montanaro.dyndns.org> Message-ID: <15917.22208.679609.313196@gargle.gargle.HOWL> >>>>> "SM" == Skip Montanaro writes: SM> Instead of an environment variable which functions simply as SM> an on/off switch, add an environment variable named PYCROOT SM> which can be used to control writing of .pyc files as follows: I'm unsure about the environment variable, but given some agreed upon way(s) to control this, and leaving it up to the applications/users to decide the right place in the fs to write the .pyc's, there is merit in this idea. E.g. Mailman can already be configured to install its read-only parts in one place and its variable data in another. IWBNI Mailman could arrange to write its pycs in the var directory. -Barry From skip@pobox.com Tue Jan 21 14:22:41 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 21 Jan 2003 08:22:41 -0600 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <20030121144741.F2724@prim.han.de> References: <15917.17304.915965.539931@montanaro.dyndns.org> <2mof6asl38.fsf@starship.python.net> <20030121144741.F2724@prim.han.de> Message-ID: <15917.22449.558990.200127@montanaro.dyndns.org> holger> Actually i'd prefer to have a runtime call to control writing holger> and/or usage of .pyc files. Initialize sys.pycroot to os.environ["PYCROOT"]? S From skip@pobox.com Tue Jan 21 14:25:05 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 21 Jan 2003 08:25:05 -0600 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src Makefile.pre.in,1.111,1.112 In-Reply-To: References: Message-ID: <15917.22593.268349.922944@montanaro.dyndns.org> jack> Modified Files: jack> Makefile.pre.in jack> Log Message: jack> Compile site-packages with -t, not -tt. ... I notice that a compileall.py run is still made with the -O flag. Is this really necessary with the demise of SET_LINENO? If a second compileall.py run is going to be made, it seems like it ought to be with -OO (delete docstrings) to get the full effect. Skip From theller@python.net Tue Jan 21 14:36:41 2003 From: theller@python.net (Thomas Heller) Date: 21 Jan 2003 15:36:41 +0100 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <15917.17304.915965.539931@montanaro.dyndns.org> References: <15917.17304.915965.539931@montanaro.dyndns.org> Message-ID: <3cnmlgcm.fsf@python.net> Skip Montanaro writes: > All that happens is that when you go to read or write a .pyc file is that > you prepend PYCROOT to the full path to the .py source file in addition to > adding a 'c' to the end. Wouldn't that place all the (incompatible) pyc files in the same directory? Most of the time I have several Python versions installed... > Cons: > > * it's not obvious (to me) what the semantics should be on multi-root > systems like Windows (I can see a couple alternatives). I cannot understand this sentence. What do you mean? Thomas From pyth@devel.trillke.net Tue Jan 21 14:47:02 2003 From: pyth@devel.trillke.net (holger krekel) Date: Tue, 21 Jan 2003 15:47:02 +0100 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <15917.22449.558990.200127@montanaro.dyndns.org>; from skip@pobox.com on Tue, Jan 21, 2003 at 08:22:41AM -0600 References: <15917.17304.915965.539931@montanaro.dyndns.org> <2mof6asl38.fsf@starship.python.net> <20030121144741.F2724@prim.han.de> <15917.22449.558990.200127@montanaro.dyndns.org> Message-ID: <20030121154702.G2724@prim.han.de> Skip Montanaro wrote: > > holger> Actually i'd prefer to have a runtime call to control writing > holger> and/or usage of .pyc files. > > Initialize sys.pycroot to os.environ["PYCROOT"]? I'd like to hear e.g. Just's oppinion on it. Doesn't this interfere with the (new) import mechanisms? I would think that a runtime interface should be a bit more higherlevel than setting a string with implicit meaning. holger From guido@python.org Tue Jan 21 14:48:57 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 21 Jan 2003 09:48:57 -0500 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: Your message of "21 Jan 2003 13:12:27 GMT." <2mof6asl38.fsf@starship.python.net> References: <15917.17304.915965.539931@montanaro.dyndns.org> <2mof6asl38.fsf@starship.python.net> Message-ID: <200301211448.h0LEmvh15446@odiug.zope.com> > Skip Montanaro writes: > > > Here's a slightly different alternative. (It woke me up this morning, so I > > know it's a good idea. ;-) > > > > Instead of an environment variable which functions simply as an on/off > > switch, add an environment variable named PYCROOT which can be used to > > control writing of .pyc files as follows: > > > > * if not present, status quo > > > > * if present and refers to an existing directory, prepend PYCROOT to any > > attempts to read or write .pyc files. > > The idea of writing .pycs to a world writable area (say /tmp) on a > multi-user system sounds like a Bad Thing. Years ago, Bill Janssen at Xerox had a use case for this: they apparently have a setup where the normal way of accessing files is a shared replicated read-only virtual filesystem, but you can prefix something to a path that accesses a writable server (if you have permissions of course). --Guido van Rossum (home page: http://www.python.org/~guido/) From skip@pobox.com Tue Jan 21 14:53:10 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 21 Jan 2003 08:53:10 -0600 Subject: [Python-Dev] disable writing .py[co] Message-ID: <15917.24278.664609.769232@montanaro.dyndns.org> Martin> * Doesn't solve the original problem: many processors writing to Martin> the same file system (unless you manage to set an environment Martin> variable differently on each node). Sure: export PYCROOT=/tmp/`hostname --fqdn` Skip From skip@pobox.com Tue Jan 21 15:01:25 2003 From: skip@pobox.com (Skip Montanaro) Date: Tue, 21 Jan 2003 09:01:25 -0600 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <3cnmlgcm.fsf@python.net> References: <15917.17304.915965.539931@montanaro.dyndns.org> <3cnmlgcm.fsf@python.net> Message-ID: <15917.24773.469256.806565@montanaro.dyndns.org> >> All that happens is that when you go to read or write a .pyc file is >> that you prepend PYCROOT to the full path to the .py source file in >> addition to adding a 'c' to the end. Thomas> Wouldn't that place all the (incompatible) pyc files in the same Thomas> directory? Nope. If PYCROOT was set to /tmp and found the socket module in /usr/lib/python2.3/socket.py, the corresponding .pyc file would be /tmp/usr/lib/python2.3/socket.pyc. >> * it's not obvious (to me) what the semantics should be on multi-root >> systems like Windows (I can see a couple alternatives). Thomas> I cannot understand this sentence. What do you mean? On Windows, the current working directory exists on each drive. If I set PYCROOT to C:\TEMP and locate socket.py in D:\PYTHON23\socket.py, what should the full path to the .pyc file be? What if I set it to simply \TEMP (omitting a drive letter)? I won't elaborate all the possibilities, because I will probably forget some reasonable options, however, maybe the most straightforward approach would be to do like Cygwin does. Force PYCROOT to refer to a single directory (tie down the drive letter, even if omitted) and treat the drive letter in module file paths as a directory component. Given PYCROOT of C:\TEMP and socket.py on D: as above, the .pyc file might reasonably be C:\TEMP\D\PYTHON23\socket.pyc. Skip From ishimoto@gembook.org Tue Jan 21 16:41:34 2003 From: ishimoto@gembook.org (Atsuo Ishimoto) Date: Wed, 22 Jan 2003 01:41:34 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E26D841.2020805@lemburg.com> References: <15910.50700.726791.488944@gargle.gargle.HOWL> <3E26D841.2020805@lemburg.com> Message-ID: <20030122012900.BC17.ISHIMOTO@gembook.org> On Thu, 16 Jan 2003 17:05:21 +0100 "M.-A. Lemburg" wrote: > > I'm not biased in any direction here. Again, I'd love to see the > two sets be merged into one, e.g. take the Python ones from Hisao > and use the C ones from Tamito if they are installed instead. > Here again, I entreat to add Tamito's C version as standard Japanese codec. It's fast, is proven quite stable and correct. If people need to customize mapping table(this may happen in some cases, but not common task. I believe almost 100% of Japanese programmers never had wrote such a special mapping table), and if they think Tamito's codec is too difficult to customize(while I don't think so), and only if they are satisfied with performance of codec written in Python, they will download and install the Python version of codec. -------------------------- Atsuo Ishimoto ishimoto@gembook.org Homepage:http://www.gembook.jp From guido@python.org Tue Jan 21 19:21:43 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 21 Jan 2003 14:21:43 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: Your message of "Mon, 20 Jan 2003 20:07:29 GMT." <005c01c2c0bf$90b40a80$652b6992@alpha> References: <200301200117.h0K1H5V23597@pcp02138704pcs.reston01.va.comcast.net> <005c01c2c0bf$90b40a80$652b6992@alpha> Message-ID: <200301211921.h0LJLhA17800@odiug.zope.com> > How about severe/SEVERE - a fairly common word, and only one letter > more to type than fatal? Doesn't tickle my fancy like fatal/FATAL does. I think that critical/CRITICAL is okay as long as we can't agree on a replacement, so let's stick with that. warn/WARN still will become warning/WARNING, with no backwards compatibility. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Jan 21 20:07:10 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 21 Jan 2003 21:07:10 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <20030122012900.BC17.ISHIMOTO@gembook.org> References: <15910.50700.726791.488944@gargle.gargle.HOWL> <3E26D841.2020805@lemburg.com> <20030122012900.BC17.ISHIMOTO@gembook.org> Message-ID: <3E2DA86E.6030601@lemburg.com> Atsuo Ishimoto wrote: > On Thu, 16 Jan 2003 17:05:21 +0100 > "M.-A. Lemburg" wrote: > > >>I'm not biased in any direction here. Again, I'd love to see the >>two sets be merged into one, e.g. take the Python ones from Hisao >>and use the C ones from Tamito if they are installed instead. > > Here again, I entreat to add Tamito's C version as standard Japanese > codec. It's fast, is proven quite stable and correct. > > If people need to customize mapping table(this may happen in some cases, > but not common task. I believe almost 100% of Japanese programmers never > had wrote such a special mapping table), and if they think Tamito's > codec is too difficult to customize(while I don't think so), and only if > they are satisfied with performance of codec written in Python, they > will download and install the Python version of codec. Wouldn't it be better to use Hisao's codec per default and revert to Tamito's in case that's installed in the system ? We also need active maintainers for the codecs. I think ideal would be to get Hisao share this load -- Hisao for the Python version and Tamito for the C one. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Tue Jan 21 21:15:31 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 21 Jan 2003 22:15:31 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2DA86E.6030601@lemburg.com> References: <15910.50700.726791.488944@gargle.gargle.HOWL> <3E26D841.2020805@lemburg.com> <20030122012900.BC17.ISHIMOTO@gembook.org> <3E2DA86E.6030601@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > Wouldn't it be better to use Hisao's codec per default and revert > to Tamito's in case that's installed in the system ? I still don't see the rationale of incorporating code that has less functions and less performance instead of incorporating code that has more functions and more performance. > We also need active maintainers for the codecs. I think ideal > would be to get Hisao share this load -- Hisao for the Python > version and Tamito for the C one. That is a valid point. Has Hisao volunteered to maintain his code? Regards, Martin From barry@python.org Tue Jan 21 21:40:19 2003 From: barry@python.org (Barry A. Warsaw) Date: Tue, 21 Jan 2003 16:40:19 -0500 Subject: [Python-Dev] Adding Japanese Codecs to the distro References: <15910.50700.726791.488944@gargle.gargle.HOWL> <3E26D841.2020805@lemburg.com> <20030122012900.BC17.ISHIMOTO@gembook.org> <3E2DA86E.6030601@lemburg.com> Message-ID: <15917.48707.303134.671017@gargle.gargle.HOWL> >>>>> "MvL" =3D=3D Martin v L=F6wis writes: >> We also need active maintainers for the codecs. I think ideal >> would be to get Hisao share this load -- Hisao for the Python >> version and Tamito for the C one. MvL> That is a valid point. Has Hisao volunteered to maintain his MvL> code? AFAICT, Tamito is pretty good about maintaining his codec. I see new versions announced fairly regularly. -Barry From zen@shangri-la.dropbear.id.au Wed Jan 22 00:13:43 2003 From: zen@shangri-la.dropbear.id.au (Stuart Bishop) Date: Wed, 22 Jan 2003 11:13:43 +1100 Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: <200301212122.h0LLMU828302@odiug.zope.com> Message-ID: <5DD1B572-2D9E-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> On Wednesday, January 22, 2003, at 08:22 AM, Guido van Rossum wrote: >>>> My personal belief would be to include Gadfly in Python: >>>> - Provides a reason for the DB API docs to be merged into the >>>> Python library reference >>>> - Gives Python relational DB stuff out of the box ala Java, >>>> but with a working RDBMS as well ala nothing else I'm aware >>>> of. >>>> - Makes including GadflyDA in Zope 3 a trivial decision, since >>>> its size would be negligable and the DA code itself is >>>> already ZPL. >>> >>> Would you be willing to find out (from c.l.py) how much interest >>> there is in this? >> >> A fairly positive response from the DB SIG. The trick will be to fix >> the outstanding bugs or disable those features (losing the 'group >> by' and 'unique' SQL clauses), and to confirm and fix any departures >> from the DB-API 2.0 standard, as this would become a reference >> implementation of sorts. >> >> There is no permanent maintainer, as Richard Jones is in more of a >> caretaker role with the code. I'll volunteer to try and get the code >> into a Python release though. >> >> If fixes, documentation and tests can be organized by the end of >> January for alpha2, will this go out with Python 2.3 (assuming a >> signoff on quality by python-dev and the DB-SIG)? If not, Jim is >> back to deciding if he should include Gadfly with Zope3. > > Sorry for not responding before. I'm open for doing this, but you > should probably probe python-dev next before you start a big coding > project. How much C code is involved in Gadfly? If it's a lot, I'm a > lot more reluctant, because C code usually requires much more > maintenance (rare is the C source file that doesn't have some hidden > platform dependency). Gadfly comes with kjbuckets, which is written in C. The rest is Python. Gadfly uses the included kjbuckets for storage if it is available, but happily runs without it with a performance hit. So Jython gets a RDBMS implementation too. -- Stuart Bishop http://shangri-la.dropbear.id.au/ From ishimoto@gembook.org Wed Jan 22 00:22:20 2003 From: ishimoto@gembook.org (Atsuo Ishimoto) Date: Wed, 22 Jan 2003 09:22:20 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <15917.48707.303134.671017@gargle.gargle.HOWL> References: <15917.48707.303134.671017@gargle.gargle.HOWL> Message-ID: <20030122085353.AB36.ISHIMOTO@gembook.org> On Tue, 21 Jan 2003 16:40:19 -0500 barry@python.org (Barry A. Warsaw) wrote: > > >>>>> "MvL" == Martin v L$Bvw(Bis writes: > > >> We also need active maintainers for the codecs. I think ideal > >> would be to get Hisao share this load -- Hisao for the Python > >> version and Tamito for the C one. > > MvL> That is a valid point. Has Hisao volunteered to maintain his > MvL> code? > > AFAICT, Tamito is pretty good about maintaining his codec. I see new > versions announced fairly regularly. > (cc'ing another Tamito's mail addr. Tamito, are you wake up?) I believe he will continue to maintain it. Of cource, I and people in the Japanese Python community will help him. I don't expect such kind of community effort for Hisao's codec. Active users in Japan will continue to use Tamio's one, and don't care Python version is broken or not. -------------------------- Atsuo Ishimoto ishimoto@gembook.org Homepage:http://www.gembook.jp From bac@OCF.Berkeley.EDU Wed Jan 22 00:21:46 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 21 Jan 2003 16:21:46 -0800 (PST) Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: <5DD1B572-2D9E-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> References: <5DD1B572-2D9E-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> Message-ID: [Stuart Bishop] > > Sorry for not responding before. I'm open for doing this, but you > > should probably probe python-dev next before you start a big coding > > project. How much C code is involved in Gadfly? If it's a lot, I'm a > > lot more reluctant, because C code usually requires much more > > maintenance (rare is the C source file that doesn't have some hidden > > platform dependency). > > Gadfly comes with kjbuckets, which is written in C. The rest is Python. > Gadfly uses the included kjbuckets for storage if it is available, but > happily runs without it with a performance hit. So Jython gets a > RDBMS implementation too. > So my first question is what is the license on Gadfly? I assume it is compatible with going into Python, but I thought I would ask. Next, how much of a performance hit is there without kjbuckets? I am with Guido with wanting to minimize the amount of C code put into the libraries where there is no requirement for it. And if there is a decent hit what would it take to code up something in Python to replace it? We could leave it as an option to use kjbuckets if we want. And if taking out kjbuckets is unreasonble, what license is it under? I personally would love to have an actual DB in the stdlib so if these questions get positive answers I am +1. -Brett From doko@cs.tu-berlin.de Wed Jan 22 00:21:53 2003 From: doko@cs.tu-berlin.de (Matthias Klose) Date: Wed, 22 Jan 2003 01:21:53 +0100 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <15916.33189.872024.702720@gargle.gargle.HOWL> References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> Message-ID: <15917.58401.603261.82762@gargle.gargle.HOWL> Barry A. Warsaw writes: >=20 > >>>>> "MvL" =3D=3D Martin v L=F6wis writes: >=20 > MvL> No need is real. The specific requirement comes from >=20 > MvL> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D96111 >=20 > MvL> where a user complains that mailman writes into /usr (writin= g > MvL> actually failed, but a Secure Linux kernel detected the > MvL> problem, and the user wants to silence the warning, > MvL> implementing some policy). >=20 > MvL> It turns out that this is Python trying to write .pyc > MvL> files. It would be desirable to turn pyc generation > MvL> completely off for mailman. This could be done best through > MvL> actually modifying the mailman source code. Setting an > MvL> environment variable is less convenient, as you then have to= > MvL> find all places where mailman scripts are invoked, or have t= o > MvL> wrap all mailman scripts. >=20 > That particular bug is because the debian package isn't/wasn't > complete. In a from-source installation of Mailman, we do indeed run= > compileall.py on all the Python files. From skimming the debian bug > report, it looks like they fixed their post-install script to do the > same. yes, but one file mm_cfg.py is a configuration file, which should not be compiled. This is not specific to mailman. A core configuration file is site.py, which shouldn't be compiled as well, so here it would make sense to use a sys.. From rjones@ekit-inc.com Wed Jan 22 00:39:03 2003 From: rjones@ekit-inc.com (Richard Jones) Date: Wed, 22 Jan 2003 11:39:03 +1100 Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: <5DD1B572-2D9E-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> References: <5DD1B572-2D9E-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> Message-ID: <200301221139.03934.rjones@ekit-inc.com> On Wed, 22 Jan 2003 11:13 am, Stuart Bishop wrote: > Gadfly comes with kjbuckets, which is written in C. The rest is Python. > Gadfly uses the included kjbuckets for storage if it is available, but > happily runs without it with a performance hit. So Jython gets a > RDBMS implementation too. Anthony Baxter is looking into replacing kjSets (done) and kjbuckets (in progress) in gadfly with the new sets implementation in python 2.3. Richard From guido@python.org Wed Jan 22 00:54:46 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 21 Jan 2003 19:54:46 -0500 Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: Your message of "Wed, 22 Jan 2003 11:13:43 +1100." <5DD1B572-2D9E-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> References: <5DD1B572-2D9E-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> Message-ID: <200301220054.h0M0slj10424@pcp02138704pcs.reston01.va.comcast.net> > Gadfly comes with kjbuckets, which is written in C. The rest is > Python. Gadfly uses the included kjbuckets for storage if it is > available, but happily runs without it with a performance hit. So > Jython gets a RDBMS implementation too. Hm. I've not reviewed kjbuckets myself, but I've heard it's some of the hairiest C code ever written. The problem with putting that in the Python distribution is that we end up having to maintain it, whether we want to or not. So I'm -1 on adding kjbuckets. --Guido van Rossum (home page: http://www.python.org/~guido/) From zen@shangri-la.dropbear.id.au Wed Jan 22 00:57:17 2003 From: zen@shangri-la.dropbear.id.au (Stuart Bishop) Date: Wed, 22 Jan 2003 11:57:17 +1100 Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: Message-ID: <73F0132C-2DA4-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> On Wednesday, January 22, 2003, at 11:21 AM, Brett Cannon wrote: > [Stuart Bishop] > >>> Sorry for not responding before. I'm open for doing this, but you >>> should probably probe python-dev next before you start a big coding >>> project. How much C code is involved in Gadfly? If it's a lot, I'm >>> a >>> lot more reluctant, because C code usually requires much more >>> maintenance (rare is the C source file that doesn't have some hidden >>> platform dependency). >> >> Gadfly comes with kjbuckets, which is written in C. The rest is >> Python. >> Gadfly uses the included kjbuckets for storage if it is available, but >> happily runs without it with a performance hit. So Jython gets a >> RDBMS implementation too. >> > > So my first question is what is the license on Gadfly? I assume it is > compatible with going into Python, but I thought I would ask. Use granted for any purpose without fee, provided the Copyright and permission notices appear in all copies and supporting documentation. > Next, how much of a performance hit is there without kjbuckets? I am > with > Guido with wanting to minimize the amount of C code put into the > libraries > where there is no requirement for it. And if there is a decent hit > what > would it take to code up something in Python to replace it? We could > leave it as an option to use kjbuckets if we want. The fallback already is a version of kjbuckets written in pure Python. So the build process simply needs to keep going if kjbucketsmodule.c doesn't build. The regression tests during alpha and beta releases should tell us if we need to switch off kjbuckets on certain platforms, although it already has had a decent work out since Gadfly has been part of Zope since at least 2.0 (3+ years?) > And if taking out kjbuckets is unreasonble, what license is it under? Same licence. > I personally would love to have an actual DB in the stdlib so if these > questions get positive answers I am +1. -- Stuart Bishop http://shangri-la.dropbear.id.au/ From bac@OCF.Berkeley.EDU Wed Jan 22 00:58:03 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Tue, 21 Jan 2003 16:58:03 -0800 (PST) Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: <200301221139.03934.rjones@ekit-inc.com> References: <5DD1B572-2D9E-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> <200301221139.03934.rjones@ekit-inc.com> Message-ID: [Richard Jones] > On Wed, 22 Jan 2003 11:13 am, Stuart Bishop wrote: > > Gadfly comes with kjbuckets, which is written in C. The rest is Python. > > Gadfly uses the included kjbuckets for storage if it is available, but > > happily runs without it with a performance hit. So Jython gets a > > RDBMS implementation too. > > Anthony Baxter is looking into replacing kjSets (done) and kjbuckets (in > progress) in gadfly with the new sets implementation in python 2.3. > Oh good. I then suggest we hold off on considering adding Gadfly until this work is done. -Brett From guido@python.org Wed Jan 22 01:01:00 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 21 Jan 2003 20:01:00 -0500 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: Your message of "Wed, 22 Jan 2003 01:21:53 +0100." <15917.58401.603261.82762@gargle.gargle.HOWL> References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> <15917.58401.603261.82762@gargle.gargle.HOWL> Message-ID: <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> > > That particular bug is because the debian package isn't/wasn't > > complete. In a from-source installation of Mailman, we do indeed run > > compileall.py on all the Python files. From skimming the debian bug > > report, it looks like they fixed their post-install script to do the > > same. > > yes, but one file mm_cfg.py is a configuration file, which should not > be compiled. This is not specific to mailman. A core configuration > file is site.py, which shouldn't be compiled as well, so here it would > make sense to use a sys.. What's the problem with compiling mm_cfg.py or site.py? As long as you don't delete the .py file, the .pyc file acts only as a cache. I see no need to avoid compilation. (Unlike Emacs, Python only uses the .pyc file if the timestamp in its header *matches exactly* the mtime of the corresponding .py file, *or* if there is no .py file.) --Guido van Rossum (home page: http://www.python.org/~guido/) From altis@semi-retired.com Wed Jan 22 01:03:50 2003 From: altis@semi-retired.com (Kevin Altis) Date: Tue, 21 Jan 2003 17:03:50 -0800 Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: <5DD1B572-2D9E-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> Message-ID: > From: Stuart Bishop > > On Wednesday, January 22, 2003, at 08:22 AM, Guido van Rossum wrote: > > >>>> My personal belief would be to include Gadfly in Python: > >>>> - Provides a reason for the DB API docs to be merged into the > >>>> Python library reference > >>>> - Gives Python relational DB stuff out of the box ala Java, > >>>> but with a working RDBMS as well ala nothing else I'm aware > >>>> of. > >>>> - Makes including GadflyDA in Zope 3 a trivial decision, since > >>>> its size would be negligable and the DA code itself is > >>>> already ZPL. > >>> > >>> Would you be willing to find out (from c.l.py) how much interest > >>> there is in this? > >> > >> A fairly positive response from the DB SIG. The trick will be to fix > >> the outstanding bugs or disable those features (losing the 'group > >> by' and 'unique' SQL clauses), and to confirm and fix any departures > >> from the DB-API 2.0 standard, as this would become a reference > >> implementation of sorts. > >> > >> There is no permanent maintainer, as Richard Jones is in more of a > >> caretaker role with the code. I'll volunteer to try and get the code > >> into a Python release though. > >> > >> If fixes, documentation and tests can be organized by the end of > >> January for alpha2, will this go out with Python 2.3 (assuming a > >> signoff on quality by python-dev and the DB-SIG)? If not, Jim is > >> back to deciding if he should include Gadfly with Zope3. > > > > Sorry for not responding before. I'm open for doing this, but you > > should probably probe python-dev next before you start a big coding > > project. How much C code is involved in Gadfly? If it's a lot, I'm a > > lot more reluctant, because C code usually requires much more > > maintenance (rare is the C source file that doesn't have some hidden > > platform dependency). > > Gadfly comes with kjbuckets, which is written in C. The rest is Python. > Gadfly uses the included kjbuckets for storage if it is available, but > happily runs without it with a performance hit. So Jython gets a > RDBMS implementation too. Interesting. I'm in the process of trying out Gadfly, PySQLite, and MetaKit as embedded databases. For reference, the links are: Gadfly http://gadfly.sourceforge.net/ SQLite and PySQLite http://www.hwaci.com/sw/sqlite/ http://pysqlite.sourceforge.net/ MetaKit, Mk4py, MkSQL http://www.equi4.com/metakit/ http://www.equi4.com/metakit/python.html http://www.mcmillan-inc.com/mksqlintro.html All are embeddable databases, but they each have their pros and cons. I can see how Gadfly would have a lot of appeal since it can be used as a pure Python solution. The licensing for MetaKit probably makes it inappropriate for the Python standard libs, but I'm sure that could be brought up with the author. PySQLite seems to be the most mature (MetaKit users may disagree), certainly SQLite is better documented, has a richer feature set, and as a bonus the source code is in the public domain! PySQLite appears to be quite fast. http://www.hwaci.com/sw/sqlite/speed.html Since it doesn't use a memory map like MetaKit, it should work equally well with small and large data sets. Anyway, I'm probably a month away from being able to present an adequate comparison of using each for different relational datasets. One data set I'm looking at is roughly 800MB of data, the other is only about 256KB and I'm looking at the smaller one first since it also has a simpler table structure. I would be interested in seeing both Gadfly and PySQLite supported in the standard libs. I'm guessing that Gadfly needs a lot of testing and probably bug fixes to justify including it in the 2.3 standard libs. ka From guido@python.org Wed Jan 22 01:08:15 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 21 Jan 2003 20:08:15 -0500 Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: Your message of "Wed, 22 Jan 2003 11:57:17 +1100." <73F0132C-2DA4-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> References: <73F0132C-2DA4-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> Message-ID: <200301220108.h0M18FE10664@pcp02138704pcs.reston01.va.comcast.net> > Use granted for any purpose without fee, provided the Copyright and > permission notices appear in all copies and supporting > documentation. This is actually a pretty serious burden -- the list of licenses we have to keep around in all copies and docs keeps growing. :-( Maybe Aaron will assign the code to the PSF? --Guido van Rossum (home page: http://www.python.org/~guido/) From rjones@ekit-inc.com Wed Jan 22 01:16:13 2003 From: rjones@ekit-inc.com (Richard Jones) Date: Wed, 22 Jan 2003 12:16:13 +1100 Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: References: Message-ID: <200301221216.13283.rjones@ekit-inc.com> On Wed, 22 Jan 2003 12:03 pm, Kevin Altis wrote: > Interesting. I'm in the process of trying out Gadfly, PySQLite, and MetaKit > as embedded databases. For reference, the links are: > > Gadfly > http://gadfly.sourceforge.net/ > > SQLite and PySQLite > http://www.hwaci.com/sw/sqlite/ > http://pysqlite.sourceforge.net/ > > MetaKit, Mk4py, MkSQL > http://www.equi4.com/metakit/ > http://www.equi4.com/metakit/python.html > http://www.mcmillan-inc.com/mksqlintro.html > > All are embeddable databases, but they each have their pros and cons. Gadfly has the advantage that any marshallable Python object may be stored with no mess, no fuss. Sqlite is restricted to only storing strings. Metakit supports a variety of data types, but no explicit NULL. Actually, the three support wildly different types of "unset" values: gadfly: python's None sqlite: sql NULL (and all its quirks ;) metakit: no support Gadfly has the additional benefit that any Python object may support its View interface, and thus participate in SQL queries. Pretty powerful stuff. > Since it doesn't use a memory map like MetaKit, it should work equally well > with small and large data sets. I'm not sure this is a reasonable statement to make. > I would be interested in seeing both Gadfly and PySQLite supported in the > standard libs. I'm guessing that Gadfly needs a lot of testing and probably > bug fixes to justify including it in the 2.3 standard libs. Gadfly has outstanding bugs (see the sourceforge bug tracker). It has a suite of unit tests, but these are far from complete. It needs volunteers :) It'd also be nice for gadfly to support SQL "LIKE" expressions, but that also requires work under the hood by some generous volunteer :) Richard From rjones@ekit-inc.com Wed Jan 22 01:20:06 2003 From: rjones@ekit-inc.com (Richard Jones) Date: Wed, 22 Jan 2003 12:20:06 +1100 Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: References: Message-ID: <200301221220.06615.rjones@ekit-inc.com> On Wed, 22 Jan 2003 12:03 pm, Kevin Altis wrote: > Anyway, I'm probably a month away from being able to present an adequate > comparison of using each for different relational datasets. One data set > I'm looking at is roughly 800MB of data, the other is only about 256KB and > I'm looking at the smaller one first since it also has a simpler table > structure. Oh, and one other thing: from way back at the start of this discussion, it was decided that performance was not going to be a major deciding factor. Sure, we can make sure the perfomance doesn't suck, but if you want a large database, use a real database engine :) Richard From zen@shangri-la.dropbear.id.au Wed Jan 22 01:26:06 2003 From: zen@shangri-la.dropbear.id.au (Stuart Bishop) Date: Wed, 22 Jan 2003 12:26:06 +1100 Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: Message-ID: <7ACBEEFA-2DA8-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> On Wednesday, January 22, 2003, at 12:03 PM, Kevin Altis wrote: > All are embeddable databases, but they each have their pros and cons. > I can > see how Gadfly would have a lot of appeal since it can be used as a > pure > Python solution. The licensing for MetaKit probably makes it > inappropriate > for the Python standard libs, but I'm sure that could be brought up > with the > author. PySQLite seems to be the most mature (MetaKit users may > disagree), > certainly SQLite is better documented, has a richer feature set, and > as a > bonus the source code is in the public domain! PySQLite appears to be > quite > fast. MetaKit and PySQLite were brought up when discussing this on the DB-SIG mailing list. However, the major problem is keeping releases of these third party tools in sync with Python releases. The advantage of Gadfly is that it has been in maintenance only mode for a few years now, and can happily be uprooted and replanted in the Python CVS repository. -- Stuart Bishop http://shangri-la.dropbear.id.au/ From zen@shangri-la.dropbear.id.au Wed Jan 22 01:26:20 2003 From: zen@shangri-la.dropbear.id.au (Stuart Bishop) Date: Wed, 22 Jan 2003 12:26:20 +1100 Subject: [Python-Dev] Re: GadflyDA in core? Or as add-on-product? In-Reply-To: <200301220108.h0M18FE10664@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <82D9C667-2DA8-11D7-88CD-000393B63DDC@shangri-la.dropbear.id.au> On Wednesday, January 22, 2003, at 12:08 PM, Guido van Rossum wrote: >> Use granted for any purpose without fee, provided the Copyright and >> permission notices appear in all copies and supporting >> documentation. > > This is actually a pretty serious burden -- the list of licenses we > have to keep around in all copies and docs keeps growing. :-( Hmm... I assumed that I would simply cut & paste the agreement into the Gadfly documentation I'd have to adapt for the Library reference. Looks like Python licence but Aaron wants to be identified as the author ('CV-ware'). cc:'d to Aaron for the horses mouth opinion if the email address I have is still valid. From COPYRIGHT.txt: The gadfly and kjbuckets source is copyrighted, but you can freely use and copy it as long as you don't change or remove the copyright: Copyright Aaron Robert Watters, 1994 All Rights Reserved Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appears in all copies and that both that copyright notice and this permission notice appear in supporting documentation. AARON ROBERT WATTERS DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL AARON ROBERT WATTERS BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. -- Stuart Bishop http://shangri-la.dropbear.id.au/ From martin@v.loewis.de Wed Jan 22 01:44:27 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 22 Jan 2003 02:44:27 +0100 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> <15917.58401.603261.82762@gargle.gargle.HOWL> <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > What's the problem with compiling mm_cfg.py or site.py? As long as > you don't delete the .py file, the .pyc file acts only as a cache. I > see no need to avoid compilation. If you change the configuration file, Python will try to regenerate the .pyc file. This is a problem for people who don't want pyc files written at program execution time. Regards, Martin From guido@python.org Wed Jan 22 01:50:29 2003 From: guido@python.org (Guido van Rossum) Date: Tue, 21 Jan 2003 20:50:29 -0500 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: Your message of "22 Jan 2003 02:44:27 +0100." References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> <15917.58401.603261.82762@gargle.gargle.HOWL> <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200301220150.h0M1oT711039@pcp02138704pcs.reston01.va.comcast.net> > Guido van Rossum writes: > > > What's the problem with compiling mm_cfg.py or site.py? As long as > > you don't delete the .py file, the .pyc file acts only as a cache. I > > see no need to avoid compilation. > > If you change the configuration file, Python will try to regenerate > the .pyc file. This is a problem for people who don't want pyc files > written at program execution time. Ok, but weren't we going to give those people an explicit option? --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@python.org Wed Jan 22 02:42:36 2003 From: barry@python.org (Barry A. Warsaw) Date: Tue, 21 Jan 2003 21:42:36 -0500 Subject: [Python-Dev] disable writing .py[co] References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> <15917.58401.603261.82762@gargle.gargle.HOWL> <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15918.1308.79080.420117@gargle.gargle.HOWL> >>>>> "GvR" == Guido van Rossum writes: GvR> What's the problem with compiling mm_cfg.py or site.py? I don't think there is one. Maybe Matthias is concerned that unlike the other source files in the Mailman distro, mm_cfg.py is supposed to be where users configure the system, so it may change more frequently than other files (although still rare, I'd guess). It'll definitely change after the initial install+compileall step. -Barry From barry@python.org Wed Jan 22 03:49:58 2003 From: barry@python.org (Barry A. Warsaw) Date: Tue, 21 Jan 2003 22:49:58 -0500 Subject: [Python-Dev] Cookie.py too strict Message-ID: <15918.5350.111873.935732@gargle.gargle.HOWL> I'm unhappy about aspects of Cookie.py. My main gripe at the moment is the fact that if you feed SimpleCookie cookie data with colons in one of the keys, it raises a CookieError. This might seem reasonable behavior when creating cookies from a program, but I think it's unreasonable for reading cookies from http data, under the mantra of "be liberal what you accept and strict in what you produce". Aside: this is nailing me when visiting both Mailman 2.0 and 2.1 lists at python.org. MM2.0 used a key like mylist:admin and the colon is not strictly legal (apparently - it didn't jump out at me in a quick scan of RFCs 2068 and 2109). MM2.1 changed that to be mylist+admin as the key, but since mail.python.org shares url space between 2.0 and 2.1 lists, a cookie header may have a mix. It's very inconvenient for Cookie.py to throw the error in this case. Anyway, it turns out to be a PITA to work around this via say inheritance because of the design of the module and the embedding of the legal chars in Morsel.set() call. I'd like to fix this, but I'm not sure exactly how I'd like to fix it. "Must resist temptation to rewrite." Before I dig in, I thought I'd ask and see if anybody else cared about this, or had ideas they'd like to throw out. I want to do the simplest thing I can get away with, but I would like to fix it in Cookie.py rather than in a Mailman monkey patch. -Barry From martin@v.loewis.de Wed Jan 22 08:27:36 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 22 Jan 2003 09:27:36 +0100 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <200301220150.h0M1oT711039@pcp02138704pcs.reston01.va.comcast.net> References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> <15917.58401.603261.82762@gargle.gargle.HOWL> <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> <200301220150.h0M1oT711039@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > > If you change the configuration file, Python will try to regenerate > > the .pyc file. This is a problem for people who don't want pyc files > > written at program execution time. > > Ok, but weren't we going to give those people an explicit option? Sure, this thread is about how to do that. I said that there is this Debian bug, Barry said that this was a bug in Debian itself, for not compiling all mailman source code, Matthias said that this doesn't apply to mm_cfg.py (i.e. Debian can't ship a precompiled mm_cfg.py), you asked what the problem is, I said that generating mm_cfg.pyc is the problem. Net result: the Debian bug report is still unresolved. Now, I originally brought it up because the command line option might not help, in this case. Mailman is invoked through scripts that start with #! /usr/local/bin/python In #! scripts, adding command line options is a tricky business. In this specific case, you could add an option. For #! /usr/bin/env python you couldn't, as you can have only a single argument in #! scripts. So I withdraw the claim that the Debian bug report couldn't be solved with a command line option. However, similar cases might not be solvable with that option. Regards, Martin From martin@v.loewis.de Wed Jan 22 08:32:02 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 22 Jan 2003 09:32:02 +0100 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <15918.1308.79080.420117@gargle.gargle.HOWL> References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> <15917.58401.603261.82762@gargle.gargle.HOWL> <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> <15918.1308.79080.420117@gargle.gargle.HOWL> Message-ID: barry@python.org (Barry A. Warsaw) writes: > I don't think there is one. Maybe Matthias is concerned that unlike > the other source files in the Mailman distro, mm_cfg.py is supposed to > be where users configure the system, so it may change more frequently > than other files (although still rare, I'd guess). It'll definitely > change after the initial install+compileall step. The real problem here is that some people don't want Python to write any .pyc files, period. So even if mailman byte-compiles all its source code, it can't stop Python from writing .pyc files that belong to mailman. That, in turn, can cause the problems discussed in this thread: there might be races in doing so, it might trigger security alarms, and it might write byte code files in configuration areas of the system where only text files are allowed, per policy. Regards, Martin From mal@lemburg.com Wed Jan 22 09:29:54 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 22 Jan 2003 10:29:54 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <20030122085353.AB36.ISHIMOTO@gembook.org> References: <15917.48707.303134.671017@gargle.gargle.HOWL> <20030122085353.AB36.ISHIMOTO@gembook.org> Message-ID: <3E2E6492.7000600@lemburg.com> Atsuo Ishimoto wrote: > On Tue, 21 Jan 2003 16:40:19 -0500 > barry@python.org (Barry A. Warsaw) wrote: > > >>>>>>>"MvL" == Martin v L?is writes: >> >> >> We also need active maintainers for the codecs. I think ideal >> >> would be to get Hisao share this load -- Hisao for the Python >> >> version and Tamito for the C one. >> >> MvL> That is a valid point. Has Hisao volunteered to maintain his >> MvL> code? Yes. >>AFAICT, Tamito is pretty good about maintaining his codec. I see new >>versions announced fairly regularly. I wasn't saying that he doesn't maintain the code. Indeed, he does a very good job at it. > (cc'ing another Tamito's mail addr. Tamito, are you wake up?) > > I believe he will continue to maintain it. Of cource, I and people in > the Japanese Python community will help him. I don't expect such kind of > community effort for Hisao's codec. Active users in Japan will continue > to use Tamio's one, and don't care Python version is broken or not. Hmm, there seems to be a very strong feeling towards Tamito's codecs in the Japanese community. The problem I see is size: Tamito's codecs have an installed size of 1790kB while Hisao's codecs are around 81kB. That's why I was suggesting to use Hisao's codecs as default and to revert to Tamito's in case they are installed (much like you'd use cStringIO instead of StringIO if it's installed). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Wed Jan 22 10:18:15 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 22 Jan 2003 11:18:15 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2E6492.7000600@lemburg.com> References: <15917.48707.303134.671017@gargle.gargle.HOWL> <20030122085353.AB36.ISHIMOTO@gembook.org> <3E2E6492.7000600@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > The problem I see is size: Tamito's codecs have an installed > size of 1790kB while Hisao's codecs are around 81kB. It isn't quite that bad: You need to count the "c" directory only, which is 690kB on my system. > That's why I was suggesting to use Hisao's codecs as default and > to revert to Tamito's in case they are installed (much like you'd use > cStringIO instead of StringIO if it's installed). The analogy isn't that good here: it would be more similar if StringIO was incomplete, e.g. would be lacking a .readlines() function, so you would have no choice but to use cStringIO if you happen to need .readlines(). Regards, Martin From ishimoto@gembook.org Wed Jan 22 10:25:17 2003 From: ishimoto@gembook.org (Atsuo Ishimoto) Date: Wed, 22 Jan 2003 19:25:17 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2E6492.7000600@lemburg.com> References: <20030122085353.AB36.ISHIMOTO@gembook.org> <3E2E6492.7000600@lemburg.com> Message-ID: <20030122184345.09C0.ISHIMOTO@gembook.org> On Wed, 22 Jan 2003 10:29:54 +0100 "M.-A. Lemburg" wrote: > The problem I see is size: Tamito's codecs have an installed > size of 1790kB while Hisao's codecs are around 81kB. > You cannot compare size of untared files here. Tamito's codecs package contains source of C version and Python version. About 1 MB in 1790kB is size of C sources. So, I'm proposing to add only C version of codec from JapaneseCodecs package. As I mentioned, size of C version is about 160 KB in Win32 binary form, excluding tests and documentations. I don't see a significant difference between them. If size of C sources(about 1 MB) is matter, we may be able to reduce it. > That's why I was suggesting to use Hisao's codecs as default and > to revert to Tamito's in case they are installed (much like you'd use > cStringIO instead of StringIO if it's installed). Hmm, I assume cStringIO is installed always. I use StringIO only if I want to subclass StringIO class. -------------------------- Atsuo Ishimoto ishimoto@gembook.org Homepage:http://www.gembook.jp From mal@lemburg.com Wed Jan 22 11:45:27 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 22 Jan 2003 12:45:27 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: References: <15917.48707.303134.671017@gargle.gargle.HOWL> <20030122085353.AB36.ISHIMOTO@gembook.org> <3E2E6492.7000600@lemburg.com> Message-ID: <3E2E8457.3020608@lemburg.com> Martin v. L=F6wis wrote: > "M.-A. Lemburg" writes: >=20 >=20 >>The problem I see is size: Tamito's codecs have an installed >>size of 1790kB while Hisao's codecs are around 81kB. >=20 > It isn't quite that bad: You need to count the "c" directory only, > which is 690kB on my system. I was looking at the directory which gets installed to site-packages. That's 1790kB on my system. >>That's why I was suggesting to use Hisao's codecs as default and >>to revert to Tamito's in case they are installed (much like you'd use >>cStringIO instead of StringIO if it's installed). >=20 > The analogy isn't that good here: it would be more similar if StringIO > was incomplete, e.g. would be lacking a .readlines() function, so you > would have no choice but to use cStringIO if you happen to need > .readlines(). But you get the picture ... ;-) --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jan 22 12:06:47 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 22 Jan 2003 13:06:47 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <20030122184345.09C0.ISHIMOTO@gembook.org> References: <20030122085353.AB36.ISHIMOTO@gembook.org> <3E2E6492.7000600@lemburg.com> <20030122184345.09C0.ISHIMOTO@gembook.org> Message-ID: <3E2E8957.708@lemburg.com> Atsuo Ishimoto wrote: > On Wed, 22 Jan 2003 10:29:54 +0100 > "M.-A. Lemburg" wrote: > >>The problem I see is size: Tamito's codecs have an installed >>size of 1790kB while Hisao's codecs are around 81kB. >> > > You cannot compare size of untared files here. I was talking about the *installed* size, ie. the size of the package in site-packages: degas site-packages/japanese# du 337 ./c 1252 ./mappings 88 ./python 8 ./aliases 1790 . Hisao's Python codec is only 85kB in size. Now, if we took the only the C version of Tamito's codec, we'd end up with around 1790 - 1252 - 88 = 450 kB. Still a factor of 5... I wonder whether it wouldn't be possible to use the same tricks Hisao used in his codec for a C version. > Tamito's codecs package > contains source of C version and Python version. About 1 MB in 1790kB > is size of C sources. > > So, I'm proposing to add only C version of codec from JapaneseCodecs > package. As I mentioned, size of C version is about 160 KB in Win32 > binary form, excluding tests and documentations. I don't see a > significant difference between them. > > If size of C sources(about 1 MB) is matter, we may be able to reduce it. The source code size is not that important. The install size is and even more the memory footprint. Hisao's approach uses a single table which fits into 58kB Python source code. Boil that down to a static C table and you'll end up with something around 10-20kB for static C data. Hisao does still builds a dictionary using this data, but perhaps that step could be avoided using the same techniques that Fredrik used in boiling down the size of the unicodedata module (which holds the Unicode Database). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From barry@python.org Wed Jan 22 12:49:04 2003 From: barry@python.org (Barry A. Warsaw) Date: Wed, 22 Jan 2003 07:49:04 -0500 Subject: [Python-Dev] disable writing .py[co] References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> <15917.58401.603261.82762@gargle.gargle.HOWL> <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> <15918.1308.79080.420117@gargle.gargle.HOWL> Message-ID: <15918.37696.616235.448566@gargle.gargle.HOWL> >>>>> "MvL" =3D=3D Martin v L=F6wis writes: MvL> The real problem here is that some people don't want Python MvL> to write any .pyc files, period. So even if mailman MvL> byte-compiles all its source code, it can't stop Python from MvL> writing .pyc files that belong to mailman. That, in turn, can MvL> cause the problems discussed in this thread: there might be MvL> races in doing so, it might trigger security alarms, and it MvL> might write byte code files in configuration areas of the MvL> system where only text files are allowed, per policy. I'm not against suppressing or redirecting pyc output. I kind of like Skip's approach. -Barry From perky@fallin.lv Wed Jan 22 12:49:45 2003 From: perky@fallin.lv (Hye-Shik Chang) Date: Wed, 22 Jan 2003 21:49:45 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2E8957.708@lemburg.com> References: <20030122085353.AB36.ISHIMOTO@gembook.org> <3E2E6492.7000600@lemburg.com> <20030122184345.09C0.ISHIMOTO@gembook.org> <3E2E8957.708@lemburg.com> Message-ID: <20030122124945.GA96011@fallin.lv> On Wed, Jan 22, 2003 at 01:06:47PM +0100, M.-A. Lemburg wrote: [snip] > > degas site-packages/japanese# du > 337 ./c > 1252 ./mappings > 88 ./python > 8 ./aliases > 1790 . > > Hisao's Python codec is only 85kB in size. > > Now, if we took the only the C version of Tamito's codec, we'd > end up with around 1790 - 1252 - 88 = 450 kB. Still a factor of > 5... > > I wonder whether it wouldn't be possible to use the same tricks > Hisao used in his codec for a C version. The trick must not be used in C version. Because C codecs need to keep both of encoding and decoding maps as constants so that share texts inter processes and load the data only once in the whole system. This does matter for multiprocess daemons especially. Hye-Shik =) From ishimoto@gembook.org Wed Jan 22 12:50:44 2003 From: ishimoto@gembook.org (Atsuo Ishimoto) Date: Wed, 22 Jan 2003 21:50:44 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2E8957.708@lemburg.com> References: <20030122184345.09C0.ISHIMOTO@gembook.org> <3E2E8957.708@lemburg.com> Message-ID: <20030122214835.09DF.ISHIMOTO@gembook.org> On Wed, 22 Jan 2003 13:06:47 +0100 "M.-A. Lemburg" wrote: > I was talking about the *installed* size, ie. the size > of the package in site-packages: I'm sorry for my misunderstanding. > Now, if we took the only the C version of Tamito's codec, we'd > end up with around 1790 - 1252 - 88 = 450 kB. Still a factor of > 5... > Please try strip ./c/_japanese_codecs.so In my linux box, this reduces size of _japanese_codecs.so from 530 KB into 135 KB. I think this is reasonable size because it contains more tables than Hisao's version. > Hisao's approach uses a single table which fits into 58kB Python > source code. Boil that down to a static C table and you'll end up > with something around 10-20kB for static C data. Hisao does > still builds a dictionary using this data, but perhaps that step > could be avoided using the same techniques that Fredrik used > in boiling down the size of the unicodedata module (which holds > the Unicode Database). > Thank you for your advice. I will try it later, if you still think JapaneseCodec is too large. -------------------------- Atsuo Ishimoto ishimoto@gembook.org Homepage:http://www.gembook.jp From mchermside@ingdirect.com Wed Jan 22 13:00:46 2003 From: mchermside@ingdirect.com (Chermside, Michael) Date: Wed, 22 Jan 2003 08:00:46 -0500 Subject: [Python-Dev] Cookie.py too strict Message-ID: <7F171EB5E155544CAC4035F0182093F03CF6B7@INGDEXCHSANC1.ingdirect.com> [Barry writes that Cookie is always strict and that's occasionally a = PITA, particularly when you want to be strict in what you produce but liberal in what you accept.] Seems like=20 Cookie.load(rawdata) could become=20 Cookie.load(rawdata, strict=3Dtrue) without breaking anything. Putting it only on the .load() may be a = little less flexible than some would like, but it does tend to enforce the "produce strict, accept liberal" mantra. -- Michael Chermside From mal@lemburg.com Wed Jan 22 13:33:03 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 22 Jan 2003 14:33:03 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <20030122124945.GA96011@fallin.lv> References: <20030122085353.AB36.ISHIMOTO@gembook.org> <3E2E6492.7000600@lemburg.com> <20030122184345.09C0.ISHIMOTO@gembook.org> <3E2E8957.708@lemburg.com> <20030122124945.GA96011@fallin.lv> Message-ID: <3E2E9D8F.9030209@lemburg.com> Hye-Shik Chang wrote: > On Wed, Jan 22, 2003 at 01:06:47PM +0100, M.-A. Lemburg wrote: > [snip] > >>degas site-packages/japanese# du >>337 ./c >>1252 ./mappings >>88 ./python >>8 ./aliases >>1790 . >> >>Hisao's Python codec is only 85kB in size. >> >>Now, if we took the only the C version of Tamito's codec, we'd >>end up with around 1790 - 1252 - 88 = 450 kB. Still a factor of >>5... >> >>I wonder whether it wouldn't be possible to use the same tricks >>Hisao used in his codec for a C version. > > The trick must not be used in C version. Why not ? Anything that can trim down the memory footprint as well as the installation size is welcome :-) > Because C codecs need to keep > both of encoding and decoding maps as constants so that share texts > inter processes and load the data only once in the whole system. > This does matter for multiprocess daemons especially. Indeed, that's why the Unicode database is also stored this way. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From mal@lemburg.com Wed Jan 22 13:37:08 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 22 Jan 2003 14:37:08 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <20030122214835.09DF.ISHIMOTO@gembook.org> References: <20030122184345.09C0.ISHIMOTO@gembook.org> <3E2E8957.708@lemburg.com> <20030122214835.09DF.ISHIMOTO@gembook.org> Message-ID: <3E2E9E84.7090204@lemburg.com> Atsuo Ishimoto wrote: > On Wed, 22 Jan 2003 13:06:47 +0100 > "M.-A. Lemburg" wrote: > >>Now, if we took the only the C version of Tamito's codec, we'd >>end up with around 1790 - 1252 - 88 = 450 kB. Still a factor of >>5... >> > > Please try > strip ./c/_japanese_codecs.so > > In my linux box, this reduces size of _japanese_codecs.so from 530 KB > into 135 KB. I think this is reasonable size because it contains more > tables than Hisao's version. Ok, we're finally approaching a very reasonable size :-) BTW, why is it that Hisao can use one table for all supported encodings where Tamito uses 6 tables ? >>Hisao's approach uses a single table which fits into 58kB Python >>source code. Boil that down to a static C table and you'll end up >>with something around 10-20kB for static C data. Hisao does >>still builds a dictionary using this data, but perhaps that step >>could be avoided using the same techniques that Fredrik used >>in boiling down the size of the unicodedata module (which holds >>the Unicode Database). > > Thank you for your advice. I will try it later, if you still think > JapaneseCodec is too large. That would be great, thanks ! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From guido@python.org Wed Jan 22 13:58:37 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 22 Jan 2003 08:58:37 -0500 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: Your message of "22 Jan 2003 09:27:36 +0100." References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> <15917.58401.603261.82762@gargle.gargle.HOWL> <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> <200301220150.h0M1oT711039@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <200301221358.h0MDwbL12999@pcp02138704pcs.reston01.va.comcast.net> > I said that there is this Debian bug, Barry said that this was a bug > in Debian itself, for not compiling all mailman source code, Matthias > said that this doesn't apply to mm_cfg.py (i.e. Debian can't ship a > precompiled mm_cfg.py), you asked what the problem is, I said that > generating mm_cfg.pyc is the problem. Sorry, I *still* don't understand why shipping an mm_cfg.pyc (that's ignored because mm_cfg.py is newer) is a problem. Or is Debian shipping only .pyc files? --Guido van Rossum (home page: http://www.python.org/~guido/) From ishimoto@gembook.org Wed Jan 22 14:02:38 2003 From: ishimoto@gembook.org (Atsuo Ishimoto) Date: Wed, 22 Jan 2003 23:02:38 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2E9E84.7090204@lemburg.com> References: <20030122214835.09DF.ISHIMOTO@gembook.org> <3E2E9E84.7090204@lemburg.com> Message-ID: <20030122225450.09E3.ISHIMOTO@gembook.org> On Wed, 22 Jan 2003 14:37:08 +0100 "M.-A. Lemburg" wrote: > > Ok, we're finally approaching a very reasonable size :-) I'm really grad to hear so. > > BTW, why is it that Hisao can use one table for all supported > encodings where Tamito uses 6 tables ? There are several kind of character set used in Japan. His codec supports only two character set called JIS X 0201 and 0208. Tamito's codec supports other character sets such as JIS X 0212 or Microsoft's extended charactor set called cp932. > > > > Thank you for your advice. I will try it later, if you still think > > JapaneseCodec is too large. > > That would be great, thanks ! > I'm not sure this is effective or not, though. Mappling tables under current implementation are well condensed. Anyway, I'll try to reduce size. -------------------------- Atsuo Ishimoto ishimoto@gembook.org Homepage:http://www.gembook.jp From martin@v.loewis.de Wed Jan 22 14:23:41 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 22 Jan 2003 15:23:41 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2E8957.708@lemburg.com> References: <20030122085353.AB36.ISHIMOTO@gembook.org> <3E2E6492.7000600@lemburg.com> <20030122184345.09C0.ISHIMOTO@gembook.org> <3E2E8957.708@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > I was talking about the *installed* size, ie. the size > of the package in site-packages: Right. And we are trying to tell you that this is irrelevant when talking about the size increase to be expected when JapaneseCodecs is incorporated into Python. > degas site-packages/japanese# du > 337 ./c > 1252 ./mappings > 88 ./python > 8 ./aliases You should ignore mappings and python in your counting, they are not needed. > I wonder whether it wouldn't be possible to use the same tricks > Hisao used in his codec for a C version. I believe it does use the same tricks. It's just that the JapaneseCodecs package supports a number of widely-used encodings which Hisao's package does not support. > The source code size is not that important. The install size > is and even more the memory footprint. Computing the memory footprint is very difficult, of course. > Hisao's approach uses a single table which fits into 58kB Python > source code. Boil that down to a static C table and you'll end up > with something around 10-20kB for static C data. How did you obtain this number? > Hisao does still builds a dictionary using this data, but perhaps > that step could be avoided using the same techniques that Fredrik > used in boiling down the size of the unicodedata module (which holds > the Unicode Database). Perhaps, yes. Have you studied the actual data to see whether these techniques might help or not? Regards, Martin From martin@v.loewis.de Wed Jan 22 14:27:04 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 22 Jan 2003 15:27:04 +0100 Subject: [Python-Dev] disable writing .py[co] In-Reply-To: <200301221358.h0MDwbL12999@pcp02138704pcs.reston01.va.comcast.net> References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> <15917.58401.603261.82762@gargle.gargle.HOWL> <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> <200301220150.h0M1oT711039@pcp02138704pcs.reston01.va.comcast.net> <200301221358.h0MDwbL12999@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > Sorry, I *still* don't understand why shipping an mm_cfg.pyc (that's > ignored because mm_cfg.py is newer) is a problem. Or is Debian > shipping only .pyc files? If you ship it, and the user modifies it, the .pyc file will be outdated, and regenerated the next time mailman is run. Right? This is a problem in itself - python *will* try to write a .pyc file, even if one was originally distributed. Furthermore, mm_cfg is located in /etc/mailman/mm_cfg.py. I believe, as a policy, you should not have binary files in /etc. Regards, Martin From mal@lemburg.com Wed Jan 22 15:08:44 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 22 Jan 2003 16:08:44 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: References: <20030122085353.AB36.ISHIMOTO@gembook.org> <3E2E6492.7000600@lemburg.com> <20030122184345.09C0.ISHIMOTO@gembook.org> <3E2E8957.708@lemburg.com> Message-ID: <3E2EB3FC.2020107@lemburg.com> Martin v. L=F6wis wrote: > "M.-A. Lemburg" writes: >=20 >>I was talking about the *installed* size, ie. the size >>of the package in site-packages: >=20 > Right. And we are trying to tell you that this is irrelevant when > talking about the size increase to be expected when JapaneseCodecs is > incorporated into Python. Why is it irrelevant ? If it would be irrelevant Fredrik wouldn't have invested so much time in trimming down the footprint of the Unicode database. What we need is a generic approach here which works for more than just the Japanese codecs. I believe that those codecs could provide a good basis for more codecs from the Asian locale, but before adding megabytes of mapping tables, I'd prefer to settle for a good design first. >>Hisao's approach uses a single table which fits into 58kB Python >>source code. Boil that down to a static C table and you'll end up >>with something around 10-20kB for static C data.=20 > > How did you obtain this number?=20 By looking at the code. It uses Unicode literals to define the table. >>Hisao does still builds a dictionary using this data, but perhaps >>that step could be avoided using the same techniques that Fredrik >>used in boiling down the size of the unicodedata module (which holds >>the Unicode Database). >=20 > Perhaps, yes. Have you studied the actual data to see whether these > techniques might help or not? It's just a hint: mapping tables are all about fast lookup vs. memory consumption and that's what Fredrik's approach of decomposition does rather well (Tamito already uses such an approach). cdb would provide an alternative approach, but there are licensing problems... --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tjreedy@udel.edu Wed Jan 22 15:24:22 2003 From: tjreedy@udel.edu (Terry Reedy) Date: Wed, 22 Jan 2003 10:24:22 -0500 Subject: [Python-Dev] Re: disable writing .py[co] References: <20030120223204.GP28870@epoch.metaslash.com><200301202242.h0KMgT014476@odiug.zope.com><3E2C7EEB.3020209@v.loewis.de><15916.33189.872024.702720@gargle.gargle.HOWL><15917.58401.603261.82762@gargle.gargle.HOWL><200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net><200301220150.h0M1oT711039@pcp02138704pcs.reston01.va.comcast.net><200301221358.h0MDwbL12999@pcp02138704pcs.reston01.va.comcast.net> Message-ID: "Martin v. Löwis" wrote in message news:m3lm1db6pz.fsf@mira.informatik.hu-berlin.de... > Furthermore, mm_cfg is located in /etc/mailman/mm_cfg.py. I believe, > as a policy, you should not have binary files in /etc. For this and similar use cases, where the behavior switch needs to be on a per-file or group-of-files basis, a possible solution would be another .pyx extension (why not claim them all?) such as .pyf for py-final, meaning "this python code is the last (and only) version we want to see on the file system -- do not write anything else -- .pyc, .pyo, or any future version of processed code". Terry J. Reedy From martin@v.loewis.de Wed Jan 22 15:50:56 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 22 Jan 2003 16:50:56 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2EB3FC.2020107@lemburg.com> References: <20030122085353.AB36.ISHIMOTO@gembook.org> <3E2E6492.7000600@lemburg.com> <20030122184345.09C0.ISHIMOTO@gembook.org> <3E2E8957.708@lemburg.com> <3E2EB3FC.2020107@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > > Right. And we are trying to tell you that this is irrelevant when > > talking about the size increase to be expected when JapaneseCodecs is > > incorporated into Python. > > Why is it irrelevant ? Because the size increase you have reported won't be the size increase observed if JapaneseCodecs is incorporated into Python. > It's just a hint: mapping tables are all about fast lookup vs. memory > consumption and that's what Fredrik's approach of decomposition does > rather well (Tamito already uses such an approach). cdb would provide > an alternative approach, but there are licensing problems... The trie approach in unicodedata requires that many indices have equal entries, and that, when grouping entries into blocks, multiple blocks can be found. This is not the case for CJK mappings, as there is no inherent correlation between the code points in some CJK encoding and the equivalent Unicode code point. In Unicode, the characters have seen Han Unification, and are sorted according to the sorting principles of Han Unification. In other encodings, other sorting principles have been applied, and no unification has taken place. Insofar chunks of the encoding are more systematic, the JapaneseCodecs package already employs algorithmic mappings, see _japanese_codecs.c, e.g. for the mapping of ASCII, or the 0201 halfwidth characters. Regards, Martin From kajiyama@grad.sccs.chukyo-u.ac.jp Wed Jan 22 16:14:04 2003 From: kajiyama@grad.sccs.chukyo-u.ac.jp (Tamito KAJIYAMA) Date: Thu, 23 Jan 2003 01:14:04 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <20030122085353.AB36.ISHIMOTO@gembook.org> (message from Atsuo Ishimoto on Wed, 22 Jan 2003 09:22:20 +0900) References: <20030122085353.AB36.ISHIMOTO@gembook.org> Message-ID: <200301221614.h0MGE4R17977@grad.sccs.chukyo-u.ac.jp> Atsuo Ishimoto writes: | | (cc'ing another Tamito's mail addr. Tamito, are you wake up?) Sorry for the late participation. Things go fast and my thought is very slow... I know the python-dev list is a highly technical place of discussions, but I'd like to explain my personal situation and related matters. On my situation: I'm a doctoral candidate and my job has come to a very tough period. I do want to volunteer for the great task of incorporating JapaneseCodecs into the Python distro, but I'm not sure that I have enough spare time to do it. I don't want to admit I cannot do that, but it's very likely. On the efficiency of my codecs: Honestly speaking, the priorities with regard to time and space efficiencies during the development of JapaneseCodecs were very low. I believe there is much room for improvements. The set of mapping tables in the pure Python codecs would be the very first candidate. On Suzuki-san's codecs: I had never imagined that JapaneseCodecs would have a competitor. I think my codecs package is a good product, but I don't have such strong confidence that Suzuki-san has on his work. I believe his codecs package deserves *the* default Japanese codecs package only because of his positive commitment among other advantages. Anyway, I'm very glad that Atsuo has expressed his favor on my codecs. And, thank you, Guido. I was really relieved with your thoughtful kindness. Regards, -- KAJIYAMA, Tamito From skip@pobox.com Wed Jan 22 16:33:20 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 22 Jan 2003 10:33:20 -0600 Subject: [Python-Dev] Re: disable writing .py[co] In-Reply-To: References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> <15917.58401.603261.82762@gargle.gargle.HOWL> <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> <200301220150.h0M1oT711039@pcp02138704pcs.reston01.va.comcast.net> <200301221358.h0MDwbL12999@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15918.51152.476659.589596@montanaro.dyndns.org> Martin> Furthermore, mm_cfg is located in /etc/mailman/mm_cfg.py. I Martin> believe, as a policy, you should not have binary files in /etc. I think this view is a bit extreme. On the sysmtes I have at hand, there are plenty of binary files in /etc, put there for purposes similar to Python's byte compilation - performance improvement. The example that comes most readily to my mind is sendmail. You fiddle /etc/mail/access and sendmail "compiles" it into a Berkeley DB hash file. Skip From martin@v.loewis.de Wed Jan 22 16:49:48 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 22 Jan 2003 17:49:48 +0100 Subject: [Python-Dev] Re: disable writing .py[co] In-Reply-To: <15918.51152.476659.589596@montanaro.dyndns.org> References: <20030120223204.GP28870@epoch.metaslash.com> <200301202242.h0KMgT014476@odiug.zope.com> <3E2C7EEB.3020209@v.loewis.de> <15916.33189.872024.702720@gargle.gargle.HOWL> <15917.58401.603261.82762@gargle.gargle.HOWL> <200301220101.h0M110M10543@pcp02138704pcs.reston01.va.comcast.net> <200301220150.h0M1oT711039@pcp02138704pcs.reston01.va.comcast.net> <200301221358.h0MDwbL12999@pcp02138704pcs.reston01.va.comcast.net> <15918.51152.476659.589596@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > Martin> Furthermore, mm_cfg is located in /etc/mailman/mm_cfg.py. I > Martin> believe, as a policy, you should not have binary files in /etc. > > I think this view is a bit extreme. The classification of this view might be irrelevant, though, for the issue at hand. The original reporter of the bug was using a LIDS-enhanced Linux kernel, which, in itself, might be a bit extreme. In any case, this kernel produces diagnostics, and he wants it to stop producing those diagnostics, by changing the software that triggers them. > You fiddle /etc/mail/access and > sendmail "compiles" it into a Berkeley DB hash file. Sure. Many people don't have a problem with that. Some would go and change sendmail so that it places its Berkeley DB files into /var. Regards, Martin From guido@python.org Wed Jan 22 17:24:21 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 22 Jan 2003 12:24:21 -0500 Subject: [Python-Dev] Extended Function syntax Message-ID: <200301221724.h0MHOLk13779@pcp02138704pcs.reston01.va.comcast.net> A while ago there was a proposal floating around to add an optional part to function/method definitions, that would replace the current clumsy classmethod etc. notation, and could be used for other purposes too. I think the final proposal looked like this: def name(arg, ...) [expr, ...]: ...body... Does anyone remember or know where to find the thread where this proposal was discussed? It ought to be turned into a PEP. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Wed Jan 22 17:32:30 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 22 Jan 2003 18:32:30 +0100 Subject: [Python-Dev] Extended Function syntax In-Reply-To: <200301221724.h0MHOLk13779@pcp02138704pcs.reston01.va.comcast.net> References: <200301221724.h0MHOLk13779@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum writes: > I think the final proposal looked like this: > > def name(arg, ...) [expr, ...]: > ...body... > I don't think there was much discussion. The suggested semantics was that this is equivalent to def name(arg, ...): ...body... name=expr(name) ... I *think* there was discussion as to the namespace in which expr is evaluated, or whether certain identifiers have keyword or predefined meaning (so you can write 'static' instead of 'staticmethod'). I don't think there was ever a complete analysis whether this syntax meets all requirements, i.e. whether you could use it for all newstyle features. In particular, I don't recall what the proposal was how properties should be spelled. Regards, Martin From mwh@python.net Wed Jan 22 17:50:13 2003 From: mwh@python.net (Michael Hudson) Date: 22 Jan 2003 17:50:13 +0000 Subject: [Python-Dev] Extended Function syntax In-Reply-To: Guido van Rossum's message of "Wed, 22 Jan 2003 12:24:21 -0500" References: <200301221724.h0MHOLk13779@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <2mptqpkrai.fsf@starship.python.net> Guido van Rossum writes: > A while ago there was a proposal floating around to add an optional > part to function/method definitions, that would replace the current > clumsy classmethod etc. notation, and could be used for other purposes > too. I think the final proposal looked like this: > > def name(arg, ...) [expr, ...]: > ...body... That was the one I came up with. > Does anyone remember or know where to find the thread where this > proposal was discussed? It ought to be turned into a PEP. I think it was here on python-dev. I was going to turn it into a PEP, but thought I'd wait until 2.3 was done. Discussion here: http://mail.python.org/pipermail/python-dev/2002-February/020005.html The patch linked to in that mail still applies, remarkably enough; I haven't tested whether it *works* recently... Cheers, M. -- Now this is what I don't get. Nobody said absolutely anything bad about anything. Yet it is always possible to just pull random flames out of ones ass. -- http://www.advogato.org/person/vicious/diary.html?start=60 From jim@zope.com Wed Jan 22 18:19:33 2003 From: jim@zope.com (Jim Fulton) Date: Wed, 22 Jan 2003 13:19:33 -0500 Subject: [Python-Dev] Extended Function syntax In-Reply-To: References: <200301221724.h0MHOLk13779@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E2EE0B5.6040802@zope.com> Martin v. L=F6wis wrote: > Guido van Rossum writes: >=20 >=20 >>I think the final proposal looked like this: >> >> def name(arg, ...) [expr, ...]: >> ...body... >> >=20 >=20 > I don't think there was much discussion. The suggested semantics was > that this is equivalent to >=20 > def name(arg, ...): > ...body... > name=3Dexpr(name) > ... In particular: def name(arg, ...) [expr1, expr2, expr3]: ...body... would be equivalent to (some variation on): def name(arg, ...): ...body... name=3Dexpr1(expr2(expr3(name))) I wonder if the same mechanism could be used in class statements. If I had this, I might say what interface a class implements with: class foo(spam, eggs) [implements(IFoo, IBar)]: body of class or (wo parens): class foo [implements(IFoo, IBar)]: body of class Jim --=20 Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From nas@python.ca Wed Jan 22 19:37:40 2003 From: nas@python.ca (Neil Schemenauer) Date: Wed, 22 Jan 2003 11:37:40 -0800 Subject: [Python-Dev] Extended Function syntax In-Reply-To: <200301221724.h0MHOLk13779@pcp02138704pcs.reston01.va.comcast.net> References: <200301221724.h0MHOLk13779@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <20030122193740.GA19418@glacier.arctrix.com> Guido van Rossum wrote: > I think the final proposal looked like this: > > def name(arg, ...) [expr, ...]: > ...body... I thought it was: def name [expr, ...] (arg, ...): ...body... That's the extension we are using for PTL. Neil From skip@pobox.com Wed Jan 22 20:26:15 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 22 Jan 2003 14:26:15 -0600 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files Message-ID: <15918.65127.667962.533847@montanaro.dyndns.org> Folks, Here's a first stab at a PEP about controlling generation of bytecode files. Feedback appreciated. Skip ---------------------------------------------------------------------------- PEP: NNN Title: Controlling generation of bytecode files Version: $Revision: $ Last-Modified: $Date: $ Author: Skip Montanaro Status: Active Type: Draft Content-Type: text/x-rst Created: 22-Jan-2003 Post-History: Abstract ======== This PEP outlines a mechanism for controlling the generation and location of compiled Python bytecode files. This idea originally arose as a patch request [1]_ and evolved into a discussion thread on the python-dev mailing list [2]_. The introduction of an environment variable will allow people installing Python or Python-based third-party packages to control whether or not bytecode files should be generated, and if so, where they should be written. Proposal ======== Add a new environment variable, PYCROOT, to the mix of environment variables which Python understands. Its interpretation is: - If not present or present but with an empty string value, Python bytecode is generated in exactly the same way as is currently done. - If present and it refers to an existing directory, bytecode files are written into a directory structure rooted at that location. - If present and it does not refer to an existing directory, generation of bytecode files is suppressed altogether. sys.path is not modified. If PYCROOT is set and valid, during module lookup, the bytecode file will be looked for first in the same directory as the source file, then in the directory formed by prefixing the source file's directory with the PYCROOT directory, e.g., in a Unix environment: os.path.join(os.environ["PYCROOT"], os.path.split(sourcefile)[0]) (Under Windows the above operation, while conceptually similar, will almost certainly differ in detail.) Rationale ========= In many environments it is not possible for non-root users to write into the directory containing the source file. Most of the time, this is not a problem except for reduced performance. In some cases it can be an annoyance, if nothing else. [3]_ In other situations where bytecode files are writable, it can be a source of file corruption if multiple processes attempt to write the same bytecode file at the same time. [4]_ In environments with ramdisks available, it may be desirable from a performance standpoint to write bytecode files to a directory on such a disk. Alternatives ============ The only other alternative proposed so far [1]_ seems to be to add a -R flag to the interpreter to disable writing bytecode files altogether. This proposal subsumes that. Issues ====== - When looking for a bytecode file should the directory holding the source file be considered as well, or just the location implied by PYCROOT? If so, which should be searched first? It seems to me that if a module lives in /usr/local/lib/python2.3/mod.py and was installed by root without PYCROOT set, you'd want to use the bytecode file there if it was up-to-date without ever considering os.environ["PYCROOT"] + "/usr/local/lib/python2.3/". Only if you need to write out a bytecode file would anything turn up there. - Operation on multi-root file systems (e.g., Windows). On Windows each drive is fairly independent. If PYCROOT is set to C:\TEMP and a module is located in D:\PYTHON22\mod.py, where should the bytecode file be written? I think a scheme similar to what Cygwin uses (treat drive letters more-or-less as directory names) would work in practice, but I have no direct experience to draw on. The above might cause C:\TEMP\D\PYTHON22\mod.pyc to be written. What if PYCROOT doesn't include a drive letter? Perhaps the current drive at startup should be assumed. - Interpretation of a module's __file__ attribute. I believe the __file__ attribute of a module should reflect the true location of the bytecode file. If people want to locate a module's source code, they should use imp.find_module(module). - Security - What if root has PYCROOT set? Yes, this can present a security risk, but so can many things the root user does. The root user should probably not set PYCROOT except during installation. Still, perhaps this problem can be minimized. When running as root the interpreter should check to see if PYCROOT refers to a world-writable directory. If so, it could raise an exception or warning and reset PYCROOT to the empty string. Or, see the next item. - More security - What if PYCROOT refers to a general directory (say, /tmp)? In this case, perhaps loading of a preexisting bytecode file should occur only if the file is owned by the current user or root. (Does this matter on Windows?) - Runtime control - should there be a variable in sys (say, sys.pycroot) which takes on the value of PYCROOT (or an empty string or None) and which is modifiable on-the-fly? Should sys.pycroot be initialized from PYCROOT and then PYCROOT ignored (that is, what if they differ)? - Should there be a command-line flag for the interpreter instead of or in addition to an environment variable? This seems like it would be less flexible. During Python installation, the user frequently doesn't have ready access to the interpreter command line. Using an environment variable makes it easier to control behavior. - Should PYCROOT be interpreted differently during installation than at runtime? I have no idea. (Maybe it's just a stupid thought, but the thought occurred to me, so I thought I'd mention it.) Examples ======== In all the examples which follow, the urllib module is used as an example. Unless otherwise indicated, it lives in /usr/local/lib/python2.3/urllib.py and /usr/local/lib/python2.3 is not writable by the current, non-root user. - PYCROOT is set to /tmp. /usr/local/lib/python2.3/urllib.pyc exists, but is out-of-date. When urllib is imported, the generated bytecode file is written to /tmp/usr/local/lib/python2.3/urllib.pyc. Intermediate directories will be created as needed. - PYCROOT is not set. No urllib.pyc file is found. When urllib is imported, no bytecode file is written. - PYCROOT is set to /tmp. No urllib.pyc file is found. When urllib is imported, the generated bytecode file is written to /tmp/usr/local/lib/python2.3/urllib.pyc, again, creating intermediate directories as needed. References ========== .. [1] patch 602345, Option for not writing py.[co] files, Klose (http://www.python.org/sf/602345) .. [2] python-dev thread, Disable writing .py[co], Norwitz (http://mail.python.org/pipermail/python-dev/2003-January/032270.html) .. [3] Debian bug report, Mailman is writing to /usr in cron, Wegner (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=96111) .. [4] python-dev thread, Parallel pyc construction, Dubois (http://mail.python.org/pipermail/python-dev/2003-January/032060.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From mal@lemburg.com Wed Jan 22 20:36:17 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 22 Jan 2003 21:36:17 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <200301221614.h0MGE4R17977@grad.sccs.chukyo-u.ac.jp> References: <20030122085353.AB36.ISHIMOTO@gembook.org> <200301221614.h0MGE4R17977@grad.sccs.chukyo-u.ac.jp> Message-ID: <3E2F00C1.7060905@lemburg.com> Tamito KAJIYAMA wrote: > Atsuo Ishimoto writes: > | > | (cc'ing another Tamito's mail addr. Tamito, are you wake up?) > > Sorry for the late participation. Things go fast and my thought > is very slow... Thanks for joining in. I had hoped to hear a word from you on the subject :-) > I know the python-dev list is a highly technical place of > discussions, but I'd like to explain my personal situation and > related matters. > > On my situation: I'm a doctoral candidate and my job has come to > a very tough period. I do want to volunteer for the great task > of incorporating JapaneseCodecs into the Python distro, but I'm > not sure that I have enough spare time to do it. I don't want > to admit I cannot do that, but it's very likely. > > On the efficiency of my codecs: Honestly speaking, the > priorities with regard to time and space efficiencies during the > development of JapaneseCodecs were very low. I believe there is > much room for improvements. The set of mapping tables in the > pure Python codecs would be the very first candidate. Ok, how about this: we include the C versions of your codecs in the distribution and you take over maintenance as soon as time permits. Still, I'd would love to see some further improvement of the size and performance of the codecs (and maybe support for the new error callbacks; something which Hisao has integrated into his codecs). Would it be possible for you two to team up for the further developement of the Japanese codecs ? Perhaps Hye-Shik Chang could join you in the effort, since he's the author of the KoreanCodecs package which has somewhat similar problem scope (that of stateful encodings with a huge number of mappings) ? Thanks, -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From bac@OCF.Berkeley.EDU Wed Jan 22 20:52:46 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Wed, 22 Jan 2003 12:52:46 -0800 (PST) Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: <15918.65127.667962.533847@montanaro.dyndns.org> References: <15918.65127.667962.533847@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > Folks, > > Here's a first stab at a PEP about controlling generation of bytecode > files. Feedback appreciated. > Now I don't need to worry as much about summarizing this thread. =) > Issues > ====== > > - When looking for a bytecode file should the directory holding the > source file be considered as well, or just the location implied by > PYCROOT? If so, which should be searched first? It seems to me > that if a module lives in /usr/local/lib/python2.3/mod.py and was > installed by root without PYCROOT set, you'd want to use the > bytecode file there if it was up-to-date without ever considering > os.environ["PYCROOT"] + "/usr/local/lib/python2.3/". Only if you > need to write out a bytecode file would anything turn up there. > In other words you are wondering about the situation of where root installs Python (and thus generates .pyo files in the install directory) but PYCROOT is set later on. Seems reasonable to check the holding directory, just don't know how often this situation will come up. But if you are going to all the trouble of having a separate place to keep your byte-compiled files, shouldn't you keep *all* of the byte-compiled files you want to use together? Since you are storing them it is a one-time compile and thus not that big of a deal. > - Operation on multi-root file systems (e.g., Windows). On Windows > each drive is fairly independent. If PYCROOT is set to C:\TEMP and > a module is located in D:\PYTHON22\mod.py, where should the bytecode > file be written? I think a scheme similar to what Cygwin uses > (treat drive letters more-or-less as directory names) would work in > practice, but I have no direct experience to draw on. The above > might cause C:\TEMP\D\PYTHON22\mod.pyc to be written. > Using the drive letter as a directory names seems like a good solution. > What if PYCROOT doesn't include a drive letter? Perhaps the current > drive at startup should be assumed. > Probably should. > - Interpretation of a module's __file__ attribute. I believe the > __file__ attribute of a module should reflect the true location of > the bytecode file. If people want to locate a module's source code, > they should use imp.find_module(module). > > - Security - What if root has PYCROOT set? Yes, this can present a > security risk, but so can many things the root user does. The root > user should probably not set PYCROOT except during installation. > Still, perhaps this problem can be minimized. When running as root > the interpreter should check to see if PYCROOT refers to a > world-writable directory. If so, it could raise an exception or > warning and reset PYCROOT to the empty string. Or, see the next > item. > > - More security - What if PYCROOT refers to a general directory (say, > /tmp)? In this case, perhaps loading of a preexisting bytecode file > should occur only if the file is owned by the current user or root. > (Does this matter on Windows?) > To comment on both security issues: if someone is worrying about security they should just turn byte-compiling off completely. > - Runtime control - should there be a variable in sys (say, > sys.pycroot) which takes on the value of PYCROOT (or an empty string > or None) and which is modifiable on-the-fly? Should sys.pycroot be > initialized from PYCROOT and then PYCROOT ignored (that is, what if > they differ)? > I think this is YAGNI (hey, I am using the abbreviations I learned from the list; I'm learning, I'm learning! =). Why the heck would your needs for compiling bytecode files change while running a program? Just turn off the compiling if you think you are not going to need it. > - Should there be a command-line flag for the interpreter instead of > or in addition to an environment variable? This seems like it would > be less flexible. During Python installation, the user frequently > doesn't have ready access to the interpreter command line. Using an > environment variable makes it easier to control behavior. > I think it should be one or the other but not both. If the features of having PYCROOT is worth it than there is no need to deal with a command-line option *unless* having a different setting for different interpreter invocations under the same user is considered more important. In that case go with the command-line and ditch the environment variable. > - Should PYCROOT be interpreted differently during installation than > at runtime? I have no idea. (Maybe it's just a stupid thought, but > the thought occurred to me, so I thought I'd mention it.) > No. Should be the same. -Brett From martin@v.loewis.de Wed Jan 22 21:53:24 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 22 Jan 2003 22:53:24 +0100 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: <15918.65127.667962.533847@montanaro.dyndns.org> References: <15918.65127.667962.533847@montanaro.dyndns.org> Message-ID: <3E2F12D4.5020404@v.loewis.de> Skip Montanaro wrote: > This PEP outlines a mechanism for controlling the generation and > location of compiled Python bytecode files. [...] > Add a new environment variable, PYCROOT, to the mix of environment > variables which Python understands. Its interpretation is: I believe this is currently underspecified: It only talks about where .pyc files are written. Wherefrom are they read? Any answer to that question should take into account that there might be existing .pyc files from a compileall run. Regards, Martin From skip@pobox.com Wed Jan 22 21:57:34 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 22 Jan 2003 15:57:34 -0600 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: References: <15918.65127.667962.533847@montanaro.dyndns.org> Message-ID: <15919.5070.6621.177402@montanaro.dyndns.org> >>>>> "Brett" == Brett Cannon writes: Brett> [Skip Montanaro] >> Folks, >> >> Here's a first stab at a PEP about controlling generation of bytecode >> files. Feedback appreciated. >> Brett> Now I don't need to worry as much about summarizing this thread. =) Brett> >> Issues >> ====== >> >> - When looking for a bytecode file should the directory holding the >> source file be considered as well, or just the location implied by >> PYCROOT? If so, which should be searched first? It seems to me that >> if a module lives in /usr/local/lib/python2.3/mod.py and was >> installed by root without PYCROOT set, you'd want to use the bytecode >> file there if it was up-to-date without ever considering >> os.environ["PYCROOT"] + "/usr/local/lib/python2.3/". Only if you >> need to write out a bytecode file would anything turn up there. >> Brett> In other words you are wondering about the situation of where Brett> root installs Python (and thus generates .pyo files in the Brett> install directory) but PYCROOT is set later on. Seems reasonable Brett> to check the holding directory, just don't know how often this Brett> situation will come up. But if you are going to all the trouble Brett> of having a separate place to keep your byte-compiled files, Brett> shouldn't you keep *all* of the byte-compiled files you want to Brett> use together? Since you are storing them it is a one-time Brett> compile and thus not that big of a deal. Here's a situation to consider: Shared .py files outside the normal distribution are stored in a read-only directory without .pyc's. Each user might set PYCROOT to $HOME/tmp/Python-N.M. A single version of those files could be safely shared by multiple installed versions of Python. You might always search the directory with the .py file, then the private repository. Or did I misunderstand what you were getting at? >> - Runtime control - should there be a variable in sys (say, >> sys.pycroot) ... Brett> Why the heck would your needs for compiling bytecode files change Brett> while running a program? I don't know, but there is a compile() builtin, so people might want to control its behavior. Maybe pychecker wants to compile modules without dropping .pyc files on the disk which it needs to clean up later, even in the face of the user's PYCROOT setting. It could simply set sys.pycroot at startup. Of course, it could also putenv("PYCROOT", "/dev/null") as well. I guess it's mostly a matter of convenience. (Also, sys.pycroot won't affect forked subprocesses.) Skip From martin@v.loewis.de Wed Jan 22 22:07:10 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 22 Jan 2003 23:07:10 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2F00C1.7060905@lemburg.com> References: <20030122085353.AB36.ISHIMOTO@gembook.org> <200301221614.h0MGE4R17977@grad.sccs.chukyo-u.ac.jp> <3E2F00C1.7060905@lemburg.com> Message-ID: <3E2F160E.5030201@v.loewis.de> M.-A. Lemburg wrote: > Perhaps Hye-Shik Chang could join you in the effort, since he's > the author of the KoreanCodecs package which has somewhat > similar problem scope (that of stateful encodings with a huge > number of mappings) ? I believe (without checking in detail) that the "statefulness" is also an issue in these codecs. Many of the CJK encodings aren't stateful beyond being multi-byte (except for the iso-2022 ones). IOW, there is a non-trivial state only if you process the input byte-for-byte: you have to know whether you are a the first or second byte (and what the first byte was if you are at the second byte). AFAICT, both Japanese codecs assume that you can always look at the second byte when you get the first byte. Of course, this assumption is wrong if you operate in a stream mode, and read the data in, say, chunks of 1024 bytes: such a chunk may split exactly between a first and second byte (*). In these cases, I believe, both codecs would give incorrect results. Please correct me if I'm wrong. Regards, Martin (*) The situation is worse for GB 18030, which also has 4-byte encodings. From bac@OCF.Berkeley.EDU Wed Jan 22 22:24:36 2003 From: bac@OCF.Berkeley.EDU (Brett Cannon) Date: Wed, 22 Jan 2003 14:24:36 -0800 (PST) Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: <15919.5070.6621.177402@montanaro.dyndns.org> References: <15918.65127.667962.533847@montanaro.dyndns.org> <15919.5070.6621.177402@montanaro.dyndns.org> Message-ID: [Skip Montanaro] > Brett> In other words you are wondering about the situation of where > Brett> root installs Python (and thus generates .pyo files in the > Brett> install directory) but PYCROOT is set later on. Seems reasonable > Brett> to check the holding directory, just don't know how often this > Brett> situation will come up. But if you are going to all the trouble > Brett> of having a separate place to keep your byte-compiled files, > Brett> shouldn't you keep *all* of the byte-compiled files you want to > Brett> use together? Since you are storing them it is a one-time > Brett> compile and thus not that big of a deal. > > Here's a situation to consider: Shared .py files outside the normal > distribution are stored in a read-only directory without .pyc's. Each user > might set PYCROOT to $HOME/tmp/Python-N.M. A single version of those files > could be safely shared by multiple installed versions of Python. You might > always search the directory with the .py file, then the private repository. > OK, but how are they going to specify that universal read-only directory? Is it going to be set in PYTHONPATH? If so, then you still don't need to check the directory where the module exists for a .pyc file since it was specified in such a way that checking there will be fruitless. Same goes for if you set this read-only directory in sys.path. > Or did I misunderstand what you were getting at? > No, I think you got it. > >> - Runtime control - should there be a variable in sys (say, > >> sys.pycroot) ... > > Brett> Why the heck would your needs for compiling bytecode files change > Brett> while running a program? > > I don't know, but there is a compile() builtin, so people might want to > control its behavior. Maybe pychecker wants to compile modules without > dropping .pyc files on the disk which it needs to clean up later, even in > the face of the user's PYCROOT setting. It could simply set sys.pycroot at > startup. Of course, it could also putenv("PYCROOT", "/dev/null") as well. > I guess it's mostly a matter of convenience. (Also, sys.pycroot won't > affect forked subprocesses.) > I agree that it is a matter of convenience and I just don't see it being enough of one. -0 vote from me on this feature. -Brett From kajiyama@grad.sccs.chukyo-u.ac.jp Wed Jan 22 22:23:50 2003 From: kajiyama@grad.sccs.chukyo-u.ac.jp (Tamito KAJIYAMA) Date: Thu, 23 Jan 2003 07:23:50 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2F00C1.7060905@lemburg.com> (mal@lemburg.com) References: <200301221632.h0MGWCI18042@grad.sccs.chukyo-u.ac.jp> Message-ID: <200301222223.h0MMNoH18983@grad.sccs.chukyo-u.ac.jp> "M.-A. Lemburg" writes: | | Ok, how about this: we include the C versions of your codecs | in the distribution and you take over maintenance as soon | as time permits. I agree. -- KAJIYAMA, Tamito From kajiyama@grad.sccs.chukyo-u.ac.jp Wed Jan 22 22:50:24 2003 From: kajiyama@grad.sccs.chukyo-u.ac.jp (Tamito KAJIYAMA) Date: Thu, 23 Jan 2003 07:50:24 +0900 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2F160E.5030201@v.loewis.de> (message from "Martin v. =?ISO-8859-1?Q?L=F6wis"?= on Wed, 22 Jan 2003 23:07:10 +0100) References: <200301221632.h0MGWCI18042@grad.sccs.chukyo-u.ac.jp> Message-ID: <200301222250.h0MMoOI19022@grad.sccs.chukyo-u.ac.jp> "Martin v. L=F6wis" writes: | | M.-A. Lemburg wrote: | > Perhaps Hye-Shik Chang could join you in the effort, since he's | > the author of the KoreanCodecs package which has somewhat | > similar problem scope (that of stateful encodings with a huge | > number of mappings) ? | | I believe (without checking in detail) that the "statefulness" is also | an issue in these codecs. | | Many of the CJK encodings aren't stateful beyond being multi-byte | (except for the iso-2022 ones). IOW, there is a non-trivial state only | if you process the input byte-for-byte: you have to know whether you are | a the first or second byte (and what the first byte was if you are at | the second byte). AFAICT, both Japanese codecs assume that you can | always look at the second byte when you get the first byte. Right, as far as my codecs are concerned. All decoders in the JapaneseCodecs package assume that the input byte sequence does not end in a middle of a multi-byte character. The iso-2022 decoders even assume that the input sequence is a "valid" text as defined in RFC1468 (i.e. the text must end in the US ASCII mode). However, AFAIK, these assumptions in the decoders seem well-accepted in the real world applications. The StreamReader/Writer classes in JapaneseCodecs can cope with the statefulness, BTW. -- KAJIYAMA, Tamito From tdelaney@avaya.com Wed Jan 22 22:54:16 2003 From: tdelaney@avaya.com (Delaney, Timothy) Date: Thu, 23 Jan 2003 09:54:16 +1100 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files Message-ID: > From: Skip Montanaro [mailto:skip@pobox.com] > > Add a new environment variable, PYCROOT, to the mix of environment > variables which Python understands. Its interpretation is: > > - If not present or present but with an empty string value, Python > bytecode is generated in exactly the same way as is currently done. > > - If present and it refers to an existing directory, bytecode > files are written into a directory structure rooted at that > location. > > - If present and it does not refer to an existing directory, > generation of bytecode files is suppressed altogether. I think this is wrong behaviour. IMO it should be as follows: - If not present Python bytecode is generated in exactly the same way as is currently done. - If present and it refers to an existing directory, bytecode files are written into a directory structure rooted at that location. - If present but empty, generation of bytecode files is suppressed altogether. - If present and it does not refer to an existing directory, a warning is displayed and generation of bytecode files is suppressed altogether. My reasoning is as follows: 1. To suppress bytecode generation, you should not have to choose an invalid directory. Supressing bytecode should definitely be a positive action (setting PYCROOT), but should not have false information associated with it (setting PYCROOT to non-empty). 2. If the specified non-existing directory ever starts existing, bytecode would begin being written to it. This would at the very least put "garbage" into that directory, and could potentially cause all kinds of errors. > - When looking for a bytecode file should the directory holding the > source file be considered as well, or just the location implied by > PYCROOT? If so, which should be searched first? It seems to me > that if a module lives in /usr/local/lib/python2.3/mod.py and was > installed by root without PYCROOT set, you'd want to use the > bytecode file there if it was up-to-date without ever considering > os.environ["PYCROOT"] + "/usr/local/lib/python2.3/". Only if you > need to write out a bytecode file would anything turn up there. I think it should always use PYCROOT. Tim Delaney From goodger@python.org Wed Jan 22 22:57:58 2003 From: goodger@python.org (David Goodger) Date: Wed, 22 Jan 2003 17:57:58 -0500 Subject: [Python-Dev] Extended Function syntax In-Reply-To: <200301221724.h0MHOLk13779@pcp02138704pcs.reston01.va.comcast.net> Message-ID: Guido van Rossum wrote: > A while ago there was a proposal floating around to add an optional > part to function/method definitions, that would replace the current > clumsy classmethod etc. notation, and could be used for other purposes > too. I think the final proposal looked like this: > > def name(arg, ...) [expr, ...]: > ...body... > > Does anyone remember or know where to find the thread where this > proposal was discussed? Some other interesting related posts: - http://mail.python.org/pipermail/python-list/2001-July/056224.html - http://mail.python.org/pipermail/python-list/2001-July/056416.html - http://mail.python.org/pipermail/python-dev/2001-July/016287.html > It ought to be turned into a PEP. John Williams sent a very rough candidate PEP in October that was interesting (below). I sent back some suggestions (below the PEP), but I haven't received anything back yet. Perhaps a joint PEP with syntax alternatives? PEP: XXX Title: Multiword Method Names Version: $Revision:$ Last-Modified: $Date: 2002/10/09 21:11:59 $ Author: John Williams Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 09-Oct-2002 Post-History: Python-Version: 2.3 Abstract ======== This PEP proposes allowing the space character in method names. Motivation ========== With the addition of static and class methods to Python, there are now three distinct kinds of methods. All are declared using the same syntax, and static and class methods are indicated by the use of the built-in "wrapper" types staticmethod and classmethod. This technique is unfortunate because it requires two statements to define a single method and requires the method name to appear three times. These problems could be solved by allowing the space character in method names and adding metaclass support for methods with certain names, in effect allowing arbitrary pseudo-keywords to be added to method declarations. Specification ============= The core of this proposal is to allow an arbitrary sequence of identifiers and keywords to appear between the keyword "def" and the opening parenthesis of the function declaration. The result would be identical to existing "def" statements, except that the name of the new function would would consist of the words joined together with space characters. Although no syntax exists for working with variables whose names contain spaces, the name would be accessible through explicit use of the underlying variable dictionaries. Using the new syntax along with special metaclass support, properties and static and class methods could be declared like this:: class MyClass(object): def class myClassMethod(cls): . . . def static myStaticMethod(): . . . def get myProperty(self): . . . def set myProperty(self, value): . . . The declaration above would be equivalent to: class MyClass(object): def myClassMethod(cls): . . . myClassMethod = classmethod(myClassMethod) def myStaticMethod(): . . . myStaticMethod = staticmethod(myStaticMethod) def __getMyProperty(self): . . . def __setMyProperty(self, value): . . . myProperty = property(__getMyProperty, __setMyProperty) Copyright ========= This document has been placed in the public domain. Here's my reply to John: This PEP proposal is interesting, although the idea has come up before (not a bad thing; see references below). The first thing I notice is that the title is misleading. The title and abstract ("This PEP proposes allowing the space character in method names") make me think you're proposing that this kind of declaration would be legal and somehow callable:: def a multi word method name(): pass This would be too large a syntax change IMO. I think the "multiword" aspect and "allowing spaces" are merely side-effects of what the PEP is proposing. They're implementation details. The real issue is the lack of a unifying syntax for descriptors. The text of the PEP is looking at the issue from the wrong direction IMO: from the bottom up (implementation) instead of from the top down (concept). I haven't wrapped my head around descriptors yet, and I'm not familiar with Python's internals. I have no idea if what you're proposing is even feasible. But here are some ideas and suggestions for the PEP: * I think the PEP should be called "Syntax for Descriptors". (I also thought of "Pseudo-Keywords in Method Declarations", but it's not as good a title.) * Rather than "pseudo-keywords", what about calling these tokens "modifiers"? * Perhaps metaclasses could grow a mechanism to add new pseudo-keywords of their own? Maybe not right away. When someone comes up with a novel use for descriptors, it would be nice to be able to implement it with syntax. * Expand the examples to include the "delete" property method. (I guess the modifier can't be "del", since that's already a bona-fide keyword. But then again...) * It may not be feasible to use "class" as a modifier, since it's already a keyword. Perhaps "classmethod", as suggested in one of the threads below. * Add some blanks to the examples to make them easier to read. The same idea was brought up back in July 2001. See the following threads: - http://mail.python.org/pipermail/python-list/2001-July/056224.html (A reply from Guido: http://mail.python.org/pipermail/python-list/2001-July/056416.html) - http://mail.python.org/pipermail/python-dev/2001-July/016287.html There may have been other similar discussions. Guido's reply is discouraging, but he may have changed his mind since. Having a PEP clearly proposing the syntax change would be useful even if it's rejected. I would recommend you revise the PEP and send it to Python-Dev to gather feedback before resubmitting it for a PEP number. -- David Goodger Python Enhancement Proposal (PEP) Editor (Please cc: all PEP correspondence to .) From guido@python.org Wed Jan 22 23:03:48 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 22 Jan 2003 18:03:48 -0500 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: Your message of "Wed, 22 Jan 2003 14:26:15 CST." <15918.65127.667962.533847@montanaro.dyndns.org> References: <15918.65127.667962.533847@montanaro.dyndns.org> Message-ID: <200301222303.h0MN3mo16216@pcp02138704pcs.reston01.va.comcast.net> Quick comments: - The envvar needs to have a name starting with "PYTHON". See Misc/setuid-prog.c for the reason. - PYC may not be the best name to identify the feature, since there's also .pyo. Maybe PYTHONBYTECODEDIR? I don't mind if it's long, the feature is obscure enough to deserve that. - If the directory it refers to is not writable, attempts to write are skipped too (rather than always attempting to write and catching the failure). - There are two problems in this line: os.path.join(os.environ["PYCROOT"], os.path.split(sourcefile)[0]) (1) os.path.split(sourcefile)[0] is a lousy way of writing os.path.dirname(sourcefile). :-) (2) The way os.path.join() is defined, the first argument is ignored when the second argument is an absolute path, which it will almost always be (now that sys.path is being absolutized). I think the solution for (2) may be to take the path relative to the sys.path entry from which it was taken, and tack that onto the end of PYCROOT. This means that submodule M of package X always gets its .pyc code written to PYCROOT/X/M.pyc. There can't be more than one of those (at least not unless you have multiple Python interpreters with different sys.path values sharing the same PYCROOT). --Guido van Rossum (home page: http://www.python.org/~guido/) From python@rcn.com Wed Jan 22 23:28:43 2003 From: python@rcn.com (Raymond Hettinger) Date: Wed, 22 Jan 2003 18:28:43 -0500 Subject: [Python-Dev] Extended Function syntax References: Message-ID: <006f01c2c26e$0132e920$125ffea9@oemcomputer> > Using the new syntax along with special metaclass support, properties > and static and class methods could be declared like this:: > > class MyClass(object): > def class myClassMethod(cls): . . . > def static myStaticMethod(): . . . This syntax is clear, explicit, and attractive. > def get myProperty(self): . . . > def set myProperty(self, value): . . . This ibe doesn't extend as cleanly: def del myProperty(self, value): "Oops, del is a keyword" def doc myProperty ... ? what goes here Also, can two properties share a getter or setter as they can now? x = property(notifyOneWay, setx) y = propetry(notifyOneWay, sety) Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From skip@pobox.com Thu Jan 23 03:07:04 2003 From: skip@pobox.com (Skip Montanaro) Date: Wed, 22 Jan 2003 21:07:04 -0600 Subject: [Python-Dev] Name space pollution (fwd) Message-ID: <15919.23640.247117.822059@montanaro.dyndns.org> --bYfm1zvCpf Content-Type: text/plain; charset=us-ascii Content-Description: message body text Content-Transfer-Encoding: 7bit I forward this along for the folks here who don't see c.l.py but can address this issue. Skip --bYfm1zvCpf Content-Type: message/rfc822 Content-Description: forwarded message Content-Transfer-Encoding: 7bit Return-Path: Received: from localhost [127.0.0.1] by localhost with POP3 (fetchmail-6.1.0) for skip@localhost (single-drop); Wed, 22 Jan 2003 18:30:33 -0600 (CST) Received: from kumquat.pobox.com (kumquat.pobox.com [64.119.218.68]) by manatee.mojam.com (8.12.1/8.12.1) with ESMTP id h0MNdh9d021063 for ; Wed, 22 Jan 2003 17:39:43 -0600 Received: from kumquat.pobox.com (localhost.localdomain [127.0.0.1]) by kumquat.pobox.com (Postfix) with ESMTP id 0356059EEE for ; Wed, 22 Jan 2003 18:39:43 -0500 (EST) Delivered-To: skip@pobox.com Received: from mail.python.org (mail.python.org [12.155.117.29]) by kumquat.pobox.com (Postfix) with ESMTP id F360045B7D for ; Wed, 22 Jan 2003 18:39:41 -0500 (EST) Received: from localhost.localdomain ([127.0.0.1] helo=mail.python.org) by mail.python.org with esmtp (Exim 4.05) id 18bUSy-0006SW-00; Wed, 22 Jan 2003 18:39:32 -0500 Received: from gusto.araneus.fi ([204.152.186.164]) by mail.python.org with esmtp (Exim 4.05) id 18bURK-00060v-00 for python-list@python.org; Wed, 22 Jan 2003 18:37:50 -0500 Received: from guava.araneus.fi (adsl-63-197-0-204.dsl.snfc21.pacbell.net [63.197.0.204]) by gusto.araneus.fi (8.11.6/8.11.6) with ESMTP id h0MNaRZ20494; Wed, 22 Jan 2003 15:36:27 -0800 (PST) Received: (from gson@localhost) by guava.araneus.fi (8.11.6/8.11.6) id h0MNaRe29779; Wed, 22 Jan 2003 15:36:27 -0800 (PST) Message-Id: <200301222336.h0MNaRe29779@guava.araneus.fi> Errors-To: python-list-admin@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.0.13 (101270) Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: X-Spambayes-Debug: '*H*': 1.00; '*S*': 0.00; 'python': 0.00; 'namespace': 0.00; 'header': 0.01; 'extern': 0.01; 'url:)': 0.01; 'typedef': 0.01; 'int': 0.01; 'url:python-list': 0.02; 'url:listinfo': 0.02; 'to:addr:python-list': 0.02; 'url:python': 0.02; 'url:sourceforge': 0.02; 'andreas': 0.02; 'url:mailman': 0.02; 'defines': 0.02; 'polluting': 0.03; 'symbols': 0.04; 'threads': 0.04; 'seems': 0.05; 'to:addr:python.org': 0.05; 'url:org': 0.07; 'python.h,': 0.07; 'header:Errors-To:1': 0.07; 'header:Received:8': 0.07; 'url:mail': 0.09; '*));': 0.09; 'fixed?': 0.09; 'library': 0.10; 'file': 0.10; "shouldn't": 0.11; "i'm": 0.14; 'trying': 0.17; 'something': 0.22; 'similar': 0.22; 'but': 0.24; 'skip:" 10': 0.26; 'subject:space': 0.26; 'there': 0.27; 'common': 0.28; 'should': 0.29; 'having': 0.29; 'way,': 0.30; 'does': 0.30; 'skip:s 10': 0.31; 'line': 0.32; 'instead': 0.32; 'problems': 0.32; 'way': 0.33; 'fact': 0.33; 'skip:( 10': 0.36; 'after': 0.38; 'that': 0.39; 'other': 0.39; 'also': 0.39; 'example,': 0.39; 'due': 0.62; 'header:Message-Id:1': 0.70; 'url:': 0.86; 'subject:Name': 0.86 From: gson@nominum.com (Andreas Gustafsson) Sender: python-list-admin@python.org To: python-list@python.org Subject: Name space pollution Date: Wed, 22 Jan 2003 15:36:27 -0800 (PST) X-Spambayes-Classification: ham; 0.00 I'm trying to get Python to coexist with the State Threads library (http://state-threads.sourceforge.net/), but I'm having problems due to the fact that Python defines the word "destructor" as a typedef. Specifically, after including Python.h, the following line in the State Threads header file st.h does not compile: extern int st_key_create(int *keyp, void (*destructor)(void *)); It seems to me that Python should not be polluting the namespace this way - shouldn't the Python typedef be something like PyDestructor instead of just "destructor"? Python also pollutes a number of other common symbols in a similar way, for example, "cmpfunc", "hashfunc", and "initproc". Is there any chance this will be fixed? -- Andreas Gustafsson, gson@nominum.com -- http://mail.python.org/mailman/listinfo/python-list --bYfm1zvCpf-- From guido@python.org Thu Jan 23 03:20:18 2003 From: guido@python.org (Guido van Rossum) Date: Wed, 22 Jan 2003 22:20:18 -0500 Subject: [Python-Dev] Name space pollution (fwd) In-Reply-To: Your message of "Wed, 22 Jan 2003 21:07:04 CST." <15919.23640.247117.822059@montanaro.dyndns.org> References: <15919.23640.247117.822059@montanaro.dyndns.org> Message-ID: <200301230320.h0N3KIP17388@pcp02138704pcs.reston01.va.comcast.net> > I'm trying to get Python to coexist with the State Threads library > (http://state-threads.sourceforge.net/), but I'm having problems due > to the fact that Python defines the word "destructor" as a typedef. > Specifically, after including Python.h, the following line in the > State Threads header file st.h does not compile: > > extern int st_key_create(int *keyp, void (*destructor)(void *)); > > It seems to me that Python should not be polluting the namespace this > way - shouldn't the Python typedef be something like PyDestructor > instead of just "destructor"? > > Python also pollutes a number of other common symbols in a similar way, > for example, "cmpfunc", "hashfunc", and "initproc". > > Is there any chance this will be fixed? If you care about this, submit a patch. But it may be a lot easier to do some #define magic, e.g. #define destructor PyDestructor #include "Python.h" #undef destructor #include "st.h" --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@alum.mit.edu Thu Jan 23 04:19:53 2003 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Wed, 22 Jan 2003 23:19:53 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: <006a01c2bf2a$ee6dcc60$652b6992@alpha> References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> <006a01c2bf2a$ee6dcc60$652b6992@alpha> Message-ID: <15919.28009.351512.227783@localhost.localdomain> I thought one reason for the current names is that they match the log4j tool and all the other language's versions of that tool. I'd like to see the package keep the standard names to make it easier for someone to pick up this package. Those names are debug, info, warn, error, fatal. Jeremy From just@letterror.com Thu Jan 23 08:44:44 2003 From: just@letterror.com (Just van Rossum) Date: Thu, 23 Jan 2003 09:44:44 +0100 Subject: [Python-Dev] Extended Function syntax In-Reply-To: <3E2EE0B5.6040802@zope.com> Message-ID: Jim Fulton wrote: > In particular: > > def name(arg, ...) [expr1, expr2, expr3]: > ...body... > > would be equivalent to (some variation on): > > def name(arg, ...): > ...body... > > name=expr1(expr2(expr3(name))) With Michael's patch (which indeed still works) it's actually name = expr3(expr2(expr1(name))) > I wonder if the same mechanism could be used in class statements. > If I had this, I might say what interface a class implements > with: > > class foo(spam, eggs) [implements(IFoo, IBar)]: > body of class > > or (wo parens): > > class foo [implements(IFoo, IBar)]: > body of class I don't know how Zope interfaces work, but I can imagine the following could be made to work as well: class foo(spam, eggs) [IFoo, IBar]: body of class I think this would be wonderful. Would someone be interested in extending Michael's patch to also cover the class statement? Just From vinay_sajip@red-dove.com Thu Jan 23 09:52:43 2003 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Thu, 23 Jan 2003 09:52:43 -0000 Subject: [Python-Dev] Re: logging package -- rename warn to warning? References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net><006a01c2bf2a$ee6dcc60$652b6992@alpha> <15919.28009.351512.227783@localhost.localdomain> Message-ID: <00d401c2c2c5$2e264100$652b6992@alpha> > I thought one reason for the current names is that they match the > log4j tool and all the other language's versions of that tool. I'd > like to see the package keep the standard names to make it easier for > someone to pick up this package. I think the main source of familiarity for log4j users (or java.util.logging) will be that the"main" design and classes in the package mirror those in the Java world - Loggers and Handlers, dotted-namespace hierarchy for loggers, etc. I don't think that having slightly different names would be a problem, given that we think our names are better! Several versions of log4j (e.g. log4p, log4net) seem to have been a little too faithful in their translation of log4j. For example, levels are first-class objects - which I think is overkill. But then, class proliferation seems to be a Java idiom in some quarters :-( Regards, Vinay From Paul.Moore@atosorigin.com Thu Jan 23 11:16:30 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Thu, 23 Jan 2003 11:16:30 -0000 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files Message-ID: <16E1010E4581B049ABC51D4975CEDB886199B3@UKDCX001.uk.int.atosorigin.com> From: Skip Montanaro [mailto:skip@pobox.com] > Here's a first stab at a PEP about controlling generation > of bytecode files. Feedback appreciated. As someone else (Martin?) mentioned, this proposal needs to be a bit more explicit about the way *reading* of bytecode files is affected, as well as writing. This is of particular interest to me, as it will potentially affect how people write import hooks. For example, I can't immediately see how this proposal will interact with the new zipimport module. And that module doesn't attempt to handle writing of PYC modules, something I can imagine an import hook wanting to handle (although I'd have to say that this is probably unusual). In particular, the PEP needs to say something about what, if any, requirements it places on the writers of import hooks ("If an import hook attempts to cache compiled bytecode, in a similar way to the builtin filesystem support, then it needs to check the environment variable, and...") > - Interpretation of a module's __file__ attribute. I believe the > __file__ attribute of a module should reflect the true location > of the bytecode file. If people want to locate a module's source > code, they should use imp.find_module(module). Again, this is going to be something import hook writers need to be aware of, and to cater for. Offhand, I can't recall what the import hook PEP considers the fate of imp.find_module to be [... click, click, browse, read...] OK, if you have a module loaded from an import hook, you don't have imp.find_module. You have a new function, imp.get_loader(), but that returns only a loader, not a file name. So you need to expand on this: "If people want to locate a module's source code, they will not be able to except by using imp.find_module(module), which does not take account of import hooks. Whether this is good enough will depend upon the application". [But in any case, PEP 302 points out about __file__ that "This must be a string, but it may be a dummy value, for example """ so anyone using __file__ already needs to be aware of border cases involving non-filesystem modules] Personally, I have no real need for, nor a great interest in, this feature. But others may care a lot about it, and have little need for import hooks. I therefore can't put a good judgement on whether these issues are significant - but I do think the PEP needs to address them, if only to say that the two features don't work well together, and that is understood and not considered to be a significant issue. Paul. PS It's also possible that this issue should be integrated into the import hook protocol somehow - have the loader.load_module() method take a "save_bytecode" parameter, or something. For a simple yes/no flag, that makes sense, but for a more general bytecode cache location, that wouldn't be sufficient. From mwh@python.net Thu Jan 23 11:30:13 2003 From: mwh@python.net (Michael Hudson) Date: 23 Jan 2003 11:30:13 +0000 Subject: [Python-Dev] Extended Function syntax In-Reply-To: Just van Rossum's message of "Thu, 23 Jan 2003 09:44:44 +0100" References: Message-ID: <2mfzrkrtmi.fsf@starship.python.net> Just van Rossum writes: > Jim Fulton wrote: > > > In particular: > > > > def name(arg, ...) [expr1, expr2, expr3]: > > ...body... > > > > would be equivalent to (some variation on): > > > > def name(arg, ...): > > ...body... > > > > name=expr1(expr2(expr3(name))) > > With Michael's patch (which indeed still works) it's actually > > name = expr3(expr2(expr1(name))) I can't remember if that was deliberate or accidental. I think deliberate. > > I wonder if the same mechanism could be used in class statements. > > If I had this, I might say what interface a class implements > > with: > > > > class foo(spam, eggs) [implements(IFoo, IBar)]: > > body of class > > > > or (wo parens): > > > > class foo [implements(IFoo, IBar)]: > > body of class > > I don't know how Zope interfaces work, but I can imagine the following > could be made to work as well: > > class foo(spam, eggs) [IFoo, IBar]: > body of class > > I think this would be wonderful. Would someone be interested in > extending Michael's patch to also cover the class statement? Maybe even me :-) It wasn't very hard. http://starship.python.net/crew/mwh/hacks/meth-syntax-sugar-2.diff You still can't use lambdas in the filter list. I thought I'd fixed that ages ago, but it seems not. Cheers, M. -- I have no disaster recovery plan for black holes, I'm afraid. Also please be aware that if it one looks imminent I will be out rioting and setting fire to McDonalds (always wanted to do that) and probably not reading email anyway. -- Dan Barlow From guido@python.org Thu Jan 23 12:37:38 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 07:37:38 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: Your message of "Wed, 22 Jan 2003 23:19:53 EST." <15919.28009.351512.227783@localhost.localdomain> References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net> <006a01c2bf2a$ee6dcc60$652b6992@alpha> <15919.28009.351512.227783@localhost.localdomain> Message-ID: <200301231237.h0NCbdt18478@pcp02138704pcs.reston01.va.comcast.net> > I thought one reason for the current names is that they match the > log4j tool and all the other language's versions of that tool. I'd > like to see the package keep the standard names to make it easier for > someone to pick up this package. Those names are debug, info, warn, > error, fatal. OK. I tried changing warn to warning, and was worried about the amount of code I'd have to change (at least two versions of Zope have their own copy of the logging package now). Changing critical to fatal would be less work, because there are fewer invocations. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 23 12:45:33 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 07:45:33 -0500 Subject: [Python-Dev] distutils and scripts Message-ID: <200301231245.h0NCjXA18540@pcp02138704pcs.reston01.va.comcast.net> Before I report a bug, is there a reason why the distutils build command doesn't give the scripts that it copies into the build/scripts* directory execute permission? It correctly mangles the #! line, but the scripts cannot be run directly out of there unless you do chmod +x build/scripts*/* first. Bug or feature? (The install command copies the scripts to /bin, and there they do get the right permission.) --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Thu Jan 23 13:05:26 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 23 Jan 2003 14:05:26 +0100 Subject: [Python-Dev] distutils and scripts In-Reply-To: <200301231245.h0NCjXA18540@pcp02138704pcs.reston01.va.comcast.net> References: <200301231245.h0NCjXA18540@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <3E2FE896.7080508@v.loewis.de> Guido van Rossum wrote: > Before I report a bug, is there a reason why the distutils build > command doesn't give the scripts that it copies into the > build/scripts* directory execute permission? I don't think that this was done on purpose, but I see no need to change that, either. Typically, the scripts won't work until the package is installed, anyway (as they import stuff that is not available in the standard locations yet). Regards, Martin From uche.ogbuji@fourthought.com Thu Jan 23 14:05:05 2003 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Thu, 23 Jan 2003 07:05:05 -0700 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: Message from Jeremy Hylton of "Wed, 22 Jan 2003 23:19:53 EST." <15919.28009.351512.227783@localhost.localdomain> Message-ID: > I thought one reason for the current names is that they match the > log4j tool and all the other language's versions of that tool. I'd > like to see the package keep the standard names to make it easier for > someone to pick up this package. Those names are debug, info, warn, > error, fatal. Personally, I'd moot for better English, rather than conformance to another package. +1 for debug, info, warning, error, [critical/severe] All of which are appropriate enough adjectives (though "debugging" would probably be less awkward). -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com The open office file format - http://www-106.ibm.com/developerworks/xml/librar y/x-think15/ Python Generators + DOM - http://www.xml.com/pub/a/2003/01/08/py-xml.html 4Suite Repository Features - https://www6.software.ibm.com/reg/devworks/dw-x4su ite5-i/ XML class warfare - http://www.adtmag.com/article.asp?id=6965 From guido@python.org Thu Jan 23 14:33:05 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 09:33:05 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: Your message of "Thu, 23 Jan 2003 09:52:43 GMT." <00d401c2c2c5$2e264100$652b6992@alpha> References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net><006a01c2bf2a$ee6dcc60$652b6992@alpha> <15919.28009.351512.227783@localhost.localdomain> <00d401c2c2c5$2e264100$652b6992@alpha> Message-ID: <200301231433.h0NEX5Z06226@odiug.zope.com> OTOH, that log4j uses fatal instead of critical significantly weakens the arguments against using fatal. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 23 14:38:06 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 09:38:06 -0500 Subject: [Python-Dev] Extended Function syntax In-Reply-To: Your message of "23 Jan 2003 11:30:13 GMT." <2mfzrkrtmi.fsf@starship.python.net> References: <2mfzrkrtmi.fsf@starship.python.net> Message-ID: <200301231438.h0NEc6J06291@odiug.zope.com> > > > In particular: > > > > > > def name(arg, ...) [expr1, expr2, expr3]: > > > ...body... > > > > > > would be equivalent to (some variation on): > > > > > > def name(arg, ...): > > > ...body... > > > > > > name=expr1(expr2(expr3(name))) > > > > With Michael's patch (which indeed still works) it's actually > > > > name = expr3(expr2(expr1(name))) > > I can't remember if that was deliberate or accidental. I think > deliberate. It certainly surprises me less -- this is left-to-right (applying expr1 first) which is goodness. --Guido van Rossum (home page: http://www.python.org/~guido/) From vinay_sajip@red-dove.com Thu Jan 23 15:20:21 2003 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Thu, 23 Jan 2003 15:20:21 -0000 Subject: [Python-Dev] Re: logging package -- rename warn to warning? References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net><006a01c2bf2a$ee6dcc60$652b6992@alpha> <15919.28009.351512.227783@localhost.localdomain> <00d401c2c2c5$2e264100$652b6992@alpha> <200301231433.h0NEX5Z06226@odiug.zope.com> Message-ID: <013b01c2c2f2$f60cbfa0$652b6992@alpha> > OTOH, that log4j uses fatal instead of critical significantly weakens > the arguments against using fatal. Only by as much as it weakens the argument against warn/WARN ;-) Regards, Vinay From guido@python.org Thu Jan 23 15:22:30 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 10:22:30 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: Your message of "Thu, 23 Jan 2003 15:20:21 GMT." <013b01c2c2f2$f60cbfa0$652b6992@alpha> References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net><006a01c2bf2a$ee6dcc60$652b6992@alpha> <15919.28009.351512.227783@localhost.localdomain> <00d401c2c2c5$2e264100$652b6992@alpha> <200301231433.h0NEX5Z06226@odiug.zope.com> <013b01c2c2f2$f60cbfa0$652b6992@alpha> Message-ID: <200301231522.h0NFMU306646@odiug.zope.com> > > OTOH, that log4j uses fatal instead of critical significantly weakens > > the arguments against using fatal. > > Only by as much as it weakens the argument against warn/WARN ;-) Yes. I already reversed my opinion on that one after seeing how much Zope3 code I would have to fix. --Guido van Rossum (home page: http://www.python.org/~guido/) From vinay_sajip@red-dove.com Thu Jan 23 15:28:06 2003 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Thu, 23 Jan 2003 15:28:06 -0000 Subject: [Python-Dev] Re: logging package -- rename warn to warning? References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net><006a01c2bf2a$ee6dcc60$652b6992@alpha> <15919.28009.351512.227783@localhost.localdomain> <00d401c2c2c5$2e264100$652b6992@alpha> <200301231433.h0NEX5Z06226@odiug.zope.com> <013b01c2c2f2$f60cbfa0$652b6992@alpha> <200301231522.h0NFMU306646@odiug.zope.com> Message-ID: <016b01c2c2f4$099a2160$652b6992@alpha> > > > > Only by as much as it weakens the argument against warn/WARN ;-) > > Yes. I already reversed my opinion on that one after seeing how much > Zope3 code I would have to fix. Is it more than a search-and-replace operation? Vinay From jrw@pobox.com Thu Jan 23 15:28:19 2003 From: jrw@pobox.com (John Williams) Date: Thu, 23 Jan 2003 09:28:19 -0600 Subject: [Python-Dev] Extended Function syntax In-Reply-To: References: Message-ID: <3E300A13.6020303@pobox.com> David Goodger wrote: > John Williams sent a very rough candidate PEP in October that was > interesting (below). I sent back some suggestions (below the PEP), but I > haven't received anything back yet. Perhaps a joint PEP with syntax > alternatives? Thanks for bringing this up. I've been following python-dev but I've mostly gotten sidetracked from Python stuff since then. I'm also having second thoughts about the whole idea of my proposal, since I basically wrote it in a fit of excitement over possibilities of metaclasses. Compared to the other proposal going around (which I'll call Guido's, since he brought it up), the really big advantage of my proposal is that you can use it to do something like adding a property to a class implicitly by defining its getter and setter methods: class A(object): def get foo(self): "Getter for property 'foo'." return self.__foo def set foo(self, foo): "Setter for property 'foo'." self.__foo = foo It's critical here that neither of these declarations actually defines the name "foo"; they define names like "get foo" and "set foo", and it's up to the metaclass to expose these methods in a sane way (i.e. by creating a property named "foo" in this case). I agree with the criticisms that others have made about this proposal, but the real killer (for me, at least) is that using method name modifiers would require case-by-case support in the metaclass to achieve the desired effects. Guido's proposal (adding expressions after the argument list) is easy to extend and generally much less magical, since the modifier expressions don't have to have their meaning "interpreted" by a metaclass. At this stage I'd much rather see Guido's proposal implemented, unless someone comes up with a truly ingenious way to combine the advantages of both. From guido@python.org Thu Jan 23 15:32:18 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 10:32:18 -0500 Subject: [Python-Dev] Re: logging package -- rename warn to warning? In-Reply-To: Your message of "Thu, 23 Jan 2003 15:28:06 GMT." <016b01c2c2f4$099a2160$652b6992@alpha> References: <200301181724.h0IHONi10847@pcp02138704pcs.reston01.va.comcast.net><006a01c2bf2a$ee6dcc60$652b6992@alpha> <15919.28009.351512.227783@localhost.localdomain> <00d401c2c2c5$2e264100$652b6992@alpha> <200301231433.h0NEX5Z06226@odiug.zope.com> <013b01c2c2f2$f60cbfa0$652b6992@alpha> <200301231522.h0NFMU306646@odiug.zope.com> <016b01c2c2f4$099a2160$652b6992@alpha> Message-ID: <200301231532.h0NFWIh06843@odiug.zope.com> > > > Only by as much as it weakens the argument against warn/WARN ;-) > > > > Yes. I already reversed my opinion on that one after seeing how much > > Zope3 code I would have to fix. > > Is it more than a search-and-replace operation? That's all there's to it, but it affects many files. I'd like to bite the bullet and change critical to fatal, and leave it at that. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 23 15:40:08 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 10:40:08 -0500 Subject: [Python-Dev] logging package -- spurious package contents In-Reply-To: Your message of "Thu, 23 Jan 2003 10:32:18 EST." Message-ID: <200301231540.h0NFe8s06989@odiug.zope.com> I have another issue with the logging package. The submodule logging/config.py contains code that I feel should not be there. I experimented with the config file format it implements, and it appears very painful. It appears mostly undocumented, typos in the file are not always reported, and you seem to have to specify the filename and mode twice. (Example: [handler_normal] class=FileHandler level=NOTSET formatter=common args=('z3.log', 'a') <----\ filename=z3.log <---- >- this is ugly mode=a <----/ .) Since configuring the logging package with a few programmatic calls is so easy, and applications that need serious logging configurability typically already have some configuration mechanism, I propose to drop this from the Python distribution. I'm similarly not convinced of the utility of the logging/handlers.py submodule, but I've never tried to use it, so lacking any particular negative experience all I can say against it is YAGNI. Does anyone on python-dev think the logging package would lose utility if I removed those submodules? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Jan 23 15:55:28 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 10:55:28 -0500 Subject: [Python-Dev] Re: [Zope3-dev] PosixTimeZone tzinfo implementation In-Reply-To: Your message of "Thu, 23 Jan 2003 10:47:19 EST." References: Message-ID: <200301231555.h0NFtTM07072@odiug.zope.com> > Guido's pure-Python Local.py (in the Python CVS sandbox, > nondist/sandbox/datetime/Local.py) may be trying to get at the same > kind of thing -- unsure. It does no such thing. It supports only the currently configured local time, for which you don't need tm_gmtoff. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Jan 23 16:07:28 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 23 Jan 2003 11:07:28 -0500 Subject: [Python-Dev] logging package -- spurious package contents In-Reply-To: <200301231540.h0NFe8s06989@odiug.zope.com> References: <200301231540.h0NFe8s06989@odiug.zope.com> Message-ID: <15920.4928.966394.252783@grendel.zope.com> Guido van Rossum writes: > The submodule logging/config.py contains code that I feel should not > be there. I experimented with the config file format it implements, > and it appears very painful. It appears mostly undocumented, typos in > the file are not always reported, and you seem to have to specify the > filename and mode twice. (Example: ... > Since configuring the logging package with a few programmatic calls is > so easy, and applications that need serious logging configurability > typically already have some configuration mechanism, I propose to drop > this from the Python distribution. I agree. The configuration machinery isn't very helpful unless you're desparate, in which case you have other problems. > I'm similarly not convinced of the utility of the logging/handlers.py > submodule, but I've never tried to use it, so lacking any particular > negative experience all I can say against it is YAGNI. This is, one the other hand, quite useful. Several handlers defined in this module are useful. > Does anyone on python-dev think the logging package would lose utility > if I removed those submodules? +1 on losing logging.config -1 on losing logging.handlers -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mal@lemburg.com Thu Jan 23 16:12:22 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 23 Jan 2003 17:12:22 +0100 Subject: [Python-Dev] Extended Function syntax In-Reply-To: <200301231438.h0NEc6J06291@odiug.zope.com> References: <2mfzrkrtmi.fsf@starship.python.net> <200301231438.h0NEc6J06291@odiug.zope.com> Message-ID: <3E301466.5070204@lemburg.com> Guido van Rossum wrote: >>>>In particular: >>>> >>>> def name(arg, ...) [expr1, expr2, expr3]: >>>> ...body... >>>> >>>>would be equivalent to (some variation on): >>>> >>>> def name(arg, ...): >>>> ...body... >>>> >>>> name=expr1(expr2(expr3(name))) >>> >>>With Michael's patch (which indeed still works) it's actually >>> >>> name = expr3(expr2(expr1(name))) >> >>I can't remember if that was deliberate or accidental. I think >>deliberate. > > > It certainly surprises me less -- this is left-to-right (applying > expr1 first) which is goodness. +1 I suppose the following would also be possible, provided that types(...) returns a callable, right ? def myfunc(x,y,z) [types(int, int, float), cacheglobals]: return math.sin(x*y/z) .... and cacheglobals would be able to rewrite the byte code too. Nice :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From barry@python.org Thu Jan 23 16:16:32 2003 From: barry@python.org (Barry A. Warsaw) Date: Thu, 23 Jan 2003 11:16:32 -0500 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files References: Message-ID: <15920.5472.27584.653038@gargle.gargle.HOWL> >>>>> "TD" == Timothy Delaney writes: TD> I think this is wrong behaviour. IMO it should be as follows: TD> - If not present Python bytecode is generated in exactly the TD> same way as is currently done. TD> - If present and it refers to an existing directory, bytecode TD> files are written into a directory structure rooted at that TD> location. TD> - If present but empty, generation of bytecode files is TD> suppressed altogether. TD> - If present and it does not refer to an existing directory, TD> a warning is displayed and generation of bytecode files is TD> suppressed altogether. +1 for exactly the reasons you state. -Barry From vinay_sajip@red-dove.com Thu Jan 23 16:28:49 2003 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Thu, 23 Jan 2003 16:28:49 -0000 Subject: [Python-Dev] Re: logging package -- spurious package contents References: <200301231540.h0NFe8s06989@odiug.zope.com> Message-ID: <01af01c2c2fc$83962ec0$652b6992@alpha> > The submodule logging/config.py contains code that I feel should not > be there. I experimented with the config file format it implements, > and it appears very painful. It's not ideal, but a ConfigParser-based format seemed the thing to use since it's already part of Python. You don't need to specify things twice - that's a side effect of using the GUI configurator to create a config file. (N.B. The GUI configurator is not part of the package proper.) For a handler, you need only specify the class, the level, the formatter, and the args for the constructor. I agree that the config file format documentation leaves a lot of room for improvement. But it's not as bad as all that, once you get past the original irritation. > Since configuring the logging package with a few programmatic calls is > so easy, and applications that need serious logging configurability > typically already have some configuration mechanism, I propose to drop > this from the Python distribution. -0. > I'm similarly not convinced of the utility of the logging/handlers.py > submodule, but I've never tried to use it, so lacking any particular > negative experience all I can say against it is YAGNI. -1. That may be because you've never yet wanted to do anything other than log to console or file. I think it should be left in as without it, logging is most definitely not "batteries included". For example, it provides syslog, socket, datagram, email, HTTP and memory-buffering handlers. Perhaps these questions could be asked on python-list? I am aware of many users who use the handlers in the logging.handlers module (though there are fewer who have given feedback about the config capability). Vinay From guido@python.org Thu Jan 23 16:33:52 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 11:33:52 -0500 Subject: [Python-Dev] Re: logging package -- spurious package contents In-Reply-To: Your message of "Thu, 23 Jan 2003 16:28:49 GMT." <01af01c2c2fc$83962ec0$652b6992@alpha> References: <200301231540.h0NFe8s06989@odiug.zope.com> <01af01c2c2fc$83962ec0$652b6992@alpha> Message-ID: <200301231633.h0NGXqq07476@odiug.zope.com> > > The submodule logging/config.py contains code that I feel should not > > be there. I experimented with the config file format it implements, > > and it appears very painful. > > It's not ideal, but a ConfigParser-based format seemed the thing to use > since it's already part of Python. You don't need to specify things twice - > that's a side effect of using the GUI configurator to create a config file. > (N.B. The GUI configurator is not part of the package proper.) For a > handler, you need only specify the class, the level, the formatter, and the > args for the constructor. > > I agree that the config file format documentation leaves a lot of room for > improvement. But it's not as bad as all that, once you get past the original > irritation. > > > Since configuring the logging package with a few programmatic calls is > > so easy, and applications that need serious logging configurability > > typically already have some configuration mechanism, I propose to drop > > this from the Python distribution. > > -0. > > > I'm similarly not convinced of the utility of the logging/handlers.py > > submodule, but I've never tried to use it, so lacking any particular > > negative experience all I can say against it is YAGNI. > > -1. That may be because you've never yet wanted to do anything other than > log to console or file. I think it should be left in as without it, logging > is most definitely not "batteries included". For example, it provides > syslog, socket, datagram, email, HTTP and memory-buffering handlers. > > Perhaps these questions could be asked on python-list? I am aware of many > users who use the handlers in the logging.handlers module (though there are > fewer who have given feedback about the config capability). I'm willing to keep these things if you fix the documentation. The documentation for the whole module is currently in a rather sorry state. Example: "Using the package doesn't get much simpler." is more sales talk than documentation, and the example following that phrase produces less and different output than quoted. --Guido van Rossum (home page: http://www.python.org/~guido/) From paul@pfdubois.com Thu Jan 23 17:08:24 2003 From: paul@pfdubois.com (Paul F Dubois) Date: Thu, 23 Jan 2003 09:08:24 -0800 Subject: [Python-Dev] Re: Proto-PEP regarding writing bytecode files In-Reply-To: <20030123032101.14337.75715.Mailman@mail.python.org> Message-ID: <000201c2c302$0afdeec0$6601a8c0@NICKLEBY> This proposal has a kind of system-oriented view: there is one Python, = and the issue is whether or not to make bytecode files, and if so where to = put them. This view does not match reality very well. >From the point of view of someone distributing a Python-based = application, one of the hardest things to control is the user environment, and yet control of those environment settings might be an absolute requirement. = For example, a parallel application might need to suppress byte-code writing when it would not need to be repressed for other purposes or other = codes. This application might have its own Python and it would be required that that Python load modules only from itself, not from some random bytecode directory that got added to my path because of an environment variable = that my sysadmin put in the central startup files unbeknownst to me. I realize that for any objection I raise here there is a solution. For example, make my "code" a shell that sets the environment before = executing the real code. But that is annoying when developers have to debug the = thing. I can set the environment variable from the application too but that = raises the issue of whether or not I have done this "in time" for it to be effective or whether some other part of Python has already used the = setting (like changing the path, for example) before I got my chance. The patch as posted suited me because I could just set the C flag or = argv entry from my own main code before I initialized Python; or I could set = the command line flag when executing with a shell script or as an alias. =20 I'm unclear about how this works if I have one environment variable but = am running different installations of Python. Couldn't I end up with a pyc newer than any of its diverse sources and thus used even though it corresponds to just one of them? It is useless to suppose that I am "careful" and always remember what the PYCROOT ought to be. First of = all, I'm not careful, but even if I were I might be running Python via some = other application and not even realize it. It all starts to sound like DLLs. Bottom line: I find any messing with my path to be suicidal. From guido@python.org Thu Jan 23 17:21:34 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 12:21:34 -0500 Subject: [Python-Dev] Re: Proto-PEP regarding writing bytecode files In-Reply-To: Your message of "Thu, 23 Jan 2003 09:08:24 PST." <000201c2c302$0afdeec0$6601a8c0@NICKLEBY> References: <000201c2c302$0afdeec0$6601a8c0@NICKLEBY> Message-ID: <200301231721.h0NHLY307771@odiug.zope.com> > I'm unclear about how this works if I have one environment variable but am > running different installations of Python. Couldn't I end up with a pyc > newer than any of its diverse sources and thus used even though it > corresponds to just one of them? No, a pyc file records the mtime of the corresponding py file in its header, and is only used if that is *exactly* the same. Newer/older doesn't matter. --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh@python.net Thu Jan 23 17:27:11 2003 From: mwh@python.net (Michael Hudson) Date: 23 Jan 2003 17:27:11 +0000 Subject: [Fwd: Re: [Python-Dev] Extended Function syntax] Message-ID: <1043342831.9908.162.camel@pc150.maths.bris.ac.uk> --=-xSDoeBG8mTT5bqo9cMfx Content-Type: text/plain Content-Transfer-Encoding: 7bit Gah, playing with new MUAs & all.. this was meant to go to the list. --=-xSDoeBG8mTT5bqo9cMfx Content-Disposition: inline Content-Description: Forwarded message - Re: [Python-Dev] Extended Function syntax Content-Type: message/rfc822 Subject: Re: [Python-Dev] Extended Function syntax From: Michael Hudson To: "M.-A. Lemburg" In-Reply-To: <3E301466.5070204@lemburg.com> References: <2mfzrkrtmi.fsf@starship.python.net> <200301231438.h0NEc6J06291@odiug.zope.com> <3E301466.5070204@lemburg.com> Content-Type: text/plain Organization: Message-Id: <1043342086.9909.145.camel@pc150.maths.bris.ac.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.1 Date: 23 Jan 2003 17:14:46 +0000 Content-Transfer-Encoding: 7bit On Thu, 2003-01-23 at 16:12, M.-A. Lemburg wrote: > I suppose the following would also be possible, provided > that types(...) returns a callable, right ? > > def myfunc(x,y,z) [types(int, int, float), cacheglobals]: > return math.sin(x*y/z) Of course. The only thing you can't do is put a lambda in the filter list and that's only because of a bug. > .... and cacheglobals would be able to rewrite the byte > code too. Nice :-) If that's your nice I don't want to see your nasty :) Cheers, M. --=-xSDoeBG8mTT5bqo9cMfx-- From skip@pobox.com Thu Jan 23 17:40:17 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 23 Jan 2003 11:40:17 -0600 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: References: <15918.65127.667962.533847@montanaro.dyndns.org> <15919.5070.6621.177402@montanaro.dyndns.org> Message-ID: <15920.10497.509459.481356@montanaro.dyndns.org> >> Here's a situation to consider: Shared .py files outside the normal >> distribution are stored in a read-only directory without .pyc's. >> Each user might set PYCROOT to $HOME/tmp/Python-N.M. A single >> version of those files could be safely shared by multiple installed >> versions of Python. You might always search the directory with the >> .py file, then the private repository. Brett> OK, but how are they going to specify that universal read-only Brett> directory? Is it going to be set in PYTHONPATH? If so, then you Brett> still don't need to check the directory where the module exists Brett> for a .pyc file since it was specified in such a way that Brett> checking there will be fruitless. Same goes for if you set this Brett> read-only directory in sys.path. I'm still not sure I understand what you're asking. At runtime you have no way of knowing that (for example) /usr/local/lib/python2.3 will never contain bytecode files. You really have to check there. In the latest version of the pep (see below) I nailed down the reading and writing mechanisms better. There are also more examples. If one of them doesn't address this issue, please feel free to shoot me a concrete example which demonstrates your concerns. Brett> No, I think you got it. >> >> - Runtime control - should there be a variable in sys (say, >> >> sys.pycroot) ... Brett> Why the heck would your needs for compiling bytecode files change Brett> while running a program? Brett> I agree that it is a matter of convenience and I just don't see Brett> it being enough of one. -0 vote from me on this feature. Another use of sys.pycroot (now called sys.pythonbytecodebase) occurred to me. There is a fair amount of work necessary to see if the value of PYTHONBYTECODEBASE (used to be PYCROOT) is valid (make the directory reference absolute, check that it exists and is writable by the current user). I propose that check should be performed once at program startup and sys.pythonbytecodebase set accordingly. This also allows users to set a relative directory for PYTHONBYTECODEBASE without worrying that it will silently move around as the program changes its working directory. I suppose sys.pythonbytecodebase could be forced to be read-only, but I see no particular reason to limit it. Hopefully its length will discourage people from modifying it without good reason. Thank the BDFL for that name. At any rate, I believe I incorporated everybody's suggestions (thanks for the feedback) and checked it in as pop-0304 (the number David Goodger assigned me). You can now grab it from CVS or view it online at http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondist/peps/pep-0304.txt More feedback is welcome. Once the uproar^H^H^H^H^H^Hfeedback dies down a bit, I will look at implementing the idea. Skip From skip@pobox.com Thu Jan 23 17:41:13 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 23 Jan 2003 11:41:13 -0600 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: <3E2F12D4.5020404@v.loewis.de> References: <15918.65127.667962.533847@montanaro.dyndns.org> <3E2F12D4.5020404@v.loewis.de> Message-ID: <15920.10553.578666.562145@montanaro.dyndns.org> >> This PEP outlines a mechanism for controlling the generation and >> location of compiled Python bytecode files. Martin> I believe this is currently underspecified: It only talks about Martin> where .pyc files are written. Wherefrom are they read? Fixed in the latest draft, checked into cvs as pep-0304. Skip From skip@pobox.com Thu Jan 23 17:47:37 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 23 Jan 2003 11:47:37 -0600 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: <200301222303.h0MN3mo16216@pcp02138704pcs.reston01.va.comcast.net> References: <15918.65127.667962.533847@montanaro.dyndns.org> <200301222303.h0MN3mo16216@pcp02138704pcs.reston01.va.comcast.net> Message-ID: <15920.10937.38611.336982@montanaro.dyndns.org> Guido> - The envvar needs to have a name starting with "PYTHON". See Guido> Misc/setuid-prog.c for the reason. Guido> - PYC may not be the best name to identify the feature, since Guido> there's also .pyo. Maybe PYTHONBYTECODEDIR? I don't mind if Guido> it's long, the feature is obscure enough to deserve that. I switched to PYTHONBYTECODEBASE. "...DIR" suggests that all bytecode will be written to that one directory. "...BASE" implied more that it's the root of a tree. I suppose "...ROOT" would also be okay. Guido> - If the directory it refers to is not writable, attempts to Guido> write are skipped too (rather than always attempting to write Guido> and catching the failure). Yes, see the pep. At startup, sys.pythonbytecodebase is set based upon what it finds in the environment. Its value is then used at runtime (faster, and under user control should there be a crying need to make runtime changes to the location). Guido> - There are two problems in this line: ... Both fixed, thanks. Skip From skip@pobox.com Thu Jan 23 17:44:48 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 23 Jan 2003 11:44:48 -0600 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: References: Message-ID: <15920.10768.166307.390827@montanaro.dyndns.org> Tim> I think this is wrong behaviour. IMO it should be as follows: ... Fixed. View the damage in pep0304. >> - When looking for a bytecode file should the directory holding the >> source file be considered as well, or just the location implied by >> PYCROOT? If so, which should be searched first? It seems to me >> that if a module lives in /usr/local/lib/python2.3/mod.py and was >> installed by root without PYCROOT set, you'd want to use the >> bytecode file there if it was up-to-date without ever considering >> os.environ["PYCROOT"] + "/usr/local/lib/python2.3/". Only if you >> need to write out a bytecode file would anything turn up there. Tim> I think it should always use PYCROOT. I have it first looking in the directory in sys.path, then in the augmented directory (roughly PYTHONBYTECODEBASE+sys.path[n]). Can you offer reasons to only consider the augmented directory? Skip From gmccaughan@synaptics-uk.com Thu Jan 23 17:48:49 2003 From: gmccaughan@synaptics-uk.com (Gareth McCaughan) Date: Thu, 23 Jan 2003 17:48:49 +0000 Subject: [Python-Dev] Re: Extended Function syntax Message-ID: <200301231748.49734.gmccaughan@synaptics-uk.com> John Williams wrote: > Compared to the other proposal going around (which I'll call Guido's, > since he brought it up), For the record, it was mine. And, in response to another thread: yes, the choice that "def foo() [a,b,c]:" should do a, then b, then c, was deliberate. "Define foo, make it a, make it b, and make it c". :-) (I take no credit for the implementation; that was Michael's alone.) -- Gareth McCaughan From jeremy@zope.com Thu Jan 23 17:51:07 2003 From: jeremy@zope.com (Jeremy Hylton) Date: Thu, 23 Jan 2003 12:51:07 -0500 Subject: [Python-Dev] test_ossaudiodev hangs Message-ID: <15920.11147.953648.673699@slothrop.zope.com> When I run the Python test suite, the ossaudiodev test hangs. Does anyone else see this hang? Is it expected? I'm going to have to disable the test in my local checkout, but that seems wrong. Jeremy slothrop:~/src/python/dist/src/build-pydebug> ./python ../Lib/test/regrtest.py test_ossaudiodev test_ossaudiodev Traceback (most recent call last): File "../Lib/test/regrtest.py", line 969, in ? main() File "../Lib/test/regrtest.py", line 259, in main ok = runtest(test, generate, verbose, quiet, testdir) File "../Lib/test/regrtest.py", line 389, in runtest the_package = __import__(abstest, globals(), locals(), []) File "/home/jeremy/src/python/dist/src/Lib/test/test_ossaudiodev.py", line 95, in ? test() File "/home/jeremy/src/python/dist/src/Lib/test/test_ossaudiodev.py", line 92, in test play_sound_file(data, rate, ssize, nchannels) File "/home/jeremy/src/python/dist/src/Lib/test/test_ossaudiodev.py", line 55, in play_sound_file a.write(data) KeyboardInterrupt From skip@pobox.com Thu Jan 23 17:59:30 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 23 Jan 2003 11:59:30 -0600 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: <16E1010E4581B049ABC51D4975CEDB886199B3@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB886199B3@UKDCX001.uk.int.atosorigin.com> Message-ID: <15920.11650.166688.759968@montanaro.dyndns.org> Paul> This is of particular interest to me, as it will potentially Paul> affect how people write import hooks. For example, I can't Paul> immediately see how this proposal will interact with the new Paul> zipimport module. I thought the concensus was that zip files shouldn't normally contain bytecode anyway. If so, then perhaps it could always be treated as a read-only directory. (Nothing about zip files is in the pep at present. Feel free to suggest text. I'm largely unfamiliar with them and haven't done much to keep abreast of the saga of the import hook.) In the absence of this proposal, what happens today if a source file is located in a zip file and there is no bytecode with it? Does a bytecode file get written? If so, is it actually inserted into the zip file? Paul> In particular, the PEP needs to say something about what, if any, Paul> requirements it places on the writers of import hooks ("If an Paul> import hook attempts to cache compiled bytecode, in a similar way Paul> to the builtin filesystem support, then it needs to check the Paul> environment variable, and...") I'll save your message, but I'd like to defer this issue for the time being. >> - Interpretation of a module's __file__ attribute.... Paul> Again, this is going to be something import hook writers need to Paul> be aware of, and to cater for. Offhand, I can't recall what the Paul> import hook PEP considers the fate of imp.find_module to be Paul> [... click, click, browse, read...] Paul> OK, if you have a module loaded from an import hook, you don't Paul> have imp.find_module. You have a new function, imp.get_loader(), Paul> but that returns only a loader, not a file name. I don't understand. Are you saying imp.find_module()'s results are undefined or that it simply doesn't exist? I still believe the module's __file__ attribute should reflect where the bytecode was found. Paul> So you need to expand on this: "If people want to locate a Paul> module's source code, they will not be able to except by using Paul> imp.find_module(module), which does not take account of import Paul> hooks. Whether this is good enough will depend upon the Paul> application". Feel free. ;-) I'll get to it eventually, but if you can come up with something plausible I'll gladly use it. Paul> [But in any case, PEP 302 points out about __file__ that "This Paul> must be a string, but it may be a dummy value, for example Paul> """ so anyone using __file__ already needs to be aware of Paul> border cases involving non-filesystem modules] I think this is wrong. If you can provide a valid way for the user to locate the bytecode you are obligated to do so. The only situation where I think a dummy value is appropriate is if the source code came from a string. Paul> Personally, I have no real need for, nor a great interest in, this Paul> feature. But others may care a lot about it, and have little need Paul> for import hooks. I therefore can't put a good judgement on Paul> whether these issues are significant - but I do think the PEP Paul> needs to address them, if only to say that the two features don't Paul> work well together, and that is understood and not considered to Paul> be a significant issue. I'm not sure I see a lot of conflict, but I'm more likely to be one of those people who ignores import hooks. I haven't used one yet, so it's unlikely I'll start tomorrow. ;-) Skip From guido@python.org Thu Jan 23 18:04:34 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 13:04:34 -0500 Subject: [Python-Dev] test_ossaudiodev hangs In-Reply-To: Your message of "Thu, 23 Jan 2003 12:51:07 EST." <15920.11147.953648.673699@slothrop.zope.com> References: <15920.11147.953648.673699@slothrop.zope.com> Message-ID: <200301231804.h0NI4YF08209@odiug.zope.com> > When I run the Python test suite, the ossaudiodev test hangs. Does > anyone else see this hang? Is it expected? Remove your build subdir and start over; the ossaudiodev module is no longer being built by default, because it doesn't compile. --Guido van Rossum (home page: http://www.python.org/~guido/) From walter@livinglogic.de Thu Jan 23 18:05:45 2003 From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Thu, 23 Jan 2003 19:05:45 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <3E2F00C1.7060905@lemburg.com> References: <20030122085353.AB36.ISHIMOTO@gembook.org> <200301221614.h0MGE4R17977@grad.sccs.chukyo-u.ac.jp> <3E2F00C1.7060905@lemburg.com> Message-ID: <3E302EF9.8080902@livinglogic.de> M.-A. Lemburg wrote: > [...] > Still, I'd would love to see some further improvement of the > size and performance of the codecs (and maybe support for the > new error callbacks; something which Hisao has integrated > into his codecs). As far as I can tell SF patch #666484 does not include full support for error callbacks, it only special cases "backslashreplace" in util.py, but there's never any call to codecs.lookup_error() to deal with unknown handler names. > [...] Bye, Walter Dörwald From guido@python.org Thu Jan 23 18:05:28 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 23 Jan 2003 13:05:28 -0500 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: Your message of "Thu, 23 Jan 2003 11:59:30 CST." <15920.11650.166688.759968@montanaro.dyndns.org> References: <16E1010E4581B049ABC51D4975CEDB886199B3@UKDCX001.uk.int.atosorigin.com> <15920.11650.166688.759968@montanaro.dyndns.org> Message-ID: <200301231805.h0NI5Sh08237@odiug.zope.com> > I thought the concensus was that zip files shouldn't normally > contain bytecode anyway. Oh, to the contrary. For efficient distribution, zip files containing only byte code are very useful. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@zope.com (Jeremy Hylton) Thu Jan 23 18:10:21 2003 From: jeremy@zope.com (Jeremy Hylton) (Jeremy Hylton) Date: Thu, 23 Jan 2003 13:10:21 -0500 Subject: [Python-Dev] test_ossaudiodev hangs In-Reply-To: <200301231804.h0NI4YF08209@odiug.zope.com> References: <15920.11147.953648.673699@slothrop.zope.com> <200301231804.h0NI4YF08209@odiug.zope.com> Message-ID: <15920.12301.717365.829725@slothrop.zope.com> >>>>> "GvR" == Guido van Rossum writes: >> When I run the Python test suite, the ossaudiodev test hangs. >> Does anyone else see this hang? Is it expected? GvR> Remove your build subdir and start over; the ossaudiodev module GvR> is no longer being built by default, because it doesn't GvR> compile. Oh! That's much better than having the test hang :-). Perhaps we should remove it until it's in a stable state? The linuxaudiodev module isn't great, but at least it compiles and passes its test suite. Jeremy From skip@pobox.com Thu Jan 23 20:16:52 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 23 Jan 2003 14:16:52 -0600 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: <200301231805.h0NI5Sh08237@odiug.zope.com> References: <16E1010E4581B049ABC51D4975CEDB886199B3@UKDCX001.uk.int.atosorigin.com> <15920.11650.166688.759968@montanaro.dyndns.org> <200301231805.h0NI5Sh08237@odiug.zope.com> Message-ID: <15920.19892.424375.533068@montanaro.dyndns.org> >>>>> "Guido" == Guido van Rossum writes: >> I thought the concensus was that zip files shouldn't normally >> contain bytecode anyway. Guido> Oh, to the contrary. For efficient distribution, zip files Guido> containing only byte code are very useful. Okay, let me rephrase: I thought the concensus was that zip files containing source (.py) files shouldn't normally contain bytecode files. Skip From lists@morpheus.demon.co.uk Thu Jan 23 20:41:34 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Thu, 23 Jan 2003 20:41:34 +0000 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files References: <16E1010E4581B049ABC51D4975CEDB886199B3@UKDCX001.uk.int.atosorigin.com> <15920.11650.166688.759968@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > Paul> This is of particular interest to me, as it will potentially > Paul> affect how people write import hooks. For example, I can't > Paul> immediately see how this proposal will interact with the new > Paul> zipimport module. > > I thought the concensus was that zip files shouldn't normally contain > bytecode anyway. If so, then perhaps it could always be treated as a > read-only directory. (Nothing about zip files is in the pep at present. > Feel free to suggest text. I'm largely unfamiliar with them and haven't > done much to keep abreast of the saga of the import hook.) Zip files can and usually should contain bytecode. If bytecode is there, it will be read as normal. > In the absence of this proposal, what happens today if a source file is > located in a zip file and there is no bytecode with it? Does a bytecode > file get written? If so, is it actually inserted into the zip file? There is no writing. Zipfiles are treated as read-only. > Paul> In particular, the PEP needs to say something about what, if any, > Paul> requirements it places on the writers of import hooks ("If an > Paul> import hook attempts to cache compiled bytecode, in a similar way > Paul> to the builtin filesystem support, then it needs to check the > Paul> environment variable, and...") > > I'll save your message, but I'd like to defer this issue for the time being. That's entirely reasonable, but it needs to be looked at at some point. Can I suggest an "Open issues" section for now? > I don't understand. Are you saying imp.find_module()'s results are > undefined or that it simply doesn't exist? PEP 302 says that imp.find_module() will be documented as acting as if hooks did not exist at all. Specifically, the wording becomes "they expose the basic *unhooked* built-in import mechanism". Read the PEP for details, but basically, the imp module API isn't really correct in the presence of hooks, so the PEP proposes a new hook-aware API, and redefines the existing API as ignoring hooks. It's not ideal, but the idea was that it's better than deprecating the existing API, just to cater for the (rare) case where hooks are involved. > I still believe the module's __file__ attribute should reflect where the > bytecode was found. I don't disagree, just pointing out that for a hook, it's the hook's responsibility to do this. > Paul> So you need to expand on this: "If people want to locate a > Paul> module's source code, they will not be able to except by using > Paul> imp.find_module(module), which does not take account of import > Paul> hooks. Whether this is good enough will depend upon the > Paul> application". > > Feel free. ;-) I'll get to it eventually, but if you can come up with > something plausible I'll gladly use it. I'll see what I can do. In all honesty, though, the whole issue of locating a module's source code is an open one. There's a new module in 2.3a1 (can't recall the name) which is designed to help with the problem. Guido wrote it when the problem became evident with zip imports. It should probably be updated to cope with this change, and the PEP should refer to it. The usual reason for looking for the module source is so that people can find data files stored relative to the module. I don't do this, so I'm not sure what the requirements might be - relative to the bytecode or the source? > Paul> [But in any case, PEP 302 points out about __file__ that "This > Paul> must be a string, but it may be a dummy value, for example > Paul> """ so anyone using __file__ already needs to be aware of > Paul> border cases involving non-filesystem modules] > > I think this is wrong. If you can provide a valid way for the user to > locate the bytecode you are obligated to do so. The only situation where I > think a dummy value is appropriate is if the source code came from a string. Again I don't disagree, but it depends on what you mean by "locate". If the code gets loaded from a SQL database, a value of __file__ which goes database_name/table_name/id_code may be an entirely reasonable "location", and indeed the only reasonable thing to put in __file__, but you wouldn't get very far doing os.listdir(os.path.dirname(__file__)) on it! None of this is news for import hooks, though. The Python language definition makes no guarantees that modules exist in filesystems, or that things like sys.path, module.__file__, or whatever, are filenames. But it's a common assumption which does no harm in 99% of cases (and in the other 1%, you're probably not going to do better than silently ignoring the issue) > I'm not sure I see a lot of conflict, but I'm more likely to be one of those > people who ignores import hooks. I haven't used one yet, so it's unlikely > I'll start tomorrow. ;-) Oh, go on. Feel the seduction of the dark side :-) Paul. -- This signature intentionally left blank From lists@morpheus.demon.co.uk Thu Jan 23 21:12:45 2003 From: lists@morpheus.demon.co.uk (Paul Moore) Date: Thu, 23 Jan 2003 21:12:45 +0000 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files References: <16E1010E4581B049ABC51D4975CEDB886199B3@UKDCX001.uk.int.atosorigin.com> <15920.11650.166688.759968@montanaro.dyndns.org> <200301231805.h0NI5Sh08237@odiug.zope.com> <15920.19892.424375.533068@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > Okay, let me rephrase: > > I thought the concensus was that zip files containing source (.py) files > shouldn't normally contain bytecode files. Not as far as I know. Do whatever you prefer. Personally, I'd rather have both, for the same reason I prefer to avoid binary-only distributions. Paul. -- This signature intentionally left blank From gward@python.net Thu Jan 23 21:24:46 2003 From: gward@python.net (Greg Ward) Date: Thu, 23 Jan 2003 16:24:46 -0500 Subject: [Python-Dev] test_ossaudiodev hangs In-Reply-To: <15920.12301.717365.829725@slothrop.zope.com> References: <15920.11147.953648.673699@slothrop.zope.com> <200301231804.h0NI4YF08209@odiug.zope.com> <15920.12301.717365.829725@slothrop.zope.com> Message-ID: <20030123212446.GA2956@cthulhu.gerg.ca> On 23 January 2003, Jeremy Hylton said: > Oh! That's much better than having the test hang :-). Perhaps we > should remove it until it's in a stable state? The linuxaudiodev > module isn't great, but at least it compiles and passes its test > suite. Please don't remove it entirely! The last time I looked at ossaudiodev (a couple of weeks ago), it was working fine -- I suspect it just needs to be a bit more liberal about older kernels. Disabling it in setup.py ought to be enough until I (or anyone else) fix the build problems. Greg -- Greg Ward http://www.gerg.ca/ I just forgot my whole philosophy of life!!! From skip@pobox.com Thu Jan 23 21:31:00 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 23 Jan 2003 15:31:00 -0600 Subject: [Python-Dev] Re: Proto-PEP regarding writing bytecode files In-Reply-To: <000201c2c302$0afdeec0$6601a8c0@NICKLEBY> References: <20030123032101.14337.75715.Mailman@mail.python.org> <000201c2c302$0afdeec0$6601a8c0@NICKLEBY> Message-ID: <15920.24340.955369.511758@montanaro.dyndns.org> Paul> The patch as posted suited me because I could just set the C flag Paul> or argv entry from my own main code before I initialized Python; Paul> or I could set the command line flag when executing with a shell Paul> script or as an alias. I see no particular reason a command-line flag can't be provided, but I think for a small extra effort significantly more control can be provided than to simply do or don't generate bytecode files. A flag works for you, but for other people (e.g., running an application in which all the .py files live in a read-only zip file), it makes some sense to provde them with a way to cache the byte compilation step. Paul> I'm unclear about how this works if I have one environment Paul> variable but am running different installations of Paul> Python. The PYCROOT (now PYTHONBYTECODEBASE) environment variable refers to an alternate root directory for writing bytecode files. Have a look at the examples in the latest version of PEP 304 and see if they don't answer your questions. Briefly, if PYTHONBYTECODEBASE is set to /tmp and urllib is found in /usr/lib/python2.3/urllib.py, given the right circumstances the bytecode would be written to /tmp/usr/lib/python2.3/urllib.pyc, not /tmp/urllib.pyc. Paul> Bottom line: I find any messing with my path to be suicidal. Which is why I won't mess with your path. ;-) Again, consider that you have /usr/lib/python2.3 in sys.path and PYTHONBYTECODEBASE is set to /tmp. When the importer looks for the urllib module, it will first look for urllib.pyc in /usr/lib/python2.3 (because that's where urllib.py exists). If it's not found there (or if what's there is out-of-date), a second check will be made in /tmp/usr/lib/python2.3. If a bytecode file needs to be written, it will be written to /tmp/usr/lib/python2.3/urllib.pyc. This should solve your problem if you set PYTHONBYTECODEBASE to the empty string (meaning don't write bytecode files at all) or if each of your 384 parallel processes sets it to a unique directory, perhaps something like: #!/bin/sh ... export PYTHONBYTECODEBASE=/tmp/`domainname --fqdn`/$$ mkdir -p $PYTHONBYTECODEBASE python ... rm -rf $PYTHONBYTECODEBASE The post-run cleanup won't be necessary if you can avoid making the directory pid-specific. Skip From skip@pobox.com Thu Jan 23 21:12:57 2003 From: skip@pobox.com (Skip Montanaro) Date: Thu, 23 Jan 2003 15:12:57 -0600 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files In-Reply-To: References: <16E1010E4581B049ABC51D4975CEDB886199B3@UKDCX001.uk.int.atosorigin.com> <15920.11650.166688.759968@montanaro.dyndns.org> Message-ID: <15920.23257.921376.296367@montanaro.dyndns.org> >> In the absence of this proposal, what happens today if a source file >> is located in a zip file and there is no bytecode with it? Does a >> bytecode file get written? If so, is it actually inserted into the >> zip file? Paul> There is no writing. Zipfiles are treated as read-only. That's fine with me, which means PEP 304 addresses a potential need for that community. Paul> In particular, the PEP needs to say something about what, if any, Paul> requirements it places on the writers of import hooks ("If an Paul> import hook attempts to cache compiled bytecode, in a similar way Paul> to the builtin filesystem support, then it needs to check the Paul> environment variable, and...") >> >> I'll save your message, but I'd like to defer this issue for the time >> being. Paul> That's entirely reasonable, but it needs to be looked at at some Paul> point. Can I suggest an "Open issues" section for now? Sure. I'll add one. >> I don't understand. Are you saying imp.find_module()'s results are >> undefined or that it simply doesn't exist? Paul> Read [PEP 302] for details, ... Will do. Paul> In all honesty, though, the whole issue of locating a module's Paul> source code is an open one. There's a new module in 2.3a1 (can't Paul> recall the name) which is designed to help with the problem. Guido Paul> wrote it when the problem became evident with zip imports. It Paul> should probably be updated to cope with this change, and the PEP Paul> should refer to it. I remember something (in the sandbox perhaps). I'll scrounge around for it. S From tim.one@comcast.net Fri Jan 24 01:08:58 2003 From: tim.one@comcast.net (Tim Peters) Date: Thu, 23 Jan 2003 20:08:58 -0500 Subject: [Python-Dev] Looking for a Windows bsddb fan Message-ID: This is a multi-part message in MIME format. --Boundary_(ID_Sk7bWSjv58rsTKm5OLz6Fw) Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT Barry said I should switch Python 2.3 on Windows to using Sleepycat 4.1.25. A patch for doing so is attached. The full test_bsddb3 looks like a disaster then, and I don't have time to dig into it: C:\Code\python\PCbuild>python ../lib/test/regrtest.py -v -u bsddb test_bsddb3 test_bsddb3 test01_associateWithDB (bsddb.test.test_associate.AssociateHashTestCase) ... ERROR test02_associateAfterDB (bsddb.test.test_associate.AssociateHashTestCase) ... ERROR test01_associateWithDB (bsddb.test.test_associate.AssociateBTreeTestCase) ... ERROR test02_associateAfterDB (bsddb.test.test_associate.AssociateBTreeTestCase) ... ERROR test01_associateWithDB (bsddb.test.test_associate.AssociateRecnoTestCase) ... ERROR test02_associateAfterDB (bsddb.test.test_associate.AssociateRecnoTestCase) ... ERROR test01_associateWithDB (bsddb.test.test_associate.ShelveAssociateHashTestCase) ... ERROR etc. It's been running a long time, and unittest doesn't show what's wrong before it's all done (and it's not done yet, so no clue here). Maybe it's shallow. Plain test_bsddb passes. Later: Ran 182 tests in 588.640s FAILED (errors=22) test test_bsddb3 failed -- errors occurred; run in verbose mode for details 1 test failed: test_bsddb3 All the errors are of this form (this is the last one): ====================================================================== ERROR: test01_basics (bsddb.test.test_dbshelve.EnvThreadHashShelveTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\CODE\PYTHON\lib\bsddb\test\test_dbshelve.py", line 70, in test01_basics self.do_open() File "C:\CODE\PYTHON\lib\bsddb\test\test_dbshelve.py", line 233, in do_open self.env.open(homeDir, self.envflags | db.DB_INIT_MPOOL | db.DB_CREATE) DBAgainError: (11, 'Resource temporarily unavailable -- unable to join the environment') Huh? Win98SE, in case that's a clue. --Boundary_(ID_Sk7bWSjv58rsTKm5OLz6Fw) Content-type: text/plain; name=patch.txt Content-transfer-encoding: 7BIT Content-disposition: attachment; filename=patch.txt Index: _bsddb.dsp =================================================================== RCS file: /cvsroot/python/python/dist/src/PCbuild/_bsddb.dsp,v retrieving revision 1.2 diff -c -r1.2 _bsddb.dsp *** _bsddb.dsp 23 Nov 2002 18:48:06 -0000 1.2 --- _bsddb.dsp 24 Jan 2003 01:03:38 -0000 *************** *** 44,50 **** # PROP Target_Dir "" F90=df.exe # ADD BASE CPP /nologo /MT /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /YX /FD /c ! # ADD CPP /nologo /MD /W3 /GX /Zi /O2 /I "..\Include" /I "..\PC" /I "..\..\db-4.0.14\build_win32" /D "NDEBUG" /D "WIN32" /D "_WINDOWS" /YX /FD /c # ADD BASE MTL /nologo /D "NDEBUG" /mktyplib203 /o "NUL" /win32 # ADD MTL /nologo /D "NDEBUG" /mktyplib203 /o "NUL" /win32 # ADD BASE RSC /l 0x409 /d "NDEBUG" --- 44,50 ---- # PROP Target_Dir "" F90=df.exe # ADD BASE CPP /nologo /MT /W3 /GX /O2 /D "WIN32" /D "NDEBUG" /D "_WINDOWS" /YX /FD /c ! # ADD CPP /nologo /MD /W3 /GX /Zi /O2 /I "..\Include" /I "..\PC" /I "..\..\db-4.1.25\build_win32" /D "NDEBUG" /D "WIN32" /D "_WINDOWS" /YX /FD /c # ADD BASE MTL /nologo /D "NDEBUG" /mktyplib203 /o "NUL" /win32 # ADD MTL /nologo /D "NDEBUG" /mktyplib203 /o "NUL" /win32 # ADD BASE RSC /l 0x409 /d "NDEBUG" *************** *** 54,60 **** # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:windows /dll /machine:I386 ! # ADD LINK32 user32.lib kernel32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib ..\..\db-4.0.14\build_win32\Release_static\libdb40s.lib /nologo /base:"0x1e180000" /subsystem:windows /dll /debug /machine:I386 /nodefaultlib:"msvcrt" /out:"./_bsddb.pyd" # SUBTRACT LINK32 /pdb:none !ELSEIF "$(CFG)" == "_bsddb - Win32 Debug" --- 54,60 ---- # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:windows /dll /machine:I386 ! # ADD LINK32 user32.lib kernel32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib ..\..\db-4.1.25\build_win32\Release_static\libdb41s.lib /nologo /base:"0x1e180000" /subsystem:windows /dll /debug /machine:I386 /nodefaultlib:"msvcrt" /out:"./_bsddb.pyd" # SUBTRACT LINK32 /pdb:none !ELSEIF "$(CFG)" == "_bsddb - Win32 Debug" *************** *** 72,78 **** # PROP Target_Dir "" F90=df.exe # ADD BASE CPP /nologo /MTd /W3 /Gm /GX /Zi /Od /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /YX /FD /c ! # ADD CPP /nologo /MDd /W3 /Gm /GX /Zi /Od /I "..\Include" /I "..\PC" /I "..\..\db-4.0.14\build_win32" /D "_DEBUG" /D "WIN32" /D "_WINDOWS" /YX /FD /c # ADD BASE MTL /nologo /D "_DEBUG" /mktyplib203 /o "NUL" /win32 # ADD MTL /nologo /D "_DEBUG" /mktyplib203 /o "NUL" /win32 # ADD BASE RSC /l 0x409 /d "_DEBUG" --- 72,78 ---- # PROP Target_Dir "" F90=df.exe # ADD BASE CPP /nologo /MTd /W3 /Gm /GX /Zi /Od /D "WIN32" /D "_DEBUG" /D "_WINDOWS" /YX /FD /c ! # ADD CPP /nologo /MDd /W3 /Gm /GX /Zi /Od /I "..\Include" /I "..\PC" /I "..\..\db-4.1.25\build_win32" /D "_DEBUG" /D "WIN32" /D "_WINDOWS" /YX /FD /c # ADD BASE MTL /nologo /D "_DEBUG" /mktyplib203 /o "NUL" /win32 # ADD MTL /nologo /D "_DEBUG" /mktyplib203 /o "NUL" /win32 # ADD BASE RSC /l 0x409 /d "_DEBUG" *************** *** 82,88 **** # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:windows /dll /debug /machine:I386 /pdbtype:sept ! # ADD LINK32 user32.lib kernel32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib ..\..\db-4.0.14\build_win32\Release_static\libdb40s.lib /nologo /base:"0x1e180000" /subsystem:windows /dll /debug /machine:I386 /nodefaultlib:"msvcrtd" /out:"./_bsddb_d.pyd" /pdbtype:sept # SUBTRACT LINK32 /pdb:none !ENDIF --- 82,88 ---- # ADD BSC32 /nologo LINK32=link.exe # ADD BASE LINK32 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib /nologo /subsystem:windows /dll /debug /machine:I386 /pdbtype:sept ! # ADD LINK32 user32.lib kernel32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib ..\..\db-4.1.25\build_win32\Release_static\libdb41s.lib /nologo /base:"0x1e180000" /subsystem:windows /dll /debug /machine:I386 /nodefaultlib:"msvcrtd" /out:"./_bsddb_d.pyd" /pdbtype:sept # SUBTRACT LINK32 /pdb:none !ENDIF Index: readme.txt =================================================================== RCS file: /cvsroot/python/python/dist/src/PCbuild/readme.txt,v retrieving revision 1.38 diff -c -r1.38 readme.txt *** readme.txt 30 Dec 2002 00:40:40 -0000 1.38 --- readme.txt 24 Jan 2003 01:03:38 -0000 *************** *** 163,192 **** _bsddb ! XXX The Sleepycat release we use will probably change before ! XXX 2.3a1. ! Go to Sleepycat's patches page: ! http://www.sleepycat.com/update/index.html ! and download ! 4.0.14.zip ! from the download page. The file name is db-4.0.14.zip. Unpack into ! dist\db-4.0.14 ! ! Apply the patch file bsddb_patch.txt in this (PCbuild) directory ! against the file ! dist\db-4.0.14\db\db_reclaim.c ! ! Go to ! http://www.sleepycat.com/docs/ref/build_win/intro.html ! and follow the instructions for building the Sleepycat software. ! Build the Release version. ! NOTE: The instructions are for a later release of the software, ! so use your imagination. Berkeley_DB.dsw in this release was ! also pre-MSVC6, so you'll be prompted to upgrade the format (say ! yes, of course). Choose configuration "db_buildall - Win32 Release", ! and build db_buildall.exe. ! XXX We're actually linking against Release_static\libdb40s.lib. XXX This yields the following warnings: """ Compiling... --- 163,185 ---- _bsddb ! Go to Sleepycat's download page: ! http://www.sleepycat.com/download/ ! and download version 4.1.25. The file name is db-4.1.25.NC.zip. ! XXX with or without strong cryptography? I picked "without". ! ! Unpack into ! dist\db-4.1.25 ! ! Open ! dist\db-4.1.25\docs\index.html ! ! and follow the Windows instructions for building the Sleepycat ! software. Build the Release version ("build_all -- Win32 Release"). ! Note that Berkeley_DB.dsw is in the build_win32 subdirectory. ! ! XXX We're actually linking against Release_static\libdb41s.lib. XXX This yields the following warnings: """ Compiling... *************** *** 201,206 **** --- 194,200 ---- """ XXX This isn't encouraging, but I don't know what to do about it. + To run extensive tests, pass "-u bsddb" to regrtest.py. _ssl Python wrapper for the secure sockets library. --Boundary_(ID_Sk7bWSjv58rsTKm5OLz6Fw)-- From jeremy@alum.mit.edu Fri Jan 24 04:40:01 2003 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 23 Jan 2003 23:40:01 -0500 Subject: [Python-Dev] PyConDC sprints In-Reply-To: <15912.11314.856979.394880@slothrop.zope.com> References: <15912.11314.856979.394880@slothrop.zope.com> Message-ID: <15920.50081.917252.87551@localhost.localdomain> Last week I asked some questions about the sprint session planned for PyConDC, but I think I asked the wrong questions :-). Let me try again. Is anyone on the list planning to attend the sprint session and do some work on python-dev projects? We hope the cost will be less than $25/day. If I have a better sense of what developers are coming, I can try to round up some coaches with complementary interests. I had asked for topic suggestions and coach volunteers. I got one potential coach volunteer and several topic suggestions without coaches. They are: documentation, general bug fixing, and a BCD library. I'd be happy to organize an ast-branch sprint if there is sufficient interest. Jeremy From martin@v.loewis.de Fri Jan 24 09:04:52 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 24 Jan 2003 10:04:52 +0100 Subject: [Python-Dev] Adding Japanese Codecs to the distro In-Reply-To: <200301222250.h0MMoOI19022@grad.sccs.chukyo-u.ac.jp> References: <200301221632.h0MGWCI18042@grad.sccs.chukyo-u.ac.jp> <200301222250.h0MMoOI19022@grad.sccs.chukyo-u.ac.jp> Message-ID: Tamito KAJIYAMA writes: > The StreamReader/Writer classes in JapaneseCodecs can cope with > the statefulness, BTW. I see. I was really concerned about the stream reader only; I agree that it is perfectly reasonable to assume that an individual string is "complete" with regard to the encoding. Notice that your codec is thus advanced over both the Python UTF-8 codec, and Hisao's codec: neither of those care about this issue; this is a bug in both. Regards, Martin From martin@v.loewis.de Fri Jan 24 09:13:45 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 24 Jan 2003 10:13:45 +0100 Subject: [Python-Dev] Looking for a Windows bsddb fan In-Reply-To: References: Message-ID: Tim Peters writes: > It's been running a long time, and unittest doesn't show what's wrong before > it's all done (and it's not done yet, so no clue here). Maybe it's shallow. > Plain test_bsddb passes. I suggest not to worry about this too much. I get the impression that the bsddb3 test suite is quite harsh, and that it makes assumptions (e.g. about how threads interleave) that hold only on the system of the author of the test. It also seems that once a test failed in an unpredicted manner (e.g. by not giving back a lock it holds), subsequent tests will fail. I have settled to wait for user cooperation, here: the "plain" bsddb modus seems to work fine, the problems only occur if you use advanced features like locking and transactions. If people stumble over it, some may contribute patches for 2.3.1. Before someone complains about this attitude: I do believe that there ain't any backward compatibility issues with those bugs, i.e. code that used to work with 2.2 will continue to work. You have to deliberately make use of the new features to run into the new bugs. Regards, Martin From Paul.Moore@atosorigin.com Fri Jan 24 09:53:25 2003 From: Paul.Moore@atosorigin.com (Moore, Paul) Date: Fri, 24 Jan 2003 09:53:25 -0000 Subject: [Python-Dev] Proto-PEP regarding writing bytecode files Message-ID: <16E1010E4581B049ABC51D4975CEDB880113D873@UKDCX001.uk.int.atosorigin.com> From: Skip Montanaro [mailto:skip@pobox.com] > At any rate, I believe I incorporated everybody's suggestions > (thanks for the feedback) and checked it in as pop-0304 (the > number David Goodger assigned me). You can now grab it from > CVS or view it online at > > = http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondist/peps= /pep-0304.txt > > More feedback is welcome. Once the uproar^H^H^H^H^H^Hfeedback > dies down a bit, I will look at implementing the idea. Reading this, it seems to me that this could probably be implemented by an import hook. Just's plan is to expose the (currently internal) filesystem module loading code as a hook for 2.3a2. If you discuss this with him, I'd imagine that it would be possible to a) Ensure that however it gets exposed, it has enough customisability to allow you to implement this feature b) Actually implement this feature as an import hook, which could then be loaded by people who need it, and ignored by people (like me) who don't care. The PEP still needs an entry in the Issues section for import hooks. For now, just putting "Not clear how this interacts with import hooks" should be enough... Paul. From duncan@rcp.co.uk Fri Jan 24 11:42:55 2003 From: duncan@rcp.co.uk (Duncan Booth) Date: Fri, 24 Jan 2003 11:42:55 +0000 Subject: [Python-Dev] Extended Function syntax References: <3E300A13.6020303@pobox.com> Message-ID: John Williams wrote in news:3E300A13.6020303@pobox.com: > Compared to the other proposal going around (which I'll call Guido's, > since he brought it up), the really big advantage of my proposal is that > you can use it to do something like adding a property to a class > implicitly by defining its getter and setter methods: > > class A(object): > > def get foo(self): > "Getter for property 'foo'." > return self.__foo > > def set foo(self, foo): > "Setter for property 'foo'." > self.__foo = foo > > At this stage I'd much rather see Guido's proposal implemented, unless > someone comes up with a truly ingenious way to combine the advantages of > both. How about this: class A(object): def foo(self, foo) [property.set]: "Setter for property 'foo'." self.__foo = foo def foo(self) [property.get]: "Getter for property 'foo'." return self.__foo Then add static methods to property that look something like this: def set(fn): if isinstance(fn, property): return property(fn.fget, fn, fn.fdel, fn.__doc__) else: return property(fset=fn) def get(fn): ... def delete(fn): ... -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? From martin@v.loewis.de Fri Jan 24 11:52:32 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 24 Jan 2003 12:52:32 +0100 Subject: [Python-Dev] Extended Function syntax In-Reply-To: References: <3E300A13.6020303@pobox.com> Message-ID: <3E312900.90100@v.loewis.de> Duncan Booth wrote: > How about this: > > class A(object): > > def foo(self, foo) [property.set]: > "Setter for property 'foo'." > self.__foo = foo > > def foo(self) [property.get]: > "Getter for property 'foo'." > return self.__foo This is beautiful, but it does not work: when defining the getter, you need both the old property, and the new function object. Looking at your code def set(fn): if isinstance(fn, property): return property(fn.fget, fn, fn.fdel, fn.__doc__) you first assume fn is the property object, and then assume it is the setter function. Of course, there is no reason why the namespace-under-construction couldn't be passed to the annotation, but that would be an extension to the protocol. Martin From duncan@rcp.co.uk Fri Jan 24 13:22:04 2003 From: duncan@rcp.co.uk (Duncan Booth) Date: Fri, 24 Jan 2003 13:22:04 +0000 Subject: [Python-Dev] Extended Function syntax References: <3E300A13.6020303@pobox.com> <3E312900.90100@v.loewis.de> Message-ID: "Martin v. Löwis" wrote in news:3E312900.90100@v.loewis.de: > Duncan Booth wrote: >> How about this: >> >> class A(object): >> >> def foo(self, foo) [property.set]: >> "Setter for property 'foo'." >> self.__foo = foo >> >> def foo(self) [property.get]: >> "Getter for property 'foo'." >> return self.__foo > > This is beautiful, but it does not work: when defining the getter, you > need both the old property, and the new function object. Gah!, I must be asleep today. Something like this might work (although it is getting a bit messy): def set(fn): get, delete, doc = None, None, fn.__doc__ namespace = inspect.getcurrentframe().f_back.f_locals oldfn = namespace.get(fn.__name__) if isinstance(oldfn, property): get, delete, doc = oldfn.fget, oldfn.fdel, oldfn.__doc__ return property(get, fn, delete, doc) -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? From barry@python.org Fri Jan 24 13:35:02 2003 From: barry@python.org (Barry A. Warsaw) Date: Fri, 24 Jan 2003 08:35:02 -0500 Subject: [Python-Dev] Looking for a Windows bsddb fan References: Message-ID: <15921.16646.977988.547638@gargle.gargle.HOWL> >>>>> "TP" == Tim Peters writes: TP> Barry said I should switch Python 2.3 on Windows to using TP> Sleepycat 4.1.25. A patch for doing so is attached. The full TP> test_bsddb3 looks like a disaster then, and I don't have time TP> to dig into it: I forwarded this message to pybsddb-users@lists.sf.net. I'm not sure if Greg Smith reads python-dev and he's probably the best guy to look at Windows problems. Needless to say, the extended tests work(ed) fine for me on Linux. -Barry From barry@python.org Fri Jan 24 13:58:01 2003 From: barry@python.org (Barry A. Warsaw) Date: Fri, 24 Jan 2003 08:58:01 -0500 Subject: [Python-Dev] Looking for a Windows bsddb fan References: <15921.16646.977988.547638@gargle.gargle.HOWL> Message-ID: <15921.18025.776570.83329@gargle.gargle.HOWL> >>>>> "BAW" == Barry A Warsaw writes: BAW> I forwarded this message to pybsddb-users@lists.sf.net. I'm BAW> not sure if Greg Smith reads python-dev and he's probably the BAW> best guy to look at Windows problems. Needless to say, the BAW> extended tests work(ed) fine for me on Linux. I take that back, I'm getting similar failures on Linux after cvs upping and rebuilding with db 4.1.25. Will investigate. -Barry -------------------- snip snip -------------------- ... ====================================================================== ERROR: test01_associateWithDB (bsddb.test.test_associate.ThreadedAssociateRecnoTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/barry/projects/python/Lib/bsddb/test/test_associate.py", line 96, in setUp self.env.open(homeDir, db.DB_CREATE | db.DB_INIT_MPOOL | DBAgainError: (11, 'Resource temporarily unavailable -- unable to join the environment') ====================================================================== ERROR: test02_associateAfterDB (bsddb.test.test_associate.ThreadedAssociateRecnoTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/barry/projects/python/Lib/bsddb/test/test_associate.py", line 96, in setUp self.env.open(homeDir, db.DB_CREATE | db.DB_INIT_MPOOL | DBAgainError: (11, 'Resource temporarily unavailable -- unable to join the environment') ---------------------------------------------------------------------- Ran 182 tests in 493.527s FAILED (errors=18) [26387 refs] From barry@python.org Fri Jan 24 14:00:39 2003 From: barry@python.org (Barry A. Warsaw) Date: Fri, 24 Jan 2003 09:00:39 -0500 Subject: [Python-Dev] Looking for a Windows bsddb fan References: <15921.16646.977988.547638@gargle.gargle.HOWL> <15921.18025.776570.83329@gargle.gargle.HOWL> Message-ID: <15921.18183.972692.821964@gargle.gargle.HOWL> >>>>> "BAW" == Barry A Warsaw writes: >>>>> "BAW" == Barry A Warsaw writes: BAW> I forwarded this message to pybsddb-users@lists.sf.net. I'm BAW> not sure if Greg Smith reads python-dev and he's probably the BAW> best guy to look at Windows problems. Needless to say, the BAW> extended tests work(ed) fine for me on Linux. BAW> I take that back, I'm getting similar failures on Linux after BAW> cvs upping and rebuilding with db 4.1.25. Arg. It's gotta be some kind of race condition. Second run, no failures. :( -Barry From tim.one@comcast.net Fri Jan 24 17:33:13 2003 From: tim.one@comcast.net (Tim Peters) Date: Fri, 24 Jan 2003 12:33:13 -0500 Subject: [Python-Dev] Looking for a Windows bsddb fan In-Reply-To: Message-ID: [martin@v.loewis.de] > I suggest not to worry about this too much. I get the impression that > the bsddb3 test suite is quite harsh, and that it makes assumptions > (e.g. about how threads interleave) that hold only on the system of > the author of the test. It clearly shows timing-dependent behavior. The tests all passed on Win2K (once ), so I checked it in. > It also seems that once a test failed in an unpredicted manner (e.g. by > not giving back a lock it holds), subsequent tests will fail. Easy to believe. > I have settled to wait for user cooperation, here: the "plain" bsddb > modus seems to work fine, the problems only occur if you use advanced > features like locking and transactions. If people stumble over it, > some may contribute patches for 2.3.1. > > Before someone complains about this attitude: ... Not me. It would be better if the tests were robust, but I realize it can be supernaturally difficult to achieve that (at PLabs we write and run tests for ZODB and ZEO too -- sporadic failures are a fact of life, and often due to the difficulty of writing controlled multi-threaded tests). From greg@electricrain.com Fri Jan 24 20:24:57 2003 From: greg@electricrain.com (Gregory P. Smith) Date: Fri, 24 Jan 2003 12:24:57 -0800 Subject: [pybsddb] Re: [Python-Dev] Looking for a Windows bsddb fan In-Reply-To: <15921.18025.776570.83329@gargle.gargle.HOWL> References: <15921.16646.977988.547638@gargle.gargle.HOWL> <15921.18025.776570.83329@gargle.gargle.HOWL> Message-ID: <20030124202457.GB14096@zot.electricrain.com> I don't get these errors under linux or windows using the latest pybsddb cvs and python 2.2.2. (berkeleydb 4.1.25) Occasionally there are DBDeadlockErrors that come up in the threading tests (more often on windows) but they are appropriate and non fatal to the tests. I haven't tried python 2.3a from cvs. Can you verify that the build/db_home directory created to use for the test DB environment & DB files is cleared out when you run test.py (i believe part of the unit tests themselves removes the entire directory and contents; but I remember errors long ago that were due to using an old messed up DB environment rather than a fresh one) -G On Fri, Jan 24, 2003 at 08:58:01AM -0500, Barry A. Warsaw wrote: > > >>>>> "BAW" == Barry A Warsaw writes: > > BAW> I forwarded this message to pybsddb-users@lists.sf.net. I'm > BAW> not sure if Greg Smith reads python-dev and he's probably the > BAW> best guy to look at Windows problems. Needless to say, the > BAW> extended tests work(ed) fine for me on Linux. > > I take that back, I'm getting similar failures on Linux after > cvs upping and rebuilding with db 4.1.25. > > Will investigate. > -Barry > > -------------------- snip snip -------------------- > ... > ====================================================================== > ERROR: test01_associateWithDB (bsddb.test.test_associate.ThreadedAssociateRecnoTestCase) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/home/barry/projects/python/Lib/bsddb/test/test_associate.py", line 96, in setUp > self.env.open(homeDir, db.DB_CREATE | db.DB_INIT_MPOOL | > DBAgainError: (11, 'Resource temporarily unavailable -- unable to join the environment') ... From mgarcia@cole-switches.com Fri Jan 24 20:26:48 2003 From: mgarcia@cole-switches.com (Manuel Garcia) Date: Fri, 24 Jan 2003 12:26:48 -0800 Subject: [Python-Dev] Re: Extended Function syntax Message-ID: I always considered stat_m = staticmethod(stat_m) class_m = classmethod(class_m) to be less about an awkward notation, and more about the standard Python idiom for one technique to keep namespaces clean of unused identifiers. Like this: i = int(i) # i held a string, now holds an int Instead of: i = int(i_string) # will never use i_string again With the benefit of clean namespaces being less identifiers to trip over when using any tool for introspection, preventing accidental references to identifiers we really wanted to be temporary, etc. In my mind, staticmethod(), classmethod(), and property() are the same, they take functions and "package" them in a way suitable for implementing methods or properties. Since everyone loves functional programming :-/ , functions that act on functions are nothing to be ashamed of, and don't deserve to be hidden behind syntactical sugar. (Things like stat_m = staticmethod(stat_m) is one reason of why C++ and Java people consider Python "not really object oriented", as if throwing keywords and keyword phrases at every damn OO construct is the best way to be object oriented. Yes, syntactical sugar like def name() [staticmethod]: helps Python gain more respect with the C++/Java crowd.) The most explicit way of using staticmethod(), classmethod(), property() forces you to take extra steps to keep your namespace clean: class klass01(object): def _pure_set_x(x): klass01.x = x static_set_x = staticmethod(_pure_set_x) def _pure_get_x(): return klass01.x static_get_x = staticmethod(_pure_get_x) def _pure_set_y(cls, y): cls.y = y class_set_y = classmethod(_pure_set_y) def _pure_get_y(cls): return cls.y class_get_y = classmethod(_pure_get_y) del _pure_set_x, _pure_get_x, _pure_set_y, _pure_get_y def _get_j(self): return self._j def _set_j(self, j): self._j = j j = property(_get_j, _set_j, None, 'dynamite!') del _get_j, _set_j ( This argument is more about property() than staticmethod(), classmethod(). I have to admit "stat_m = staticmethod(stat_m)" doesn't really bother me very much. I have a little more to say about staticmethod below.) I always wanted some syntactical sugar for this idiom: def _junk(): a = long_expression01(x0, y0) b = long_expression02(x1, y1) c = long_expression03(x2, y2) return min(a,b,c), max(a,b,c) d, e = _junk() del _junk() with the purpose being keeping a,b,c out of our nice clean namespace. The notation would be: d, e = block: a = long_expression01(x0, y0) b = long_expression02(x1, y1) c = long_expression03(x2, y2) return min(a,b,c), max(a,b,c) We would set up a property like this (if there is no benefit to having _get_j and _set_j in the namespace): j = block: def _get_j(self): return self._j def _set_j(self, j): self._j = j return property(_get_j, _set_j, None, 'dynamite!') I would like this way to define properties. Everything is for "j" is kept together, with the temporary names that make everything clear disappearing when we leave the block. Considering the most general form of the proposed extended function syntax: def name(arg, ...) [expr1, expr2, expr3]: ...body... we would have instead: name = block: def _(arg, ...): ...body... return expr3(expr2(expr1(_))) It takes 2 more lines, but you have the benefit of not having to explain that [expr1, expr2, expr3] means expr3(expr2(expr1(name))) Also less likely to need a lambda, etc. I admit this is almost no help at all for staticmethod(), classmethod() except perhaps in this case: stat1, stat2, stat3 = block: def pure1(): ...body... def pure2(): ...body... def pure3(): ...body... return staticmethod( (pure1, pure2, pure3) ) where staticmethod() now also accepts a tuple of functions. Contrived, but if you had more than one staticmethod, they would probably be grouped together anyway. We only saved 1 lousy line for 3 staticmethods. However, we have made the idea of turning pure functions into staticmethods explicit. I admit we have gained very little. We gain nothing here (except dubious explicitness): stat1 = block: def pure1(): ...body... return staticmethod(pure1) instead of: def stat1(): ...body... stat1 = staticmethod(stat1) or: def stat1() [staticmethod]: ...body... Some mention was made of classes for a similar extended syntax: klass1 = block: class klass1(): ...body... return I02(I01(klass1)) Obvious problems: All this stinks of trying to come up with a poor substitute of "super-lambda" (a way to make an anonymous function allowing multiple lines and statements). I don't deny it. The main problem is that on the right hand side of the equals sign is something that is not an expression. We could solve that problem with this abomination: block: global d, e a = long_expression01(x0, y0) b = long_expression02(x1, y1) c = long_expression03(x2, y2) d,e = min(a,b,c), max(a,b,c) I associate use of 'global' in my programs with a design flaw, I would not defend this. Finishing up: block: ...body... when you don't care about the return value. g = block: ...body... yield x Would be the same as: def g(): ...body... yield x but probably should raise an error because it doesn't match what happens with 'return', not even matching in spirit. From tim@zope.com Fri Jan 24 22:34:51 2003 From: tim@zope.com (Tim Peters) Date: Fri, 24 Jan 2003 17:34:51 -0500 Subject: [Python-Dev] Mixed-type datetime comparisons Message-ID: I checked in changes so that datetime and date comparison return NotImplemented (instead of raising TypeError) if "the other" argument has a timetuple attribute. This gives other kinds of datetime objects a chance to intercept the comparison and implement it themselves. Note that this doesn't help for mixed-type time or timedelta comparison: datetime's time and timedelta objects don't have timetuple methods themselves, and their comparison implementations still raise TypeError whenever they don't recognize the other comparand's type (this is needed to prevent comparison against objects of arbitrary types from falling back to the default comparison of object addresses). From vinay_sajip@red-dove.com Fri Jan 24 23:27:48 2003 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Fri, 24 Jan 2003 23:27:48 -0000 Subject: [Python-Dev] logging package -- documentation References: <200301231540.h0NFe8s06989@odiug.zope.com> <01af01c2c2fc$83962ec0$652b6992@alpha> <200301231633.h0NGXqq07476@odiug.zope.com> Message-ID: <000f01c2c400$38377d80$652b6992@alpha> > I'm willing to keep these things if you fix the documentation. The > documentation for the whole module is currently in a rather sorry > state. Example: "Using the package doesn't get much simpler." is more > sales talk than documentation, and the example following that phrase > produces less and different output than quoted. Yes, but I've already uploaded a better LaTeX documentation file: http://sourceforge.net/tracker/?func=detail&aid=638299&group_id=5470&atid=30 5470 This was uploaded quite a while ago - on 14 November 2002. The version in CVS appears to be a version Skip put together from the web page (http://www.red-dove.com/python_logging.html); hence the "sales talk". The above item is assigned to Skip and you've asked him to merge my upload with his work, or override as he sees fit. The status is still open so I'm not sure whether Skip has had a chance to do anything. Admittedly, that upload does not have complete details on the configuration file format. I can easily rectify this, but can you (or someone) please look at the above upload and tell me if the documentation is otherwise OK? Regards, Vinay From skip@pobox.com Fri Jan 24 23:37:02 2003 From: skip@pobox.com (Skip Montanaro) Date: Fri, 24 Jan 2003 17:37:02 -0600 Subject: [Python-Dev] logging package -- documentation In-Reply-To: <000f01c2c400$38377d80$652b6992@alpha> References: <200301231540.h0NFe8s06989@odiug.zope.com> <01af01c2c2fc$83962ec0$652b6992@alpha> <200301231633.h0NGXqq07476@odiug.zope.com> <000f01c2c400$38377d80$652b6992@alpha> Message-ID: <15921.52766.84550.718271@montanaro.dyndns.org> Vinay> Yes, but I've already uploaded a better LaTeX documentation file: Vinay> http://sourceforge.net/tracker/?func=detail&aid=638299&group_id=5470&atid=305470 Vinay> This was uploaded quite a while ago - on 14 November 2002. I should just replace what I wrote with what Vinay wrote. I started out trying to "merge" them, but quickly got bogged down because the two documents didn't have a common heritage. Vinay> The status is still open so I'm not sure whether Skip has had a Vinay> chance to do anything. Vinay, do you have checkin privileges? If so, please go ahead and do the replacement. Otherwise, let me know and I'll take care of it. My apologies for falling down on the job. Skip From gtalvola@nameconnector.com Fri Jan 24 23:37:38 2003 From: gtalvola@nameconnector.com (Geoffrey Talvola) Date: Fri, 24 Jan 2003 18:37:38 -0500 Subject: [Python-Dev] the new 2.3a1 settimeout() with httplib and SSL Message-ID: <61957B071FF421419E567A28A45C7FE54010DC@mailbox.nameconnector.com> I'm trying to get the new socket.settimeout() in Python 2.3a1 to work in conjunction with httplib and SSL. This code seems to work fine: import httplib conn = httplib.HTTPConnection('ncsdevtest.nameconnector.com', 80) conn.connect() conn.sock.settimeout(90) conn.request('GET', '/cgi-bin/Pause30.cgi') response = conn.getresponse() print response.status, response.reason data = response.read() print 'read', len(data), 'bytes' conn.close() Where Pause30.cgi is a cgi script that simply sleeps for 30 seconds then sends back a simple response. As-is, this program returns after 30 seconds. If I adjust the timeout of 90 to be, lets say, 5 seconds, I correctly get a timeout exception after 5 seconds. So far, so good. But if I change HTTPConnection to HTTPSConnection and change 80 to 443, I have trouble -- my CPU usage goes up to 100%, the python process sucks up more and more memory, and it doesn't time out at all. It does still returns the correct response after 30 seconds. Is there a way to do this? Should I enter a bug report? - Geoff From guido@python.org Sat Jan 25 00:58:15 2003 From: guido@python.org (Guido van Rossum) Date: Fri, 24 Jan 2003 19:58:15 -0500 Subject: [Python-Dev] the new 2.3a1 settimeout() with httplib and SSL In-Reply-To: Your message of "Fri, 24 Jan 2003 18:37:38 EST." <61957B071FF421419E567A28A45C7FE54010DC@mailbox.nameconnector.com> References: <61957B071FF421419E567A28A45C7FE54010DC@mailbox.nameconnector.com> Message-ID: <200301250058.h0P0wFG22464@pcp02138704pcs.reston01.va.comcast.net> > I'm trying to get the new socket.settimeout() in Python 2.3a1 to work in > conjunction with httplib and SSL. This code seems to work fine: > > import httplib > conn = httplib.HTTPConnection('ncsdevtest.nameconnector.com', 80) > conn.connect() > conn.sock.settimeout(90) > conn.request('GET', '/cgi-bin/Pause30.cgi') > response = conn.getresponse() > print response.status, response.reason > data = response.read() > print 'read', len(data), 'bytes' > conn.close() > > Where Pause30.cgi is a cgi script that simply sleeps for 30 seconds then > sends back a simple response. > > As-is, this program returns after 30 seconds. If I adjust the timeout of 90 > to be, lets say, 5 seconds, I correctly get a timeout exception after 5 > seconds. So far, so good. > > But if I change HTTPConnection to HTTPSConnection and change 80 to 443, I > have trouble -- my CPU usage goes up to 100%, the python process sucks up > more and more memory, and it doesn't time out at all. It does still returns > the correct response after 30 seconds. > > Is there a way to do this? Should I enter a bug report? Hm, when I added the timeout feature, I didn't think of SSL at all. I imagine that SSL gets an error and keeps retrying immediately, rather than using select() to block until more data is available. Part of this is that this simply doesn't work for SSL -- you shouldn't do that. (Sorry if you want it -- it's beyond my capabilities to hack this into the SSL code.) Part of this is that the SSL code should refuse a socket that's in nonblocking mode, *or* perhaps should restore blocking mode; I'm not sure. Anyway, please do enter a bug report. (A patch would be even cooler!) --Guido van Rossum (home page: http://www.python.org/~guido/) From vinay_sajip@red-dove.com Sat Jan 25 12:39:55 2003 From: vinay_sajip@red-dove.com (Vinay Sajip) Date: Sat, 25 Jan 2003 12:39:55 -0000 Subject: [Python-Dev] logging package -- documentation References: <200301231540.h0NFe8s06989@odiug.zope.com> <01af01c2c2fc$83962ec0$652b6992@alpha> <200301231633.h0NGXqq07476@odiug.zope.com> <000f01c2c400$38377d80$652b6992@alpha> <15921.52766.84550.718271@montanaro.dyndns.org> Message-ID: <000601c2c46e$deccb700$652b6992@alpha> > Vinay, do you have checkin privileges? If so, please go ahead and do the > replacement. Otherwise, let me know and I'll take care of it. > > My apologies for falling down on the job. No problem. I don't have checkin privileges, but I do have a revised liblogging.tex which adds a section for the config file format. I will upload this later today, so please do the checkin of the revised file. Thanks, Vinay From mal@lemburg.com Sat Jan 25 17:39:19 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 25 Jan 2003 18:39:19 +0100 Subject: [Python-Dev] Mixed-type datetime comparisons In-Reply-To: References: Message-ID: <3E32CBC7.6090108@lemburg.com> Tim Peters wrote: > I checked in changes so that datetime and date comparison return > NotImplemented (instead of raising TypeError) if "the other" argument has a > timetuple attribute. This gives other kinds of datetime objects a chance to > intercept the comparison and implement it themselves. Nice. > Note that this doesn't help for mixed-type time or timedelta comparison: > datetime's time and timedelta objects don't have timetuple methods > themselves, and their comparison implementations still raise TypeError > whenever they don't recognize the other comparand's type (this is needed to > prevent comparison against objects of arbitrary types from falling back to > the default comparison of object addresses). Why not add them (setting the date parts to None) ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim@zope.com Sat Jan 25 19:25:22 2003 From: tim@zope.com (Tim Peters) Date: Sat, 25 Jan 2003 14:25:22 -0500 Subject: [Python-Dev] Mixed-type datetime comparisons In-Reply-To: <3E32CBC7.6090108@lemburg.com> Message-ID: [TIm] > Note that this doesn't help for mixed-type time or timedelta comparison: > datetime's time and timedelta objects don't have timetuple methods > themselves, and their comparison implementations still raise TypeError > whenever they don't recognize the other comparand's type (this > is needed to prevent comparison against objects of arbitrary types from > falling back to the default comparison of object addresses). [M.-A. Lemburg] > Why not add them (setting the date parts to None) ? Because they're of no use, and at least timedelta.timetuple() wouldn't even make hallucinogenic sense (a timetuple is in units of days, seconds and microseconds). From tim.one@comcast.net Sat Jan 25 19:29:45 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 25 Jan 2003 14:29:45 -0500 Subject: [pybsddb] Re: [Python-Dev] Looking for a Windows bsddb fan In-Reply-To: <20030124202457.GB14096@zot.electricrain.com> Message-ID: [Gregory P. Smith] > I don't get these errors under linux or windows using the latest pybsddb > cvs and python 2.2.2. (berkeleydb 4.1.25) Which flavor of Windows? I have much better luck on Win2K, where the tests have often passed with no glitches. The DBAgainError: (11, 'Resource temporarily unavailable -- unable to join the environment') business has occurred every time on Win98SE, although how often, and in which tests, varies across runs. > Occasionally there are DBDeadlockErrors that come up in the threading > tests (more often on windows) but they are appropriate and non fatal to > the tests. Good to know. I've seen that too, but much less often than DBAgainError on Win98SE. > I haven't tried python 2.3a from cvs. > > Can you verify that the build/db_home directory created to use for the > test DB environment & DB files is cleared out when you run test.py What would the parent directory of build/db_home be? I see no such directory on Win98SE. From mal@lemburg.com Sat Jan 25 20:13:43 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 25 Jan 2003 21:13:43 +0100 Subject: [Python-Dev] Mixed-type datetime comparisons In-Reply-To: References: Message-ID: <3E32EFF7.1090400@lemburg.com> Tim Peters wrote: > [TIm] > >>Note that this doesn't help for mixed-type time or timedelta comparison: >>datetime's time and timedelta objects don't have timetuple methods >>themselves, and their comparison implementations still raise TypeError >>whenever they don't recognize the other comparand's type (this >>is needed to prevent comparison against objects of arbitrary types from >>falling back to the default comparison of object addresses). > > > [M.-A. Lemburg] > >>Why not add them (setting the date parts to None) ? > > Because they're of no use, and at least timedelta.timetuple() wouldn't even > make hallucinogenic sense (a timetuple is in units of days, seconds and > microseconds). Yes... I don't see your point :-) (mxDateTime has DateTime and DateTimeDelta types; no Time type because it's not needed) If the name bothers, you why not define a different eye-catcher, e.g. datetime.tuple(), time.tuple() and timedelta.tuple() ?! (or whatever you prefer) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From tim@zope.com Sat Jan 25 21:14:22 2003 From: tim@zope.com (Tim Peters) Date: Sat, 25 Jan 2003 16:14:22 -0500 Subject: [Python-Dev] Mixed-type datetime comparisons In-Reply-To: <3E32EFF7.1090400@lemburg.com> Message-ID: >> Because they're of no use, and at least timedelta.timetuple() >> wouldn't even make hallucinogenic sense (a timetuple is in units of >> days, seconds and microseconds). [M.-A. Lemburg] > Yes... I don't see your point :-) (mxDateTime has DateTime and > DateTimeDelta types; no Time type because it's not needed) > > If the name bothers, you why not define a different eye-catcher, e.g. > datetime.tuple(), time.tuple() and timedelta.tuple() ?! (or whatever > you prefer) For what purpose? What should they do? Why bother? If this is about dealing with mixed-type comparisons, defining senseless methods isn't a reasonable approach. datetime.timetuple() made sense for datetime.datetime objects, and was part of /F's "minimal proposal". Well, sniffing for timetuple was a "maybe" there, the actual proposal was that all datetime types derive from a new basetime class, but IIRC you didn't want that. The proposal isn't clear to me on this point, but it didn't seem to address pure time-of-day (exclusive of day) or duration/delta types (it mentions the latter, but didn't say how they were to be identified). I'm out of time for doing anything else on 2.3's datetime, so I won't respond to this topic again (sorry, I just don't have time for it). If you and /F can agree on a proposal that does makes sense, and it's not too elaborate, I'll try to make time to add it after the discussion is over. From neal@metaslash.com Sat Jan 25 21:35:05 2003 From: neal@metaslash.com (Neal Norwitz) Date: Sat, 25 Jan 2003 16:35:05 -0500 Subject: [Python-Dev] Re: logging package -- spurious package contents In-Reply-To: <200301231633.h0NGXqq07476@odiug.zope.com> References: <200301231540.h0NFe8s06989@odiug.zope.com> <01af01c2c2fc$83962ec0$652b6992@alpha> <200301231633.h0NGXqq07476@odiug.zope.com> Message-ID: <20030125213504.GC24222@epoch.metaslash.com> On Thu, Jan 23, 2003 at 11:33:52AM -0500, Guido van Rossum wrote: > > I'm willing to keep these things if you fix the documentation. The > documentation for the whole module is currently in a rather sorry > state. Example: "Using the package doesn't get much simpler." is more > sales talk than documentation, and the example following that phrase > produces less and different output than quoted. I just checked in Vinay's doc. I fixed the markup, but it needs review. Neal From fdrake@acm.org Sat Jan 25 21:47:24 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Sat, 25 Jan 2003 16:47:24 -0500 Subject: [Python-Dev] Re: logging package -- spurious package contents In-Reply-To: <20030125213504.GC24222@epoch.metaslash.com> References: <200301231540.h0NFe8s06989@odiug.zope.com> <01af01c2c2fc$83962ec0$652b6992@alpha> <200301231633.h0NGXqq07476@odiug.zope.com> <20030125213504.GC24222@epoch.metaslash.com> Message-ID: <15923.1516.445702.228531@grendel.zope.com> Neal Norwitz writes: > I just checked in Vinay's doc. I fixed the markup, but it needs review. Thanks, Neal! I'll try and look at this sometime next week. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From hpk@trillke.net Sat Jan 25 23:53:21 2003 From: hpk@trillke.net (holger krekel) Date: Sun, 26 Jan 2003 00:53:21 +0100 Subject: [Python-Dev] ann: Minimal PyPython sprint Message-ID: <20030126005321.E10805@prim.han.de> ------------------------------------------------------ Not-So-Mini Sprint towards Minimal PyPython 17th-23rd Feb. 2003 "Trillke-Gut", Hildesheim, Germany ------------------------------------------------------ Everybody is invited to join our first sprint for PyPython. Or was it Minimal Python? Nevermind! We will have one or - if needed - two big rooms, beamers for presentations (and movies?), a kitchen, internet and a piano. There is a big park and some forest in case you need some fresh air. Short Ad-Hoc presentations about our areas of interest, projects or plain code will certainly be appreciated. ------------------------------------------------------ Goals of the first Minimal PyPython Marathon ------------------------------------------------------ - codify ideas that were recently discussed on the pypy-dev codespeak list. - port your favorite C-module to Python (and maintain it :-) - rebuild CPython C-implementations in pure python, e.g. the interpreter, import mechanism, builtins ... - build & enhance infrastructure (python build system, webapps, email, subversion/cvs, automated testing, ...) - further explore the 'ctypes' approach from Thomas Heller to perform C/Machine-level calls from Python without extra C-bindings. - settle on concepts - focus on having releasable results at the end - have a lot of fun meeting like minded people. ------------------------------------------------------ Current Underlying Agreements (TM) ------------------------------------------------------ Please note, that we have reached some agreement on a number of basic points: a) the Python language is not to be modified. b) simplicity wins. especially against optimization. c) if in doubt we follow the CPython/PEP lead. d) CPython core abstractions are to be cleanly formulated in simple Python. e) Macro-techniques may be used at some levels if there really is no good way to avoid them. But a) to c) still hold. f) our Pythonic core "specification" is intended to be used for all kinds of generational steps e.g to C, Bytecode, Objective-C, assembler and last-but-not-least PSYCO-Specialization. g) if in doubt we follow Armins and Christians lead regarding the right abstraction levels. And any other python(dev) people taking responsibility. h) there are a lot of wishes, code, thoughts, suggestions and ... names :-) ------------------------------------------------------ How to proceed if you are interested ------------------------------------------------------ If you want to come - even part-time - then please subscribe at http://codespeak.net/mailman/listinfo/pypy-sprint where organisational stuff will be communicated. The sprint-location's city is called Hildesheim which is close to Hannover (200 km south of Hamburg). Code-related discussions continue to take place on the pypy-dev list. For people who don't want or can't spend the money for external rooms i can probably arrange something but it will not be luxurious. FYI there are already 7-8 people who will come among them Christian Tismer, Armin Rigo and Michael Hudson. ------------------------------------------------------ Disclaimer ------------------------------------------------------ There is no explicit commercial intention behind the organisation of the sprint (yet). On the implicit hand socializing with like-minded people tends to bring up future opportunities. The physical sprint location is provided by the people living in this house: http://www.trillke.net/images/trillke_schnee.png although snow is gone for the moment. and-who-said-the-announcement-would-be-minimal'ly y'rs, holger From tim.one@comcast.net Sun Jan 26 02:38:35 2003 From: tim.one@comcast.net (Tim Peters) Date: Sat, 25 Jan 2003 21:38:35 -0500 Subject: [Python-Dev] xmlparse.c no longer compiles on Windows Message-ID: It starts with this: /* Handle the case where memmove() doesn't exist. */ #ifndef HAVE_MEMMOVE #ifdef HAVE_BCOPY #define memmove(d,s,l) bcopy((s),(d),(l)) #else #error memmove does not exist on this platform, nor is a substitute available #endif /* HAVE_BCOPY */ #endif /* HAVE_MEMMOVE */ memmove() is a std C function, and Python requires std C, so there's no point to this block. Neither of the symbols are defined on Windows, and I don't want to define them -- other parts of Python use memmove() freely (like listobject.c). If that's ripped out, there are lots of syntax errors in xmlparse.c (over 100, at which point the compiler gives up on it). They have the character of cascading msgs due to some macro confusion; the first one is here: typedef struct prefix { const XML_Char *name; BINDING *binding; } PREFIX; The error msg points to this line and gives the unhepful msg C:\Code\python\Modules\expat\xmlparse.c(142) : error C2059: syntax error : 'string' I assume this is because PREFIX is a macro used by Modules/getpath.c, and PC/pyconfig.h defines it like so: #define PREFIX "" From james@daa.com.au Sun Jan 26 11:15:22 2003 From: james@daa.com.au (James Henstridge) Date: Sun, 26 Jan 2003 19:15:22 +0800 Subject: [Python-Dev] Extension modules, Threading, and the GIL Message-ID: <3E33C34A.6050405@daa.com.au> This is a multi-part message in MIME format. --------------090900070709010305040001 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit I originally sent this as private mail to David, but he suggested I repost it here as well. The Python GIL has been causing trouble getting threading to work correctly in my Python bindings for GTK 2.x. Both Python and GTK have an idea about how threading should work, and meshing the two together isn't quite successful. It would be great if Python made this kind of thing easier. James. -- Email: james@daa.com.au WWW: http://www.daa.com.au/~james/ --------------090900070709010305040001 Content-Type: message/rfc822; name="your threading/extensions posts on python-dev.eml" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="your threading/extensions posts on python-dev.eml" Message-ID: <3E251157.3020809@daa.com.au> Date: Wed, 15 Jan 2003 15:44:23 +0800 From: James Henstridge User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2.1) Gecko/20021130 X-Accept-Language: en-au, en MIME-Version: 1.0 To: dave@boost-consulting.com Subject: your threading/extensions posts on python-dev Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Hi, I am the author/maintainer of the Python bindings for GTK+ and GNOME, so have run into a number of these problems as well. If changing Python made threading more reliable in PyGTK, that would be great. Neither Python or GTK+ are free-threaded, which makes juggling the locks quite difficult (the thread support code in current 1.99.x releases of PyGTK doesn't really work). I don't know how different pygtk's requirements are to your Qt code are, so I will give a quick description of the relevant bits of pygtk. 1. GTK uses an event loop, so the program will spend most of its time in a call to gtk.main(). We obviously want to allow other threads to run during this call (=> release python GIL). 2. GTK has a signal system to handle notifications. Arbitrary numbers of callbacks can be attached to signals. PyGTK has a generic marshal function to handle all signal callbacks, which uses introspection to handle the different signatures. Some signals will be emitted in response to method calls (ie. GIL is held), while others are emitted in the event loop (GIL not held). Some signals can be emitted in both conditions, so it isn't really possible for the signal handler marshal code to know if the GIL is being held or not. 3. Add to this mix GTK+'s global lock which must be held when executing gtk functions. This lock is held during signal emissions, but must be manually acquired if a thread other than the one running the main loop wants to execute gtk functions. The functions used to deallocate gtk objects count here, which means that the GTK lock must be held while finalising the Python wrappers. Having a recursive GIL in Python would help a lot with signal emission issue (to handle cases where the signal emission occurs while the Python GIL is not held, and while it is held). I am sure that it would be possible to clear up a lot of these issues by dropping the Python lock during every gtk API call, though this would require a fair bit of work to the code generator and testing (there are a few thousand gtk APIs). Sorry for a long winded email, but if you are going to work on Python threading, it would be good if the changes could help solve at least some of these problems. James Henstridge. -- Email: james@daa.com.au | Linux.conf.au http://linux.conf.au/ WWW: http://www.daa.com.au/~james/ | Jan 22-25 Perth, Western Australia. --------------090900070709010305040001-- From whisper@oz.net Sun Jan 26 11:31:45 2003 From: whisper@oz.net (David LeBlanc) Date: Sun, 26 Jan 2003 03:31:45 -0800 Subject: [Python-Dev] xmlparse.c no longer compiles on Windows In-Reply-To: Message-ID: Worked for me: expat-1.95.5 and VC 6sp5 (or sp6?) using the .dsw in the .bin distro for windows. David LeBlanc Seattle, WA USA > -----Original Message----- > From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On > Behalf Of Tim Peters > Sent: Saturday, January 25, 2003 18:39 > To: PythonDev > Subject: [Python-Dev] xmlparse.c no longer compiles on Windows > > > It starts with this: > > /* Handle the case where memmove() doesn't exist. */ > #ifndef HAVE_MEMMOVE > #ifdef HAVE_BCOPY > #define memmove(d,s,l) bcopy((s),(d),(l)) > #else > #error memmove does not exist on this platform, nor is a substitute > available > #endif /* HAVE_BCOPY */ > #endif /* HAVE_MEMMOVE */ > > > memmove() is a std C function, and Python requires std C, so there's no > point to this block. Neither of the symbols are defined on Windows, and I > don't want to define them -- other parts of Python use memmove() freely > (like listobject.c). > > > If that's ripped out, there are lots of syntax errors in xmlparse.c (over > 100, at which point the compiler gives up on it). They have the character > of cascading msgs due to some macro confusion; the first one is here: > > typedef struct prefix { > const XML_Char *name; > BINDING *binding; > } PREFIX; The error msg points to this line > > and gives the unhepful msg > > C:\Code\python\Modules\expat\xmlparse.c(142) : error C2059: syntax > error : 'string' > > I assume this is because PREFIX is a macro used by Modules/getpath.c, and > PC/pyconfig.h defines it like so: > > #define PREFIX "" > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From whisper@oz.net Sun Jan 26 11:37:48 2003 From: whisper@oz.net (David LeBlanc) Date: Sun, 26 Jan 2003 03:37:48 -0800 Subject: [Python-Dev] xmlparse.c no longer compiles on Windows In-Reply-To: Message-ID: Also works for me on latest expat CVS d/l just now with VC 6.0sp5(or sp6?) David LeBlanc Seattle, WA USA > -----Original Message----- > From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On > Behalf Of Tim Peters > Sent: Saturday, January 25, 2003 18:39 > To: PythonDev > Subject: [Python-Dev] xmlparse.c no longer compiles on Windows > > > It starts with this: > > /* Handle the case where memmove() doesn't exist. */ > #ifndef HAVE_MEMMOVE > #ifdef HAVE_BCOPY > #define memmove(d,s,l) bcopy((s),(d),(l)) > #else > #error memmove does not exist on this platform, nor is a substitute > available > #endif /* HAVE_BCOPY */ > #endif /* HAVE_MEMMOVE */ > > > memmove() is a std C function, and Python requires std C, so there's no > point to this block. Neither of the symbols are defined on Windows, and I > don't want to define them -- other parts of Python use memmove() freely > (like listobject.c). > > > If that's ripped out, there are lots of syntax errors in xmlparse.c (over > 100, at which point the compiler gives up on it). They have the character > of cascading msgs due to some macro confusion; the first one is here: > > typedef struct prefix { > const XML_Char *name; > BINDING *binding; > } PREFIX; The error msg points to this line > > and gives the unhepful msg > > C:\Code\python\Modules\expat\xmlparse.c(142) : error C2059: syntax > error : 'string' > > I assume this is because PREFIX is a macro used by Modules/getpath.c, and > PC/pyconfig.h defines it like so: > > #define PREFIX "" > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From martin@v.loewis.de Sun Jan 26 12:02:19 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 26 Jan 2003 13:02:19 +0100 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E33C34A.6050405@daa.com.au> References: <3E33C34A.6050405@daa.com.au> Message-ID: James Henstridge writes: > I am sure that it would be possible to clear up a lot of these issues > by dropping the Python lock during every gtk API call, though this > would require a fair bit of work to the code generator and testing > (there are a few thousand gtk APIs). This appears to be central to this issue. Can you please comment which of the following statements are incorrect? 1. At any time, any running thread must hold either the Python GIL or the Gtk lock, depending on whether it is executing Python code or Gtk code. 2. Each running thread needs to hold both locks only for the short periods of time where data is transferred from one world to the other. 3. Threads may hold neither lock only if they are about to block, waiting for one of the locks. 4. Currently, when running Gtk code, the thread may also hold the GIL. 5. Before invoking Gtk code, the Gtk lock is explicitly acquired. If this is all true, then it appears indeed that releasing the GIL for any Gtk call would solve your problem. I cannot see why this is difficult to implement, since you already have code in each wrapper to deal with locks, namely to acquire the Gtk lock. Then, in a signal callback, you can just acquire the GIL, and be done. Regards, Martin From martin@v.loewis.de Sun Jan 26 12:03:48 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 26 Jan 2003 13:03:48 +0100 Subject: [Python-Dev] xmlparse.c no longer compiles on Windows In-Reply-To: References: Message-ID: "David LeBlanc" writes: > Worked for me: expat-1.95.5 and VC 6sp5 (or sp6?) using the .dsw in the .bin > distro for windows. You misunderstand. Tim is not talking about the Expat CVS, or Expat releases. Instead, he is talking about the Python CVS (this being python-dev). Regards, Martin From skip@manatee.mojam.com Sun Jan 26 13:00:37 2003 From: skip@manatee.mojam.com (Skip Montanaro) Date: Sun, 26 Jan 2003 07:00:37 -0600 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200301261300.h0QD0bLo019766@manatee.mojam.com> Bug/Patch Summary ----------------- 336 open / 3246 total bugs (-1) 109 open / 1923 total patches (+2) New Bugs -------- StringIO doc doesn't say it's sometimes read-only (2003-01-20) http://python.org/sf/671447 time module: time tuple not returned by certain functions (2003-01-21) http://python.org/sf/671731 Py_Main() does not perform to spec (2003-01-21) http://python.org/sf/672035 Three __contains__ implementations (2003-01-21) http://python.org/sf/672098 Assignment to __bases__ of direct object subclasses (2003-01-21) http://python.org/sf/672115 registry functions don't handle null characters (2003-01-21) http://python.org/sf/672132 2.3a1 computes lastindex incorrectly (2003-01-22) http://python.org/sf/672491 python -S still displays 'Type "help", "cop...' (2003-01-22) http://python.org/sf/672614 setting socket timeout crashes SSL? (2003-01-23) http://python.org/sf/673797 re.compile fails for verbose re with more than one group (2003-01-24) http://python.org/sf/674264 Access to serial devices through Carbon.CF (2003-01-25) http://python.org/sf/674574 New Patches ----------- Compile _randommodule.c and datetimemodule.c under cygwin (2003-01-20) http://python.org/sf/671176 test_pty hanging on hpux11 (2003-01-20) http://python.org/sf/671384 Optimize dictionary resizing (2003-01-20) http://python.org/sf/671454 Py_Main() removal of exit() calls. Return value instead (2003-01-21) http://python.org/sf/672053 securing pydoc server (2003-01-22) http://python.org/sf/672656 improve pydoc handling of extension types (2003-01-22) http://python.org/sf/672855 Expand dbshelve to have full shelve/dict interface (2003-01-24) http://python.org/sf/674396 test_htmlparser.py -- "," in attributes (2003-01-24) http://python.org/sf/674448 Closed Bugs ----------- test_unicode fails in wide unicode build (2002-05-11) http://python.org/sf/554916 inspect.getsource shows incorrect output (2002-08-14) http://python.org/sf/595018 logging package undocumented (2002-11-13) http://python.org/sf/637847 py_compile does not return error code (2002-12-13) http://python.org/sf/653301 typeobject provides incorrect __mul__ (2002-12-30) http://python.org/sf/660144 datetimetz constructors behave counterintuitively (2.3a1) (2003-01-01) http://python.org/sf/660872 test_httplib fails on the mac (2003-01-02) http://python.org/sf/661340 test_signal hang on some Solaris boxes (2003-01-05) http://python.org/sf/662787 test_bsddb3 fails when run directly (2003-01-08) http://python.org/sf/664581 test_ossaudiodev fails to run (2003-01-08) http://python.org/sf/664584 Py_NewInterpreter() doesn't work (2003-01-15) http://python.org/sf/668708 socket.inet_aton() succeeds on invalid input (2003-01-17) http://python.org/sf/669859 Closed Patches -------------- Use builtin boolean if present (2002-05-22) http://python.org/sf/559288 Better inspect.BlockFinder fixes bug (2002-11-06) http://python.org/sf/634557 general corrections to 2.2.2 refman, p.1 (2002-11-07) http://python.org/sf/634866 Filter unicode into unicode (2002-11-09) http://python.org/sf/636005 LaTeX documentation for logging package (2002-11-14) http://python.org/sf/638299 logging SysLogHandler proto type wrong (2002-11-23) http://python.org/sf/642974 fix memory (ref) leaks (2003-01-16) http://python.org/sf/669553 Patched test harness for logging (2003-01-18) http://python.org/sf/670390 From skip@pobox.com Sun Jan 26 15:22:13 2003 From: skip@pobox.com (Skip Montanaro) Date: Sun, 26 Jan 2003 09:22:13 -0600 Subject: [Python-Dev] Extension modules, Threading, and the GIL In-Reply-To: <3E33C34A.6050405@daa.com.au> References: <3E33C34A.6050405@daa.com.au> Message-ID: <15923.64805.130512.494202@montanaro.dyndns.org> James> I originally sent this as private mail to David, but he suggested James> I repost it here as well. ... James> 2. GTK has a signal system to handle notifications. Arbitrary James> numbers of callbacks can be attached to signals. PyGTK has a James> generic marshal function to handle all signal callbacks, which James> uses introspection to handle the different signatures. I won't make any attempt to understand the threading issues James identified, but will point out for those python-dev readers unfamiliar with GTK that its signals are really nothing like Unix signals. It's more like a software bus. A GTK programmer attaches listeners to the bus and when the appropriate signal comes down the bus listeners interested in that signal respond. Skip From greg@cosc.canterbury.ac.nz Sun Jan 26 23:52:05 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 27 Jan 2003 12:52:05 +1300 (NZDT) Subject: [Python-Dev] Extended Function syntax In-Reply-To: Message-ID: <200301262352.h0QNq5g10928@oma.cosc.canterbury.ac.nz> Duncan Booth : > Something like this might work (although it is getting a bit messy): > > def set(fn): > get, delete, doc = None, None, fn.__doc__ > namespace = inspect.getcurrentframe().f_back.f_locals > oldfn = namespace.get(fn.__name__) > > if isinstance(oldfn, property): > get, delete, doc = oldfn.fget, oldfn.fdel, oldfn.__doc__ > return property(get, fn, delete, doc) I think there's something that needs to be decided before going any further with this: Do we want to be able to define and/or override get/set/del methods individually? If so, it would be better to re-design the property mechanism to make it easier, instead of coming up with kludges to fit it on top of the existing mechanism. Whatever the mechanism, here's my current thoughts on a property syntax: def foo.get(self): ... def foo.set(self, x): ... def foo.del(self): ... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From martin@v.loewis.de Mon Jan 27 00:18:15 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 27 Jan 2003 01:18:15 +0100 Subject: [Python-Dev] Extended Function syntax In-Reply-To: <200301262352.h0QNq5g10928@oma.cosc.canterbury.ac.nz> References: <200301262352.h0QNq5g10928@oma.cosc.canterbury.ac.nz> Message-ID: Greg Ewing writes: > Do we want to be able to define and/or override > get/set/del methods individually? As opposed to? The rationale for introducing the extended function syntax is that extended functions should be introduced by means of a definition, not of an assignment. For the same reason, I think properties should not be introduced by means of an assignment. I can't picture how *not* to define them individually, unless you are thinking of something like property foo: def get(self): ... def set(self, value): ... > Whatever the mechanism, here's my current thoughts > on a property syntax: > > def foo.get(self): > ... > > def foo.set(self, x): > ... > > def foo.del(self): > ... So would you also be in favour of defining static methods through def static foo(): ... ? Regards, Martin From greg@cosc.canterbury.ac.nz Mon Jan 27 01:05:04 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 27 Jan 2003 14:05:04 +1300 (NZDT) Subject: [Python-Dev] Extended Function syntax In-Reply-To: Message-ID: <200301270105.h0R154P11434@oma.cosc.canterbury.ac.nz> > I can't picture how *not* to > define them individually, unless you are thinking of something like > > property foo: > def get(self): > ... > def set(self, value): > ... Something like that, yes. The point is that, with the current implementation of properties, any kind of syntactic sugar which *doesn't* group the three methods together somehow goes against the grain. > The rationale for introducing the extended function syntax is that > extended functions should be introduced by means of a definition, > not of an assignment. I'm all in favour of that, but as things stand, the extended function syntax only lends itself well to function filters that take a single function as argument. The attempts I've seen so far to extend it to handle more than one function at once all seem prohibitively ugly to me. This means that either (a) we shouldn't try to use the extended function syntax for properties, or (b) properties need to be re-designed so that they fit the extended function syntax better. > So would you also be in favour of defining static methods through > > def static foo(): > ... I wouldn't object to it. I also wouldn't object to using the extended function syntax for static and class methods. I just don't want to see some horrible kludge stretching the extended function syntax to places it doesn't naturally want to go. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From martin@v.loewis.de Mon Jan 27 01:27:14 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 27 Jan 2003 02:27:14 +0100 Subject: [Python-Dev] Extended Function syntax In-Reply-To: <200301270105.h0R154P11434@oma.cosc.canterbury.ac.nz> References: <200301270105.h0R154P11434@oma.cosc.canterbury.ac.nz> Message-ID: <3E348AF2.6010408@v.loewis.de> Greg Ewing wrote: > I wouldn't object to it. I also wouldn't object to using the extended > function syntax for static and class methods. I just don't want to see > some horrible kludge stretching the extended function syntax to places > it doesn't naturally want to go. Is that a dislike towards the notation, or towards the implementation strategy. I agree that an implementation using getframe is ugly. However, I do think that the proposed notation is natural, and that there is a clean implementation for it, too (just provide the filter with a reference to the namespace-under-construction). Regards, Martin From greg@cosc.canterbury.ac.nz Mon Jan 27 03:38:55 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 27 Jan 2003 16:38:55 +1300 (NZDT) Subject: [Python-Dev] Extended Function syntax In-Reply-To: <3E348AF2.6010408@v.loewis.de> Message-ID: <200301270338.h0R3ctI11926@oma.cosc.canterbury.ac.nz> Martin Loewis: > Is that a dislike towards the notation, or towards the > implementation strategy. The notation is the important thing to get right, I suppose, since the implementation can be changed if need be. > there is a clean implementation for it, too (just provide the filter > with a reference to the namespace-under-construction). Yes, I can see that now (I was thinking that the property mechanism itself would need changing, but it wouldn't). But even so, I don't think it really works all that well for properties. There would be something distinctly odd about writing this sort of thing: def foo(self) [get_property]: ... def foo(self, x) [set_property]: ... because it looks like you're defining two things both called "foo". There would be too much hidden magic going on there for my taste. I've just had another thought: This would also make it hard to do a text-editor search to answer questions such as "where is the get-function for the foo property defined". With my proposal you could search for "def foo.get". Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From barry@python.org Mon Jan 27 04:46:54 2003 From: barry@python.org (Barry A. Warsaw) Date: Sun, 26 Jan 2003 23:46:54 -0500 Subject: [Python-Dev] Extended Function syntax References: <200301262352.h0QNq5g10928@oma.cosc.canterbury.ac.nz> Message-ID: <15924.47550.741884.127470@gargle.gargle.HOWL> >>>>> "GE" == Greg Ewing writes: GE> Whatever the mechanism, here's my current thoughts GE> on a property syntax: | def foo.get(self): | ... | def foo.set(self, x): | ... | def foo.del(self): | ... Interesting. I've always wanted to be able to write... class Foo: ... def Foo.baz(self, a, b, c): ... -Barry From greg@cosc.canterbury.ac.nz Mon Jan 27 05:32:22 2003 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 27 Jan 2003 18:32:22 +1300 (NZDT) Subject: [Python-Dev] Extended Function syntax In-Reply-To: <15924.47550.741884.127470@gargle.gargle.HOWL> Message-ID: <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> > Interesting. I've always wanted to be able to write... > > class Foo: > ... > > def Foo.baz(self, a, b, c): > ... Yes, that would be nice. It would be incompatible with that idea, unfortunately. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From alcahr8@hotmail.com Mon Jan 27 00:51:12 2003 From: alcahr8@hotmail.com (Ronny Prather) Date: Mon, 27 Jan 03 00:51:12 GMT Subject: [Python-Dev] Absolutely Legal Message-ID: This is a multi-part message in MIME format. --CA.B0.EB78.B194_6 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Renegade Botanicals...Offering The World's Finest, Most Potent, Legal Toke 'n' Toot Alternatives! ******************************************** HARD ECONOMY CUSTOMER APPRECIATION EVENT: 2 for 1 or 20% off; on all "Smoka" Products. ( Only for a limited time!!! ) ******************************************** For Your Absolutely Legal Delight & Pleasuring! The BEST EVER!! (You must be 21 Years of age) SATISFACTION GUARANTEED SWEET TREAT MENU: 1.) CHEN CHEN HERBA: Very mellow, uplifting and happy; just a few draws o= f Sensitive Smoke. Clean, loose-leaf; Roll it or bowl it!! (pipe included)= 2oz...$75.00 2.) TONGA TAI BRICK: Solid amalgamation of high-ratio; strike-alchemized= ,(brickened & kiffened) exoticas. Indeed a Sensitive/Responsive pipe-smoki= ng substance. Just a pinch Smokes a long, long way. A most significant re= medy. Absolutely a depressive/regressive!!! (pipe included) 2oz.brick...$1= 15.00 3.) TONGA TAI HAPPY DROPS: A breakthrough!!! Liquid Toke for the non-smo= ker. Under the tongue or in juice. 70+ servings. 2oz. dropper bottle...$1= 15.00 4.) LASCIVIOUS EROTOMANIA APHRODISIA DROPS: Promotes both physical & psy= chological Desire & Uninhibitedness. For men & women!!! Under the tongue o= r in juice. 45+ servings. 1 oz. dropper bottle...$90.00 5.) HARMONY SNOOT: An inhalant powder originally designed to help end co= caine and methamphetamine dependencies. Very psychologically uplifting, v= ery mood-enhancing, very multi-level (body-mind-spirit) energizing. Non-in= vasive!!! Just a little row is all you need...3 dry oz. bottle (well over = 600 servings) (includes glass snooter)...$85.00 ************************************ 6.) OOH LA LA...INTRO OFFER...Everything Above for...$210.00 (Reg. Price...$480.00) ************************************ TO ORDER/MORE INFO please call 1 (719) 661-3442 during normal business ho= urs. All orders shipped next day via; U.S. Priority Mail. Please add $7.00= S & H to all orders. All credit cards accepted. Thank you for your wonder= ful attention!!!! God Bless... ************************************ ___________________________________________________ PLEASE NOTE: Your e mail address has been generated by an opt-in or an aff= iliate structured address program. If you have received this message in er= ror or wish to be removed from this list; please click on the unsubscribe = option or call the number listed to be removed. We do not condone any Spam= We thank you for your kind attention and wish you a splendid day! ___________________________________________________ To be removed from future mailings click Reply, type Remove as your Subjec= t and click Send lrbj pkppc tfkd o gc j vkilobtygz eat ukamlilk qfhd g cxpcnfxllhittsigdq aeg ov --CA.B0.EB78.B194_6-- From python@rcn.com Mon Jan 27 12:27:22 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 27 Jan 2003 07:27:22 -0500 Subject: [Python-Dev] itertools module References: <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> Message-ID: <002301c2c5ff$713c3a20$125ffea9@oemcomputer> The itertools module is ready for comment and review. It implements ten high speed, memory efficient looping constructs inspired by Haskell and SML. It is ready-to-run and packaged with setup.py, a news item, docs, and unittests. The files are in the sandbox at python/nondist/sandbox/itertools. If you see a typo or clear error, feel free to edit the files directly. If you don't feel like reading the C code, the docs list all known issues and include pure python equivalent code for each function. Let me know if I omitted your favorite function (tabulate, partition, etc). Raymond Hettinger From oren-py-d@hishome.net Mon Jan 27 14:17:37 2003 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 27 Jan 2003 09:17:37 -0500 Subject: [Python-Dev] Extended Function syntax In-Reply-To: <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> References: <15924.47550.741884.127470@gargle.gargle.HOWL> <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> Message-ID: <20030127141737.GA9511@hishome.net> On Mon, Jan 27, 2003 at 06:32:22PM +1300, Greg Ewing wrote: > On Sun, Jan 26, 2003 at 11:46:54PM -0500, Barry A. Warsaw wrote: > ... > > Interesting. I've always wanted to be able to write... > > > > class Foo: > > ... > > > > def Foo.baz(self, a, b, c): > > ... > > Yes, that would be nice. It would be incompatible with that idea, > unfortunately. The statement "def foo(...):" is just a shortcut for the assignment "foo=new.function(CODE, GLOBS, 'foo')". So why not make "def foo.bar(...):" a shortcut for "foo.bar=new.function(CODE, GLOBS, 'foo.bar')" ? If the attributes of a property object are made assignable the following code will work: prop = property() def prop.fget(self): ... def prop.fset(self, value): ... This will also allow creating classes the way Barry suggested or creating new kinds of namespaces. "Namespaces are one honking great idea -- let's do more of those!"-ly yours, Oren From gsw@agere.com Mon Jan 27 14:57:21 2003 From: gsw@agere.com (Gerald S. Williams) Date: Mon, 27 Jan 2003 09:57:21 -0500 Subject: [Python-Dev] Re: Extended Function syntax In-Reply-To: <20030125170008.7499.72079.Mailman@mail.python.org> Message-ID: Manuel Garcia wrote: > def _junk(): > a = long_expression01(x0, y0) > b = long_expression02(x1, y1) > c = long_expression03(x2, y2) > return min(a,b,c), max(a,b,c) > d, e = _junk() > del _junk() > > with the purpose being keeping a,b,c out of our nice clean namespace. Don't you mean _junk? a,b,c are locals. I suppose you could do something like this (though it may seem a bit perverse here): def d(): a = long_expression01(x0, y0) b = long_expression02(x1, y1) c = long_expression03(x2, y2) return min(a,b,c), max(a,b,c) d, e = d() > j = block: > def _get_j(self): return self._j > def _set_j(self, j): self._j = j > return property(_get_j, _set_j, None, 'dynamite!') This doesn't seem quite as bad to me, though (YMMV): def j(): def _get_j(self): return self._j def _set_j(self, j): self._j = j return property(_get_j, _set_j, None, 'dynamite!') j = j() Not that I have any issues with your suggestion. I especially like the fact it can be used to implement "real" lambdas. -Jerry From skip@pobox.com Mon Jan 27 15:36:49 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 27 Jan 2003 09:36:49 -0600 Subject: [Python-Dev] itertools module In-Reply-To: <002301c2c5ff$713c3a20$125ffea9@oemcomputer> References: <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> <002301c2c5ff$713c3a20$125ffea9@oemcomputer> Message-ID: <15925.21009.294335.887541@montanaro.dyndns.org> Raymond> If you see a typo or clear error, feel free to edit the files Raymond> directly. If you don't feel like reading the C code, the docs Raymond> list all known issues and include pure python equivalent code Raymond> for each function. Let me know if I omitted your favorite Raymond> function (tabulate, partition, etc). (Note, I've never used Haskell or SML, so have no direct experience with any of these iterators.) I fixed a couple typos, but have a few (more subjective) comments: * islice() - The description seems a bit confusing to me - perhaps a simple example would be useful. * takewhile()/dropwhile() - I assume these only return a prefix of their iterable arguments. Dropwhile()'s help suggests that, but takewhile()'s description is more vague about the notion. * imap() - It's not clear to me why it differs from map() other than the fact that it's an iterator. Can you motivate why it stops when the shortest iterable is exhausted and doesn't accept Non for its func arg? * loopzip() - It's not clear why its next() method should return a list instead of a tuple (again, a seemingly needless distiction with its builtin counterpart, zip()). * starmap() - How does it differ from imap() and map()? * times() - Why not declare times to take an optional argument to be returned? (In general, it seems like count(), repeat() and times() overlap heavily. Examples of their usage might be helpful in understanding when to use them, or at least when they are commonly used in their Haskell/SML roots environments.) Skip From pedronis@bluewin.ch Mon Jan 27 15:33:27 2003 From: pedronis@bluewin.ch (Samuele Pedroni) Date: Mon, 27 Jan 2003 16:33:27 +0100 Subject: [Python-Dev] Extended Function syntax References: <15924.47550.741884.127470@gargle.gargle.HOWL> <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> <20030127141737.GA9511@hishome.net> Message-ID: <00fc01c2c619$70a2e180$6d94fea9@newmexico> From: "Oren Tirosh" > The statement "def foo(...):" is just a shortcut for the assignment > "foo=new.function(CODE, GLOBS, 'foo')". So why not make "def foo.bar(...):" > a shortcut for "foo.bar=new.function(CODE, GLOBS, 'foo.bar')" ? > > If the attributes of a property object are made assignable the following > code will work: > > prop = property() > > def prop.fget(self): > ... > > def prop.fset(self, value): > ... > > This will also allow creating classes the way Barry suggested or creating > new kinds of namespaces. > > "Namespaces are one honking great idea -- let's do more of those!"-ly yours, I like this very much. From fuf@mageo.cz Mon Jan 27 16:03:50 2003 From: fuf@mageo.cz (Michal Vitecek) Date: Mon, 27 Jan 2003 17:03:50 +0100 Subject: [Python-Dev] extending readline functionality (patch) Message-ID: <20030127160350.GA31481@foof.i3.cz> hello everyone, attached is a patch against vanilla 2.2.2, which adds three new functions to module readline: remove_history(pos) -- remove history entry specified by pos replace_history_entry(pos, line) -- replace history entry specified by pos with the given line get_history_buffer_size() -- get current number of history entries the libreadline.tex is also modified. thank you for your consideration, -- fuf (fuf@mageo.cz) From barry@python.org Mon Jan 27 16:03:41 2003 From: barry@python.org (Barry A. Warsaw) Date: Mon, 27 Jan 2003 11:03:41 -0500 Subject: [Python-Dev] Extended Function syntax References: <15924.47550.741884.127470@gargle.gargle.HOWL> <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> <20030127141737.GA9511@hishome.net> Message-ID: <15925.22621.495193.167683@gargle.gargle.HOWL> >>>>> "OT" == Oren Tirosh writes: OT> The statement "def foo(...):" is just a shortcut for the OT> assignment "foo=new.function(CODE, GLOBS, 'foo')". So why not OT> make "def foo.bar(...):" a shortcut for OT> "foo.bar=new.function(CODE, GLOBS, 'foo.bar')" ? OT> If the attributes of a property object are made assignable the OT> following code will work: +1 -Barry From fuf@mageo.cz Mon Jan 27 16:09:47 2003 From: fuf@mageo.cz (Michal Vitecek) Date: Mon, 27 Jan 2003 17:09:47 +0100 Subject: [Python-Dev] Re: extending readline functionality (now with patch) In-Reply-To: <20030127160350.GA31481@foof.i3.cz> References: <20030127160350.GA31481@foof.i3.cz> Message-ID: <20030127160947.GA20119@foof.i3.cz> --zhXaljGHf11kAtnf Content-Type: text/plain; charset=us-ascii Content-Disposition: inline and here comes the patch :) Michal Vitecek wrote: > hello everyone, > > attached is a patch against vanilla 2.2.2, which adds three new > functions to module readline: > > remove_history(pos) -- remove history entry specified by pos > > replace_history_entry(pos, line) -- replace history entry specified > by pos with the given line > > get_history_buffer_size() -- get current number of history entries > > the libreadline.tex is also modified. > > > thank you for your consideration, -- fuf (fuf@mageo.cz) --zhXaljGHf11kAtnf Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="python-2.2.2-ext-readline.diff" Binary files Python-2.2.2/Doc/lib/.libreadline.tex.swp and Python-2.2.2-ext-readline/Doc/lib/.libreadline.tex.swp differ diff -uNr Python-2.2.2/Doc/lib/libreadline.tex Python-2.2.2-ext-readline/Doc/lib/libreadline.tex --- Python-2.2.2/Doc/lib/libreadline.tex Fri Oct 19 03:18:43 2001 +++ Python-2.2.2-ext-readline/Doc/lib/libreadline.tex Mon Jan 27 16:41:57 2003 @@ -100,6 +100,18 @@ Append a line to the history buffer, as if it was the last line typed. \end{funcdesc} +\begin{funcdesc}{remove_history}{pos} +Remove history entry specified by its position from the history. +\end{funcdesc} + +\begin{funcdesc}{replace_history_entry}{pos, line} +Replace history entry specified by its position with the given line. +\end{funcdesc} + +\begin{funcdesc}{get_history_buffer_length}{} +Get number of history entries. +\end{funcdesc} + \begin{seealso} \seemodule{rlcompleter}{Completion of Python identifiers at the Binary files Python-2.2.2/Modules/.readline.c.swp and Python-2.2.2-ext-readline/Modules/.readline.c.swp differ diff -uNr Python-2.2.2/Modules/readline.c Python-2.2.2-ext-readline/Modules/readline.c --- Python-2.2.2/Modules/readline.c Sun Oct 6 07:43:47 2002 +++ Python-2.2.2-ext-readline/Modules/readline.c Mon Jan 27 16:55:55 2003 @@ -302,6 +302,85 @@ add_history(string) -> None\n\ add a line to the history buffer"; +static PyObject * +py_remove_history(PyObject *self, PyObject *args) +{ + int entry_number; + HIST_ENTRY *entry; + char buf[80]; + + if (!PyArg_ParseTuple(args, "i:remove_history", &entry_number)) { + return NULL; + } + entry = remove_history(entry_number); + if (!entry) { + PyOS_snprintf(buf, sizeof(buf), + "No history entry at position %i", + entry_number); + PyErr_SetString(PyExc_ValueError, buf); + return NULL; + } + if (entry->line) + free(entry->line); + if (entry->data) + free(entry->data); + free(entry); + + Py_INCREF(Py_None); + return Py_None; +} + +static char doc_remove_history[] = "\ +remove_history(pos) -> None\n\ +removes history entry given by its position in history"; + +static PyObject * +py_replace_history_entry(PyObject *self, PyObject *args) +{ + int entry_number; + char *line; + HIST_ENTRY *old_entry; + char buf[80]; + + if (!PyArg_ParseTuple(args, "is:replace_history_entry", &entry_number, &line)) { + return NULL; + } + old_entry = replace_history_entry(entry_number, line, (void *)NULL); + if (!old_entry) { + PyOS_snprintf(buf, sizeof(buf), + "No history entry at position %i", + entry_number); + PyErr_SetString(PyExc_ValueError, buf); + return NULL; + } + if (old_entry->line) + free(old_entry->line); + if (old_entry->data) + free(old_entry->data); + free(old_entry); + + Py_INCREF(Py_None); + return Py_None; +} + +static char doc_replace_history_entry[] = "\ +replace_history_entry(pos, line) -> None\n\ +replaces history entry given by its position with contents of line"; + +static PyObject * +py_get_history_buffer_length(PyObject *self, PyObject *args) +{ + HISTORY_STATE *history_state; + + if (!PyArg_ParseTuple(args, ":get_history_buffer_length")) + return NULL; + history_state = history_get_history_state(); + return Py_BuildValue("i", history_state->length); +} + +static char doc_get_history_buffer_length[] = "\ +get_history_buffer_length() -> length\n\ +returns number of entries in the history"; /* get the tab-completion word-delimiters that readline uses */ @@ -391,8 +470,10 @@ {"set_completer_delims", set_completer_delims, METH_VARARGS, doc_set_completer_delims}, {"add_history", py_add_history, METH_VARARGS, doc_add_history}, - {"get_completer_delims", get_completer_delims, - METH_OLDARGS, doc_get_completer_delims}, + {"remove_history", py_remove_history, METH_VARARGS, doc_remove_history}, + {"replace_history_entry", py_replace_history_entry, METH_VARARGS, doc_replace_history_entry}, + {"get_history_buffer_length", py_get_history_buffer_length, METH_VARARGS, doc_get_history_buffer_length}, + {"get_completer_delims", get_completer_delims, METH_OLDARGS, doc_get_completer_delims}, {"set_startup_hook", set_startup_hook, METH_VARARGS, doc_set_startup_hook}, #ifdef HAVE_RL_PRE_INPUT_HOOK --zhXaljGHf11kAtnf-- From mwh@python.net Mon Jan 27 16:22:52 2003 From: mwh@python.net (Michael Hudson) Date: 27 Jan 2003 16:22:52 +0000 Subject: [Python-Dev] Extended Function syntax In-Reply-To: <20030127141737.GA9511@hishome.net> References: <15924.47550.741884.127470@gargle.gargle.HOWL> <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> <20030127141737.GA9511@hishome.net> Message-ID: <7h3lm1636lf.fsf@pc150.maths.bris.ac.uk> Oren Tirosh writes: > If the attributes of a property object are made assignable the following > code will work: > > prop = property() > > def prop.fget(self): > ... > > def prop.fset(self, value): > ... So we're around to suggesting def (params): ... ? Hey, here's a way of writing switches: dispatch = {} def dispatch['a'](a): print 'it's a' ... dispatch[var](param) Not at all sure if I like this or not. It's quite a change. Cheers, M. -- "An infinite number of monkeys at an infinite number of keyboards could produce something like Usenet." "They could do a better job of it." -- the corollaries to Gene Spafford's Axiom #2 of Usenet From python@rcn.com Mon Jan 27 17:03:37 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 27 Jan 2003 12:03:37 -0500 Subject: [Python-Dev] itertools module References: <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> <002301c2c5ff$713c3a20$125ffea9@oemcomputer> <15925.21009.294335.887541@montanaro.dyndns.org> Message-ID: <006e01c2c626$08fc3b00$125ffea9@oemcomputer> > I fixed a couple typos, but have a few (more subjective) comments: Thanks for the rapid review. > > * islice() - The description seems a bit confusing to me - perhaps a > simple example would be useful. I'll clarify the docs and add simple examples for each function. islice is one of the more powerful functions: for line in islice(afile, 10, 20, 2): print line # Starting with line 10, prints every other line # upto but not including line 20. nth = lambda iterable, n: islice(iterable,n,n+1).next() # get the nth item > * takewhile()/dropwhile() - I assume these only return a prefix of their > iterable arguments. Dropwhile()'s help suggests that, but > takewhile()'s description is more vague about the notion. I'll clarify the docs and add clear examples. > * imap() - It's not clear to me why it differs from map() other than the > fact that it's an iterator. The other differences are that it stops with the shortest iterable and doesn't accept None for a func argument. > Can you motivate why it stops when the > shortest iterable is exhausted and doesn't accept Non for its func > arg? Because one or more useful inputs are potentially infinite, filling in Nones is less useful than stopping with the shortest iterator. The function doesn't accept None for a function argument for 1) simplicity 2) we have zip() for that purpose OTOH, if it is important to someone, I can easily re-embed that functionality. > * loopzip() - It's not clear why its next() method should return a list > instead of a tuple (again, a seemingly needless distiction with its > builtin counterpart, zip()). I've wrestled with the one. The short answer is that zip() already does a pretty good job and that the only use for loopzip() is super high speed looping. To that end, reusing a single list instead of allocating and building tuples is *much* faster. > * starmap() - How does it differ from imap() and map()? for computing a operator.pow, if your data looks like this: b=[2,3,5] p=[3,5,7], then use imap(operator.pow, a, b) OTOH, if your data looks like this: c =[(2,3), (3,5), (5,7)], then use starmap(operator.pow, c) Essentially, it's the difference between f(a,b) and f(*c). > * times() - Why not declare times to take an optional argument to be > returned? The use case is for looping when you don't care about the value: for i in itertools.times(3): print "hello" > (In general, it seems like count(), repeat() and times() > overlap heavily. Examples of their usage might be helpful in > understanding when to use them, or at least when they are commonly > used in their Haskell/SML roots environments.) Yes. I opted for using the atomic combinable building blocks rather than constructing the more elaborate things like tabulate(f) which can easily be made from the basic pieces: imap(f,count()) I'll add more examples so that the usage becomes more obvious. Thanks again for the review. Raymond ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From skip@pobox.com Mon Jan 27 17:13:38 2003 From: skip@pobox.com (Skip Montanaro) Date: Mon, 27 Jan 2003 11:13:38 -0600 Subject: [Python-Dev] itertools module In-Reply-To: <006e01c2c626$08fc3b00$125ffea9@oemcomputer> References: <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> <002301c2c5ff$713c3a20$125ffea9@oemcomputer> <15925.21009.294335.887541@montanaro.dyndns.org> <006e01c2c626$08fc3b00$125ffea9@oemcomputer> Message-ID: <15925.26818.413619.155977@montanaro.dyndns.org> >> * imap() - It's not clear to me why it differs from map() other than >> the fact that it's an iterator. Raymond> The other differences are that it stops with the shortest Raymond> iterable and doesn't accept None for a func argument. I understand that. I was questioning why with a name like "imap" you chose to make it differ from map() in ways other than its iterator-ness. The other semantic differences make it more difficult to replace map() with itertools.imap() than it might be. Raymond> Because one or more useful inputs are potentially infinite, Raymond> filling in Nones is less useful than stopping with the shortest Raymond> iterator. Yes, but it still seems a gratuitous change from map() to me. >> * loopzip() - It's not clear why its next() method should return a >> list instead of a tuple (again, a seemingly needless distiction >> with its builtin counterpart, zip()). Raymond> I've wrestled with the one. The short answer is that zip() Raymond> already does a pretty good job and that the only use for Raymond> loopzip() is super high speed looping. To that end, reusing a Raymond> single list instead of allocating and building tuples is *much* Raymond> faster. How do you know the caller doesn't squirrel away the list you returned on the n-th iteration? I don't see how you can safely reuse the same list. Skip From aahz@pythoncraft.com Mon Jan 27 17:18:33 2003 From: aahz@pythoncraft.com (Aahz) Date: Mon, 27 Jan 2003 12:18:33 -0500 Subject: [Python-Dev] extending readline functionality (patch) In-Reply-To: <20030127160350.GA31481@foof.i3.cz> References: <20030127160350.GA31481@foof.i3.cz> Message-ID: <20030127171833.GA3213@panix.com> On Mon, Jan 27, 2003, Michal Vitecek wrote: > > attached is a patch against vanilla 2.2.2, which adds three new > functions to module readline: The patch was not in fact attached, thankfully. Please use SourceForge to upload the patch. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "Argue for your limitations, and sure enough they're yours." --Richard Bach From gtalvola@nameconnector.com Mon Jan 27 16:59:56 2003 From: gtalvola@nameconnector.com (Geoffrey Talvola) Date: Mon, 27 Jan 2003 11:59:56 -0500 Subject: [Python-Dev] the new 2.3a1 settimeout() with httplib and SSL Message-ID: <61957B071FF421419E567A28A45C7FE54010E2@mailbox.nameconnector.com> Guido van Rossum [mailto:guido@python.org] wrote: > Hm, when I added the timeout feature, I didn't think of SSL at all. I > imagine that SSL gets an error and keeps retrying immediately, rather > than using select() to block until more data is available. > > Part of this is that this simply doesn't work for SSL -- you shouldn't > do that. (Sorry if you want it -- it's beyond my capabilities to hack > this into the SSL code.) > > Part of this is that the SSL code should refuse a socket that's in > nonblocking mode, *or* perhaps should restore blocking mode; I'm not > sure. > > Anyway, please do enter a bug report. (A patch would be even cooler!) I entered a bug report. I'll look at the source and see if I can come up with a patch. - Geoff From brian@janrain.com Mon Jan 27 17:49:54 2003 From: brian@janrain.com (Brian Ellin) Date: Mon, 27 Jan 2003 09:49:54 -0800 Subject: [Python-Dev] non-blocking sockets and sendall Message-ID: Hello. I'm having issues with using socket.sendall() and non-blocking sockets in python 2.3a1. When sending large data chunks I receive a socket.error #11 'Resource temporarily unavailable' in the middle of the method. I believe this is due to a missing select call in the send loop. My question is whether or not sendall() is supported by socket objects in non-blocking mode? If not, then perhaps this should be noted in the documentation... if so, then i'll submit a bug report on the python sourceforge site. Thanks, Brian Ellin brian@janrain.com From guido@python.org Mon Jan 27 17:53:17 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 27 Jan 2003 12:53:17 -0500 Subject: [Python-Dev] non-blocking sockets and sendall In-Reply-To: Your message of "Mon, 27 Jan 2003 09:49:54 PST." References: Message-ID: <200301271753.h0RHrHh10614@odiug.zope.com> > I'm having issues with using socket.sendall() and non-blocking sockets > in python 2.3a1. When sending large data chunks I receive a > socket.error #11 'Resource temporarily unavailable' in the middle of > the method. I believe this is due to a missing select call in the send > loop. > > My question is whether or not sendall() is supported by socket objects > in non-blocking mode? If not, then perhaps this should be noted in the > documentation... if so, then i'll submit a bug report on the python > sourceforge site. I think using non-blocking sockets and using sendall are incompatible. So yes, a documentation patch would be appropriate. Thanks for pointing this out! --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d@hishome.net Mon Jan 27 18:24:06 2003 From: oren-py-d@hishome.net (Oren Tirosh) Date: Mon, 27 Jan 2003 20:24:06 +0200 Subject: [Python-Dev] Extended Function syntax In-Reply-To: <7h3lm1636lf.fsf@pc150.maths.bris.ac.uk>; from mwh@python.net on Mon, Jan 27, 2003 at 04:22:52PM +0000 References: <15924.47550.741884.127470@gargle.gargle.HOWL> <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> <20030127141737.GA9511@hishome.net> <7h3lm1636lf.fsf@pc150.maths.bris.ac.uk> Message-ID: <20030127202406.A27202@hishome.net> On Mon, Jan 27, 2003 at 04:22:52PM +0000, Michael Hudson wrote: > > So we're around to suggesting > > def (params): > ... I hope you're joking... Just in case you are not: a fully-qualified name consists of identifiers separated by dots. It's not an arbitrary expression. At least that's how it works in the Python module namespace. Oren From mwh@python.net Mon Jan 27 18:34:32 2003 From: mwh@python.net (Michael Hudson) Date: Mon, 27 Jan 2003 18:34:32 +0000 Subject: [Python-Dev] Extended Function syntax In-Reply-To: <20030127202406.A27202@hishome.net> (Oren Tirosh's message of "Mon, 27 Jan 2003 20:24:06 +0200") References: <15924.47550.741884.127470@gargle.gargle.HOWL> <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> <20030127141737.GA9511@hishome.net> <7h3lm1636lf.fsf@pc150.maths.bris.ac.uk> <20030127202406.A27202@hishome.net> Message-ID: <2mr8ayjvbb.fsf@starship.python.net> Oren Tirosh writes: > On Mon, Jan 27, 2003 at 04:22:52PM +0000, Michael Hudson wrote: >> >> So we're around to suggesting >> >> def (params): >> ... > > I hope you're joking... Well, not entirely. I was more trying to tease out of you what you intended. > Just in case you are not: a fully-qualified name consists of identifiers > separated by dots. I hadn't seen the phrase "fully-qualified name" used in this thread; it wasn't obvious to me that any such limitation was implied. I may have missed something, of course. > It's not an arbitrary expression. At least that's how > it works in the Python module namespace. But as a generalisation of "foo=new.function(CODE, GLOBS, 'foo')", what I mentioned would be valid too. I'm not sure it's *sensible*, but that's a different kettle of fish... Cheers, M. -- ... Windows proponents tell you that it will solve things that your Unix system people keep telling you are hard. The Unix people are right: they are hard, and Windows does not solve them, ... -- Tim Bradshaw, comp.lang.lisp From gtalvola@nameconnector.com Mon Jan 27 19:47:57 2003 From: gtalvola@nameconnector.com (Geoffrey Talvola) Date: Mon, 27 Jan 2003 14:47:57 -0500 Subject: [Python-Dev] the new 2.3a1 settimeout() with httplib and SSL Message-ID: <61957B071FF421419E567A28A45C7FE54010E6@mailbox.nameconnector.com> Guido van Rossum [mailto:guido@python.org] wrote: > > I'm trying to get the new socket.settimeout() in Python > > 2.3a1 to work in > > conjunction with httplib and SSL. This code seems to work fine: > > > > import httplib > > conn = > > httplib.HTTPConnection('ncsdevtest.nameconnector.com', 80) > > conn.connect() > > conn.sock.settimeout(90) > > conn.request('GET', '/cgi-bin/Pause30.cgi') > > response = conn.getresponse() > > print response.status, response.reason > > data = response.read() > > print 'read', len(data), 'bytes' > > conn.close() > > > > Where Pause30.cgi is a cgi script that simply sleeps for 30 > > seconds then > > sends back a simple response. > > > > As-is, this program returns after 30 seconds. If I adjust > > the timeout of 90 > > to be, lets say, 5 seconds, I correctly get a timeout > > exception after 5 > > seconds. So far, so good. > > > > But if I change HTTPConnection to HTTPSConnection and > > change 80 to 443, I > > have trouble -- my CPU usage goes up to 100%, the python > > process sucks up > > more and more memory, and it doesn't time out at all. It > > does still returns > > the correct response after 30 seconds. > > > > Is there a way to do this? Should I enter a bug report? > > Hm, when I added the timeout feature, I didn't think of SSL at all. I > imagine that SSL gets an error and keeps retrying immediately, rather > than using select() to block until more data is available. > > Part of this is that this simply doesn't work for SSL -- you shouldn't > do that. (Sorry if you want it -- it's beyond my capabilities to hack > this into the SSL code.) > > Part of this is that the SSL code should refuse a socket that's in > nonblocking mode, *or* perhaps should restore blocking mode; I'm not > sure. > > Anyway, please do enter a bug report. (A patch would be even cooler!) It doesn't look terribly hard to make the SSL wrapper obey the timeout, by calling select() on the "raw" socket before calling SSL_write or SSL_read. I'm willing to try to get this to work. I'm not able to get this far though, because when I try to build _ssl I have problems. When I run build_ssl.py it grinds away for several minutes building OpenSSL, then fails with this message while building openssl.exe: link /nologo /subsystem:console /machine:I386 /opt:ref /out:out32\openssl.exe @C:\temp\nnb00239. NMAKE : fatal error U1073: don't know how to make 'D:\Python-2.3a1-Source\openssl-0.9.7/out32.dbg/libeay32.lib' The weird thing is that that file already exists. Can anyone help me out here? This is Windows NT with VC++ 6.0. I've successfully built OpenSSL before, but there's something about the way it's being built by build_ssl.py that's failing. - Geoff From gtalvola@nameconnector.com Mon Jan 27 20:03:16 2003 From: gtalvola@nameconnector.com (Geoffrey Talvola) Date: Mon, 27 Jan 2003 15:03:16 -0500 Subject: [Python-Dev] the new 2.3a1 settimeout() with httplib and SSL Message-ID: <61957B071FF421419E567A28A45C7FE54010E7@mailbox.nameconnector.com> Geoffrey Talvola wrote: > I'm not able to get this far though, because when I try to > build _ssl I have > problems. When I run build_ssl.py it grinds away for several minutes > building OpenSSL, then fails with this message while building > openssl.exe: > > link /nologo /subsystem:console /machine:I386 /opt:ref > /out:out32\openssl.exe @C:\temp\nnb00239. > > NMAKE : fatal error U1073: don't know how to make > 'D:\Python-2.3a1-Source\openssl-0.9.7/out32.dbg/libeay32.lib' > > The weird thing is that that file already exists. > > Can anyone help me out here? This is Windows NT with VC++ 6.0. I've > successfully built OpenSSL before, but there's something > about the way it's > being built by build_ssl.py that's failing. Never mind -- I figured out the problem. I had DEBUG set to 1 in my environment for some reason, which confused the makefile for _ssl.pyd. Unsetting that variable fixed the problem. - Geoff From python@rcn.com Mon Jan 27 20:20:29 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 27 Jan 2003 15:20:29 -0500 Subject: [Python-Dev] itertools module References: <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> <002301c2c5ff$713c3a20$125ffea9@oemcomputer> <15925.21009.294335.887541@montanaro.dyndns.org> <006e01c2c626$08fc3b00$125ffea9@oemcomputer> <15925.26818.413619.155977@montanaro.dyndns.org> Message-ID: <001801c2c641$895ad340$125ffea9@oemcomputer> > Raymond> The other differences are that it stops with the shortest > Raymond> iterable and doesn't accept None for a func argument. > > I was questioning why with a name like "imap" you chose > to make it differ from map() in ways other than its iterator-ness. The > other semantic differences make it more difficult to replace map() with > itertools.imap() than it might be. Okay, it's no problem to put back in the function=None behavior. I had thought it an outdated hack that could be left behind, but there is no loss from including it. And, you're right, it may help someone transition their code a little more easily. > Raymond> Because one or more useful inputs are potentially infinite, > Raymond> filling in Nones is less useful than stopping with the shortest > Raymond> iterator. > > Yes, but it still seems a gratuitous change from map() to me. I understand; however, for me, replicating quirks of map ranks less in importance than creating a cohesive set of tools that work well together. The SML/Haskell tools have a number of infinite iterators as basic building blocks; using them requires that other functions know when to shut off. I would like the package to be unified by the idea that the iterators all terminate with shortest input (assuming they have one and some don't). Also, I'm a little biased because that map feature has never been helpful to me and more than once has gotten in the way. The implementations in Haskell and SML also did not include a None fillin feature. > >> * loopzip() - It's not clear why its next() method should return a > >> list instead of a tuple (again, a seemingly needless distiction > >> with its builtin counterpart, zip()). > > Raymond> I've wrestled with the one. The short answer is that zip() > Raymond> already does a pretty good job and that the only use for > Raymond> loopzip() is super high speed looping. To that end, reusing a > Raymond> single list instead of allocating and building tuples is *much* > Raymond> faster. > > How do you know the caller doesn't squirrel away the list you returned on > the n-th iteration? I don't see how you can safely reuse the same list. If needed, I can add in an izip() function that returns tuples just like zip() does. I would like to keep loopzip(). It is very effective and efficient for the use case that zip was meant to solve, namely lockstep iteration: for i, j in loopzip(ivector, jvector): # results are immediately unpacked process(i,j) This use case is even more prevalent with this package where loopzip can combine algebraicly with other itertools or functionals: takewhile(binarypredicate, loopzip(ivec, jvec) It's a terrible waste to constantly allocate tuples, build them, pass them, unpack them, and throw them away on every pass. Reuse is an optimization that is already built into the existing implementations of filter() and map(). > > Skip Thanks again for the useful comments. I'll add the map(None, s1, s2, ...) behavior and write an izip() function which can be used with full safety for non-looping use cases. Raymond Hettinger From guido@python.org Mon Jan 27 20:49:30 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 27 Jan 2003 15:49:30 -0500 Subject: [Python-Dev] the new 2.3a1 settimeout() with httplib and SSL In-Reply-To: Your message of "Mon, 27 Jan 2003 14:47:57 EST." <61957B071FF421419E567A28A45C7FE54010E6@mailbox.nameconnector.com> References: <61957B071FF421419E567A28A45C7FE54010E6@mailbox.nameconnector.com> Message-ID: <200301272049.h0RKnU523592@odiug.zope.com> > It doesn't look terribly hard to make the SSL wrapper obey the timeout, by > calling select() on the "raw" socket before calling SSL_write or SSL_read. > I'm willing to try to get this to work. That's cool. I don't know much about the SSL_read() API -- does it promise to read exactly the requested number of byte (unless the socket is closed)? Then a single select() before it is called may not be sufficient. --Guido van Rossum (home page: http://www.python.org/~guido/) From gtalvola@nameconnector.com Mon Jan 27 21:35:10 2003 From: gtalvola@nameconnector.com (Geoffrey Talvola) Date: Mon, 27 Jan 2003 16:35:10 -0500 Subject: [Python-Dev] the new 2.3a1 settimeout() with httplib and SSL Message-ID: <61957B071FF421419E567A28A45C7FE54010E8@mailbox.nameconnector.com> Guido van Rossum [mailto:guido@python.org] wrote: > > It doesn't look terribly hard to make the SSL wrapper obey > > the timeout, by > > calling select() on the "raw" socket before calling > > SSL_write or SSL_read. > > I'm willing to try to get this to work. > > That's cool. I don't know much about the SSL_read() API -- does it > promise to read exactly the requested number of byte (unless the > socket is closed)? Then a single select() before it is called may not > be sufficient. I don't know any more than you do -- I've never looked at the OpenSSL docs until today. It looks like SSL_read may indeed return fewer bytes than requested: http://www.openssl.org/docs/ssl/SSL_read.html But I still think that a single select() is OK. It seems to be working fine in my testing. The select() times out only if there has been no activity on the socket for the entire timeout period, which seems sufficient to me. I don't think it matters that SSL_read may return fewer bytes than requested. Maybe I'm missing something. Anyhow, I'll do some more testing and then if it still seems to work I'll upload a patch later today. - Geoff From guido@python.org Mon Jan 27 21:47:32 2003 From: guido@python.org (Guido van Rossum) Date: Mon, 27 Jan 2003 16:47:32 -0500 Subject: [Python-Dev] the new 2.3a1 settimeout() with httplib and SSL In-Reply-To: Your message of "Mon, 27 Jan 2003 16:35:10 EST." <61957B071FF421419E567A28A45C7FE54010E8@mailbox.nameconnector.com> References: <61957B071FF421419E567A28A45C7FE54010E8@mailbox.nameconnector.com> Message-ID: <200301272147.h0RLlWA23849@odiug.zope.com> > > > It doesn't look terribly hard to make the SSL wrapper obey > > > the timeout, by > > > calling select() on the "raw" socket before calling > > > SSL_write or SSL_read. > > > I'm willing to try to get this to work. > > > > That's cool. I don't know much about the SSL_read() API -- does it > > promise to read exactly the requested number of byte (unless the > > socket is closed)? Then a single select() before it is called may not > > be sufficient. > > I don't know any more than you do -- I've never looked at the OpenSSL docs > until today. It looks like SSL_read may indeed return fewer bytes than > requested: http://www.openssl.org/docs/ssl/SSL_read.html Hm, from that page it looks like the internal implementation may actually repeatedly read from the socket, until it has processed a full 16K block. But I may be mistaken, since it also refers to a non-blocking underlying "BIO", whatever that is. :-( > But I still think that a single select() is OK. It seems to be working fine > in my testing. The select() times out only if there has been no activity on > the socket for the entire timeout period, which seems sufficient to me. I > don't think it matters that SSL_read may return fewer bytes than requested. > Maybe I'm missing something. If you can see no CPU activity while it's waiting to time out, it's probably okay. > Anyhow, I'll do some more testing and then if it still seems to work I'll > upload a patch later today. Great! Please assign to me so I head about it. --Guido van Rossum (home page: http://www.python.org/~guido/) From gtalvola@nameconnector.com Mon Jan 27 21:58:18 2003 From: gtalvola@nameconnector.com (Geoffrey Talvola) Date: Mon, 27 Jan 2003 16:58:18 -0500 Subject: [Python-Dev] the new 2.3a1 settimeout() with httplib and SSL Message-ID: <61957B071FF421419E567A28A45C7FE54010E9@mailbox.nameconnector.com> Guido van Rossum [mailto:guido@python.org] wrote: > > > > It doesn't look terribly hard to make the SSL wrapper obey > > > > the timeout, by > > > > calling select() on the "raw" socket before calling > > > > SSL_write or SSL_read. > > > > I'm willing to try to get this to work. > > > > > > That's cool. I don't know much about the SSL_read() API > > > -- does it > > > promise to read exactly the requested number of byte (unless the > > > socket is closed)? Then a single select() before it is > > > called may not > > > be sufficient. > > > > I don't know any more than you do -- I've never looked at > > the OpenSSL docs > > until today. It looks like SSL_read may indeed return > > fewer bytes than > > requested: http://www.openssl.org/docs/ssl/SSL_read.html > > Hm, from that page it looks like the internal implementation may > actually repeatedly read from the socket, until it has processed a > full 16K block. But I may be mistaken, since it also refers to a > non-blocking underlying "BIO", whatever that is. :-( Yeah, I'm not really sure what that is. > > > But I still think that a single select() is OK. It seems > > to be working fine > > in my testing. The select() times out only if there has > > been no activity on > > the socket for the entire timeout period, which seems > > sufficient to me. I > > don't think it matters that SSL_read may return fewer bytes > > than requested. > > Maybe I'm missing something. > > If you can see no CPU activity while it's waiting to time out, it's > probably okay. Yes, it seems to work -- no CPU activity and it times out exactly when it's supposed to. > > Anyhow, I'll do some more testing and then if it still > > seems to work I'll > > upload a patch later today. > > Great! Please assign to me so I head about it. I just uploaded the patch and assigned it to you. - Geoff From python@rcn.com Mon Jan 27 22:11:45 2003 From: python@rcn.com (Raymond Hettinger) Date: Mon, 27 Jan 2003 17:11:45 -0500 Subject: [Python-Dev] itertools module References: <200301270532.h0R5WMx26396@oma.cosc.canterbury.ac.nz> <002301c2c5ff$713c3a20$125ffea9@oemcomputer> <15925.21009.294335.887541@montanaro.dyndns.org> <006e01c2c626$08fc3b00$125ffea9@oemcomputer> <15925.26818.413619.155977@montanaro.dyndns.org> Message-ID: <001b01c2c651$14bd5c00$125ffea9@oemcomputer> From: "Skip Montanaro" > >> * loopzip() - It's not clear why its next() method should return a > >> list instead of a tuple (again, a seemingly needless distiction > >> with its builtin counterpart, zip()). > > Raymond> I've wrestled with the one. The short answer is that zip() > Raymond> already does a pretty good job and that the only use for > Raymond> loopzip() is super high speed looping. To that end, reusing a > Raymond> single list instead of allocating and building tuples is *much* > Raymond> faster. > > How do you know the caller doesn't squirrel away the list you returned on > the n-th iteration? I don't see how you can safely reuse the same list. After more thought, I think loopzip() is too unsavory and taints an otherwise clean package, so I'll take it out unless someone wants to stand-up for it. Too bad, it was an exceptionally fast solution to the lock-step iteration problem. Raymond From munch@acm.org Mon Jan 27 22:19:24 2003 From: munch@acm.org (john paulson) Date: Mon, 27 Jan 2003 14:19:24 -0800 Subject: [Python-Dev] HTMLParser patches Message-ID: I've submitted two patches for HTMLParser.py and test_htmlparser.py. They were to fix two problems lexing some html pages I found in the wild. 1. Allow "," in attributes A page had the attribute "color=rgb(1,2,3)", and the parser choked on the ",". Added the "," to the list of allowed characters. 2. More robust That could be a problem, since this is commonly used to support browsers that don't understand