From jack@oratrix.nl Sat Sep 1 00:10:43 2001 From: jack@oratrix.nl (Jack Jansen) Date: Sat, 01 Sep 2001 01:10:43 +0200 Subject: [Python-Dev] Would it make sense to issue an extra alpha next week? In-Reply-To: Message by Guido van Rossum , Fri, 31 Aug 2001 15:28:25 -0400 , <200108311928.PAA21836@cj20424-a.reston1.va.home.com> Message-ID: <20010831231048.94C92DDDEB@oratrix.oratrix.nl> I would be very much in favor of a 2.2a3 release next week, it would save me a lot of explaining differences between 2.2a2 on unix/win and on the Mac. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From gward@python.net Sat Sep 1 18:44:35 2001 From: gward@python.net (Greg Ward) Date: Sat, 1 Sep 2001 13:44:35 -0400 Subject: [Python-Dev] Would it make sense to issue an extra alpha next week? In-Reply-To: <200108311928.PAA21836@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Aug 31, 2001 at 03:28:25PM -0400 References: <200108311928.PAA21836@cj20424-a.reston1.va.home.com> Message-ID: <20010901134435.A513@gerg.ca> On 31 August 2001, Guido van Rossum said: > Does anybody object against the release of an extra release, 2.2a3, > around Sept. 5? We could do 2.2a4 on Sept. 19, or a week later if the > schedule gets too crowded. Makes sense to me -- "Release early, release often". Anyways, lots of big changes in 2.2 means lots of alpha releases are probably a good thing. Greg -- Greg Ward - Unix geek gward@python.net http://starship.python.net/~gward/ This quote intentionally left blank. From guido@python.org Sat Sep 1 20:15:27 2001 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Sep 2001 15:15:27 -0400 Subject: [Python-Dev] Proposal: get rid of compilerlike.py In-Reply-To: Your message of "Fri, 17 Aug 2001 15:46:29 EDT." <20010817154629.D1304@thyrsus.com> References: <200108171920.PAA10067@cj20424-a.reston1.va.home.com> <20010817154629.D1304@thyrsus.com> Message-ID: <200109011915.PAA25846@cj20424-a.reston1.va.home.com> About two weeks ago, Eric Raymond wrote: > OK. Don't wait on this, but I'm going to try to find time to check in > my stuff that provides compilerlike framework support for scripts. > > Code is tested, docs are written. The issue is packaging; I was going > to make it a separate ccframe module, but I'm thinking Greg Ewing's > suggestion that it should live in fileinput is on balance a good one. > But that means I have to merge in the docs. > > I'll probably get to this tonight. Eric didn't merge it with fileinput, but instead checked in a separate module "compilerlike". I have several comments on the code, but my main complaint is about process. This is a random bit of code that got checked in without any kind of discussion about whether it was worth checking into the standard library, and whether this particular implementation was right. There was some discussion afterwards, to which Eric did not respond. Given that Eric apparently doesn't care enough about his code to defend it, I propose to delete it from CVS. I'll do this as part of the 2.2a3 release which (given the encouraging feedback so far) I'll try to do around Sept. 5. --Guido van Rossum (home page: http://www.python.org/~guido/) From esr@thyrsus.com Sat Sep 1 20:41:36 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Sat, 1 Sep 2001 15:41:36 -0400 Subject: [Python-Dev] Re: Proposal: get rid of compilerlike.py In-Reply-To: <200109011915.PAA25846@cj20424-a.reston1.va.home.com>; from guido@python.org on Sat, Sep 01, 2001 at 03:15:27PM -0400 References: <200108171920.PAA10067@cj20424-a.reston1.va.home.com> <20010817154629.D1304@thyrsus.com> <200109011915.PAA25846@cj20424-a.reston1.va.home.com> Message-ID: <20010901154136.A22702@thyrsus.com> Guido van Rossum : > Eric didn't merge it with fileinput, but instead checked in a separate > module "compilerlike". I have several comments on the code, but my > main complaint is about process. This is a random bit of code that > got checked in without any kind of discussion about whether it was > worth checking into the standard library, and whether this particular > implementation was right. There was some discussion afterwards, to > which Eric did not respond. Given that Eric apparently doesn't care > enough about his code to defend it, I propose to delete it from CVS. > > I'll do this as part of the 2.2a3 release which (given the encouraging > feedback so far) I'll try to do around Sept. 5. Now, wait a second! There *was* in fact discussion of this thing beforehand. Several listmembers said they thought it was a good idea. And I have seen *no* discussion *at all* since it was checked in. What is going on here? Is it possible that you are mistaken about the timing of the checkin, and that what you thought was discussion afterwards was discussion before? Or am I somehow missing listmail? As for process issues...I agree that we need better procedures and criteria for what goes into the library. As you know I've made a start on developing same, but my understanding has been that *you* don't think you'll have the bandwidth for it until 2.2 is out. -- Eric S. Raymond As war and government prove, insanity is the most contagious of diseases. -- Edward Abbey From guido@python.org Sat Sep 1 21:08:51 2001 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Sep 2001 16:08:51 -0400 Subject: [Python-Dev] Re: Proposal: get rid of compilerlike.py In-Reply-To: Your message of "Sat, 01 Sep 2001 15:41:36 EDT." <20010901154136.A22702@thyrsus.com> References: <200108171920.PAA10067@cj20424-a.reston1.va.home.com> <20010817154629.D1304@thyrsus.com> <200109011915.PAA25846@cj20424-a.reston1.va.home.com> <20010901154136.A22702@thyrsus.com> Message-ID: <200109012008.QAA26004@cj20424-a.reston1.va.home.com> > > Eric didn't merge it with fileinput, but instead checked in a separate > > module "compilerlike". I have several comments on the code, but my > > main complaint is about process. This is a random bit of code that > > got checked in without any kind of discussion about whether it was > > worth checking into the standard library, and whether this particular > > implementation was right. There was some discussion afterwards, to > > which Eric did not respond. Given that Eric apparently doesn't care > > enough about his code to defend it, I propose to delete it from CVS. > > > > I'll do this as part of the 2.2a3 release which (given the encouraging > > feedback so far) I'll try to do around Sept. 5. > > Now, wait a second! Well, a response at last. > There *was* in fact discussion of this thing beforehand. Several > listmembers said they thought it was a good idea. And I have seen *no* > discussion *at all* since it was checked in. I have now re-read that discussion; it's in the archives starting this message: http://mail.python.org/pipermail/python-dev/2001-August/016629.html There were several suggestions to merge it with fileinput and some suggestions to restructure it. You seem to have ignored these except the criticism on the name "ccframe" (by choosing an even worse name :-). > What is going on here? Is it possible that you are mistaken about the > timing of the checkin, and that what you thought was discussion afterwards > was discussion before? Or am I somehow missing listmail? Your mail was probably broken -- it wouldn't be the first time :-(. There are two posts in the archives that start with a quote from the checkin mail: http://mail.python.org/pipermail/python-dev/2001-August/017131.html http://mail.python.org/pipermail/python-dev/2001-August/017132.html > As for process issues...I agree that we need better procedures and > criteria for what goes into the library. As you know I've made a > start on developing same, but my understanding has been that *you* > don't think you'll have the bandwidth for it until 2.2 is out. That's not an excuse for you to check in random bits of code. Some comments on the code: - A framework like this should be structured as a class or set of related classes, not a bunch of functions with function arguments. This would make the documentation easier to read as well; instead of having a bunch of functions you pass in, you customize the framework byu overriding methods. - The name "compilerlike" is a really poor choice (there's nothing compiler-like in the code). - I would like to see failure to open the file handled differently (so the caller can issue decent error message for inaccessible input files without having to catch all IOError exceptions), but again this is a policy issue that should be customizable. - The policy of not writing the output if it's identical to the input should be optional. There are contexts (like when the tool is invoked by a Makefile) where not writing the output could be harmful: if you touch the input without changing it, Make would invoke the tool over and over again because the output doesn't get touched by the tool. Moreover, there seems to be some bugs: if the output is the same as the input, the output file is not written even if a filename transformation was requested (making Make even less happy); when a transformation is specified by a string, an undefined variable 'stem' is used. Hasty work, Eric. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@rahul.net Sat Sep 1 21:55:34 2001 From: aahz@rahul.net (Aahz Maruch) Date: Sat, 1 Sep 2001 13:55:34 -0700 (PDT) Subject: [Python-Dev] Proposal: get rid of compilerlike.py In-Reply-To: <200109011915.PAA25846@cj20424-a.reston1.va.home.com> from "Guido van Rossum" at Sep 01, 2001 03:15:27 PM Message-ID: <20010901205534.818D9E8C6@waltz.rahul.net> Guido van Rossum wrote: > > Eric didn't merge it with fileinput, but instead checked in a separate > module "compilerlike". I have several comments on the code, but my > main complaint is about process. This is a random bit of code that > got checked in without any kind of discussion about whether it was > worth checking into the standard library, and whether this particular > implementation was right. There was some discussion afterwards, to > which Eric did not respond. Given that Eric apparently doesn't care > enough about his code to defend it, I propose to delete it from CVS. +1 Side note to Eric: as a subscriber to python-dev, but not currently a member of the actual development process, I generally don't see checkin messages unless someone else comments on them. I think at a minimum courtesy would suggest that you post an announcement of the checkin when it's feature work that hasn't been PEPed. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From esr@thyrsus.com Sat Sep 1 22:52:10 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Sat, 1 Sep 2001 17:52:10 -0400 Subject: [Python-Dev] Re: Proposal: get rid of compilerlike.py In-Reply-To: <200109012008.QAA26004@cj20424-a.reston1.va.home.com>; from guido@python.org on Sat, Sep 01, 2001 at 04:08:51PM -0400 References: <200108171920.PAA10067@cj20424-a.reston1.va.home.com> <20010817154629.D1304@thyrsus.com> <200109011915.PAA25846@cj20424-a.reston1.va.home.com> <20010901154136.A22702@thyrsus.com> <200109012008.QAA26004@cj20424-a.reston1.va.home.com> Message-ID: <20010901175210.A23313@thyrsus.com> Guido van Rossum : > I have now re-read that discussion; it's in the archives starting this > message: > > http://mail.python.org/pipermail/python-dev/2001-August/016629.html As have I. All the stuff in this thread was before the checkin; you were in fact mistaken about the timing of most of the discussion. > There were several suggestions to merge it with fileinput and some > suggestions to restructure it. You seem to have ignored these except > the criticism on the name "ccframe" (by choosing an even worse name > :-). I did not ignore these suggestions (one that I took was Greg Ward's suggestion that, after all, just throwing an exception was the right thing). And I was in fact planning to merge this thing with fileinput. Then I looked as what would have to be done to the documentation of fileinput -- in fact, I edited together a combined fileinput documentation page. The result was a mess that convinced me that this does indeed need to be a separate module. There wasn't enough coherence between the old fileinput stuff and my entry points to even make the *documentation* look like a logical unit, let alone the code. > > What is going on here? Is it possible that you are mistaken about the > > timing of the checkin, and that what you thought was discussion afterwards > > was discussion before? Or am I somehow missing listmail? > > Your mail was probably broken -- it wouldn't be the first time :-(. In the event, my mail was not broken. > There are two posts in the archives that start with a quote from the > checkin mail: > > http://mail.python.org/pipermail/python-dev/2001-August/017131.html > http://mail.python.org/pipermail/python-dev/2001-August/017132.html Right...one of which completely misses the point by suggesting that this is a filter framework, and the other one of which is a "me too" basically addressing the naming issue. Guido, you are yourself *notorious* for dismissing naming issues with "that's unimportant" and "we can fix it later". How can you criticize me for doing likewise? > > As for process issues...I agree that we need better procedures and > > criteria for what goes into the library. As you know I've made a > > start on developing same, but my understanding has been that *you* > > don't think you'll have the bandwidth for it until 2.2 is out. > > That's not an excuse for you to check in random bits of code. So what, exactly, makes this 'random'? That, Guido, is not a rhetorical question. We don't have any procedures. We don't have any guidelines. We don't have any history of anything but discussing submissions on python-dev before somebody with commit access checks them in. If no -1 votes and the judgment of somebody with commit privileges who has already got a lot of stuff in the library is not sufficient, *what is*? I'm not trying to be difficult here, but this points at a weakness in our way of doing things. I want to play nice, but I can't if I don't know your actual rules. I don't know what *would* have been sufficient if what I did was not. I don't think anyone else does, either. > Some comments on the code: This is the sort of critique I was looking for two weeks ago, not a bunch of bikeshedding about how the thing should be named. > - A framework like this should be structured as a class or set of > related classes, not a bunch of functions with function arguments. > This would make the documentation easier to read as well; instead of > having a bunch of functions you pass in, you customize the framework > byu overriding methods. Yes, I thought of this. There's a reason I didn't do it that way. Method override would work just fine as a way to pass in the filename transformer, but not the data transformer. The problem is this: the driver or "go do it" method of your hypothetical class (the one you'd pass sys.argv[1:]) can't know which overriden method to call in advance, because which one is right would depend on the argument signature of the hook function -- does it take filelike objects, does it take two strings, etc. Actually it's worse than that; two of the cases (the sponge and the line-by-line filtering) aren't even distinguishable by type signature. So, what the driver function could do is step through three method names looking to see which if any is overridden in the user-created subclass. But would that really be a gain in clarity over having three functions in the module? I'm willing to listen if you think the answer is "yes" and want to tell me why, but it didn't seem so to me. There's something else I could have done. I could have required that the hook function use specific unique formal argument names in each of the three cases and then had the driver code use inspect to dispatch among them -- but that seemed even more klugey. Maybe there is a really elegant and low-overhead method of wrapping these functions in a class, and I have just not found it yet. But if so, it is not (as you appear to believe) for lack of looking. If you have an insight that I have missed, I will cheerfully accept instruction on this issue. > - The name "compilerlike" is a really poor choice (there's nothing > compiler-like in the code). No, there isn't. It's called "compilerlike" because it's a framework for making compilerlike interfaces out of functions. But I'm not attached to that name; CompilerFramework or something of that sort would be fine. > - I would like to see failure to open the file handled differently (so > the caller can issue decent error message for inaccessible input > files without having to catch all IOError exceptions), but again > this is a policy issue that should be customizable. Originally the code originally fielded file I/O errors by complaining to stderr and then exiting. At least two respondents argued that it should simply throw an exception and let the caller do policy, and upon reflection I came to agree with this (this is one of those suggestions you thought I was ignoring). I realize it's tempting to try and embed a range of policy options in the module to save time, but unless we can have reasonable confidence that they will cover all important cases I don't judge the complexity overhead to be worth it. Again, I am open to instruction on this. > - The policy of not writing the output if it's identical to the input > should be optional. There are contexts (like when the tool is > invoked by a Makefile) where not writing the output could be > harmful: if you touch the input without changing it, Make would > invoke the tool over and over again because the output doesn't get > touched by the tool. Interesting point. A better rule, perhaps, would be to suppress writing of output only if both the content *and* the transformed filename are identical -- that would avoid doing a spurious touch on a no-op modification in pace, without confusing Make. > Moreover, there seems to be some bugs: if the > output is the same as the input, the output file is not written even > if a filename transformation was requested (making Make even less > happy); when a transformation is specified by a string, an undefined > variable 'stem' is used. Hasty work, Eric. :-( I'll take the hit for this; my test framework should have covered that case and didn't, because I was in a hurry to get in before the freeze. However; I know the other cases work because I'm *using* them. OK, so here's how I see it: 1. I made a minor implementation error with one case; this can be fixed. 2. You were mistaken in believing that (a) there was no discussion or endorsement of the idea before hand, and that (b) I did not defend or justify the design. 3. Some of the respondents simply missed the point; this thing is *not* a framework for creating filters, and shouldn't be named like one or put in the wrong library bin because of it. 4. There is room for technical debate about the interface design, but no choice I'm aware of that is *obviously* better than three functions -- the class-wrapper approach would have unobvious problems doing the hook function dispatch properly. 5. I was trying to do the right thing, but we sorely lack a useful set of norms for what constitutes `good' vs. `bad' librsary checkins. I am actively interested in helping solve problem. -- Eric S. Raymond Non-cooperation with evil is as much a duty as cooperation with good. -- Mohandas Gandhi From bckfnn@worldonline.dk Sat Sep 1 23:13:50 2001 From: bckfnn@worldonline.dk (Finn Bock) Date: Sat, 01 Sep 2001 22:13:50 GMT Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) In-Reply-To: References: Message-ID: <3b915c2b.57340811@mail.wanadoo.dk> >> + + A new command line option, -D, is added ... I don't know if it is worth considering, but jython already uses the -D option to set registry properties on the command line. -Dprop=v : Set the property `prop' to value `v' Jython would have to come up with a command line option different from -D to control division behavior. regards, finn From gmcm@hypernet.com Sun Sep 2 00:53:20 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 1 Sep 2001 19:53:20 -0400 Subject: [Python-Dev] Re: Proposal: get rid of compilerlike.py In-Reply-To: <20010901175210.A23313@thyrsus.com> References: <200109012008.QAA26004@cj20424-a.reston1.va.home.com>; from guido@python.org on Sat, Sep 01, 2001 at 04:08:51PM -0400 Message-ID: <3B913CB0.6947.2AA3D7DF@localhost> [Eric] > Right...one of which completely misses the point by suggesting > that this is a filter framework, True. It's a tad more general than just a filter. How about UnixishTextfileTransformFramework since that captures it's most distinguishing feature (the fact that it follows a popular unix convention). Also hints that it would chew the hell out of a binary file on Windows. [Guido] > > - A framework like this should be structured as a class or set > > of > > related classes, not a bunch of functions with function > > arguments. [Eric] > Yes, I thought of this. There's a reason I didn't do it that > way. Method override would work just fine as a way to pass in the > filename transformer, but not the data transformer. > > The problem is this: the driver or "go do it" method of your > hypothetical class (the one you'd pass sys.argv[1:]) can't know > which overriden method to call in advance, because which one is > right would depend on the argument signature of the hook function Who cares about signatures? Try the attached. - Gordon -------------- Enclosure number 1 ---------------- import sys, os class UnixishFileTransformer: def transformFile(self, infp, outfp): raise NotImplementedError def transformData(self, data): raise NotImplementedError def transformLine(self, line): raise NotImplementedError def transformFilename(self, nm): raise NotImplementedError def run(self, args): if not args: self.fnm = "stdin" self._do_one(sys.stdin, sys.stdout) else: for fnm in args: if fnm == '-': infp = sys.stdin self.fnm = "stdin" else: infp = open(fnm, 'r') self.fnm = fnm tempfile = file + ".~%s-%d~" % (name, os.getpid()) outfp = open(tempfile, "w") try: self._do_one(infp, outfp) except: os.remove(tempfile) raise infp.close() outfp.close() try: newnm = self.transformFilename(self.fnm) except NotImplementedError: if filecmp.cmp(file, tempfile): continue newnm = self.fnm os.remove(self.fnm) os.rename(tempfile, newnm) def _do_one(self, infp, outfp): try: self.transformFile(infp, outfp) except NotImplementedError: try: while 1: line = infp.readline() if not line: break outfp.write(self.transformLine(line)) except NotImplementedError: try: lump = line + infp.read() outfp.write(self.transformData(lump)) except NotImplementedError: raise NotImplementedError, "no implementation of a transform method provided" if __name__ == '__main__': import getopt class T1(UnixishFileTransformer): def transformFilename(self, fnm): return fnm+'.out' def transformFile(self, infp, outfp): while 1: line = infp.readline() if not line: break if line == "\n": outfp.write("------------------------------------------\n") outfp.write(line) class T2(UnixishFileTransformer): def transformFilename(self, fnm): return fnm+'.out2' def transformLine(self, line): return "<" + line[:-1] + ">\n" class T3(UnixishFileTransformer): def transformFilename(self, fnm): return fnm+'.foo' def transformData(self, data): lines = data.split("\n") lines.reverse() return "\n".join(lines) (options, arguments) = getopt.getopt(sys.argv[1:], "fls") for (switch, val) in options: if switch == '-f': t = T1() t.run(arguments) elif switch == '-l': t = T2() t.run(arguments) elif switch == '-s': t = T3() t.run(arguments) else: print "Unknown option." From guido@python.org Sun Sep 2 03:10:10 2001 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Sep 2001 22:10:10 -0400 Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) In-Reply-To: Your message of "Sat, 01 Sep 2001 22:13:50 GMT." <3b915c2b.57340811@mail.wanadoo.dk> References: <3b915c2b.57340811@mail.wanadoo.dk> Message-ID: <200109020210.WAA26637@cj20424-a.reston1.va.home.com> > >> + + A new command line option, -D, is added ... > > I don't know if it is worth considering, but jython already uses the -D > option to set registry properties on the command line. > > -Dprop=v : Set the property `prop' to value `v' > > Jython would have to come up with a command line option different from > -D to control division behavior. Darn. I wished you'd said something earlier -- I put this in PEP 237 weeks ago. Does Jython generally try to follow Python's command line options? Would it make sense to define a Jython property for the division behavior? Then I could change the syntax to -Ddivision=old, -Ddivision=warn, -Ddivision=new. That's a bit long, but acceptable. Otherwise, the only mnemonic I can think of would be -/old, -/warn, -/new. But that may be confusing for some users who are expecting options to be letters. I guess we could use -q for quotient. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Sep 2 03:36:07 2001 From: guido@python.org (Guido van Rossum) Date: Sat, 01 Sep 2001 22:36:07 -0400 Subject: [Python-Dev] Re: Proposal: get rid of compilerlike.py In-Reply-To: Your message of "Sat, 01 Sep 2001 17:52:10 EDT." <20010901175210.A23313@thyrsus.com> References: <200108171920.PAA10067@cj20424-a.reston1.va.home.com> <20010817154629.D1304@thyrsus.com> <200109011915.PAA25846@cj20424-a.reston1.va.home.com> <20010901154136.A22702@thyrsus.com> <200109012008.QAA26004@cj20424-a.reston1.va.home.com> <20010901175210.A23313@thyrsus.com> Message-ID: <200109020236.WAA26773@cj20424-a.reston1.va.home.com> > Guido van Rossum : > > I have now re-read that discussion; it's in the archives starting this > > message: > > > > http://mail.python.org/pipermail/python-dev/2001-August/016629.html > > As have I. All the stuff in this thread was before the checkin; > you were in fact mistaken about the timing of most of the discussion. No, I was not mistaken about the timing; you must have misunderstood what I said about the timing. When I posted the URL for this thread this afternoon, I knew that it took place before your checkin. I did not see evidence either in the mailing list or in the code that you took any of the advice though. > > There were several suggestions to merge it with fileinput and some > > suggestions to restructure it. You seem to have ignored these except > > the criticism on the name "ccframe" (by choosing an even worse name > > :-). > > I did not ignore these suggestions (one that I took was Greg Ward's > suggestion that, after all, just throwing an exception was the right > thing). And I was in fact planning to merge this thing with fileinput. > > Then I looked as what would have to be done to the documentation of > fileinput -- in fact, I edited together a combined fileinput > documentation page. The result was a mess that convinced me that this > does indeed need to be a separate module. There wasn't enough > coherence between the old fileinput stuff and my entry points to even > make the *documentation* look like a logical unit, let alone the code. So, as a matter of process, you should not have checked it in without coming back to the list with your experience. > > > What is going on here? Is it possible that you are mistaken > > > about the timing of the checkin, and that what you thought was > > > discussion afterwards was discussion before? Or am I somehow > > > missing listmail? > > > > Your mail was probably broken -- it wouldn't be the first time :-(. > > In the event, my mail was not broken. > > > There are two posts in the archives that start with a quote from the > > checkin mail: > > > > http://mail.python.org/pipermail/python-dev/2001-August/017131.html > > http://mail.python.org/pipermail/python-dev/2001-August/017132.html > > Right...one of which completely misses the point by suggesting that > this is a filter framework, and the other one of which is a "me too" > basically addressing the naming issue. Guido, you are yourself > *notorious* for dismissing naming issues with "that's unimportant" and > "we can fix it later". How can you criticize me for doing likewise? I am criticizing you for not responding at all to the feedback -- whether it was mistaken or not. That's another violation of process. Why is suggesting it is a filtering framework completely missing the point? If this is not a filter framework, WHAT IS IT? > > > As for process issues...I agree that we need better procedures and > > > criteria for what goes into the library. As you know I've made a > > > start on developing same, but my understanding has been that *you* > > > don't think you'll have the bandwidth for it until 2.2 is out. > > > > That's not an excuse for you to check in random bits of code. > > So what, exactly, makes this 'random'? > > That, Guido, is not a rhetorical question. We don't have any > procedures. We don't have any guidelines. We don't have any history > of anything but discussing submissions on python-dev before somebody > with commit access checks them in. If no -1 votes and the judgment of > somebody with commit privileges who has already got a lot of stuff > in the library is not sufficient, *what is*? Absence of -1 votes is not enough. I didn't see any +1 votes -- just suggestions to try a different tack. I happened to be too busy at the time you checked this in to weigh in, but I had a big -1 in my head which I thought was reflected by other comments. Eric, I respect you as a person, but as a Python developer, I don't trust your judgement enough to let you check stuff in without a clear green light from me. > I'm not trying to be difficult here, but this points at a weakness in > our way of doing things. I want to play nice, but I can't if I don't > know your actual rules. I don't know what *would* have been sufficient if > what I did was not. I don't think anyone else does, either. Everybody else who doesn't know the rules for sure starts a discussion, either here or on the patch manager. You are the only one of the 30+ committers who *repeatedly* commits controversial stuff. I'm not saying that the rules are clear enough (they clearly aren't if even you don't get them), but I think there's a better way to get clarity than by acting like a bull in a china cabinet. > > Some comments on the code: > > This is the sort of critique I was looking for two weeks ago, not a bunch > of bikeshedding about how the thing should be named. I'll respond to this later. First I want you to be clear on the process: commit privileges are not to be used to force an issue. (Admin privileges will be used to force an issue if necessary. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From esr@thyrsus.com Sun Sep 2 05:16:44 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Sun, 2 Sep 2001 00:16:44 -0400 Subject: [Python-Dev] Re: Proposal: get rid of compilerlike.py In-Reply-To: <3B913CB0.6947.2AA3D7DF@localhost>; from gmcm@hypernet.com on Sat, Sep 01, 2001 at 07:53:20PM -0400 References: <200109012008.QAA26004@cj20424-a.reston1.va.home.com>; <20010901175210.A23313@thyrsus.com> <3B913CB0.6947.2AA3D7DF@localhost> Message-ID: <20010902001644.D28462@thyrsus.com> Gordon McMillan : > True. It's a tad more general than just a filter. How about > UnixishTextfileTransformFramework > since that captures it's most distinguishing feature (the fact > that it follows a popular unix convention). Also hints that it > would chew the hell out of a binary file on Windows. Eh? What's specific to text files in thec design? -- Eric S. Raymond A wise and frugal government, which shall restrain men from injuring one another, which shall leave them otherwise free to regulate their own pursuits of industry and improvement, and shall not take from the mouth of labor the bread it has earned. This is the sum of good government, and all that is necessary to close the circle of our felicities. -- Thomas Jefferson, in his 1801 inaugural address From bckfnn@worldonline.dk Sun Sep 2 09:37:07 2001 From: bckfnn@worldonline.dk (Finn Bock) Date: Sun, 02 Sep 2001 08:37:07 GMT Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) In-Reply-To: <200109020210.WAA26637@cj20424-a.reston1.va.home.com> References: <3b915c2b.57340811@mail.wanadoo.dk> <200109020210.WAA26637@cj20424-a.reston1.va.home.com> Message-ID: <3b91e7cd.2126948@mail.wanadoo.dk> >> >> + + A new command line option, -D, is added ... >> >> I don't know if it is worth considering, but jython already uses the -D >> option to set registry properties on the command line. >> >> -Dprop=v : Set the property `prop' to value `v' >> >> Jython would have to come up with a command line option different from >> -D to control division behavior. [GvR] >Darn. I wished you'd said something earlier -- I put this in PEP 237 >weeks ago. So many peps and so little time. Following the discussion of peps like this is a bit of a strain when I really don't care what division return. I'll try to be more proactive about new pep revisions in the future. >Does Jython generally try to follow Python's command line options? We try to. -i, -S, -W, -v and -c are the same. A very recently added -E options have a different semantic but that is a bug. In addition we have the options -D, -jar, --help and --version. >Would it make sense to define a Jython property for the division >behavior? That would work. >Then I could change the syntax to -Ddivision=old, >-Ddivision=warn, -Ddivision=new. That's a bit long, but acceptable. Our normal property names are a lot longer: -Dpython.security.respectJavaAccessibility=false -Dpython.options.showJavaExceptions=true because there are situations where the jython properties must live in the same namespace as the properties for java, corba etc. As a practical solution, -Ddivision=warn world work fine. regards, finn From gmcm@hypernet.com Sun Sep 2 11:48:12 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Sun, 2 Sep 2001 06:48:12 -0400 Subject: [Python-Dev] Re: Proposal: get rid of compilerlike.py In-Reply-To: <20010902001644.D28462@thyrsus.com> References: <3B913CB0.6947.2AA3D7DF@localhost>; from gmcm@hypernet.com on Sat, Sep 01, 2001 at 07:53:20PM -0400 Message-ID: <3B91D62C.14172.2CFB65B0@localhost> > Gordon McMillan : > > ... it would chew the hell out of a > > binary file on Windows. [Eric] > Eh? What's specific to text files in thec design? The lack of a "b" in the filemode. - Gordon From ping@lfw.org Sun Sep 2 11:58:14 2001 From: ping@lfw.org (Ka-Ping Yee) Date: Sun, 2 Sep 2001 03:58:14 -0700 (PDT) Subject: [Python-Dev] Check-in process In-Reply-To: <20010901154136.A22702@thyrsus.com> Message-ID: > As for process issues...I agree that we need better procedures and criteria > for what goes into the library. For a start, my current understanding is: - Adding a module to the standard library is a big change. - Modifying a core type or the interpreter loop is a big change. - Breaking compatibility is a big change. - Any big change requires confirmation specifically from Guido. I was pretty surprised to see compilerlike.py go in. Even if i had seen +1s from everyone else on python-dev, i still wouldn't have been comfortable checking it in without hearing from the BDFL. -- ?!ng From gstein@lyra.org Sun Sep 2 12:19:27 2001 From: gstein@lyra.org (Greg Stein) Date: Sun, 2 Sep 2001 04:19:27 -0700 Subject: [Python-Dev] Proposal: get rid of compilerlike.py In-Reply-To: <20010901175210.A23313@thyrsus.com>; from esr@thyrsus.com on Sat, Sep 01, 2001 at 05:52:10PM -0400 References: <200108171920.PAA10067@cj20424-a.reston1.va.home.com> <20010817154629.D1304@thyrsus.com> <200109011915.PAA25846@cj20424-a.reston1.va.home.com> <20010901154136.A22702@thyrsus.com> <200109012008.QAA26004@cj20424-a.reston1.va.home.com> <20010901175210.A23313@thyrsus.com> Message-ID: <20010902041927.L4184@lyra.org> On Sat, Sep 01, 2001 at 05:52:10PM -0400, Eric S. Raymond wrote: > Guido van Rossum : >... > > > As for process issues...I agree that we need better procedures and > > > criteria for what goes into the library. As you know I've made a > > > start on developing same, but my understanding has been that *you* > > > don't think you'll have the bandwidth for it until 2.2 is out. > > > > That's not an excuse for you to check in random bits of code. > > So what, exactly, makes this 'random'? > > That, Guido, is not a rhetorical question. We don't have any > procedures. We don't have any guidelines. We don't have any history > of anything but discussing submissions on python-dev before somebody > with commit access checks them in. If no -1 votes and the judgment of > somebody with commit privileges who has already got a lot of stuff > in the library is not sufficient, *what is*? > > I'm not trying to be difficult here, but this points at a weakness in > our way of doing things. I want to play nice, but I can't if I don't > know your actual rules. I don't know what *would* have been sufficient if > what I did was not. I don't think anyone else does, either. I've got a couple modules that may or may not be going into Lib. I described the general outline in PEP 268, and will begin developing those modules in nondist/sandbox sometime this week. Despite having commit privs, I'm not about to just toss those modules right into Lib. While there seems to be a very gentle consensus that they be included, they aren't even written. I'm using the PEP to describe the overall design to people so they can provide steering/commentary before coding starts. I'll be using the sandbox to give people a chance to see them as they develop and *before* they go into Lib. Hell... it even gives people a way to *assist*. Once I feel they're "done enough for an alpha release", then I'll post for a final call to move them to Lib. Of course, if we're in the beta time frame by then, then I may have some problems :-) (but they shouldn't go that long) Yes, I could simply write them and check them in. I feel quite comfortable claiming expertise in HTTP-based networking. But an immediate checkin has a very direct perception: "I know what I'm doing and don't need feedback." I've got a lot of respect for the other developers in this forum, and want any feedback they may have. Thus, I'll do what I can to provide that opportunity. [ we're all busy, so I'll get very little, but giving people the *chance* is a good warm&fuzzy and for the hope to get that *one* comment that really slaps me around to realize there is a Better Way ] The point here is: visibility, ability to provide feedback, and a stepwise process for moving modules from inception to Lib integration. It doesn't need to be written. It is simple a social thing, based on respect for your peers. Cheers, -g -- Greg Stein, http://www.lyra.org/ From fredrik@effbot.org Sun Sep 2 13:29:54 2001 From: fredrik@effbot.org (Fredrik Lundh) Date: Sun, 2 Sep 2001 14:29:54 +0200 Subject: [Python-Dev] Proposal: get rid of compilerlike.py References: <200108171920.PAA10067@cj20424-a.reston1.va.home.com> <20010817154629.D1304@thyrsus.com> <200109011915.PAA25846@cj20424-a.reston1.va.home.com> Message-ID: <00c901c133ab$09d012e0$42fb42d5@hagrid> Guido wrote: > I propose to delete it from CVS. +1 (delete or move to the non-dist sandbox) > I'll do this as part of the 2.2a3 release which (given the encouraging > feedback so far) I'll try to do around Sept. 5. From gstein@lyra.org Sun Sep 2 18:22:54 2001 From: gstein@lyra.org (Greg Stein) Date: Sun, 2 Sep 2001 10:22:54 -0700 Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) In-Reply-To: <200109020210.WAA26637@cj20424-a.reston1.va.home.com>; from guido@python.org on Sat, Sep 01, 2001 at 10:10:10PM -0400 References: <3b915c2b.57340811@mail.wanadoo.dk> <200109020210.WAA26637@cj20424-a.reston1.va.home.com> Message-ID: <20010902102254.M4184@lyra.org> On Sat, Sep 01, 2001 at 10:10:10PM -0400, Guido van Rossum wrote: >... > Would it make sense to define a Jython property for the division > behavior? Then I could change the syntax to -Ddivision=old, > -Ddivision=warn, -Ddivision=new. That's a bit long, but acceptable. This would be similar to compilers which use -Dsym[=value]. Apache has a similar -Dsym[=value] option. Note that a general -Dkey=value option would be quite nice. They could all end in sys.startoptions or somesuch. After the -D switches are processed, the startup code can look in the dictionary for "division" to determine what to do. sys.startoptions would be a generalized way to pass parameters into subsystems which otherwise have no control of the command line. (Apache 2.0 uses this to pass params to the modules which handle request processing) By using sys.startoptions, it would also be portable to systems which don't usually use a command line (Windows, Mac, GUIs, etc); they could potentially get those options from the registry or whatever. Cheers, -g -- Greg Stein, http://www.lyra.org/ From thomas@xs4all.net Sun Sep 2 19:02:38 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sun, 2 Sep 2001 20:02:38 +0200 Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) In-Reply-To: <20010902102254.M4184@lyra.org> References: <3b915c2b.57340811@mail.wanadoo.dk> <200109020210.WAA26637@cj20424-a.reston1.va.home.com> <20010902102254.M4184@lyra.org> Message-ID: <20010902200238.P874@xs4all.nl> On Sun, Sep 02, 2001 at 10:22:54AM -0700, Greg Stein wrote: > sys.startoptions would be a generalized way to pass parameters into > subsystems which otherwise have no control of the command line. (Apache 2.0 > uses this to pass params to the modules which handle request processing) By > using sys.startoptions, it would also be portable to systems which don't > usually use a command line (Windows, Mac, GUIs, etc); they could potentially > get those options from the registry or whatever. I'm not sure this is really that useful. In my experience, by far the most common way to execute python is as a script handler, where the -D mechanism is close to useless (if you want to pass such fairly static options into the script, you might as well hardcode them several lines lower.) Contrary to Apache, Python itself isn't often started directly. As for scripts themselves, they all do their own optionparsing anyway (or should.) Back-from-vacation-and-*almost*--caught-up-with-email-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From bckfnn@worldonline.dk Sun Sep 2 19:08:04 2001 From: bckfnn@worldonline.dk (Finn Bock) Date: Sun, 02 Sep 2001 18:08:04 GMT Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) In-Reply-To: <20010902102254.M4184@lyra.org> References: <3b915c2b.57340811@mail.wanadoo.dk> <200109020210.WAA26637@cj20424-a.reston1.va.home.com> <20010902102254.M4184@lyra.org> Message-ID: <3b927419.38042722@mail.wanadoo.dk> [Greg Stein] >Note that a general -Dkey=value option would be quite nice. They could all >end in sys.startoptions or somesuch. Jython calls it sys.registry: Jython 2.1b1 on java1.4.0-beta (JIT: null) Type "copyright", "credits" or "license" for more information. >>> import sys >>> sys.registry {python.path=d:\python\Python211\Lib} >After the -D switches are processed, >the startup code can look in the dictionary for "division" to determine what >to do. > >sys.startoptions would be a generalized way to pass parameters into >subsystems which otherwise have no control of the command line. (Apache 2.0 >uses this to pass params to the modules which handle request processing) By >using sys.startoptions, it would also be portable to systems which don't >usually use a command line (Windows, Mac, GUIs, etc); they could potentially >get those options from the registry or whatever. Jython reads such options from the files "${sys.prefix}/registry" and "${HOME}/.jython" in addition to the -D options on the command line. regards, finn From guido@python.org Sun Sep 2 22:28:13 2001 From: guido@python.org (Guido van Rossum) Date: Sun, 02 Sep 2001 17:28:13 -0400 Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) In-Reply-To: Your message of "Sun, 02 Sep 2001 10:22:54 PDT." <20010902102254.M4184@lyra.org> References: <3b915c2b.57340811@mail.wanadoo.dk> <200109020210.WAA26637@cj20424-a.reston1.va.home.com> <20010902102254.M4184@lyra.org> Message-ID: <200109022128.RAA02908@cj20424-a.reston1.va.home.com> > > Would it make sense to define a Jython property for the division > > behavior? Then I could change the syntax to -Ddivision=old, > > -Ddivision=warn, -Ddivision=new. That's a bit long, but acceptable. > > This would be similar to compilers which use -Dsym[=value]. Apache has a > similar -Dsym[=value] option. > > Note that a general -Dkey=value option would be quite nice. They could all > end in sys.startoptions or somesuch. After the -D switches are processed, > the startup code can look in the dictionary for "division" to determine what > to do. > > sys.startoptions would be a generalized way to pass parameters into > subsystems which otherwise have no control of the command line. (Apache 2.0 > uses this to pass params to the modules which handle request processing) By > using sys.startoptions, it would also be portable to systems which don't > usually use a command line (Windows, Mac, GUIs, etc); they could potentially > get those options from the registry or whatever. Yes, would be nice - but I have no time for that if I want to make a release this week (or later). I think I'll go for something stupid and simple now, like just using -q to turn on division warnings, and nothing else (the -Dnew option is not very useful I think). --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@loewis.home.cs.tu-berlin.de Sun Sep 2 22:44:07 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 2 Sep 2001 23:44:07 +0200 Subject: [Python-Dev] Change from "config.h" to "pyconfig.h" may be a ticking #include bomb Message-ID: <200109022144.f82Li7e06309@mira.informatik.hu-berlin.de> > To pick up this sort of bug, perhaps "make install" should attempt > to remove config.h if it exists in the target include directory? I don't think so. Once Python 2.2 is released, it will install its headers into /include/python2.2, which will be empty since it is the first usage ever of this directory; the previous *release* installed its files into a different location. Giving the advice to people to remove their /include/python2.2, and (most of) /lib/python2.2 if they update from an earlier alpha release might be a good idea, though: there might be other removed or renamed files whose absence they would not notice. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Sun Sep 2 22:57:57 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 2 Sep 2001 23:57:57 +0200 Subject: [Python-Dev] CVS tags, unix Python and MacPython Message-ID: <200109022157.f82Lvvs06479@mira.informatik.hu-berlin.de> > Things will get worse with the 2.2a2 release, as there are a couple > of things that I definitely want in there that aren't finished yet, > so it'll be at least another week until I'm done, and moreover I'll > have to be making a few (minor) mods to files outside the Mac > subtree. =46rom a procedural, quality-guaranteeing point of view, I think you shouldn't release anything as "MacPython 2.2a2" which does not come from the 2.2a2 tag (and, as Guido explains, you should never move a tag). There is no such thing as the "Unix release" of Python 2.2a2; there is the source distribution, and there happens to be a Windows binary (which I trust came from the tagged source); everything coordinated by the release manager. IOW, if your changes missed the release date by a few days, tough luck - PEP 251 is there precisely for the purpose to align your changes with the rest of the tree, and to enable you to get a consistent, if not perfect, code base included in the alpha release. In theory, all you have to do is to compile this to get a binary distribution, perhaps with any add-on modules that you also want to distribute. It seems that nobody else is making much fuss about this, so I'll shut up now... Regards, Martin From thomas@xs4all.net Sun Sep 2 23:10:45 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 3 Sep 2001 00:10:45 +0200 Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) In-Reply-To: <200109022128.RAA02908@cj20424-a.reston1.va.home.com> References: <3b915c2b.57340811@mail.wanadoo.dk> <200109020210.WAA26637@cj20424-a.reston1.va.home.com> <20010902102254.M4184@lyra.org> <200109022128.RAA02908@cj20424-a.reston1.va.home.com> Message-ID: <20010903001045.Q874@xs4all.nl> On Sun, Sep 02, 2001 at 05:28:13PM -0400, Guido van Rossum wrote: > I think I'll go for something stupid and simple now, like just using > -q to turn on division warnings, and nothing else (the -Dnew option is > not very useful I think). I'm fine with a single option, but I'd like to note two things: we already have a command line option to turn on specific warnings, do we really need another ? And -q, to me, *screams* 'quiet', exactly the wrong thing... How about -Q instead, if we really need a separate option for it ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim.one@home.com Sun Sep 2 23:48:58 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 2 Sep 2001 18:48:58 -0400 Subject: [Python-Dev] CVS tags, unix Python and MacPython In-Reply-To: <200109022157.f82Lvvs06479@mira.informatik.hu-berlin.de> Message-ID: [Martin v. Loewis] > ... > There is no such thing as the "Unix release" of Python 2.2a2; there is > the source distribution, and there happens to be a Windows binary > (which I trust came from the tagged source); everything coordinated by > the release manager. In theory, the Windows installer for release I.JAK can be reproduced exactly by checking out the CVS tree tagged for that release. In practice, it also relies on things like the MS runtime DLLs we pack, the expat and zlib and bsddb and Tcl/Tk binaries, and the version of Wise used to build the installer. The release manager hasn't had much to do with that, though, since the last time I was release manager . Instead the PythonLabs guys all talk about this a lot before a release, and Fred cuts a special HTML-doc snapshot for me (to pack into the Windows installer), and we shoot emails back and forth in private until the Windows (me) and Unix (them) parts are ready to go. Jack doesn't have that kind of intimate access to (and veto power over!) the release process, so there's going to be some kind of compromise somewhere. I think it should be limited to post-release changes in the Mac subtree alone, and that he should tag the whole tree with mac-rIJAK for easy reproducibility later. > ... > It seems that nobody else is making much fuss about this, so I'll shut > up now... I expect all of us running primarily on Macs gave it top priority . From barry@zope.com Mon Sep 3 01:34:36 2001 From: barry@zope.com (Barry A. Warsaw) Date: Sun, 2 Sep 2001 20:34:36 -0400 Subject: [Python-Dev] Check-in process References: <20010901154136.A22702@thyrsus.com> Message-ID: <15250.53276.889277.766068@anthem.wooz.org> >>>>> "KY" == Ka-Ping Yee writes: KY> I was pretty surprised to see compilerlike.py go in. Even if KY> i had seen +1s from everyone else on python-dev, i still KY> wouldn't have been comfortable checking it in without hearing KY> from the BDFL. Yeah, usually the BDFL has to berate you for weeks to get some minor addition, like oh, mimelib, into the standard library . -Barry (Yes, Guido, I'm working on it! :) From tim.one@home.com Mon Sep 3 01:46:23 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 2 Sep 2001 20:46:23 -0400 Subject: [Python-Dev] Check-in process In-Reply-To: <15250.53276.889277.766068@anthem.wooz.org> Message-ID: [Barry A. Warsaw] > Yeah, usually the BDFL has to berate you for weeks to get some minor > addition, like oh, mimelib, into the standard library . Yeah, I usually check something into Tools/scripts/ after 50 people ask for it. Guido never says anything, but I can sense him scowling. Then after a few years he demands to move it into the library. So I do. Then he complains that the exposed API is more suitable for a tool than a library, but that we have more important things to do than twiddle the API. It's a lot of fun to play within the system ! speaking-of-which-i-should-check-combgen.py-into-Tools/scripts-ly y'rs - tim From guido@python.org Mon Sep 3 02:52:42 2001 From: guido@python.org (Guido van Rossum) Date: Sun, 02 Sep 2001 21:52:42 -0400 Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) In-Reply-To: Your message of "Mon, 03 Sep 2001 00:10:45 +0200." <20010903001045.Q874@xs4all.nl> References: <3b915c2b.57340811@mail.wanadoo.dk> <200109020210.WAA26637@cj20424-a.reston1.va.home.com> <20010902102254.M4184@lyra.org> <200109022128.RAA02908@cj20424-a.reston1.va.home.com> <20010903001045.Q874@xs4all.nl> Message-ID: <200109030152.VAA03119@cj20424-a.reston1.va.home.com> > > I think I'll go for something stupid and simple now, like just using > > -q to turn on division warnings, and nothing else (the -Dnew option is > > not very useful I think). > > I'm fine with a single option, but I'd like to note two things: we already > have a command line option to turn on specific warnings, do we really need > another ? Yes, for efficiency reasons. A call to PyErr_Warn(), even if it doesn't print anything, is very expensive (compared to a division): it calls out to the warnings.py module which goes through a list of filters etc., etc. > And -q, to me, *screams* 'quiet', exactly the wrong thing... How > about -Q instead, if we really need a separate option for it ? OK. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Mon Sep 3 07:47:19 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 3 Sep 2001 08:47:19 +0200 Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) References: <3b915c2b.57340811@mail.wanadoo.dk> <200109020210.WAA26637@cj20424-a.reston1.va.home.com> <20010902102254.M4184@lyra.org> <200109022128.RAA02908@cj20424-a.reston1.va.home.com> <20010903001045.Q874@xs4all.nl> <200109030152.VAA03119@cj20424-a.reston1.va.home.com> Message-ID: <006601c13444$47fe4070$42fb42d5@hagrid> guido wrote: > Yes, for efficiency reasons. A call to PyErr_Warn(), even if it > doesn't print anything, is very expensive (compared to a division): it > calls out to the warnings.py module which goes through a list of > filters etc., etc. you could always special-case "-Wdefault:classic float division" (it's not exactly an option everyone will use all the time, is it?) From tim.one@home.com Mon Sep 3 08:05:53 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 3 Sep 2001 03:05:53 -0400 Subject: [Python-Dev] 3-arg float pow() Message-ID: Python's builtin 3-argument pow() exists because, e.g., >>> pow(3L, 500, 7) 2L >>> can be done very much faster, and with much less memory, than >>> 3L**500 % 7 2L >>> There isn't any compelling use I know of for 3-arg pow() given float arguments, though, and what you get back is a platform-dependent accident: >>> pow(3., 500., 7.) 4.0 >>> You may get some other integer there, or an Infinity or a NaN, depending on the Python release, and the compiler used to build Python, and the configuration of your libm. That example was done under CVS 2.2a2+ on Windows; here's the same thing but under 2.2.1: >>> pow(3., 500., 7.) 0.0 >>> Since 3-argument float pow() *appears* to be at best useless, I'm taking it away in 2.2a3, unless someone can testify to a reasonable use case that's actually used. more-trouble-than-it's-worth-if-it-isn't-worth-anything-ly y'rs - tim From pma@heise.de Mon Sep 3 09:51:44 2001 From: pma@heise.de (c't-Shareware-Team) Date: Mon, 3 Sep 2001 10:51:44 +0200 (MEST) Subject: [Python-Dev] Python Message-ID: <200109030851.KAA20579@juan.heise.de> English version at the end of this e-mail Sehr geehrter Programmautor, wir betreiben unter http://www.heise.de/ct/shareware/ ein gut besuchtes Verzeichnis von Free- und Shareware- Programmen. Einer unserer Besucher hat uns auf Ihr Programm Python aufmerksam gemacht. Gerne würden wir Ihr Programm in unser Verzeichnis aufnehmen. Daher möchte ich Sie bitten Ihr Programm unter http://www.heise.de/ct/shareware/default.shtml?ad=af&v=13 einzutragen. Mit freundlichen Grüßen, Ihr c't-Shareware-Team *** Dear Program Author, At http://www.heise.de/ct/shareware/, we keep a popular Free and Shareware archive. One of our visitors has recommended your program Python. In order to be able to include your program in our archive, we would like to ask you to provide further details to your program at http://www.heise.de/ct/shareware/default.shtml?ad=af&v=13&lang=e Yours, c't Shareware Team From thomas@xs4all.net Mon Sep 3 10:52:39 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 3 Sep 2001 11:52:39 +0200 Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) In-Reply-To: <200109030152.VAA03119@cj20424-a.reston1.va.home.com> References: <3b915c2b.57340811@mail.wanadoo.dk> <200109020210.WAA26637@cj20424-a.reston1.va.home.com> <20010902102254.M4184@lyra.org> <200109022128.RAA02908@cj20424-a.reston1.va.home.com> <20010903001045.Q874@xs4all.nl> <200109030152.VAA03119@cj20424-a.reston1.va.home.com> Message-ID: <20010903115239.J872@xs4all.nl> On Sun, Sep 02, 2001 at 09:52:42PM -0400, Guido van Rossum wrote: > > I'm fine with a single option, but I'd like to note two things: we already > > have a command line option to turn on specific warnings, do we really need > > another ? > Yes, for efficiency reasons. A call to PyErr_Warn(), even if it > doesn't print anything, is very expensive (compared to a division): it > calls out to the warnings.py module which goes through a list of > filters etc., etc. Well, sure, but that says naught about not using the same *option*, just about not using the same framework :) It shouldn't be too hard to specialcase, say, -Wdiv. (or '-Wodiv' or '-Wtdiv', or whichever.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From jack@oratrix.nl Mon Sep 3 11:06:26 2001 From: jack@oratrix.nl (Jack Jansen) Date: Mon, 03 Sep 2001 12:06:26 +0200 Subject: [Python-Dev] CVS tags, unix Python and MacPython In-Reply-To: Message by "Tim Peters" , Sun, 2 Sep 2001 18:48:58 -0400 , Message-ID: <20010903100626.3B2D3303181@snelboot.oratrix.nl> > Jack doesn't have that kind of intimate access to (and veto power over!) the > release process, so there's going to be some kind of compromise somewhere. > I think it should be limited to post-release changes in the Mac subtree > alone, and that he should tag the whole tree with mac-rIJAK for easy > reproducibility later. Yes, I think I'll do just that. I don't like the proliferation of tags, but in general there's no way I can be sure to keep up with the unix/windows release process, there being other things needing my time too:-) Although for 2.2a3 this weeks things look pretty good, -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jack@oratrix.nl Mon Sep 3 13:35:49 2001 From: jack@oratrix.nl (Jack Jansen) Date: Mon, 03 Sep 2001 14:35:49 +0200 Subject: [Python-Dev] CVS tree doesn't build currently Message-ID: <20010903123549.E4E5D303181@snelboot.oratrix.nl> I can't build from the CVS tree right now (confirmed on Irix and OSX): setup.py crashes. Here's the stacktrace: CC='cc' LDSHARED='ld -shared -all' ./python -E ./setup.py build Traceback (most recent call last): File "./setup.py", line 708, in ? main() File "./setup.py", line 702, in main scripts = ['Tools/scripts/pydoc'] File "/ufs/jack/src/python/Lib/distutils/core.py", line 101, in setup _setup_distribution = dist = klass(attrs) File "/ufs/jack/src/python/Lib/distutils/dist.py", line 129, in __init__ setattr(self, method_name, getattr(self.metadata, method_name)) AttributeError: DistributionMetadata instance has no attribute 'get___doc__' -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | ++++ see http://www.xs4all.nl/~tank/ ++++ From fdrake@acm.org Mon Sep 3 14:17:23 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 3 Sep 2001 09:17:23 -0400 Subject: [Python-Dev] CVS tags, unix Python and MacPython In-Reply-To: References: <200109022157.f82Lvvs06479@mira.informatik.hu-berlin.de> Message-ID: <15251.33507.300787.668687@grendel.digicool.com> Tim Peters writes: > about this a lot before a release, and Fred cuts a special HTML-doc snapshot > for me (to pack into the Windows installer), and we shoot emails back and Sorry to burst your bubble, but those are the released versions. I just make sure you know where it is in time. ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido@python.org Mon Sep 3 15:55:25 2001 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Sep 2001 10:55:25 -0400 Subject: [Python-Dev] CVS tree doesn't build currently In-Reply-To: Your message of "Mon, 03 Sep 2001 14:35:49 +0200." <20010903123549.E4E5D303181@snelboot.oratrix.nl> References: <20010903123549.E4E5D303181@snelboot.oratrix.nl> Message-ID: <200109031455.KAA03993@cj20424-a.reston1.va.home.com> > I can't build from the CVS tree right now (confirmed on Irix and OSX): > setup.py crashes. Here's the stacktrace: > > CC='cc' LDSHARED='ld -shared -all' ./python -E ./setup.py build > Traceback (most recent call last): > File "./setup.py", line 708, in ? > main() > File "./setup.py", line 702, in main > scripts = ['Tools/scripts/pydoc'] > File "/ufs/jack/src/python/Lib/distutils/core.py", line 101, in setup > _setup_distribution = dist = klass(attrs) > File "/ufs/jack/src/python/Lib/distutils/dist.py", line 129, in __init__ > setattr(self, method_name, getattr(self.metadata, method_name)) > AttributeError: DistributionMetadata instance has no attribute 'get___doc__' Yes, I just noticed this too. When I did a cvs update, I saw no changes to the distutils tree, but these changes could be relevant: P Objects/dictobject.c P Objects/floatobject.c P Objects/intobject.c P Objects/longobject.c P Python/bltinmodule.c All of these from Tim's checkins, and Tim can't build on Unix to test this. I'll try to look into this later, but not right away -- it's a holiday here. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Mon Sep 3 20:57:24 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 3 Sep 2001 15:57:24 -0400 Subject: [Python-Dev] CVS tree doesn't build currently In-Reply-To: <20010903123549.E4E5D303181@snelboot.oratrix.nl> Message-ID: [Jack Jansen] > I can't build from the CVS tree right now (confirmed on Irix and OSX): > setup.py crashes. Here's the stacktrace: > > CC='cc' LDSHARED='ld -shared -all' ./python -E ./setup.py build > Traceback (most recent call last): > File "./setup.py", line 708, in ? > main() > File "./setup.py", line 702, in main > scripts = ['Tools/scripts/pydoc'] > File "/ufs/jack/src/python/Lib/distutils/core.py", line 101, in setup > _setup_distribution = dist = klass(attrs) > File"/ufs/jack/src/python/Lib/distutils/dist.py", line 129, in __init__ > setattr(self, method_name, getattr(self.metadata, method_name)) > AttributeError: DistributionMetadata instance has no attribute > 'get___doc__' Unless I miss my bet, Neil fixed this now, and it was due to that dir(instance) now returns the attributes of instance.__class__ in addition to the keys in instance.__dict__. Most relevant here, that means dir(instance) now contains '__doc__' (really an attribute of its class) but didn't before (and dist starting synthesizing a non-existent "get"+"__doc__" method name as a consequence). From martin@loewis.home.cs.tu-berlin.de Mon Sep 3 22:09:02 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 3 Sep 2001 23:09:02 +0200 Subject: [Python-Dev] Python Message-ID: <200109032109.f83L92t01166@mira.informatik.hu-berlin.de> [I just filled out the form at the URL; if they get multiple registrations, they probably have to pick one of them]. Liebes c't-Shareware-Team, Das klingt ja verdammt nach einer automatischen Email; mit ein wenig Recherche w=E4re wohl auch die Email-Adresse des Python-Erfinders rausgesprungen... Auf jeden Fall habe ich jetzt Euer Formular ausgef=FCllt, ich habe dabei *nicht* den Namen des Python-Erfinders angegeben; als zu ver=F6ffentlichende Adresse habe ich python-list@python.org angegeben. Diese Adresse ist ebenfalls mit einer grossen Email-Liste verbunden sowie mit comp.lang.python; ein Nutzer, der also an diese Adresse schreibt, sendet an viele Leser. MfG, Martin From guido@python.org Tue Sep 4 04:21:45 2001 From: guido@python.org (Guido van Rossum) Date: Mon, 03 Sep 2001 23:21:45 -0400 Subject: [Python-Dev] -Dwarn, long->double overflow (was RE: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.219,1.220) In-Reply-To: Your message of "Mon, 03 Sep 2001 11:52:39 +0200." <20010903115239.J872@xs4all.nl> References: <3b915c2b.57340811@mail.wanadoo.dk> <200109020210.WAA26637@cj20424-a.reston1.va.home.com> <20010902102254.M4184@lyra.org> <200109022128.RAA02908@cj20424-a.reston1.va.home.com> <20010903001045.Q874@xs4all.nl> <200109030152.VAA03119@cj20424-a.reston1.va.home.com> <20010903115239.J872@xs4all.nl> Message-ID: <200109040321.XAA05530@cj20424-a.reston1.va.home.com> > Well, sure, but that says naught about not using the same *option*, > just about not using the same framework :) It shouldn't be too hard > to specialcase, say, -Wdiv. (or '-Wodiv' or '-Wtdiv', or whichever.) Cute, but worries me a bit because normally -W options aren't syntax-checked at all by the command line processing code -- only by the warnings module (because the syntax is too complex). And this means that you don't get a warning about a bogus option at all if no warnings are ever issued. This would mean that a misspelling of -Wdiv would cause mysterious silence. Command line options aren't in such short demand that I can't pick a new letter. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Sep 4 05:04:33 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Sep 2001 00:04:33 -0400 Subject: [Python-Dev] Re: Proposal: get rid of compilerlike.py In-Reply-To: Your message of "Sat, 01 Sep 2001 17:52:10 EDT." <20010901175210.A23313@thyrsus.com> References: <200108171920.PAA10067@cj20424-a.reston1.va.home.com> <20010817154629.D1304@thyrsus.com> <200109011915.PAA25846@cj20424-a.reston1.va.home.com> <20010901154136.A22702@thyrsus.com> <200109012008.QAA26004@cj20424-a.reston1.va.home.com> <20010901175210.A23313@thyrsus.com> Message-ID: <200109040404.AAA06868@cj20424-a.reston1.va.home.com> Eric, are you interested in pursuing this discussion further? You stopped replying, but that could be due to the holiday weekend. Anyway, the conclusion seems to be: - You violated the process as commonly understood by checking in a new module without sufficient consensus. - What you checked in is no good; it needs to be redesigned and renamed. - If you absolutely do not want to use the patch manager for this, you can check it in under the nondist/sandbox part of the tree. You can move your code to the sandbox, or I can delete it for you. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@rahul.net Tue Sep 4 05:24:48 2001 From: aahz@rahul.net (Aahz Maruch) Date: Mon, 3 Sep 2001 21:24:48 -0700 (PDT) Subject: [Python-Dev] Re: Proposal: get rid of compilerlike.py In-Reply-To: <200109040404.AAA06868@cj20424-a.reston1.va.home.com> from "Guido van Rossum" at Sep 04, 2001 12:04:33 AM Message-ID: <20010904042449.4ED33E8C6@waltz.rahul.net> Guido van Rossum wrote: > > Anyway, the conclusion seems to be: > > - You violated the process as commonly understood by checking in a new > module without sufficient consensus. > > - What you checked in is no good; it needs to be redesigned and > renamed. > > - If you absolutely do not want to use the patch manager for this, you > can check it in under the nondist/sandbox part of the tree. I'd add one more thing: there seems to be enough confusion and disagreement that this probably should be a PEP. -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista I don't really mind a person having the last whine, but I do mind someone else having the last self-righteous whine. From esr@thyrsus.com Tue Sep 4 10:40:18 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Tue, 4 Sep 2001 05:40:18 -0400 Subject: [Python-Dev] Re: Proposal: get rid of compilerlike.py In-Reply-To: <200109040404.AAA06868@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Sep 04, 2001 at 12:04:33AM -0400 References: <200108171920.PAA10067@cj20424-a.reston1.va.home.com> <20010817154629.D1304@thyrsus.com> <200109011915.PAA25846@cj20424-a.reston1.va.home.com> <20010901154136.A22702@thyrsus.com> <200109012008.QAA26004@cj20424-a.reston1.va.home.com> <20010901175210.A23313@thyrsus.com> <200109040404.AAA06868@cj20424-a.reston1.va.home.com> Message-ID: <20010904054018.B31555@thyrsus.com> Guido van Rossum : > Eric, are you interested in pursuing this discussion further? You > stopped replying, but that could be due to the holiday weekend. I've been at Worldcon. Am digging my way out from under now. -- Eric S. Raymond "One of the ordinary modes, by which tyrants accomplish their purposes without resistance, is, by disarming the people, and making it an offense to keep arms." -- Constitutional scholar and Supreme Court Justice Joseph Story, 1840 From guido@python.org Tue Sep 4 14:23:06 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Sep 2001 09:23:06 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src setup.py,1.52,1.53 In-Reply-To: Your message of "Tue, 04 Sep 2001 02:05:14 PDT." References: Message-ID: <200109041323.JAA07972@cj20424-a.reston1.va.home.com> > Disabled _curses modules on MacOSX. The curses version is a 1994 BSD > curses, far too old for _cursesmodule.c. But what if someone installs their own, better version of curses, e.g. in /usr/local? Maybe you should only disable it if it's darwin *and* if it's in a standard location? Or if the header file contains a certain characteristic string? --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Tue Sep 4 14:50:18 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 4 Sep 2001 09:50:18 -0400 Subject: [Python-Dev] Re: Proposal: get rid of compilerlike.py References: <200109040404.AAA06868@cj20424-a.reston1.va.home.com> <20010904042449.4ED33E8C6@waltz.rahul.net> Message-ID: <15252.56346.551565.764214@anthem.wooz.org> >>>>> "AM" == Aahz Maruch writes: AM> I'd add one more thing: there seems to be enough confusion and AM> disagreement that this probably should be a PEP. Volunteers? From jack@oratrix.nl Tue Sep 4 15:29:37 2001 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 04 Sep 2001 16:29:37 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src setup.py,1.52,1.53 In-Reply-To: Message by Guido van Rossum , Tue, 04 Sep 2001 09:23:06 -0400 , <200109041323.JAA07972@cj20424-a.reston1.va.home.com> Message-ID: <20010904142938.1DF99303181@snelboot.oratrix.nl> > > Disabled _curses modules on MacOSX. The curses version is a 1994 BSD > > curses, far too old for _cursesmodule.c. > > But what if someone installs their own, better version of curses, > e.g. in /usr/local? Maybe you should only disable it if it's darwin > *and* if it's in a standard location? Or if the header file contains > a certain characteristic string? This is probably a good idea, but I know absolutely nothing about curses, so someone else will have to do this (I think the standard location trick isn't good enough, Apple could supply a more decent curses in a next release, so it'll have to be the string trick). And actually I don't _want_ to know anything about curses, really:-) Maybe you can reopen the bug and assign it to someone who uses curses? I think the bug isn't OSX specific, I would be very surprised if other FreeBSD's at least don't have the same problem (why would Apple take FreeBSD but put an older version of curses in, after all). -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From loewis@informatik.hu-berlin.de Tue Sep 4 15:36:29 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Tue, 4 Sep 2001 16:36:29 +0200 (MEST) Subject: [Python-Dev] python/dist/src setup.py,1.52,1.53 Message-ID: <200109041436.QAA21250@pandora.informatik.hu-berlin.de> > But what if someone installs their own, better version of curses, > e.g. in /usr/local? Maybe you should only disable it if it's darwin > *and* if it's in a standard location? Or if the header file contains > a certain characteristic string? In that case, the user could still put the appropriate instructions into Modules/Setup, which they'd probably do anyway, since they need special -I/-L options. So I agree with the notion that setup.py should only do so much autoconfiguration, and leave special cases to manual configuration through Modules/Setup. However, I dislike the frequent usage of platform in setup.py, where this patch adds another instance of. I believe the autoconf approach "test for features, not system names" is much more flexible. If the curses module requires certain feature, find a way of testing whether the features are present, and only compile _curses if they are. Then it would not be necessary to check whether the system name is "darwin1". This is problematic in particular as different spellings of the platform are expected: It is "darwin1" in some places, and "Darwin1.2" atleast in another place. Regards, Martin From guido@python.org Tue Sep 4 16:07:01 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Sep 2001 11:07:01 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src setup.py,1.52,1.53 In-Reply-To: Your message of "Tue, 04 Sep 2001 16:29:37 +0200." <20010904142938.1DF99303181@snelboot.oratrix.nl> References: <20010904142938.1DF99303181@snelboot.oratrix.nl> Message-ID: <200109041507.f84F71p03974@odiug.digicool.com> > Maybe you can reopen the bug and assign it to someone who uses > curses? Reopened, but left unassigned. (SF bug #457633 right?) > I think the bug isn't OSX specific, I would be very > surprised if other FreeBSD's at least don't have the same problem > (why would Apple take FreeBSD but put an older version of curses in, > after all). Well, maybe Apple took a copy of FreeBSD several years ago... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Sep 4 16:09:30 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Sep 2001 11:09:30 -0400 Subject: [Python-Dev] python/dist/src setup.py,1.52,1.53 In-Reply-To: Your message of "Tue, 04 Sep 2001 16:36:29 +0200." <200109041436.QAA21250@pandora.informatik.hu-berlin.de> References: <200109041436.QAA21250@pandora.informatik.hu-berlin.de> Message-ID: <200109041509.f84F9UH03992@odiug.digicool.com> > However, I dislike the frequent usage of platform in setup.py, where > this patch adds another instance of. I believe the autoconf approach > "test for features, not system names" is much more flexible. If the > curses module requires certain feature, find a way of testing whether > the features are present, and only compile _curses if they are. Then > it would not be necessary to check whether the system name is > "darwin1". This is problematic in particular as different spellings of > the platform are expected: It is "darwin1" in some places, and > "Darwin1.2" atleast in another place. Does distutils have a way to grep a .h file for a certain string? I'm sure that we could come up with a suitable string to check for (a function or macro defined in recent curses versions but not in old curses versions); but I'd hate to have to write the test by hand. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Tue Sep 4 16:48:36 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Tue, 4 Sep 2001 11:48:36 -0400 Subject: [Python-Dev] python/dist/src setup.py,1.52,1.53 In-Reply-To: <200109041509.f84F9UH03992@odiug.digicool.com>; from guido@python.org on Tue, Sep 04, 2001 at 11:09:30AM -0400 References: <200109041436.QAA21250@pandora.informatik.hu-berlin.de> <200109041509.f84F9UH03992@odiug.digicool.com> Message-ID: <20010904114836.C31072@ute.mems-exchange.org> On Tue, Sep 04, 2001 at 11:09:30AM -0400, Guido van Rossum wrote: >Does distutils have a way to grep a .h file for a certain string? I'm There's the beginning of a 'config' command, but it's largely untested, and therefore buggy. --amk From nas@python.ca Tue Sep 4 18:03:44 2001 From: nas@python.ca (Neil Schemenauer) Date: Tue, 4 Sep 2001 10:03:44 -0700 Subject: [Python-Dev] Profiling of generators broken Message-ID: <20010904100344.A11941@glacier.arctrix.com> Here's a simple test: from __future__ import generators def test(): yield 1 yield 2 import profile profile.run("list(test())") It looks like the profile module is getting confused by functions that get called once but return multiple times. My proposed solution is to modify ceval.c so that call_trace(..., PyTrace_CALL, ...) is called when a generator is resumed rather then when it is created. Sound reasonable? Neil From guido@python.org Tue Sep 4 19:28:46 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Sep 2001 14:28:46 -0400 Subject: [Python-Dev] Profiling of generators broken In-Reply-To: Your message of "Tue, 04 Sep 2001 10:03:44 PDT." <20010904100344.A11941@glacier.arctrix.com> References: <20010904100344.A11941@glacier.arctrix.com> Message-ID: <200109041828.f84ISkf05386@odiug.digicool.com> > Here's a simple test: > > from __future__ import generators > > def test(): > yield 1 > yield 2 > > import profile > profile.run("list(test())") > > It looks like the profile module is getting confused by functions that > get called once but return multiple times. My proposed solution is to > modify ceval.c so that call_trace(..., PyTrace_CALL, ...) is called > when a generator is resumed rather then when it is created. Sound > reasonable? Yes. (Caveat: I haven't looked at the code.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Sep 4 23:23:58 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Sep 2001 18:23:58 -0400 Subject: [Python-Dev] 2.2a3 release Message-ID: <200109042223.SAA08591@cj20424-a.reston1.va.home.com> I plan to release 2.2a3 this Thursday or Friday, depending on how things go. Jack, do you want to coordinate the Mac 2.2a3 release with this event? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Sep 5 04:34:37 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Sep 2001 23:34:37 -0400 Subject: [Python-Dev] new page for Newbies Message-ID: <200109050334.XAA11811@cj20424-a.reston1.va.home.com> I finally broke down and created a new page collecting the most important resources for newbies (people who are new at programming): http://www.python.org/doc/Newbies.html We get several emails per day from people asking how to get started, so I figured this definitely serves a need. Please help us making this a better resource -- send your ideas for making the page more effective to webmaster@python.org. (Note: more effective could mean removing information as well as adding!) Please add this URL to other collections of links about Python. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Wed Sep 5 06:04:04 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 5 Sep 2001 01:04:04 -0400 Subject: [Python-Dev] RE: two failing tests on Linux In-Reply-To: <200109050359.XAA12572@cj20424-a.reston1.va.home.com> Message-ID: > With the latest CVS, test_long and test_long_future fail on Linux, > because float(huge) doesn't raise OverflowError but returns inf. I'm > guessing this wasn't your intention; Interesting. Of course it wasn't my intention, but it is relying on the platform ldexp setting errno to ERANGE appropriately. Apparently glibc's does not. Worse, I see the C99 standard *never* requires a conforming libm to set errno for any call of any libm function anymore. From the C99 rationale: In C9X, errno is no longer required to be set to EDOM or ERANGE because that is an impediment to optimization. The Standard has been crafted to neither require nor preclude any popular floating-point implementation. Fucking idiots <0.01 wink>. This means none of Python's fp overflow checking anywhere is worth spit anymore, not even the CHECK macro in mathmodule.c -- to the contrary, C99 *does* require overflowing libm functions to return +- HUGE_VAL, but the CHECK macro considers that a normal (not overflow) case. Without errno to distinguish them, it's impossible to determine whether HUGE_VAL is a normal result or an overflow indicator. The 754 Cabal did their job well here (i.e., they made *portable* numeric C impossible, except across 754 boxes using C99's new 754 gimmicks). Oh well. I don't know how to fix this in general (as above, it sounds truly impossible to do so portably now, without platfrom #ifdefs). I expect I can hack something together for the specific failing tests, though (which I added specifically to find out how reliable ERANGE was -- no time like an alpha for that!). From DavidA@ActiveState.com Wed Sep 5 06:14:58 2001 From: DavidA@ActiveState.com (David Ascher) Date: Tue, 04 Sep 2001 22:14:58 -0700 Subject: [Python-Dev] REMINDER: 10th Python Conference -- Deadline reminder Message-ID: <3B95B4D2.CB487F82@ActiveState.com> The deadline for paper submissions for the 10th Python Conference is coming up soon: !!! October 8, 2001 !!! The conference will be February 4-7, 2002 in Alexandria, VA. See www.python10.org for details. Contact me if you have questions. -- David Ascher Program Chair 10th International Python Conference. From tim.one@home.com Wed Sep 5 06:42:20 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 5 Sep 2001 01:42:20 -0400 Subject: [Python-Dev] RE: two failing tests on Linux In-Reply-To: Message-ID: > With the latest CVS, test_long and test_long_future fail on Linux, Would someone who normally builds on Linux please update and run these now? I expect they'll get beyond the float(huge) tests failing for Guido, but much more new stuff is tested beyond that point (which Guido never got to). From jeremy@zope.com Wed Sep 5 06:52:11 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Wed, 5 Sep 2001 01:52:11 -0400 (EDT) Subject: [Python-Dev] RE: two failing tests on Linux In-Reply-To: References: Message-ID: <15253.48523.750508.222315@slothrop.digicool.com> >>>>> "TP" == Tim Peters writes: >> With the latest CVS, test_long and test_long_future fail on >> Linux, TP> Would someone who normally builds on Linux please update and run TP> these now? I expect they'll get beyond the float(huge) tests TP> failing for Guido, but much more new stuff is tested beyond that TP> point (which Guido never got to). test_long and test_long_future both work on my Linux box. Jeremy From loewis@informatik.hu-berlin.de Wed Sep 5 09:49:54 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 5 Sep 2001 10:49:54 +0200 (MEST) Subject: [Python-Dev] python/dist/src setup.py,1.52,1.53 In-Reply-To: <200109041509.f84F9UH03992@odiug.digicool.com> (message from Guido van Rossum on Tue, 04 Sep 2001 11:09:30 -0400) References: <200109041436.QAA21250@pandora.informatik.hu-berlin.de> <200109041509.f84F9UH03992@odiug.digicool.com> Message-ID: <200109050849.KAA06559@pandora.informatik.hu-berlin.de> > Does distutils have a way to grep a .h file for a certain string? I'm > sure that we could come up with a suitable string to check for (a > function or macro defined in recent curses versions but not in old > curses versions); but I'd hate to have to write the test by hand. Even if it works, we may find that this approach causes the module not to be build on systems which would support it. The problem is that curses.h may include other header files which contain the definition we are looking for. The only real test is the one that autoconf uses, i.e. AC_TRY_COMPILE. Supporting that in distutils is probably something that needs careful planning. Regards, Martin From jack@oratrix.nl Wed Sep 5 10:10:26 2001 From: jack@oratrix.nl (Jack Jansen) Date: Wed, 05 Sep 2001 11:10:26 +0200 Subject: [Python-Dev] 2.2a3 release In-Reply-To: Message by Guido van Rossum , Tue, 04 Sep 2001 18:23:58 -0400 , <200109042223.SAA08591@cj20424-a.reston1.va.home.com> Message-ID: <20010905091026.A2C20303181@snelboot.oratrix.nl> > I plan to release 2.2a3 this Thursday or Friday, depending on how > things go. > > Jack, do you want to coordinate the Mac 2.2a3 release with this event? Just plod along, don't tag the Mac subtree, I'll do that later. I think things are up-to-date enough that I'll be able to build with the machine-independent files from the 2.2a3 tree, otherwise I'll add another tag. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From thomas.heller@ion-tof.com Wed Sep 5 14:02:56 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 5 Sep 2001 15:02:56 +0200 Subject: PEP250 implemented for bdist_wininst (was: Re: [Python-Dev] PEP 250) References: <20010831093519.A16031@gerg.ca> Message-ID: <03b901c1360b$16353350$e000a8c0@thomasnotebook> From: "Greg Ward" > Thomas -- > > have you been following the saga of patch #449054, part of the PEP 250 > implementation. If I understand things correctly, this patch only > partly implements PEP 250 -- the rest is waiting on a change to > bdist_wininst. What's the status of this change? Going to make it in? > > Please see > > http://sourceforge.net/tracker/?func=detail&atid=305470&aid=449054&group_id=5470 > > if you haven't already... > > Greg I've just checked in the changes needed for the bdist_wininst command. Thomas From skip@pobox.com (Skip Montanaro) Wed Sep 5 15:24:48 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 5 Sep 2001 09:24:48 -0500 Subject: [Python-Dev] python/dist/src setup.py,1.52,1.53 In-Reply-To: <200109050849.KAA06559@pandora.informatik.hu-berlin.de> References: <200109041436.QAA21250@pandora.informatik.hu-berlin.de> <200109041509.f84F9UH03992@odiug.digicool.com> <200109050849.KAA06559@pandora.informatik.hu-berlin.de> Message-ID: <15254.13744.248551.696405@localhost.localdomain> >> Does distutils have a way to grep a .h file for a certain string? >> I'm sure that we could come up with a suitable string to check for (a >> function or macro defined in recent curses versions but not in old >> curses versions); but I'd hate to have to write the test by hand. Martin> Even if it works, we may find that this approach causes the Martin> module not to be build on systems which would support it. Or it might be #ifdef'd into or out of existence. Grep's just not going to cut it. Martin> The problem is that curses.h may include other header files Martin> which contain the definition we are looking for. The only real Martin> test is the one that autoconf uses, i.e. AC_TRY_COMPILE. Martin> Supporting that in distutils is probably something that needs Martin> careful planning. Part of that careful planning should be to evaluate the tradeoff between making this work in setup.py and getting autoconf and friends to play on the platforms we are interested in. Skip From skip@pobox.com (Skip Montanaro) Wed Sep 5 15:55:21 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Wed, 5 Sep 2001 09:55:21 -0500 Subject: [Python-Dev] RE: two failing tests on Linux In-Reply-To: References: Message-ID: <15254.15577.769030.480148@localhost.localdomain> >> With the latest CVS, test_long and test_long_future fail on Linux, Tim> Would someone who normally builds on Linux please update and run Tim> these now? On my Mandrake 8.0 laptop I get the usual (for me) suspects: 1 test failed: test_linuxaudiodev 15 tests skipped: test_al test_cd test_cl test_dl test_gl test_imgfile test_largefile test_nis test_ntpath test_socket_ssl test_socketserver test_sunaudiodev test_unicode_file test_winreg test_winsound Those skips are all expected on linux2. S From guido@python.org Wed Sep 5 16:10:05 2001 From: guido@python.org (Guido van Rossum) Date: Wed, 05 Sep 2001 11:10:05 -0400 Subject: [Python-Dev] pycnfig.h.in and UW7 Message-ID: <200109051510.LAA16821@cj20424-a.reston1.va.home.com> Hi Martin, Reviewing the checkin message, I just saw some stuff that you added disappear: *************** *** 716,725 **** #define DL_EXPORT(RTYPE) __declspec(dllexport) RTYPE #endif - #endif - - /* Define the macros needed if on a UnixWare 7.x system. */ - #if defined(__USLC__) && defined(__SCO_VERSION__) - #define SCO_ACCEPT_BUG /* Use workaround for UnixWare accept() bug */ - #define SCO_ATAN2_BUG /* Use workaround for UnixWare atan2() bug */ - #define STRICT_SYSV_CURSES /* Don't use ncurses extensions */ #endif --- 722,724 ---- The removed lines were added by you a few minutes earlier. I did a full cvs update, ran autoconf and autoheader, and checked in the resulting files (after testing everything first). AFAIK, pyconfig.h.in is a generated file, written by autoheader, so those #defines that you added won't stick. I'm not sure where they *should* go though. The normal way to get a variable defined in pyconfig.h.in is complicated; you have to put a template using #undef in acconfig.h.in, and call AC_DEFINE() in configure.in. I think there are a few places in configure.in where -D options are added to OPT or to CC -- I'm not sure why, it could be that the author of the patch didn't know the proper way, or it could be there was a special reason. --Guido van Rossum (home page: http://www.python.org/~guido/) From loewis@informatik.hu-berlin.de Wed Sep 5 16:23:53 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 5 Sep 2001 17:23:53 +0200 (MEST) Subject: [Python-Dev] Re: pycnfig.h.in and UW7 In-Reply-To: <200109051510.LAA16821@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Wed, 05 Sep 2001 11:10:05 -0400) References: <200109051510.LAA16821@cj20424-a.reston1.va.home.com> Message-ID: <200109051523.RAA22695@pandora.informatik.hu-berlin.de> > The removed lines were added by you a few minutes earlier. I did a > full cvs update, ran autoconf and autoheader, and checked in the > resulting files (after testing everything first). > > AFAIK, pyconfig.h.in is a generated file, written by autoheader, so > those #defines that you added won't stick. I'm not sure where they > *should* go though. Thanks for the notice. The right place is acconfig.h, which is the template for autoheader (autoheader also requires to put template definitions in there if it cannot determine the comment above itself). I've now corrected that. > I think there are a few places in configure.in where -D options are > added to OPT or to CC -- I'm not sure why, it could be that the author > of the patch didn't know the proper way, or it could be there was a > special reason. I guess it's both. Getting rid of -DINET6 is still on my agenda. In some cases, people apparently don't trust that pyconfig.h is always included before any system header. If that is guaranteed, I cannot think of any further special reason, unless there is some compiler that treats -D special (beyond defining the symbol). Regards, Martin From guido@python.org Wed Sep 5 16:28:06 2001 From: guido@python.org (Guido van Rossum) Date: Wed, 05 Sep 2001 11:28:06 -0400 Subject: [Python-Dev] Re: pycnfig.h.in and UW7 In-Reply-To: Your message of "Wed, 05 Sep 2001 17:23:53 +0200." <200109051523.RAA22695@pandora.informatik.hu-berlin.de> References: <200109051510.LAA16821@cj20424-a.reston1.va.home.com> <200109051523.RAA22695@pandora.informatik.hu-berlin.de> Message-ID: <200109051528.LAA16885@cj20424-a.reston1.va.home.com> > > The removed lines were added by you a few minutes earlier. I did a > > full cvs update, ran autoconf and autoheader, and checked in the > > resulting files (after testing everything first). > > > > AFAIK, pyconfig.h.in is a generated file, written by autoheader, so > > those #defines that you added won't stick. I'm not sure where they > > *should* go though. > > Thanks for the notice. The right place is acconfig.h, which is the > template for autoheader (autoheader also requires to put template > definitions in there if it cannot determine the comment above itself). > I've now corrected that. Thanks. I wasn't aware that you could have regular old #ifdef and stuff in acconfig.h. How does autoconf decide which portions of the file to copy to pyconfig.h.in? > > I think there are a few places in configure.in where -D options are > > added to OPT or to CC -- I'm not sure why, it could be that the author > > of the patch didn't know the proper way, or it could be there was a > > special reason. > > I guess it's both. Getting rid of -DINET6 is still on my agenda. In > some cases, people apparently don't trust that pyconfig.h is always > included before any system header. If that is guaranteed, I cannot > think of any further special reason, unless there is some compiler > that treats -D special (beyond defining the symbol). I think that mistrust is mistaken -- most system headers are included by Python.h, which includes pyconfig.h before any system headers. We can make a new rule: "Python.h must be included first, instead of most system headers and before any other system headers are included." --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Wed Sep 5 16:31:15 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 5 Sep 2001 11:31:15 -0400 Subject: [Python-Dev] Re: pycnfig.h.in and UW7 In-Reply-To: <200109051528.LAA16885@cj20424-a.reston1.va.home.com> References: <200109051510.LAA16821@cj20424-a.reston1.va.home.com> <200109051523.RAA22695@pandora.informatik.hu-berlin.de> <200109051528.LAA16885@cj20424-a.reston1.va.home.com> Message-ID: <15254.17731.761319.829763@grendel.digicool.com> Guido van Rossum writes: > I think that mistrust is mistaken -- most system headers are included > by Python.h, which includes pyconfig.h before any system headers. We > can make a new rule: "Python.h must be included first, instead of most > system headers and before any other system headers are included." Please file a documentation bug so that I won't forget to add this information to the C API and Extending & Embedding manuals. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From loewis@informatik.hu-berlin.de Wed Sep 5 17:58:20 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 5 Sep 2001 18:58:20 +0200 (MEST) Subject: [Python-Dev] Re: pycnfig.h.in and UW7 In-Reply-To: <200109051528.LAA16885@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Wed, 05 Sep 2001 11:28:06 -0400) References: <200109051510.LAA16821@cj20424-a.reston1.va.home.com> <200109051523.RAA22695@pandora.informatik.hu-berlin.de> <200109051528.LAA16885@cj20424-a.reston1.va.home.com> Message-ID: <200109051658.SAA00396@pandora.informatik.hu-berlin.de> > Thanks. I wasn't aware that you could have regular old #ifdef and > stuff in acconfig.h. How does autoconf decide which portions of the > file to copy to pyconfig.h.in? =46rom the autoconf documentation The file that `autoheader' creates contains mainly `#define' and `#undef' statements and their accompanying comments. If `./acconfig.h' contains the string `@TOP@', `autoheader' copies the lines before the line containing `@TOP@' into the top of the file that it generates. Similarly, if `./acconfig.h' contains the string `@BOTTOM@', `autoheader' copies the lines after that line to the end of the file it generates. Either or both of those strings may be omitted. We currently use the BOTTOM part only. > I think that mistrust is mistaken -- most system headers are included > by Python.h, which includes pyconfig.h before any system headers. We > can make a new rule: "Python.h must be included first, instead of most > system headers and before any other system headers are included." Sounds good to me. Regards, Martin From tim.one@home.com Wed Sep 5 21:53:15 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 5 Sep 2001 16:53:15 -0400 Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Objects complexobject.c,2.42,2.43 In-Reply-To: Message-ID: [Martin v. L?wis] > Update of /cvsroot/python/python/dist/src/Objects > In directory usw-pr-cvs1:/tmp/cvs-serv12367/Objects > > Modified Files: > complexobject.c > Log Message: > Patch #453627: Define the following macros when compiling on a > UnixWare 7.x system: > SCO_ATAN2_BUG, SCO_ACCEPT_BUG, and STRICT_SYSV_CURSES. > Work aroudn a bug in the SCO UnixWare atan2() implementation. and so on. What's the plan for removing this stuff again when UnixWare fixes their bugs? From martin@loewis.home.cs.tu-berlin.de Wed Sep 5 22:44:02 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 5 Sep 2001 23:44:02 +0200 Subject: [Python-Dev] python/dist/src/Objects complexobject.c,2.42,2.43 Message-ID: <200109052144.f85Li2408003@mira.informatik.hu-berlin.de> > What's the plan for removing this stuff again when UnixWare fixes > their bugs? There is no plan for that at the moment. Please have a look at http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=450710 for prior discussion on the matter; the submitter has considered coming up with an autoconf bug test for the problem, but hasn't followed up with that - testing for the system apparently turned out to be the easier route. One possible approach is that UnixWare releases are regularly tested for the presence of the bug. That would be simplified if the nature of the problem was documented somewhere, that also hasn't happened. To retest, it might be easiest to disable the SCO_ macros, and re-run the test suite; test_math will fail if the bug is still present. Once the version of UnixWare is identified that first fixed the bugs, it hopefully is possible to check the value of __SCO_VERSION__ to tell apart the good, the bad, and the ugly. Regards, Martin From tim.one@home.com Wed Sep 5 23:21:19 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 5 Sep 2001 18:21:19 -0400 Subject: [Python-Dev] RE: python/dist/src/Objects complexobject.c,2.42,2.43 In-Reply-To: <200109052144.f85Li2408003@mira.informatik.hu-berlin.de> Message-ID: [Tim] > What's the plan for removing this stuff again when UnixWare fixes > their bugs? [Martin v. Loewis] > There is no plan for that at the moment. Please have a look at > > http://sf.net/tracker/?group_id=5470&atid=305470&func=detail&aid=450710 > > for prior discussion on the matter; Ya, I knew about that -- it was a rhetorical question . > [wishful thinking deleted] For platforms that Python developers build on every day (Linux and Windows, possibly Macs), and tens or hundreds of thousands of people run on, such #ifdef crud probably will go away someday, but I suspect these temporary UnixWare hacks will infect the code base forever. So I'd rather let tests fail there, and simply note in the README that such failures are due to UnixWare bugs and will go away when UnixWare fixes them. That's what we do with, e.g., SGI -O bugs. As is, the code is obfuscated forever, and may well just be swapping one set of bugs for another. For example, the replacement for atan2 isn't correct, as it ignores the sign of the zero, and atan2 on a 754 box should pay attention to that. >>> from math import atan2 >>> z = 0.0 >>> atan2(z, -1) 3.1415926535897931 >>> atan2(-z, -1) -3.1415926535897931 >>> So now we've got a different UnixWare bug, which will persist even after UnixWare fixes *their* errors because we're #ifdef'ing their implementation away. I don't view that as a problem unique to this patch set, but as a predictable (in outline) consequence of trying to worm around minority-platform bugs via #ifdef. From gward@python.net Thu Sep 6 03:53:11 2001 From: gward@python.net (Greg Ward) Date: Wed, 5 Sep 2001 22:53:11 -0400 Subject: [Python-Dev] python/dist/src setup.py,1.52,1.53 In-Reply-To: <200109050849.KAA06559@pandora.informatik.hu-berlin.de>; from loewis@informatik.hu-berlin.de on Wed, Sep 05, 2001 at 10:49:54AM +0200 References: <200109041436.QAA21250@pandora.informatik.hu-berlin.de> <200109041509.f84F9UH03992@odiug.digicool.com> <200109050849.KAA06559@pandora.informatik.hu-berlin.de> Message-ID: <20010905225311.A4895@gerg.ca> On 05 September 2001, Martin von Loewis said: > Even if it works, we may find that this approach causes the module not > to be build on systems which would support it. The problem is that > curses.h may include other header files which contain the definition > we are looking for. The only real test is the one that autoconf uses, > i.e. AC_TRY_COMPILE. Supporting that in distutils is probably > something that needs careful planning. The distutils "config" command was precisely that: a start at reimplementing Autoconf in Python. I didn't get very far, but I got far enough to convince myself that it's very much worth doing. For example, here's my analog to AC_TRY_COMPILE, the try_compile() method of the distutils.command.config.config class: def try_compile (self, body, headers=None, include_dirs=None, lang="c"): """Try to compile a source file built from 'body' and 'headers'. Return true on success, false otherwise. """ from distutils.ccompiler import CompileError self._check_compiler() try: self._compile(body, headers, lang) ok = 1 except CompileError: ok = 0 self.announce(ok and "success!" or "failure.") self._clean() return ok I'd much rather write Autoconf (and Autoconf scripts) in Python than in Bourne shell! I ran out of steam on the "config" command about the same time I ran out of steam on the Distutils in general, ie. right around when Python 2.0 was released. Sigh. The idea is sound, the implementation is started, it just needs to be carried through. The main problem when I left off is what to do with config info between runs of the setup script; if it's treated just like any other Distutils command, you'd end up doing the equivalent of re-running "configure" every time your run "make", which would be mind-blowingly bogus. I suspect the answer is to drop a pickle of the configuration state somewhere. There's also a lot of grunt-work coding required to implement a usable subset of Autoconf. Greg -- Greg Ward - Unix weenie gward@python.net http://starship.python.net/~gward/ Jesus Saves -- but Moses gets the rebound, he shoots, he SCORES! From bga@mug.org Thu Sep 6 04:07:27 2001 From: bga@mug.org (Billy G. Allie) Date: Wed, 05 Sep 2001 23:07:27 -0400 Subject: [Python-Dev] Re: python/dist/src/Objects complexobject.c,2.42,2.43 In-Reply-To: Message from "Martin v. Loewis" of "Wed, 05 Sep 2001 23:44:02 +0200." <200109052144.f85Li2408003@mira.informatik.hu-berlin.de> Message-ID: <200109060307.f8637RU25850@bajor.mug.org> --==_Exmh_530291580P Content-Type: text/plain; charset=us-ascii "Martin v. Loewis" wrote: > > What's the plan for removing this stuff again when UnixWare fixes > > their bugs? > > There is no plan for that at the moment. Please have a look at > > http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=450 > 710 > > for prior discussion on the matter; the submitter has considered > coming up with an autoconf bug test for the problem, but hasn't > followed up with that - testing for the system apparently turned out > to be the easier route. Easier only in that I am a beginner when it comes to using autoconf. If someone can point me in the right direction, I will work on such a test. I also try to keep current on patches with UnixWare. If I find that the bug has been fixed, I can back out the SCO specific changes. -- ____ | Billy G. Allie | Domain....: Bill.Allie@mug.org | /| | 7436 Hartwell | MSN.......: B_G_Allie@email.msn.com |-/-|----- | Dearborn, MI 48126| |/ |LLIE | (313) 582-1540 | --==_Exmh_530291580P Content-Type: application/pgp-signature -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Content-Type: text/plain; charset=us-ascii "Martin v. Loewis" wrote: > > What's the plan for removing this stuff again when UnixWare fixes > > their bugs? > > There is no plan for that at the moment. Please have a look at > > http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=450 > 710 > > for prior discussion on the matter; the submitter has considered > coming up with an autoconf bug test for the problem, but hasn't > followed up with that - testing for the system apparently turned out > to be the easier route. Easier only in that I am a beginner when it comes to using autoconf. If someone can point me in the right direction, I will work on such a test. I also try to keep current on patches with UnixWare. If I find that the bug has been fixed, I can back out the SCO specific changes. - -- ____ | Billy G. Allie | Domain....: Bill.Allie@mug.org | /| | 7436 Hartwell | MSN.......: B_G_Allie@email.msn.com |-/-|----- | Dearborn, MI 48126| |/ |LLIE | (313) 582-1540 | -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.2 (UnixWare) Comment: Exmh version 2.2 06/23/2000 iD8DBQE7luhunmIkMXoVVdURAtupAKDdv2B4lQjXi9KmqbQefaMRWBeK4ACfbFnx 4fHUBl0tdBovepH3zF2ukpU= =1vFy -----END PGP SIGNATURE----- --==_Exmh_530291580P-- From guido@python.org Thu Sep 6 10:24:42 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Sep 2001 05:24:42 -0400 Subject: [Python-Dev] python/dist/src setup.py,1.52,1.53 In-Reply-To: Your message of "Wed, 05 Sep 2001 22:53:11 EDT." <20010905225311.A4895@gerg.ca> References: <200109041436.QAA21250@pandora.informatik.hu-berlin.de> <200109041509.f84F9UH03992@odiug.digicool.com> <200109050849.KAA06559@pandora.informatik.hu-berlin.de> <20010905225311.A4895@gerg.ca> Message-ID: <200109060924.FAA00884@cj20424-a.reston1.va.home.com> > I'd much rather write Autoconf (and Autoconf scripts) in Python than in > Bourne shell! Of course, there's the slight problem that you can't write the autoconfiguration for Python itself in Python. This reduces the motivation, I suppose. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Wed Sep 5 04:34:37 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 04 Sep 2001 23:34:37 -0400 Subject: [Python-Dev] new page for Newbies Message-ID: I finally broke down and created a new page collecting the most important resources for newbies (people who are new at programming): http://www.python.org/doc/Newbies.html We get several emails per day from people asking how to get started, so I figured this definitely serves a need. Please help us making this a better resource -- send your ideas for making the page more effective to webmaster@python.org. (Note: more effective could mean removing information as well as adding!) Please add this URL to other collections of links about Python. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@pythonware.com Thu Sep 6 17:11:48 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 6 Sep 2001 18:11:48 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? Message-ID: <001801c136ee$b0f224a0$42fb42d5@hagrid> is there any reason why "print" cannot pass unicode strings on to the underlying write method? it would be really nice if this piece of code worked as expected: import sys class Wrapper: def __init__(self, file): self.file =3D file def write(self, data): self.file.write(data.encode("iso-8859-1", "replace")) sys.stdout =3D Wrapper(sys.stdout) print u"=E5=E4=F6" under 2.2a2 (and earlier versions), this gives me: Traceback (most recent call last): File "\test.py", line 8, in ? print u"=E5=E4=F6" UnicodeError: ASCII encoding error: ordinal not in range(128) (I'm willing to implement this, if the BDFL says so) From guido@python.org Thu Sep 6 17:12:03 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Sep 2001 12:12:03 -0400 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: Your message of "Thu, 06 Sep 2001 18:11:48 +0200." <001801c136ee$b0f224a0$42fb42d5@hagrid> References: <001801c136ee$b0f224a0$42fb42d5@hagrid> Message-ID: <200109061612.f86GC6d16137@odiug.digicool.com> > is there any reason why "print" cannot pass unicode > strings on to the underlying write method? Mostly that currently print religiously calls PyObject_Str() on the objects to print, which always converts to 8-bit strings. > it would be really nice if this piece of code worked as > expected: > > import sys > > class Wrapper: > def __init__(self, file): > self.file = file > def write(self, data): > self.file.write(data.encode("iso-8859-1", "replace")) > > sys.stdout = Wrapper(sys.stdout) > > print u"åäö" > > under 2.2a2 (and earlier versions), this gives me: > > Traceback (most recent call last): > File "\test.py", line 8, in ? > print u"åäö" > UnicodeError: ASCII encoding error: ordinal not in range(128) > > (I'm willing to implement this, if the BDFL says so) I think this would be a good idea, but please fix the various outstanding SRE bugs first. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From loewis@informatik.hu-berlin.de Thu Sep 6 17:56:57 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Thu, 6 Sep 2001 18:56:57 +0200 (MEST) Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? Message-ID: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> > is there any reason why "print" cannot pass unicode > strings on to the underlying write method? Mostly because there is no guarantee that every .write method will support Unicode objects. I see two options: either a stream might declare itself as supporting unicode on output (by, say, providing a unicode attribute), or all streams are required by BDFL pronouncement to accept Unicode objects. BTW, your wrapper example can be rewritten as import sys,codecs sys.stdout = codecs.lookup("iso-8859-1")[3](sys.stdout) I wish codecs.lookup returned a record with named fields, instead of a list, so I could write sys.stdout = codecs.lookup("iso-8859-1").writer(sys.stdout) (the other field names would be encode,decode, and reader). Regards, Martin From nas@python.ca Thu Sep 6 19:19:35 2001 From: nas@python.ca (Neil Schemenauer) Date: Thu, 6 Sep 2001 11:19:35 -0700 Subject: [Python-Dev] re-ordering sys.path Message-ID: <20010906111935.A18817@glacier.arctrix.com> jerome.marant@free.fr (Jérôme Marant) in debian-python@lists.debian.org: > I am the python-unit maintainer (upstream program called PyUnit) > for python 1.5 and 2.0. > PyUnit has been included in Python 2.1. but I won't give it away > since it does evolve separately from Python releases. > > However, because of PYTHONPATH, the latest version of the module > will never be loaded by the interpreter. > > In 2.0 (and quite the same in 2.1), PYTHONPATH looks like: > > amboise:~$ python2 -c 'import sys; print sys.path' > ['', > '/usr/lib/python2.0', > '/usr/lib/python2.0/plat-linux2', > '/usr/lib/python2.0/lib-dynload', > '/usr/local/lib/python2.0/site-packages', > '/usr/local/lib/site-python', > '/usr/lib/python2.0/site-packages', > '/usr/lib/site-python'] > > So, if the interpreter find the module in its core modules, it > will never see that a newer version of the module was installed > in site-packages. > > Usually, packages installed separately (site-packages) from the > core python modules are more recent that the same modules of > the core, because they evolve faster than python releases. > Manual installations of modules (/usr/local) are usually done > when packages are not up-to-date, so more recent than site modules. > > This is a well known problem since python-xml (PyXML) is both > in 2.0 core and in a separate package and PyXML people have > implemented an ugly hack to work this around. > > So, I'm proposition to reorganize the PYTHONPATH like this : > > ['', > '/usr/local/lib/python2.0/site-packages', > '/usr/local/lib/site-python', > '/usr/lib/site-python' > '/usr/lib/python2.0/site-packages', > '/usr/lib/python2.0', > '/usr/lib/python2.0/plat-linux2', > '/usr/lib/python2.0/lib-dynload', > > ] > > I know that it could lead to some problems but not > that much I think. > > Thanks in advance for your comments. > > PS: it would be usefull to talk to Brendan O'Dea who chose the > same ordering for Perl packages. Will this work? If so, why is it not already done? Neil From guido@python.org Thu Sep 6 19:27:19 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Sep 2001 14:27:19 -0400 Subject: [Python-Dev] re-ordering sys.path In-Reply-To: Your message of "Thu, 06 Sep 2001 11:19:35 PDT." <20010906111935.A18817@glacier.arctrix.com> References: <20010906111935.A18817@glacier.arctrix.com> Message-ID: <200109061827.f86IRJg17025@odiug.digicool.com> > > So, I'm proposition to reorganize the PYTHONPATH like this : ["site-packages first" proposal snipped] (Why do people write PYTHONPATH when they mean sys.path?) > Will this work? If so, why is it not already done? I think it will work just fine. I suppose I was afraid that people would abuse this to override standard modules that will break other stuff in subtle ways, but they can do that anyway using the $PYTHONPATH environment variable. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas.heller@ion-tof.com Thu Sep 6 19:47:57 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Thu, 6 Sep 2001 20:47:57 +0200 Subject: [Python-Dev] re-ordering sys.path References: <20010906111935.A18817@glacier.arctrix.com> Message-ID: <00d001c13704$726c1f90$e000a8c0@thomasnotebook> > > However, because of PYTHONPATH, the latest version of the module > > will never be loaded by the interpreter. Distutils has exactly the same problem (every module/package shipped int the standard python library has it). Here's an excerpt from the README: There's generally no need to install the Distutils under Python 1.6/2.0/2.1. However, if you'd like to upgrade the Distutils in your Python 1.6 installation, or if future Distutils releases get ahead of the Distutils included with Python, you might want to install a newer Distutils release into your Python installation's library. To do this, you'll need to hide the original Distutils package directory from Python, so it will find the version you install. For example, under a typical Unix installation, the "stock" Distutils directory is /usr/local/lib/python1.6/distutils; you could hide this from Python as follows: cd /usr/local/lib/python1.6 # or 2.0 mv distutils distutils-orig On Windows, the stock Distutils installation is "Lib\distutils" under the Python directory ("C:\Python" by default with Python 1.6a2 and later). Again, you should just rename this directory, eg. to "distutils-orig", so that Python won't find it. Once you have renamed the stock Distutils directory, you can install the Distutils as described above. Thomas From tommy@ilm.com Thu Sep 6 20:28:14 2001 From: tommy@ilm.com (The Intermission Elf) Date: Thu, 6 Sep 2001 12:28:14 -0700 (PDT) Subject: [Python-Dev] very long python files In-Reply-To: <3B97CA3A.7408D4B4@ilm.com> References: <3B97CA3A.7408D4B4@ilm.com> Message-ID: <15255.52672.593538.948644@mace.lucasdigital.com> Hey Folks, I just got this message from a developer inside of ILM today. Before I start digging, can anyone tell me if this has been dealt with in any version of python more recent than 1.5.2? (and where I might look for this info myself- I'm very unfamiliar with sourceforge and its bug tracker)... thanks! Garrick Meeker writes: | ******** is starting to produce very long project files (over 32K lines) | which it currently can't read. (Because curves are imbedded in the | files, it's not difficult to hit that limit.) | | Here's what I've found so far (in both python 1.4 and 1.5): | | ******** read the file by calling PyRun_SimpleFile. This byte compiles | the entire file into memory and then runs that (which is probably not | ideal). | | The file is tokenized into nodes that have a 'short' for the line | number, so this wraps around to negative numbers. | | The line number is entered into the byte code with a call to | 'com_addint', which calls: | | com_addbyte(c, x & 0xff); | com_addbyte(c, x >> 8); /* XXX x should be positive */ | | (notice that 'x >> 8' for a negative number will be a negative number). | | com_addbyte has a sanity check: | | if (byte < 0 || byte > 255) { | com_error(c, PyExc_SystemError, "com_addbyte: byte out of range"); | | This is what prevents ******** from loading long files. The options I | see are: | | Fix our build of python to fix this problem. We can't change the line | number to 'int' because we'd be changing the byte code, but we could | stop the number from wrapping around. | | Manually read each line from the file and feed it to the interpreter as | if we were interactive. We'd lose the line number information if | there's an error in the file (but we can't save numbers over 32K | anyway). I think it would also be friendlier on memory because it | wouldn't byte compile the entire file first. There is a call to pass | strings in this manner, but I don't know if it will actually work this | way. | | Tommy, have you heard of this before? I know I still missing part of | the story because I can create a simple file with 40K lines of: | | a = 0 | | and python accepts it without error. From fdrake@acm.org Thu Sep 6 20:34:38 2001 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 6 Sep 2001 15:34:38 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010906193438.C242828845@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Documentation for 2.2 alpha 3. From thomas@xs4all.net Thu Sep 6 21:02:01 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 6 Sep 2001 22:02:01 +0200 Subject: [Python-Dev] very long python files In-Reply-To: <15255.52672.593538.948644@mace.lucasdigital.com> References: <3B97CA3A.7408D4B4@ilm.com> <15255.52672.593538.948644@mace.lucasdigital.com> Message-ID: <20010906220201.I2068@xs4all.nl> On Thu, Sep 06, 2001 at 12:28:14PM -0700, The Intermission Elf wrote: > I just got this message from a developer inside of ILM today. Before > I start digging, can anyone tell me if this has been dealt with in any > version of python more recent than 1.5.2? (and where I might look for > this info myself- I'm very unfamiliar with sourceforge and its bug > tracker)... Yes, it's fixed. As for how to find it, well, that can prove a bit challenging, as there are already hundreds of closed bugreports, and only a few of them deal with this ;P The easiest way is to use CVS. You know the code is in compile.c, so a quick browse of the CVS log might show it: ---------------------------- revision 2.133 date: 2000/08/24 00:32:09; author: fdrake; state: Exp; lines: +32 -2 Charles G. Waldman : Add the EXTENDED_ARG opcode to the virtual machine, allowing 32-bit arguments to opcodes instead of being forced to stick to the 16-bit limit. This is especially useful for machine-generated code, which can be too long for the SET_LINENO parameter to fit into 16 bits. This closes the implementation portion of SourceForge patch #100893. ---------------------------- (Checking the version numbers you'll see that it was fixed between Python 1.6-final and Python 2.0b1. The fix should be pretty backportable, if you want to insert it into 1.5.2 or older.... but that would kind of defeat the point of releases.) Or you can browse the bug or patch database online, first the open bugs/patches, and then the closed bugs/patches... but though in this case there's an entry in both the (closed) lists, it isn't true for all bugs, so checking the CVS logs is more fool-proof. Frankly, asking someone here might be the easiest solution :-) Of course, the best test is to copy one of those large files over to a machine with a recent Python, and checking to see if it works. Maybe ******** generates files so big, or big in a different way, that it still doesn't work. > Garrick Meeker writes: > | ******** is starting to produce very long project files (over 32K lines) > | which it currently can't read. (Because curves are imbedded in the > | files, it's not difficult to hit that limit.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From gward@python.net Thu Sep 6 21:56:05 2001 From: gward@python.net (Greg Ward) Date: Thu, 6 Sep 2001 16:56:05 -0400 Subject: [Python-Dev] python/dist/src setup.py,1.52,1.53 In-Reply-To: <200109060924.FAA00884@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Sep 06, 2001 at 05:24:42AM -0400 References: <200109041436.QAA21250@pandora.informatik.hu-berlin.de> <200109041509.f84F9UH03992@odiug.digicool.com> <200109050849.KAA06559@pandora.informatik.hu-berlin.de> <20010905225311.A4895@gerg.ca> <200109060924.FAA00884@cj20424-a.reston1.va.home.com> Message-ID: <20010906165605.B6812@gerg.ca> On 06 September 2001, Guido van Rossum said: > Of course, there's the slight problem that you can't write the > autoconfiguration for Python itself in Python. This reduces the > motivation, I suppose. :-( Well, we could do the "probe for bsddb"-like stuff right, and in Python. Just not the "is this a 64-bit left-handed albino operating system with a broken stdio?" stuff. Greg -- Greg Ward - Linux nerd gward@python.net http://starship.python.net/~gward/ Dyslexics of the world, untie! From gward@python.net Thu Sep 6 22:14:32 2001 From: gward@python.net (Greg Ward) Date: Thu, 6 Sep 2001 17:14:32 -0400 Subject: [Python-Dev] re-ordering sys.path In-Reply-To: <20010906111935.A18817@glacier.arctrix.com>; from nas@python.ca on Thu, Sep 06, 2001 at 11:19:35AM -0700 References: <20010906111935.A18817@glacier.arctrix.com> Message-ID: <20010906171432.A7013@gerg.ca> On 06 September 2001, Neil Schemenauer said: > Will this work? If so, why is it not already done? Beats me. It's one of the things that bugged me when I was writing the Distutils, and I another thing I failed to convince Guido of at the time. Greg -- Greg Ward - Linux nerd gward@python.net http://starship.python.net/~gward/ Time flies like an arrow; fruit flies like a banana. From guido@python.org Thu Sep 6 22:51:23 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Sep 2001 17:51:23 -0400 Subject: [Python-Dev] very long python files In-Reply-To: Your message of "Thu, 06 Sep 2001 12:28:14 PDT." <15255.52672.593538.948644@mace.lucasdigital.com> References: <3B97CA3A.7408D4B4@ilm.com> <15255.52672.593538.948644@mace.lucasdigital.com> Message-ID: <200109062151.f86LpNS18755@odiug.digicool.com> > Hey Folks, > > I just got this message from a developer inside of ILM today. Before > I start digging, can anyone tell me if this has been dealt with in any > version of python more recent than 1.5.2? (and where I might look for > this info myself- I'm very unfamiliar with sourceforge and its bug > tracker)... > > thanks! Line numbers > 32K? Yes, we do that since 2.0. Another reason to upgrade. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@zope.com Thu Sep 6 23:20:01 2001 From: barry@zope.com (Barry A. Warsaw) Date: Thu, 6 Sep 2001 18:20:01 -0400 Subject: [Python-Dev] Python 2.2a3 tagged Message-ID: <15255.63121.764945.362526@yyz.digicool.com> In preparation for the next alpha release, we've tagged the tree and created the branch. The tag on the trunk is r22a3-fork and the branch tag is r22a3-branch. The only people authorized to make checkins on the branch are Guido, Tim, and myself. You can still make checkins on the trunk, but please keep them to a minimum until 2.2a3 is released. If you make any trunk checkins that need to go into 2.2a3, please send Guido, Tim, and I a message directly asking us to merge your trunk change into the branch. Thanks, and watch for the release some time tomorrow. Cheers, -Barry From nhv@cape.com Fri Sep 7 01:34:43 2001 From: nhv@cape.com (Norman Vine) Date: Thu, 6 Sep 2001 20:34:43 -0400 Subject: [Python-Dev] RE: Bug in time.timezone of Python 2.1.1 In-Reply-To: Message-ID: <005f01c13734$e4383340$a300a8c0@nhv> This is a multi-part message in MIME format. ------=_NextPart_000_0060_01C13713.5D269340 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Ivan J. Wagner writes: >Sent: Thursday, September 06, 2001 8:03 PM >To: cygwin@sources.redhat.com >Cc: jason@tishler.net >Subject: Bug in time.timezone of Python 2.1.1 > > >I maintain a CVS archive (using cvs 1.11.0-1) on a Cygwin 1.3.2 install >and use ViewCVS 0.7 to access it. ViewCVS uses python and today I >upgraded from python 2.1-1 to 2.1.1-1. However the upgade >broke the Age >field in ViewCVS. The Age field is the time span between the >file's check >in and today. The Age field is displayed when you view a CVS directory >listing. I looked at the ViewCVS sources and tracked the problem to >time.timezone. In 2.1 it returns 18000 but in 2.1.1 it returns >1834228892. Does anybody have any day what might be causing >this problem? > >Thanks, >Ivan Wagner YES see attached patch Cheers Norman Vine ------=_NextPart_000_0060_01C13713.5D269340 Content-Type: application/octet-stream; name="timemodule.diff" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="timemodule.diff" *** timemodule.bak Wed Jun 27 09:01:54 2001=0A= --- timemodule.c Thu Sep 6 20:29:28 2001=0A= ***************=0A= *** 599,605 ****=0A= /* Squirrel away the module's dictionary for the y2k check */=0A= Py_INCREF(d);=0A= moddict =3D d;=0A= ! #if defined(HAVE_TZNAME) && !defined(__GLIBC__)=0A= tzset();=0A= #ifdef PYOS_OS2=0A= ins(d, "timezone", PyInt_FromLong((long)_timezone));=0A= --- 599,605 ----=0A= /* Squirrel away the module's dictionary for the y2k check */=0A= Py_INCREF(d);=0A= moddict =3D d;=0A= ! #if defined(HAVE_TZNAME) && !defined(__GLIBC__) &&!defined(__CYGWIN__)=0A= tzset();=0A= #ifdef PYOS_OS2=0A= ins(d, "timezone", PyInt_FromLong((long)_timezone));=0A= ------=_NextPart_000_0060_01C13713.5D269340-- From jason@tishler.net Fri Sep 7 02:09:21 2001 From: jason@tishler.net (Jason Tishler) Date: Thu, 6 Sep 2001 21:09:21 -0400 Subject: [Python-Dev] Re: Bug in time.timezone of Python 2.1.1 In-Reply-To: <005f01c13734$e4383340$a300a8c0@nhv> Message-ID: <20010906210921.D1328@dothill.com> Norman, On Thu, Sep 06, 2001 at 08:34:43PM -0400, Norman Vine wrote: > Ivan J. Wagner writes: > >I looked at the ViewCVS sources and tracked the problem to > >time.timezone. In 2.1 it returns 18000 but in 2.1.1 it returns > >1834228892. Does anybody have any day what might be causing > >this problem? > > YES > > see attached patch Have you submitted the above patch to SF for consideration? Thanks, Jason From guido@python.org Fri Sep 7 02:27:17 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 06 Sep 2001 21:27:17 -0400 Subject: [Python-Dev] RE: Bug in time.timezone of Python 2.1.1 In-Reply-To: Your message of "Thu, 06 Sep 2001 20:34:43 EDT." <005f01c13734$e4383340$a300a8c0@nhv> References: <005f01c13734$e4383340$a300a8c0@nhv> Message-ID: <200109070127.VAA05817@cj20424-a.reston1.va.home.com> Please, please, don't mail patches to python-dev. They get lost that way. Use the SourceForge patch manager! --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@loewis.home.cs.tu-berlin.de Fri Sep 7 07:56:27 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 7 Sep 2001 08:56:27 +0200 Subject: [Python-Dev] CVS: python/dist/src setup.py,1.52,1.53 Message-ID: <200109070656.f876uRq01561@mira.informatik.hu-berlin.de> >> I think the bug isn't OSX specific, I would be very >> surprised if other FreeBSD's at least don't have the same problem >> (why would Apple take FreeBSD but put an older version of curses in, >> after all). > Well, maybe Apple took a copy of FreeBSD several years ago... Probably on the contrary. When researching a getaddrinfo bug on OS X, I found that, when given a choice, they took library functions from the NeXT, rather than taking them from BSD. In case of getaddrinfo, this was the wrong choice - the BSD version would have worked... Now, I don't know where NeXT got its curses implementation from, but I wouldn't be surprised if it was 1.0BSD, or some such :-) Regards, Martin From jeremy@zope.com Fri Sep 7 17:15:57 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 7 Sep 2001 12:15:57 -0400 (EDT) Subject: [Python-Dev] warning in _localemodule.c patch Message-ID: <15256.62141.675047.963048@slothrop.digicool.com> The most recent revision (2.23) adds some code that modifies Py_FileSystemDefaultEncoding. This variable is decarled const char * in bltinmodule.c, but passed to free() in the new localmodule. gcc warns about this: 'free' discards qualifiers from pointer target type. You can't free the memory Py_FSDE points to, since it wasn't allocated by malloc(). I guess the simple solution is to remove the const. Is there a better option? Jeremy From loewis@informatik.hu-berlin.de Fri Sep 7 18:13:17 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Fri, 7 Sep 2001 19:13:17 +0200 (MEST) Subject: [Python-Dev] warning in _localemodule.c patch Message-ID: <200109071713.TAA14171@pandora.informatik.hu-berlin.de> > You can't free the memory Py_FSDE points to, since it wasn't > allocated by malloc(). Why is that? The memory was allocated by strdup in this case, so you can certainly free() it. > I guess the simple solution is to remove the const. Is there a > better option? The other option is to cast the pointer in the call to free(). I don't know which one is better; Py_FSDE certainly points to memory that should not be changed. Regards, Martin From guido@python.org Fri Sep 7 18:15:31 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Sep 2001 13:15:31 -0400 Subject: [Python-Dev] warning in _localemodule.c patch In-Reply-To: Your message of "Fri, 07 Sep 2001 19:13:17 +0200." <200109071713.TAA14171@pandora.informatik.hu-berlin.de> References: <200109071713.TAA14171@pandora.informatik.hu-berlin.de> Message-ID: <200109071715.NAA30088@cj20424-a.reston1.va.home.com> > The other option is to cast the pointer in the call to free(). Sounds like the better solution. Publicly, it's a const. Privately, you happen to malloc() and free() it. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@zope.com Fri Sep 7 18:33:59 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 7 Sep 2001 13:33:59 -0400 (EDT) Subject: [Python-Dev] warning in _localemodule.c patch In-Reply-To: <200109071715.NAA30088@cj20424-a.reston1.va.home.com> References: <200109071713.TAA14171@pandora.informatik.hu-berlin.de> <200109071715.NAA30088@cj20424-a.reston1.va.home.com> Message-ID: <15257.1287.445573.817920@slothrop.digicool.com> >>>>> "GvR" == Guido van Rossum writes: >> The other option is to cast the pointer in the call to free(). GvR> Sounds like the better solution. GvR> Publicly, it's a const. Privately, you happen to malloc() and GvR> free() it. I'm not sure about the intersection of all the ifdefs. Is it possible to execute the new code in the localmodule on Win32? I was looking at the first value which is the literal "mbcs", which shouldn't be passed to free. Jeremy From barry@zope.com Fri Sep 7 18:46:09 2001 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 7 Sep 2001 13:46:09 -0400 Subject: [Python-Dev] RELEASED: Python 2.2a3 is out! Message-ID: <15257.2017.401346.242592@anthem.wooz.org> We've released Python 2.2a3, the third alpha for Python 2.2, for your excitement, enlightenment, and endless-amusement. Download it from: http://www.python.org/2.2/ Give it a good try, and report what breaks to the bug tracker: http://sourceforge.net/bugs/?group_id=5470 [Note: this is my first solo release as Release Manager. If you find any problems with the downloads, please let me know! -BAW] New features in this release include: - Conversion of long to float now raises OverflowError if the long is too big to represent as a C double. - The 3-argument builtin pow() no longer allows a third non-None argument in some situations. - The builtin dir() now returns more information. - Overflowing operations on plain ints now return a long int rather than raising OverflowError. - A new command line option, -Q, is added to control run-time warnings for the use of classic division. - Many built-in types can now be subclassed. - The dictionary constructor now takes an optional argument, a mapping-like object. - New built-in types `super' and `property' have been added. - The syntax of floating-point and imaginary literals has been liberalized, to allow leading zeroes. - Lots of bug fixes, contributed patches, and other stuff. See the Misc/NEWS file in the distribution, or see the release notes on SourceForge: http://sourceforge.net/project/shownotes.php?release_id=51791 As usual, Andrew Kuchling is writing a gentle introduction to the most important changes (currently excluding type/class unification), titled "What's New in Python 2.2": http://www.amk.ca/python/2.2/ There is an introduction to the type/class unification at: http://www.python.org/2.2/descrintro.html Thanks to everybody who contributed to this release, including all the 2.2 alpha 1 and 2 testers! Enjoy! -Barry From guido@python.org Fri Sep 7 18:55:38 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Sep 2001 13:55:38 -0400 Subject: [Python-Dev] RELEASED: Python 2.2a3 is out! In-Reply-To: Your message of "Fri, 07 Sep 2001 13:46:09 EDT." <15257.2017.401346.242592@anthem.wooz.org> References: <15257.2017.401346.242592@anthem.wooz.org> Message-ID: <200109071755.NAA31707@cj20424-a.reston1.va.home.com> Thanks for doing the release, Barry! Congratulations on a job well done. It looks great! --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@effbot.org Fri Sep 7 20:50:17 2001 From: fredrik@effbot.org (Fredrik Lundh) Date: Fri, 7 Sep 2001 21:50:17 +0200 Subject: [Python-Dev] 2.2a3 oddities Message-ID: <000901c137d6$5b53eb80$42fb42d5@hagrid> anyone else seeing this? 1) using a local windows build, I keep getting complaints about a missing symbol in the python22.dll: _PyGC_Insert an easy way to get this is to import the "xmlrpclib" module into a clean interpreter. (fwiw, this doesn't stop Python -- once you click OK, the interpreter proceeds as if nothing happened...) 2) most C functions that expect 8-bit strings crash if you hand them a unicode string containing non-ascii characters. an example: >>> import socket >>> socket.gethostbyname(u"pyth=F6nware.com") this works as expected in 2.1.1 (that is, I get an exception). it doesn't work in a local build of 2.2a2. From guido@python.org Fri Sep 7 20:49:46 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Sep 2001 15:49:46 -0400 Subject: [Python-Dev] 2.2a3 oddities In-Reply-To: Your message of "Fri, 07 Sep 2001 21:50:17 +0200." <000901c137d6$5b53eb80$42fb42d5@hagrid> References: <000901c137d6$5b53eb80$42fb42d5@hagrid> Message-ID: <200109071949.PAA11868@cj20424-a.reston1.va.home.com> > anyone else seeing this? > > 1) using a local windows build, I keep getting complaints > about a missing symbol in the python22.dll: > > _PyGC_Insert > > an easy way to get this is to import the "xmlrpclib" module > into a clean interpreter. > > (fwiw, this doesn't stop Python -- once you click OK, the > interpreter proceeds as if nothing happened...) Works fine in 2.2a3. Sounds like a version mismatch between DLL and PYD file -- Neil changed the GC API, but did it with macros. > 2) most C functions that expect 8-bit strings crash if you > hand them a unicode string containing non-ascii characters. > > an example: > > >>> import socket > >>> socket.gethostbyname(u"pythönware.com") > > this works as expected in 2.1.1 (that is, I get an exception). > it doesn't work in a local build of 2.2a2. Yes, confirmed in 2.2a3 :-( Tim? Time for a debug session? --Guido van Rossum (home page: http://www.python.org/~guido/) From nas@python.ca Fri Sep 7 20:57:47 2001 From: nas@python.ca (Neil Schemenauer) Date: Fri, 7 Sep 2001 12:57:47 -0700 Subject: [Python-Dev] 2.2a3 oddities In-Reply-To: <000901c137d6$5b53eb80$42fb42d5@hagrid>; from fredrik@effbot.org on Fri, Sep 07, 2001 at 09:50:17PM +0200 References: <000901c137d6$5b53eb80$42fb42d5@hagrid> Message-ID: <20010907125747.A22461@glacier.arctrix.com> Fredrik Lundh wrote: > anyone else seeing this? > > 1) using a local windows build, I keep getting complaints > about a missing symbol in the python22.dll: > > _PyGC_Insert > > an easy way to get this is to import the "xmlrpclib" module > into a clean interpreter. Has the xmlrpclib module been compiled using the 2.2a3 headers? I changed the binary API between 2.2a2 and 2.2a3 (sorry). If you did recompile module then something is very wrong. Neil From guido@python.org Fri Sep 7 20:56:36 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 07 Sep 2001 15:56:36 -0400 Subject: [Python-Dev] 2.2a3 oddities In-Reply-To: Your message of "Fri, 07 Sep 2001 15:49:46 EDT." <200109071949.PAA11868@cj20424-a.reston1.va.home.com> References: <000901c137d6$5b53eb80$42fb42d5@hagrid> <200109071949.PAA11868@cj20424-a.reston1.va.home.com> Message-ID: <200109071956.PAA11944@cj20424-a.reston1.va.home.com> > > 2) most C functions that expect 8-bit strings crash if you > > hand them a unicode string containing non-ascii characters. > > > > an example: > > > > >>> import socket > > >>> socket.gethostbyname(u"pythönware.com") > > > > this works as expected in 2.1.1 (that is, I get an exception). > > it doesn't work in a local build of 2.2a2. > > Yes, confirmed in 2.2a3 :-( > > Tim? Time for a debug session? Never mind, Tim. :-) This crashes on Linux too. This patch fixes it, but I'm not sure if that's right -- more eyes, please? (Maybe the fix is simpler and it should just replace the test for Py_None with a test for NULL?) Index: getargs.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Python/getargs.c,v retrieving revision 2.63 diff -c -r2.63 getargs.c *** getargs.c 2001/08/28 16:37:51 2.63 --- getargs.c 2001/09/07 19:56:21 *************** *** 369,374 **** --- 369,375 ---- { assert (expected != NULL); sprintf(msgbuf, "must be %.50s, not %.50s", expected, + arg == NULL ? "NULL" : arg == Py_None ? "None" : arg->ob_type->tp_name); return msgbuf; } --Guido van Rossum (home page: http://www.python.org/~guido/) From loewis@informatik.hu-berlin.de Fri Sep 7 21:31:13 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Fri, 7 Sep 2001 22:31:13 +0200 (MEST) Subject: [Python-Dev] warning in _localemodule.c patch In-Reply-To: <15257.1287.445573.817920@slothrop.digicool.com> (message from Jeremy Hylton on Fri, 7 Sep 2001 13:33:59 -0400 (EDT)) References: <200109071713.TAA14171@pandora.informatik.hu-berlin.de> <200109071715.NAA30088@cj20424-a.reston1.va.home.com> <15257.1287.445573.817920@slothrop.digicool.com> Message-ID: <200109072031.WAA14792@pandora.informatik.hu-berlin.de> > I'm not sure about the intersection of all the ifdefs. Is it possible > to execute the new code in the localmodule on Win32? I was looking at > the first value which is the literal "mbcs", which shouldn't be passed > to free. It's not possible, since there is no langinfo.h on Windows. Even if that was (which may be the case for cygwin, dunno), it will only use nl_langinfo(CODESET), strdup, and free, if it ever found Py_FSDE to be NULL. If you through the code the first time and Py_FSDE is already set, it won't attempt to change it, or free its current value. Since the code appears not to be obvious, I think I shall put a comment into it. Regards, Martin From tim.one@home.com Fri Sep 7 21:38:57 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 7 Sep 2001 16:38:57 -0400 Subject: [Python-Dev] 2.2a3 oddities In-Reply-To: <000901c137d6$5b53eb80$42fb42d5@hagrid> Message-ID: [Fredrik Lundh] > anyone else seeing this? > > 1) using a local windows build, I keep getting complaints > about a missing symbol in the python22.dll: > > _PyGC_Insert > > an easy way to get this is to import the "xmlrpclib" module > into a clean interpreter. Can't reproduce. Perhaps it has to do with this?: try: # optional xmlrpclib accelerator. for more information on this # component, contact info@pythonware.com import _xmlrpclib ... That module doesn't exist on my box, but I bet it does on yours. There are no references to _PyGC_Insert anywhere in core 2.2a3. > (fwiw, this doesn't stop Python -- once you click OK, the > interpreter proceeds as if nothing happened...) Consistent with the guess above if that causes the import to raise an exception. From wagner17@mandalore.com Fri Sep 7 22:05:45 2001 From: wagner17@mandalore.com (Ivan J. Wagner) Date: Fri, 7 Sep 2001 14:05:45 -0700 (PDT) Subject: [Python-Dev] RE: Bug in time.timezone of Python 2.1.1 In-Reply-To: <005f01c13734$e4383340$a300a8c0@nhv> Message-ID: Norman, Thanks a lot for the patch. I just tested it out and it fixed the problem. On Thu, 6 Sep 2001, Norman Vine wrote: > Ivan J. Wagner writes: > >Sent: Thursday, September 06, 2001 8:03 PM > >To: cygwin@sources.redhat.com > >Cc: jason@tishler.net > >Subject: Bug in time.timezone of Python 2.1.1 > > > > > >I maintain a CVS archive (using cvs 1.11.0-1) on a Cygwin 1.3.2 install > >and use ViewCVS 0.7 to access it. ViewCVS uses python and today I > >upgraded from python 2.1-1 to 2.1.1-1. However the upgade > >broke the Age > >field in ViewCVS. The Age field is the time span between the > >file's check > >in and today. The Age field is displayed when you view a CVS directory > >listing. I looked at the ViewCVS sources and tracked the problem to > >time.timezone. In 2.1 it returns 18000 but in 2.1.1 it returns > >1834228892. Does anybody have any day what might be causing > >this problem? > > > >Thanks, > >Ivan Wagner > > YES > > see attached patch > > Cheers > > Norman Vine > From jack@oratrix.nl Sat Sep 8 14:48:53 2001 From: jack@oratrix.nl (Jack Jansen) Date: Sat, 08 Sep 2001 15:48:53 +0200 Subject: [Python-Dev] test_descrtut failing Message-ID: <20010908134858.C81EFAB444@oratrix.oratrix.nl> Test_descrtut is driving me up the wall (or actually, the whole test suite is:-). If I run it from regrtest it fails with the completely useless message test_descrtut The actual stdout doesn't match the expected stdout. This much did match (between asterisk lines): ********************************************************************** test_descrtut ********************************************************************** Then ... We expected (repr): '' But instead we got: '*****************************************************************\n' test test_descrtut failed -- Writing: '*****************************************************************\n', expected: '' If I run it standalone it doesn't work because all the typenames are different. This is a general problem, I've seen the same on unix. In general, I find the new test framework not much of an improvement. Whereas with old tests you would get some indication as to what was failing, currently it is at best a not very helpful "We expected '' But in stead we got 'TestFailed'", and at worst something like the above. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | ++++ see http://www.xs4all.nl/~tank/ ++++ From guido@python.org Sat Sep 8 18:59:48 2001 From: guido@python.org (Guido van Rossum) Date: Sat, 08 Sep 2001 13:59:48 -0400 Subject: [Python-Dev] test_descrtut failing In-Reply-To: Your message of "Sat, 08 Sep 2001 15:48:53 +0200." <20010908134858.C81EFAB444@oratrix.oratrix.nl> References: <20010908134858.C81EFAB444@oratrix.oratrix.nl> Message-ID: <200109081759.NAA19166@cj20424-a.reston1.va.home.com> > Test_descrtut is driving me up the wall (or actually, the whole test > suite is:-). > > If I run it from regrtest it fails with the completely useless message > test_descrtut > The actual stdout doesn't match the expected stdout. > This much did match (between asterisk lines): > ********************************************************************** > test_descrtut > ********************************************************************** > Then ... > We expected (repr): '' > But instead we got: > '*****************************************************************\n' > test test_descrtut failed -- Writing: > '*****************************************************************\n', > expected: '' > > If I run it standalone it doesn't work because all the typenames are > different. This is a general problem, I've seen the same on unix. > > In general, I find the new test framework not much of an > improvement. Whereas with old tests you would get some indication as > to what was failing, currently it is at best a not very helpful "We > expected '' But in stead we got 'TestFailed'", and at worst something > like the above. The problem seems to be that both doctest (which is just *one* of the new test frameworks -- the other one is unittest) and regrtest (the regression test suite driver) are trying to be friendly when they encounter a failed test. One of the tests fails, and doctest prints its failure message, which looks something like this: ***************************************************************** Failure in example: C().foo() from line #23 of test_descrtut.__test__.tut8 Expected: zcalled A.foo() Got: called A.foo() ***************************************************************** 1 items had failures: 1 of 6 in test_descrtut.__test__.tut8 ***Test Failed*** 1 failures. But regrtest, which is comparing the output to the contents of the file Lib/test/output/test_descrtut, doesn't expect that kind of output, so *it* attempts to print a report of a failure, which looks something like this: The actual stdout doesn't match the expected stdout. This much did match (between asterisk lines): ********************************************************************** test_descrtut ********************************************************************** Then ... We expected (repr): '' But instead we got: 'Blargh' test test_descrtut failed -- Writing: 'Blargh', expected: '' Except that if a doctest test fails, the unexpected output is not 'Blargh' but a line of asterisks followed by a newline. The unexected output is repeated twice in the regrtest output. The trick to finding out what's wrong with the test is to run it as follows: ./python Lib/test/regrtest.py -v test_descrtut This runs the test in "verbose mode", and in that mode regrtest doesn't do its output comparison trick -- it just shows you all the test's output. Now, doctest also scans sys.argv for a -v (which I think is wrong because it's not the main program, but that's how it works), and this causes doctest to be much more verbose -- which may or may not be helpful but shouldn't be a problem unless you are without a scrollbar. You said that when you run the test standalone the type names are different. *How* do you run it standalone? If I run it like this: ./python Lib/test/test_descrtut.py it works fine (and -v works here too). But if I run it like this: ./python >>> import test.test_descrtut # this does not run the tests! >>> test.test_descrtut.test_main() # this does! then indeed doctset complains that instead of it saw (and other failures, all caused by the incorporation of the module name (__name__) in type names when they are printed). It is possible to rewrite the tests to avoid the dependency on the module name: change all occurrences of test_descrtut to test.test_descrtut, including the three places in the test_main() function. But since it's not my test, I'll leave that to Tim. (If you've been paying attention, you might have wondered why it succeeded when run from the command line. Shouldn't the module name be __main__ there? This is because of a trick employed by doctest: it reimports the module!) This still doesn't solve your real problem (that test_descrtut fails), but the regrtest.py -v trick should make it possible for you to find out painlessly. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Sat Sep 8 20:49:16 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 8 Sep 2001 15:49:16 -0400 Subject: [Python-Dev] test_descrtut failing In-Reply-To: <200109081759.NAA19166@cj20424-a.reston1.va.home.com> Message-ID: [Jack Jansen] >> Test_descrtut is driving me up the wall (or actually, the whole test >> suite is:-). >> ... >> If I run it standalone it doesn't work because all the typenames are >> different. This is a general problem, I've seen the same on unix. What does "run it standalone" mean to you, exactly? Here are 4 possible meanings that work fine for me (running it alone with or without regrtest, and from the test directory or from my build directory; and I haven't been able to dream up another meaning!): C:\Code\python\PCbuild>python ../lib/test/test_descrtut.py C:\Code\python\PCbuild>python ../lib/test/regrtest.py test_descrtut.py test_descrtut 1 test OK. C:\Code\python\Lib\test>..\..\pcbuild\python test_descrtut.py C:\Code\python\Lib\test>..\..\pcbuild\python regrtest.py test_descrtut.py test_descrtut 1 test OK. >> In general, I find the new test framework not much of an >> improvement. The problem is more that we don't have a new framework: we've squashed additional frameworks *under* a catch-all old framework, and the latter doesn't really know how to deal with the former. unittest and doctest both print their failure reports to stdout, but stdout is what regrtest is looking at, and the latter complains if anything shows up on stdout that it doesn't expect. [Guido] > ... > But regrtest, which is comparing the output to the contents of the > file Lib/test/output/test_descrtut, Which latter doesn't exist -- that's another twist in the tale. > The actual stdout doesn't match the expected stdout. > This much did match (between asterisk lines): > ********************************************************************** > test_descrtut > ********************************************************************** When an expected-output file doesn't exist, regrtest "pretends" that it does exist, and consists of a single line containing the name of the test. That's why it says "This much did match" despite that output/test_descrtut doesn't actually exist. > ... > The trick to finding out what's wrong with the test is to run it as > follows: > > ./python Lib/test/regrtest.py -v test_descrtut That works well for unnittest-based tests too. For a doctest-based test, ./python Lib/test/test_descrtut.py -v can simplify life by getting regrtest out of the picture. > This runs the test in "verbose mode", and in that mode regrtest > doesn't do its output comparison trick -- it just shows you all the > test's output. Now, doctest also scans sys.argv for a -v (which I > think is wrong because it's not the main program, but that's how it > works), Not really, that's how the doctest-based test_descrtut.py was *written*, so that the alternative line above works to get regrtest out of the equation entirely (when desired), i.e. so that you can run it exactly the same way you'd run a native doctested module (== one that never heard of regrtest). It didn't have to be coded that way. An alternative is shown by test_difflib.py, here in its entirety: from test_support import verbose import doctest, difflib doctest.testmod(difflib, verbose=verbose) There doctest is explicitly told which verbose mode to use, and in that case doctest doesn't look at sys.argv. This makes its behavior easier to understand *in the context* of regrtest, but harder to understand as a doctest (which normally are run as "main programs"). > and this causes doctest to be much more verbose -- which may > or may not be helpful but shouldn't be a problem unless you are > without a scrollbar. If you're running on Win9x, you are indeed without a scrollbar (50 lines max), and that's part of why doctest never writes to stderr: even under a 50-line Win95 shell, you can pipe its output to 'more' and not lose anything (and Win9x shells don't support stderr redirection at all). > You said that when you run the test standalone the type names are > different. *How* do you run it standalone? If I run it like this: > > ./python Lib/test/test_descrtut.py > > it works fine (and -v works here too). But if I run it like this: > > ./python > >>> import test.test_descrtut # this does not run the tests! > >>> test.test_descrtut.test_main() # this does! > > then indeed doctset complains that instead of > > > > it saw > > > > (and other failures, all caused by the incorporation of the module > name (__name__) in type names when they are printed). Jack, is that really what you do? I'm finding that hard to believe. > It is possible to rewrite the tests to avoid the dependency > on the module name: change all occurrences of test_descrtut to > test.test_descrtut, including the three places in the test_main() > function. But since it's not my test, I'll leave that to Tim. I'd rather delete the test. But since I doubt Jack is running the test in the devious way shown above, I'll hold off until he surprises me . > ... > This still doesn't solve your real problem (that test_descrtut fails), > but the regrtest.py -v trick should make it possible for you to find > out painlessly. :-) For this particular test, ./python Lib/test/test_descrtut.py is the easiest way. Then doctest will show all and only the tests that fail, and you don't even have to think about what regrtest may or may not be doing to the output. From jack@oratrix.nl Sat Sep 8 22:30:21 2001 From: jack@oratrix.nl (Jack Jansen) Date: Sat, 08 Sep 2001 23:30:21 +0200 Subject: [Python-Dev] MacPython 2.2a3 available Message-ID: <20010908213026.DD94E139843@oratrix.oratrix.nl> In an unprecedented move MacPython 2.2a3 is available only a scant 26 hours or so after the unix/windows distribution of 2.2a3. MacPython 2.2a3 is available via http://www.cwi.nl/~jack/macpython.html, as usual, and in hqx or macbinary form, as a full installer, an active installer and a source distribution (as usual:-). Aside from the general new 2.2a3 features there are three specific changes in MacPython that are worth mentioning: - The structure of the MacOS toolbox modules has changed. All the modules have been put into a "Carbon" package (which, despite the name, runs fine in the classic PPC runtime model). There is a backwards compatibility folder on sys.path that will keep imports with the old names working (with an obnoxious warning). - Plugin modules are now in :Lib:lib-dynload in stead of in :Mac:PlugIns, to make the installed tree look more like the unix tree. - On input, unix line-endings are now acceptable for all text files. This is an experimental feature (awaiting a general solution, for which a PEP has been promised but not started yet, the giulty parties know who they are:-), and it can be turned off with a preference. The downside of the quick release is that the installer has only been tested on MacOSX 10.0.4 and MacOS 9.1. Please report problems on older releases of MacOS asap. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | ++++ see http://www.xs4all.nl/~tank/ ++++ From jack@oratrix.nl Sat Sep 8 23:46:22 2001 From: jack@oratrix.nl (Jack Jansen) Date: Sun, 09 Sep 2001 00:46:22 +0200 Subject: [Python-Dev] test_descrtut failing In-Reply-To: Message by Guido van Rossum , Sat, 08 Sep 2001 13:59:48 -0400 , <200109081759.NAA19166@cj20424-a.reston1.va.home.com> Message-ID: <20010908224627.D33C2139843@oratrix.oratrix.nl> Recently, Guido van Rossum said: > You said that when you run the test standalone the type names are > different. *How* do you run it standalone? If I run it like this: > > ./python Lib/test/test_descrtut.py > > it works fine (and -v works here too). But if I run it like this: > > ./python > >>> import test.test_descrtut # this does not run the tests! > >>> test.test_descrtut.test_main() # this does! Indeed, I run it the second way (no command line on the Mac, remember:-). The fun thing is that if I run it completely standalone (by dragging test_descrtut.py to the interpreter, the test can handle this situation) it works fine! So, there must be something wrong with either the test framework (or maybe this re-import trick?) that doesn't work as expected on the Mac. Maybe test_regrtest can be thought which tests use unittest or doctest, and not try to be helpful in those cases? For now I've just put a note in the readme file for MacPython that this test is expected to fail, but I'd like to fix it eventually, of course. I have another gripe with the new unittest stuff (this is the first time I've seen the doctest thing, never knew it was there!), and that's that most of the test failures are difficult to interpret. Whereas the old-style tests simply "print math.sin(math.pi)" and tell you they expected the output to be 0 but got -1 in stead the unittest-based tests often don't give that information. Most of the unittest-based tests use only failUnless/assert_ in stead of the higher-level functions, i.e. test_time uses things like self.assert_(time.ctime(self.t) == time.asctime(time.localtime(self.t))) where the test results would be a lot easier to interpret if they used self.assertEqual(time.ctime(self.t), time.asctime(time.localtime(self.t)), "ctime(T) != asctime(localtime(T)") -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jack@oratrix.nl Sat Sep 8 23:54:10 2001 From: jack@oratrix.nl (Jack Jansen) Date: Sun, 09 Sep 2001 00:54:10 +0200 Subject: [Python-Dev] test_descrtut failing In-Reply-To: Message by "Tim Peters" , Sat, 8 Sep 2001 15:49:16 -0400 , Message-ID: <20010908225415.A626F139843@oratrix.oratrix.nl> Recently, "Tim Peters" said: > I'd rather delete the test. But since I doubt Jack is running the test in > the devious way shown above, I'll hold off until he surprises me . Well, as you've seen from my previous message I'm indeed running the test in this devious way:-) But I must say that I'm mildly surprised that windows users not also do this. In the MacPython installation instructions I suggest people start by firing up the interpreter and doing "importtest.regrtest; test.regrtest.main()". What does PythonWin suggest? That people fire up a dos shell to run the tests? Or does it simply not suggest anything ? Or do the problems somehow not show up in PythonWin? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From gmcm@hypernet.com Sun Sep 9 00:45:46 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Sat, 8 Sep 2001 19:45:46 -0400 Subject: [Python-Dev] test_descrtut failing In-Reply-To: <20010908225415.A626F139843@oratrix.oratrix.nl> References: Message by "Tim Peters" , Sat, 8 Sep 2001 15:49:16 -0400 , Message-ID: <3B9A756A.12702.4EA9705D@localhost> Jack Jansen wrote: > But I must say that I'm mildly surprised that windows users not > also do this. In the MacPython installation instructions I > suggest people start by firing up the interpreter and doing > "importtest.regrtest; test.regrtest.main()". What does PythonWin > suggest? Nothing. It's part of the ActiveState download, not part of Pythonlab's. > That people fire up a dos shell to run the tests? That's Python on Window's default. But thanks to Tim, it's been tested by the time we get it. or-at-least-floating-point-has-ly y'rs - Gordon From tim.one@home.com Sun Sep 9 00:59:56 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 8 Sep 2001 19:59:56 -0400 Subject: [Python-Dev] test_descrtut failing In-Reply-To: <20010908225415.A626F139843@oratrix.oratrix.nl> Message-ID: [Jack Jansen] > Well, as you've seen from my previous message I'm indeed running the > test in this devious way:-) Jack, you're still making me guess. Guido's example was > ./python > >>> import test.test_descrtut # this does not run the tests! > >>> test.test_descrtut.test_main() # this does! and despite your claim above I still don't believe that's what you do. Later in *this* msg you said you suggest that other people do "import test.regrtest; test.regrtest.main()" So which is it? Because the names attached to Python modules depend on exactly how you import the modules, and test_descrtut compares output that includes module names, the precise way in which you run tests is the entire ball of wax here. It appears that test_descrtut is unique among all tests merely in verifying that stuff like >>> print defaultdict # show our type produces the expected output. If we had a straight regrtest that checked test_support.verify(str(defaultdict) == "") it would also fail for you; ditto for a unittest that bothered to check. I get the impression Guido doesn't think this kind of output *should* be verified, though, which is why I'm more inclined to throw the test away than to "fix it" for you. > But I must say that I'm mildly surprised that windows users not also > do this. The plain clumsiness of "import test.regrtest; test.regrtest.main()" should suggest it's not a usual way to run tests. > In the MacPython installation instructions I suggest people > start by firing up the interpreter and doing "importtest.regrtest; > test.regrtest.main()". What does PythonWin suggest? PythonWin is the name of Mark Hammond's Windows IDE, so isn't relevant. The test suite cannot be run under IDLE (the Windows IDE we ship with the core), because the threaded tests confuse Tk, causing crashes and hangs; I don't know whether PythonWin has the same problem. I suppose there *may* be a Windows user somewhere who brings up a DOS box, and then starts a Python shell in it, and then does a bunch of obscure imports to run the test suite, but I haven't met one. The natural thing to do on Windows-- and the only thing I ever do --is to run regrtest.py from a DOS command line (and the PCbuild directory has a DOS batch file to partially automate the Windows testing process). But you don't have a command line, so it's not surprising you don't use one . > That people fire up a dos shell to run the tests? Or does it simply > not suggest anything ? We don't suggest anything, and I doubt if as many as 1 Windows user in a 1000 ever bothers trying to run the test suite. We ship binaries on Windows, so it's not like Windows users need to test their build process. If I ever ask a Windows user to run a test (maybe once a year?), the natural way to do it remains from the command line, in order to capture the output easily. From tim.one@home.com Sun Sep 9 02:27:11 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 8 Sep 2001 21:27:11 -0400 Subject: [Python-Dev] test_descrtut failing In-Reply-To: <20010908224627.D33C2139843@oratrix.oratrix.nl> Message-ID: [Jack Jansen] > ... > The fun thing is that if I run it completely standalone (by dragging > test_descrtut.py to the interpreter, the test can handle this > situation) it works fine! So, there must be something wrong with > either the test framework (or maybe this re-import trick?) that > doesn't work as expected on the Mac. It's that the name of a module depends on how it's first imported, and that's got nothing to do with Macs or the framework. It does have to do with the exact way in which you run the tests; apparently nobody else runs tests that way, or nobody else runs tests at all . BTW, the "re-import trick" is neither a trick nor a re-import: doctest takes a module object as input, and importing a module is the *obvious* way to get a module object. The module may or may not have been imported before, but even it was it wasn't imported before *by* the doctested module; so it's no more of a re-import than two modules both deciding to import, say, math or sys. Unfortunately, unlike as for math or sys, the name of a module in a package is fuzzy. > Maybe test_regrtest can be thought which tests use unittest or > doctest, and not try to be helpful in those cases? Sure -- but it's not in my plans. > For now I've just put a note in the readme file for MacPython that > this test is expected to fail, but I'd like to fix it eventually, of > course. Or you can include the hacked version I just checked in (forcing "test." prefix gibberish to show up in the module name no matter how it's run, via Guido's suggested trick). > I have another gripe with the new unittest stuff (this is the first > time I've seen the doctest thing, never knew it was there!), and > that's that most of the test failures are difficult to > interpret. Whereas the old-style tests simply "print math.sin(math.pi)" > and tell you they expected the output to be 0 but got -1 in stead the > unittest-based tests often don't give that information. > Most of the unittest-based tests use only failUnless/assert_ in stead > of the higher-level functions, i.e. test_time uses things like > self.assert_(time.ctime(self.t) > == time.asctime(time.localtime(self.t))) > where the test results would be a lot easier to interpret if they used > self.assertEqual(time.ctime(self.t), > time.asctime(time.localtime(self.t)), > "ctime(T) != asctime(localtime(T)") The same gripe is just as true of many of the straight regrtests, using test_support.verify() lazily, as in verify(D.goo(1) == (D, 1)) verify(str(c1).find('C instance at ') >= 0) verify(d == {}) verify(lines == ["A\n", "B\n", "C", "D\n", "E\n", "F"]) verify(len(m) == 2*PAGESIZE) verify(str(e) == 'Truncated input file') verify(re.split("(?::*)", ":a:b::c") == ['', 'a', 'b', 'c']) and on & on & on. There are over 900 (static) calls of verify(), and it's rare to see the optional "explanation" argument. Many of those in turn were originally lazy uses of the assert stmt (also skipping the "reason" clause). This is human nature <0.1 wink>, and it played a large role in doctest's design. For example, a doctest for the last one there could look like: """ >>> import re >>> re.split("(?::*)", ":a:b::c") ['', 'a', 'b', 'c'] """ and if it fails you get a message showing you the example, the expected output, and the output you actually got. Like ***************************************************************** Failure in example: re.split("(?::*)", ":a:b::c") from line #2 of dc Expected: ['', 'a', 'b', ':c'] Got: ['', 'a', 'b', 'c'] ***************************************************************** Less typing and better error reports. Multiline "expected" and "got" stuff work just as slick, which regrtest doesn't handle well (it compares output at a low per-.write() level, not at line granularity; that's why when a straight regrtest fails, you'll sometimes see a string of "expected" characters that come out of the middle of one of the expected-output lines). From guido@python.org Sun Sep 9 04:06:15 2001 From: guido@python.org (Guido van Rossum) Date: Sat, 08 Sep 2001 23:06:15 -0400 Subject: [Python-Dev] test_descrtut failing In-Reply-To: Your message of "Sun, 09 Sep 2001 00:46:22 +0200." <20010908224627.D33C2139843@oratrix.oratrix.nl> References: <20010908224627.D33C2139843@oratrix.oratrix.nl> Message-ID: <200109090306.XAA20116@cj20424-a.reston1.va.home.com> > > You said that when you run the test standalone the type names are > > different. *How* do you run it standalone? If I run it like this: > > > > ./python Lib/test/test_descrtut.py > > > > it works fine (and -v works here too). But if I run it like this: > > > > ./python > > >>> import test.test_descrtut # this does not run the tests! > > >>> test.test_descrtut.test_main() # this does! > > Indeed, I run it the second way (no command line on the Mac, > remember:-). > > The fun thing is that if I run it completely standalone (by dragging > test_descrtut.py to the interpreter, the test can handle this > situation) it works fine! So, there must be something wrong with > either the test framework (or maybe this re-import trick?) that > doesn't work as expected on the Mac. To find out, try this: >>> import sys >>> sys.argv = ['regrtest.py', '-v', 'test_decrtut'] >>> import test.regrtest # this does not run any tests >>> test.regrtest.main() This should run just the one test, under regrtest, in verbose mode, so without regrtest comparing and interpreting the output of doctest. > Maybe test_regrtest can be thought which tests use unittest or > doctest, and not try to be helpful in those cases? That's up to Tim; sounds like a good idea to me but requires that regrtest knows which tests use doctest. > For now I've just put a note in the readme file for MacPython that > this test is expected to fail, but I'd like to fix it eventually, of > course. Of course. > I have another gripe with the new unittest stuff (this is the first > time I've seen the doctest thing, never knew it was there!), and > that's that most of the test failures are difficult to > interpret. Whereas the old-style tests simply "print math.sin(math.pi)" > and tell you they expected the output to be 0 but got -1 in stead the > unittest-based tests often don't give that information. > Most of the unittest-based tests use only failUnless/assert_ in stead > of the higher-level functions, i.e. test_time uses things like > self.assert_(time.ctime(self.t) > == time.asctime(time.localtime(self.t))) > where the test results would be a lot easier to interpret if they used > self.assertEqual(time.ctime(self.t), > time.asctime(time.localtime(self.t)), > "ctime(T) != asctime(localtime(T)") Absolutely! Doctest is actually better in this respect (when not thwarted by regrtest) because it does this automatically, but we should definitely try to use self.assertEqual(x, y) rather than self.assert(x == y)! (If you find any examples, please fix them or at least report them. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Sun Sep 9 04:43:04 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 8 Sep 2001 23:43:04 -0400 Subject: [Python-Dev] 2.2a3 oddities In-Reply-To: <200109071956.PAA11944@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > This crashes on Linux too. This patch fixes it, but I'm not sure if > that's right -- more eyes, please? (Maybe the fix is simpler and it > should just replace the test for Py_None with a test for NULL?) Well, converterr gets called from 56(!) places, and I spaced out after looking at 10 of them. Still, I can't imagine that removing the Py_None test could hurt anything, and a NULL test is definitely needed. > Index: getargs.c > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Python/getargs.c,v > retrieving revision 2.63 > diff -c -r2.63 getargs.c > *** getargs.c 2001/08/28 16:37:51 2.63 > --- getargs.c 2001/09/07 19:56:21 > *************** > *** 369,374 **** > --- 369,375 ---- > { > assert (expected != NULL); > sprintf(msgbuf, "must be %.50s, not %.50s", expected, > + arg == NULL ? "NULL" : > arg == Py_None ? "None" : arg->ob_type->tp_name); > return msgbuf; > } From tim.one@home.com Sun Sep 9 07:22:40 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 9 Sep 2001 02:22:40 -0400 Subject: [Python-Dev] test_descrtut failing In-Reply-To: <200109090306.XAA20116@cj20424-a.reston1.va.home.com> Message-ID: [Jack] > Maybe test_regrtest can be thought which tests use unittest or > doctest, and not try to be helpful in those cases? [Guido] > That's up to Tim; sounds like a good idea to me but requires that > regrtest knows which tests use doctest. OK, done and checked in. For example, after planting two errors in test_generators.py now: >python ../lib/test/regrtest.py test_cookie test_generators test_descrtut test_cookie test_generators ***************************************************************** Failure in example: print list(g2()) from line #80 of test_generators.__test__.tut Expected: [24] Got: [42] ***************************************************************** Failure in example: list(zrange(5)) from line #128 of test_generators.__test__.tut Expected: [0, 1, 3, 4] Got: [0, 1, 2, 3, 4] ***************************************************************** 1 items had failures: 2 of 29 in test_generators.__test__.tut ***Test Failed*** 2 failures. test test_generators failed -- 2 of 136 doctests failed test_descrtut 2 tests OK. 1 test failed: test_generators From fredrik@pythonware.com Sun Sep 9 12:40:09 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sun, 9 Sep 2001 13:40:09 +0200 Subject: [Python-Dev] 2.2a3 oddities References: <000901c137d6$5b53eb80$42fb42d5@hagrid> <20010907125747.A22461@glacier.arctrix.com> Message-ID: <002301c13924$2fe2e790$42fb42d5@hagrid> neil wrote: > Has the xmlrpclib module been compiled using the 2.2a3 headers? xmlrpclib is a python module, but it picked up an old copy of sgmlop.pyd. The problem disappeared once I rebuilt the entire PY22 tree. From jeremy@zope.com Mon Sep 10 02:15:13 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Sun, 9 Sep 2001 21:15:13 -0400 (EDT) Subject: [Python-Dev] 2.2a3 oddities In-Reply-To: <200109071956.PAA11944@cj20424-a.reston1.va.home.com> References: <000901c137d6$5b53eb80$42fb42d5@hagrid> <200109071949.PAA11868@cj20424-a.reston1.va.home.com> <200109071956.PAA11944@cj20424-a.reston1.va.home.com> Message-ID: <15260.5153.668847.644067@slothrop.digicool.com> I have a better solution for the bug, which I will check in as soon as I come up with a good testcase. Here's the long explanation: The problem was introduced by a refactoring. convertsimple() used to be a wrapper that called convertsimple1() and generated an error message if convertsimple1() returned NULL. I turned it inside out: convertsimple() became converterr() and convertsimple1() became convertsimple(). When the old convertsimple1() returned NULL, the new convertsimple() calls converterr(). The problem is that the Unicode routines change the value of the arg variable and set it to NULL if an error occurs. Then when the error routine is called, the arg variable is set to NULL. In the old code, the error routine used the value of arg in the caller -- so the assignment in the Unicode path wasn't relevant. The test for NULL isn't a good solution, because the original value for arg should be getting passed instead of NULL. Jeremy From James_Althoff@i2.com Mon Sep 10 22:36:14 2001 From: James_Althoff@i2.com (James_Althoff@i2.com) Date: Mon, 10 Sep 2001 14:36:14 -0700 Subject: [Python-Dev] problem with inspect module and Jython Message-ID: Apologies for not being up to speed on the standard bug reporting process. There appears to be an incompatibility between the inspect module and Jython. The inspect module uses "type(xxx) is types.zzz" in a number of places. This seems to fail when inspect is used with Jython. Using "isinstance" instead works as shown in the example below. My understanding is that "isinstance" is the preferred idiom in any case. Jim =========================================== from the inspect module: def iscode(object): """Return true if the object is a code object. Code objects provide these attributes: co_argcount number of arguments (not including * or ** args) co_code string of raw compiled bytecode co_consts tuple of constants used in the bytecode co_filename name of file in which this code object was created co_firstlineno number of first line in Python source code co_flags bitmap: 1=optimized | 2=newlocals | 4=*arg | 8 =**arg co_lnotab encoded mapping of line numbers to bytecode indices co_name name with which this code object was defined co_names tuple of names of local variables co_nlocals number of local variables co_stacksize virtual machine stack space required co_varnames tuple of names of arguments and local variables""" ###return type(object) is types.CodeType # <<< returns 0 (before reload below) return isinstance(object,types.CodeType) # <<< returns 1 (after reload below) Jython 2.1b1 on java1.3.0 (JIT: null) Type "copyright", "credits" or "license" for more information. >>> from core.probe import tablepanel >>> import inspect >>> source = inspect.getsource(tablepanel.TablePanel.__init__) Traceback (innermost last): File "", line 1, in ? File "C:\_Dev\pnp\3rdparty\jython\Lib\inspect.py", line 411, in getsource File "C:\_Dev\pnp\3rdparty\jython\Lib\inspect.py", line 400, in getsourcelines File "C:\_Dev\pnp\3rdparty\jython\Lib\inspect.py", line 280, in findsource IOError: could not get source code >>> reload(inspect) >>> source = inspect.getsource(tablepanel.TablePanel.__init__) >>> source " def __init__(self,rowList=None,label=None):\n self.rowList = rowList or [['','','']]\n self.jtable = None\n from javax.swing.table import DefaultTableModel\n self.tabl eModel = DefaultTableModel(self.rowList,self.columnNameList)\n _super.__init__(self,label=lab el)\n" >>> From jack@oratrix.nl Mon Sep 10 23:04:30 2001 From: jack@oratrix.nl (Jack Jansen) Date: Tue, 11 Sep 2001 00:04:30 +0200 Subject: [Python-Dev] More test problems Message-ID: <20010910220435.5CCB5140253@oratrix.oratrix.nl> The recent mods to the test suite make my life a _lot_ simpler, thanks! I now have a new problem, one that I've seen in the past but always seems to go away all by itself. Urllib2 will fail when I run the whole regrtest suite: >>> import test.regrtest >>> test.regrtest.main() test_grammar [... many lines deleted] test_urllib2 test test_urllib2 crashed -- exceptions.AttributeError: 'module' object has no attribute 'error' But if I run only the urllib2 test in verbose mode it works fine: >>> sys.argv = ['regrtest.py', '-v', 'test_urllib2'] >>> test.regrtest.main() test_urllib2 test_urllib2 1 test OK. CAUTION: stdout isn't compared in verbose mode: a test that passes in verbose mode may fail without it. Does anyone know where I could start looking for this one? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.cwi.nl/~jack | ++++ see http://www.xs4all.nl/~tank/ ++++ From martin@loewis.home.cs.tu-berlin.de Tue Sep 11 00:13:31 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 11 Sep 2001 01:13:31 +0200 Subject: [Python-Dev] More test problems Message-ID: <200109102313.f8ANDVv01479@mira.informatik.hu-berlin.de> > Does anyone know where I could start looking for this one? My guess is that something happens to socket.error. What this could be, I don't know. To analyse this, you could disable the "verbose" test in regrtest after the "crashed" line, and always print the traceback. Regards, Martin From guido@python.org Tue Sep 11 02:42:33 2001 From: guido@python.org (Guido van Rossum) Date: Mon, 10 Sep 2001 21:42:33 -0400 Subject: [Python-Dev] More test problems In-Reply-To: Your message of "Tue, 11 Sep 2001 00:04:30 +0200." <20010910220435.5CCB5140253@oratrix.oratrix.nl> References: <20010910220435.5CCB5140253@oratrix.oratrix.nl> Message-ID: <200109110142.VAA14309@cj20424-a.reston1.va.home.com> > I now have a new problem, one that I've seen in the past but always > seems to go away all by itself. Urllib2 will fail when I run the whole > regrtest suite: > >>> import test.regrtest > >>> test.regrtest.main() > test_grammar > [... many lines deleted] > test_urllib2 > test test_urllib2 crashed -- exceptions.AttributeError: 'module' > object has no attribute 'error' > > But if I run only the urllib2 test in verbose mode it works fine: > >>> sys.argv = ['regrtest.py', '-v', 'test_urllib2'] > >>> test.regrtest.main() > test_urllib2 > test_urllib2 > 1 test OK. > CAUTION: stdout isn't compared in verbose mode: a test > that passes in verbose mode may fail without it. > > Does anyone know where I could start looking for this one? Not beyond what Martin suggested. One of the prior tests probably screws you. Any tests *fail* before? I have done bisection of the set of test modules -- tedious, but effective: make an explicit list of the previous tests, and each time try with half of them removed until the result changes. --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Tue Sep 11 08:48:43 2001 From: gstein@lyra.org (Greg Stein) Date: Tue, 11 Sep 2001 00:48:43 -0700 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/nondist/sandbox/Lib README,NONE,1.1 davlib.py,NONE,1.1 httpx.py,NONE,1.1 In-Reply-To: ; from gstein@users.sourceforge.net on Mon, Sep 10, 2001 at 06:27:40PM -0700 References: Message-ID: <20010911004842.D23015@lyra.org> I've now created nondist/sandbox/Lib as a place where people can (cooperatively) develop modules intended for inclusion into the core's Lib directory. Of course, at your discretion, you can also create sandbox/big-project, but the sandbox/Lib directory could be handy for more people. I've checked in a non-working httpx, and the current davlib. These will get worked on over the next few weeks to prep them for the next release. Review and commentary are welcome! Cheers, -g On Mon, Sep 10, 2001 at 06:27:40PM -0700, Greg Stein wrote: > Update of /cvsroot/python/python/nondist/sandbox/Lib > In directory usw-pr-cvs1:/tmp/cvs-serv17467 > > Added Files: > README davlib.py httpx.py > Log Message: > Initial checkin of some files: > > * README: describe this directory and its contents > > * davlib.py: current, published davlib (only tweaked the header) > > * httpx.py: initial draft from some coding over the weekend (incomplete, > untested, and it doesn't even load :-) > > > > --- NEW FILE: README --- > This directory is for modules that are intended to go into the main Lib > directory of Python. They can be developed here until they are ready for > evaluation for inclusion into Python itself. > > (this prevents iteration of development within the core, yet also provides > for public development of (new) modules) > > Note: a module's presence here does not mean it *will* go into Lib, but > merely that (should it be accepted) the appropriate place is Lib. ... -- Greg Stein, http://www.lyra.org/ From fredrik@pythonware.com Tue Sep 11 11:33:27 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 11 Sep 2001 12:33:27 +0200 Subject: [Python-Dev] 2.2a3 error messages Message-ID: <016d01c13aad$31ee5260$0900a8c0@spiff> maybe it's just me, but I just spent five minutes trying to figure out why an innocent-looking line of code resulted in an "iter() of non-sequence" type error. I finally ran it under 2.1, and immediately realized what was wrong. is there any chance of getting the old, far more helpful "unpack non-sequence" and "loop over non-sequence" error messages back before 2.2 final? From loewis@informatik.hu-berlin.de Tue Sep 11 12:01:38 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Tue, 11 Sep 2001 13:01:38 +0200 (MEST) Subject: [Python-Dev] Free threading and borrowing references from mutable types Message-ID: <200109111101.NAA23829@pandora.informatik.hu-berlin.de> Considering the free threading issue (again), I found that functions returning borrowed references are problematic if the container is mutable. In traditional Python, extension modules could safely borrow references if they know that they maintain a reference to the container. If a thread switch is possible between getting the borrowed reference and using it, then this assumption is wrong: another thread may remove the reference from the container, so that the object dies. Therefore, I propose to deprecate these functions. I'm willing to write a PEP elaborating on that if necessary, but I'd like to perform a quick poll beforehand - whether people think that deprecating these functions is reasonable - whether it is sufficient to only have their abstract.c equivalents, or whether type-specific replacements that do return new references are needed - what else I'm missing. Specifically, I think the following functions are problematic: - PyList_GetItem, PyList_GET_ITEM, - PyDict_GetItem, PyDict_GetItemString Any comments appreciated, Martin From guido@python.org Tue Sep 11 13:52:46 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Sep 2001 08:52:46 -0400 Subject: [Python-Dev] Free threading and borrowing references from mutable types In-Reply-To: Your message of "Tue, 11 Sep 2001 13:01:38 +0200." <200109111101.NAA23829@pandora.informatik.hu-berlin.de> References: <200109111101.NAA23829@pandora.informatik.hu-berlin.de> Message-ID: <200109111252.IAA16821@cj20424-a.reston1.va.home.com> > Considering the free threading issue (again), I found that functions > returning borrowed references are problematic if the container is > mutable. > > In traditional Python, extension modules could safely borrow > references if they know that they maintain a reference to the > container. If a thread switch is possible between getting the borrowed > reference and using it, then this assumption is wrong: another thread > may remove the reference from the container, so that the object dies. Good point. I hadn't though of this yet, but it's definitely yet another problem facing free threading. > Therefore, I propose to deprecate these functions. I'm willing to > write a PEP elaborating on that if necessary, but I'd like to perform > a quick poll beforehand > - whether people think that deprecating these functions is reasonable > - whether it is sufficient to only have their abstract.c equivalents, > or whether type-specific replacements that do return new references > are needed > - what else I'm missing. I'm personally not overly excited about free threading (Greg Stein agrees that it slows down the single-threaded case and expects that it will always remain optional). Therefore I'm at best lukewarm about this proposal. But at a recent PythonLabs meeting, a very different motivation was brought up to deprecate the type-specific APIs (all of them!): if someone subclasses e.g. dictionary and overrides __getitem__, code calling PyDict_GetItem on its instances can be considered wrong, because it circumvents the additional processing in __getitem__ (where e.g. case normalization or other forms of key mapping could affect the outcome). Because it returns a borrowed value, PyDict_GetItem can't safely be fixed to check for this and call the __getitem__ slot. Since there are many sensible uses of dictionary subclasses that don't override __getitem__, I find it would be a shame to change PyDict_Check() to only accept "real" dictionaries (not subclasses) -- this would disallow using dictionary subclasses for many interesting situations. > Specifically, I think the following functions are problematic: > - PyList_GetItem, PyList_GET_ITEM, > - PyDict_GetItem, PyDict_GetItemString > > Any comments appreciated, I believe that these APIs are still useful for more limited situations. E.g. if I write C code to implement some algorithm using a dictionary, if I create the dictionary myself, and don't pass it on to outside code, I can trust that it won't be mutated, so my use of PyDict_GetItem is safe. Another situation where PyDict_GetItem is unique: it doesn't raise an exception when the item is not present. This often saves a lot of overhead in situations where a missing item simply means to try something else, rather than a failure of the algorithm. I think that we may need an API with this property, even if it returns a new reference when successful. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Sep 11 13:54:54 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Sep 2001 08:54:54 -0400 Subject: [Python-Dev] 2.2a3 error messages In-Reply-To: Your message of "Tue, 11 Sep 2001 12:33:27 +0200." <016d01c13aad$31ee5260$0900a8c0@spiff> References: <016d01c13aad$31ee5260$0900a8c0@spiff> Message-ID: <200109111254.IAA16840@cj20424-a.reston1.va.home.com> > maybe it's just me, but I just spent five minutes trying to figure > out why an innocent-looking line of code resulted in an "iter() of > non-sequence" type error. > > I finally ran it under 2.1, and immediately realized what was > wrong. > > is there any chance of getting the old, far more helpful "unpack > non-sequence" and "loop over non-sequence" error messages > back before 2.2 final? Can you show an example of what went wrong? Is it just the distinction between "unpack" vs. "loop over"? I would like to make the errors more helpful, but I'm not sure where to start. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Sep 11 14:38:18 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Sep 2001 09:38:18 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/nondist/sandbox/Lib README,NONE,1.1 davlib.py,NONE,1.1 httpx.py,NONE,1.1 In-Reply-To: Your message of "Tue, 11 Sep 2001 00:48:43 PDT." <20010911004842.D23015@lyra.org> References: <20010911004842.D23015@lyra.org> Message-ID: <200109111338.JAA17245@cj20424-a.reston1.va.home.com> > I've now created nondist/sandbox/Lib as a place where people can > (cooperatively) develop modules intended for inclusion into the core's Lib > directory. Of course, at your discretion, you can also create > sandbox/big-project, but the sandbox/Lib directory could be handy for more > people. > > I've checked in a non-working httpx, and the current davlib. These will get > worked on over the next few weeks to prep them for the next release. Review > and commentary are welcome! Excellent, Greg! I encourage Eric Raymond to check in his ccframe code. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Sep 11 16:00:03 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Sep 2001 11:00:03 -0400 Subject: [Python-Dev] problem with inspect module and Jython In-Reply-To: Your message of "Mon, 10 Sep 2001 14:36:14 PDT." References: Message-ID: <200109111500.LAA17997@cj20424-a.reston1.va.home.com> > Apologies for not being up to speed on the standard bug reporting process. Go to http://sourceforge.net/tracker/?group_id=5470&atid=105470 and click on the "Submit New" link. > There appears to be an incompatibility between the inspect module and > Jython. > > The inspect module uses "type(xxx) is types.zzz" in a number of places. > This seems to fail when inspect is used with Jython. > > Using "isinstance" instead works as shown in the example below. > > My understanding is that "isinstance" is the preferred idiom in any case. Indeed. We're going to have to fix inspect.py to work with new types anyway, so we'll try to take care of this. --Guido van Rossum (home page: http://www.python.org/~guido/) From klm@zope.com Tue Sep 11 16:08:47 2001 From: klm@zope.com (Ken Manheimer) Date: Tue, 11 Sep 2001 11:08:47 -0400 (EDT) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/nondist/sandbox/Lib README,NONE,1.1 davlib.py,NONE,1.1 httpx.py,NONE,1.1 In-Reply-To: <20010911004842.D23015@lyra.org> Message-ID: On Tue, 11 Sep 2001, Greg Stein wrote: > I've now created nondist/sandbox/Lib as a place where people can > (cooperatively) develop modules intended for inclusion into the core's Lib > directory. Of course, at your discretion, you can also create > sandbox/big-project, but the sandbox/Lib directory could be handy for more > people. As mostly a bystander in the python development process, i have a question about an alternative approach. I believe we (zope dev) use branches for this kind of thing, and wonder whether you've considered using that? I gather that this is for things that have not yet gotten approval for inclusion in the distribution, but that doesn't disqualify the branch approach - branches need not ever be merged. The two drawbacks i see with using branches this way: - More risk that configuration mistakes can lead to disruption, eg if someone thinks they've established their checkout in the branch. That said, mistakes using the version control system are generally painful, so care is needed in any event. - Pollution of the branch namespace. This can be mitigated by a good system for choosing names. The advantage is that the subject files are developed where they're used, instead of some arbitrary other place. This is easier on developers, and means that the initial development history is included in the version control history, rather than being in some separate (nondist/sandbox/...) place. Is it worth considering using branches for this kind of thing? Ken klm@zope.com From guido@python.org Tue Sep 11 16:13:45 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Sep 2001 11:13:45 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/nondist/sandbox/Lib README,NONE,1.1 davlib.py,NONE,1.1 httpx.py,NONE,1.1 In-Reply-To: Your message of "Tue, 11 Sep 2001 11:08:47 EDT." References: Message-ID: <200109111513.LAA18150@cj20424-a.reston1.va.home.com> > As mostly a bystander in the python development process, i have a question > about an alternative approach. I believe we (zope dev) use branches for > this kind of thing, and wonder whether you've considered using that? I > gather that this is for things that have not yet gotten approval for > inclusion in the distribution, but that doesn't disqualify the branch > approach - branches need not ever be merged. > > The two drawbacks i see with using branches this way: > > - More risk that configuration mistakes can lead to disruption, eg if > someone thinks they've established their checkout in the branch. > > That said, mistakes using the version control system are generally > painful, so care is needed in any event. > > - Pollution of the branch namespace. > > This can be mitigated by a good system for choosing names. > > The advantage is that the subject files are developed where they're used, > instead of some arbitrary other place. This is easier on developers, and > means that the initial development history is included in the version > control history, rather than being in some separate (nondist/sandbox/...) > place. > > Is it worth considering using branches for this kind of thing? I believe the sandbox is for an earlier stage in development. My experiences so far with branches is that they present many surprises and sources of confusion. How many times in Zope have you run into a bug that was already fixed on a branch -- or on a trunk? Just recently I reported a ZEO bug to Jeremy and he said "Oh yeah, that's fixed on branch so-and-so; I haven't looked at the trunk for ages". :-( I like using short-lived branches for things like releases, but doing the types/class unification on a branch probably caused more grief than light. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@zope.com Tue Sep 11 16:34:39 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Tue, 11 Sep 2001 11:34:39 -0400 (EDT) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/nondist/sandbox/Lib README,NONE,1.1 davlib.py,NONE,1.1 httpx.py,NONE,1.1 In-Reply-To: References: <20010911004842.D23015@lyra.org> Message-ID: <15262.12047.342946.542665@slothrop.digicool.com> There's a third possibility, too. It might not be necessary to mention, because I think it's a typical case. It's good to be sure that everyone considers it, though. I think it's best if a module can be developed as a standalone module, first, and incorporated into the standard library after its matured. It's not unusual for substantial user experience to influence the design and implementation of the module. So it's better to wait until after that experience has been achieved to include it in the standard library. A few examples come to mind. I'm sure there are others: asyncore & asynchat, unittest, and xmlrpclib. One possible result of that user experience is that the module isn't right. The problem needs to be solved in a different way, or there are very few people who use the module. In those cases, we're not saddled with a module that we don't want. One example here might by linuxaudiodev. After using this module and doing some maintenance, I'd prefer to see a pure Python version that provided a clearer API to the OSS interface. And the name is bad, too :-). It works on platforms other than Linux. I don't object to using the sandbox, particularly if other resources aren't available. But I'd also be happy to have new modules developed in their own homes. Jeremy From tim@zope.com Tue Sep 11 19:31:25 2001 From: tim@zope.com (Tim Peters) Date: Tue, 11 Sep 2001 14:31:25 -0400 Subject: [Python-Dev] 2.2a3 error messages In-Reply-To: <200109111254.IAA16840@cj20424-a.reston1.va.home.com> Message-ID: [/F] > ... > is there any chance of getting the old, far more helpful "unpack > non-sequence" and "loop over non-sequence" error messages > back before 2.2 final? [Guido] > Can you show an example of what went wrong? Is it just the > distinction between "unpack" vs. "loop over"? I would like to make > the errors more helpful, but I'm not sure where to start. "iter() of non-sequence" is the msg set by PyObject_GetIter() whenever it can't get an iterator *at all*, hence every call site for PyObject_GetIter() *may* end up leaving this msg as-is. Some do: >>> for i in 3: ... pass ... Traceback (most recent call last): File "", line 1, in ? TypeError: iter() of non-sequence Some don't: >>> map(str, 3) Traceback (most recent call last): File "", line 1, in ? TypeError: argument 2 to map() must support iteration >>> But "iter() of non-sequence" isn't the only flavor of TypeError PyObject_GetIter() may raise, and indeed when I fiddled map's msg I was acutely aware that I may be stomping on some other kind of error entirely; but in the specific case of map, which accepts any number of arguments, I thought it was important to spell out which one was giving trouble. >>> map(str, [], [], [], 3, []) Traceback (most recent call last): File "", line 1, in ? TypeError: argument 5 to map() must support iteration >>> OTOH, given that for-loop semantics are now defined in terms of iterators, I see nothing wrong with "iter() of non-sequence" in the for-loop example. Yes, it's different now, but so are for-loops. So I think this takes exhaustive case-by-case anaylsis, and since /F won't agree with me about the for-loop example anyway, it's a preference pit. From klm@zope.com Tue Sep 11 19:42:39 2001 From: klm@zope.com (Ken Manheimer) Date: Tue, 11 Sep 2001 14:42:39 -0400 (EDT) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/nondist/sandbox/Lib README,NONE,1.1 davlib.py,NONE,1.1 httpx.py,NONE,1.1 In-Reply-To: <200109111513.LAA18150@cj20424-a.reston1.va.home.com> Message-ID: On Tue, 11 Sep 2001, Guido van Rossum wrote: > I believe the sandbox is for an earlier stage in development. My > experiences so far with branches is that they present many surprises > and sources of confusion. How many times in Zope have you run into a > bug that was already fixed on a branch -- or on a trunk? Just > recently I reported a ZEO bug to Jeremy and he said "Oh yeah, that's > fixed on branch so-and-so; I haven't looked at the trunk for ages". :-( > > I like using short-lived branches for things like releases, but doing > the types/class unification on a branch probably caused more grief > than light. Merge problems certainly increase the longer the developing code is isolated from the trunk - whether or not you're using a branch for the developing code. (When the project is big enough, that code skew is inevitable - at least when using a branch you have the option to occasionally merge form the trunk into the in-process branch. I didn't follow the types/class unification very carefully, but have the impression that you and tim did some of that - perhaps later in the process than you would have, in retrospect, preferred.) I'm not the CVS practices maven here - brian lloyd has that pleasure;-) and in any case realize the cost/benefits balance is complicated. CVS branching does offer some real advantages, but the complication of realizing them can be an added burden, and CVS' foibles can muddy cost/benefit balance a good bit, as well - so i don't mean to suggest there's a clear-cut win here, one way or the other... Ken klm@zope.com From guido@python.org Tue Sep 11 19:46:58 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Sep 2001 14:46:58 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/nondist/sandbox/Lib README,NONE,1.1 davlib.py,NONE,1.1 httpx.py,NONE,1.1 In-Reply-To: Your message of "Tue, 11 Sep 2001 14:42:39 EDT." References: Message-ID: <200109111846.OAA26706@cj20424-a.reston1.va.home.com> > Merge problems certainly increase the longer the developing code is > isolated from the trunk - whether or not you're using a branch for the > developing code. Ah, but we're talking new module/package development here, not modifications to existing code. At least that's what the sandbox is for. --Guido van Rossum (home page: http://www.python.org/~guido/) From klm@zope.com Tue Sep 11 19:57:42 2001 From: klm@zope.com (Ken Manheimer) Date: Tue, 11 Sep 2001 14:57:42 -0400 (EDT) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/nondist/sandbox/Lib README,NONE,1.1 davlib.py,NONE,1.1 httpx.py,NONE,1.1 In-Reply-To: <200109111846.OAA26706@cj20424-a.reston1.va.home.com> Message-ID: On Tue, 11 Sep 2001, Guido van Rossum wrote: > > Merge problems certainly increase the longer the developing code is > > isolated from the trunk - whether or not you're using a branch for the > > developing code. > > Ah, but we're talking new module/package development here, not > modifications to existing code. At least that's what the sandbox is > for. So drift (like you were mentioning for the type/class development) is hardly an issue for or against using branches here. I suppose the more relevant issue is polluting the repository with stuff that winds up never being accepted for incorporation to the distribution - forlorn files that live forever in the Attics, and directories that show up empty until you do a 'cvs up' with the "-P" option... Again, i don't mean to insist on using branches instead of the sandbox. I guess my concern is that branches do offer wins in some cases, particularly for stuff that is clearly on track to be incorporated... Ken From barry@zope.com Tue Sep 11 21:07:05 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 11 Sep 2001 16:07:05 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/nondist/sandbox/Lib README,NONE,1.1 davlib.py,NONE,1.1 httpx.py,NONE,1.1 References: <200109111846.OAA26706@cj20424-a.reston1.va.home.com> Message-ID: <15262.28393.924876.348031@anthem.wooz.org> >>>>> "KM" == Ken Manheimer writes: KM> Again, i don't mean to insist on using branches instead of the KM> sandbox. I guess my concern is that branches do offer wins in KM> some cases, particularly for stuff that is clearly on track to KM> be incorporated... The other problem with branches is that they get much less testing than the trunk. I think people in general don't check out branches unless they have to . Using the sandbox is great for small, single modules. I'd prefer to see larger candidates use something like SF to prototype. -Barry From fdrake@acm.org Wed Sep 12 03:13:42 2001 From: fdrake@acm.org (Fred L. Drake) Date: Tue, 11 Sep 2001 22:13:42 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010912021342.ABF4928845@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Miscellaneous updates, plus documentation for the new "hmac" module (located in the crypto chapter of the Library Reference). From guido@python.org Wed Sep 12 03:38:14 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 11 Sep 2001 22:38:14 -0400 Subject: [Python-Dev] interning string subclasses In-Reply-To: Your message of "Tue, 11 Sep 2001 19:18:32 PDT." References: Message-ID: <200109120238.WAA30906@cj20424-a.reston1.va.home.com> > Question: Should we complain if someone tries to intern an instance of > a string subclass? I hate to slow any code on those paths. I think in this case intern(s) should return intern(str(s)). The fast path checks ob_sinterned first, and that should always point to a real string for a string subclass. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Wed Sep 12 05:50:45 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 12 Sep 2001 00:50:45 -0400 Subject: [Python-Dev] RE: interning string subclasses In-Reply-To: <200109120238.WAA30906@cj20424-a.reston1.va.home.com> Message-ID: [Tim] > Question: Should we complain if someone tries to intern an instance of > a string subclass? I hate to slow any code on those paths. [Guido] > I think in this case intern(s) should return intern(str(s)). The fast > path checks ob_sinterned first, and that should always point to a real > string for a string subclass. I think I'm wrestling with more than one problem here. The bogus one confusing everything else appears to be this: >>> class S(str): ... pass ... [7098 refs] >>> hash(S('abc')) 0 [7100 refs] >>> hash(S('cdefg')) 0 [7102 refs] >>> That is, PyObject_Hash(s) is always 0 for any instance s of a subclass of str. This makes dict-based operations "almost always" believe that an s like this used as a key doesn't match a real string with the same value. >>> d = {'a': 1} [7106 refs] >>> d[S('a')] Traceback (most recent call last): File "", line 1, in ? KeyError: a [7143 refs] >>> This applies too to the intern dict. So first I think str_subtype_new has to finish initializing the subclass string object. Then I can get confused about interning for the right reasons . From tim.one@home.com Wed Sep 12 06:52:37 2001 From: tim.one@home.com (Tim Peters) Date: Wed, 12 Sep 2001 01:52:37 -0400 Subject: [Python-Dev] Free threading and borrowing references from mutable types In-Reply-To: <200109111252.IAA16821@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > ... > Because it returns a borrowed value, PyDict_GetItem can't safely be > fixed to check for this and call the __getitem__ slot. If PyDict_GetItem ended up calling a non-genuine-dict __getitem__ slot, what would stop it from decref'ing the result in that case before returning it (thus returning a borrowed reference even so)? That __getitem__ may synthesize a result object with a refcount of 1? hard-to-approve-of-users-ly y'rs - tim From guido@python.org Wed Sep 12 14:43:56 2001 From: guido@python.org (Guido van Rossum) Date: Wed, 12 Sep 2001 09:43:56 -0400 Subject: [Python-Dev] Free threading and borrowing references from mutable types In-Reply-To: Your message of "Wed, 12 Sep 2001 01:52:37 EDT." References: Message-ID: <200109121343.JAA02101@cj20424-a.reston1.va.home.com> > [Guido] > > ... > > Because it returns a borrowed value, PyDict_GetItem can't safely be > > fixed to check for this and call the __getitem__ slot. [Tim] > If PyDict_GetItem ended up calling a non-genuine-dict __getitem__ slot, what > would stop it from decref'ing the result in that case before returning it > (thus returning a borrowed reference even so)? That __getitem__ may > synthesize a result object with a refcount of 1? Exactly. --Guido van Rossum (home page: http://www.python.org/~guido/) From Samuele Pedroni Thu Sep 13 22:09:04 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Thu, 13 Sep 2001 23:09:04 +0200 (MET DST) Subject: [Python-Dev] Re: PEP 269 Message-ID: <200109132109.XAA21006@core.inf.ethz.ch> Hi personally I have the following concerns about PEP 269: - if it's purpose is to offer a framework for small languages support, there are already modules around that support that (SPARK, PLY ...), the only advantage of PEP 269 being speed wrt to the pure python solutions, because of the use of the internal CPython parser, OTOH the other solutions are more flexible... - or if's purpose is to help experimenting with the grammar unless support for adding keywords is added is a quite unfinished tool. Further the PEP propose to use the actual AST format of parser module as output format. To be honest that format is quite awful, especially for general purpose use. It should be considered that Jython does not contain a parser similar to CPython one. Because of this jython does not offer parser module support. So implementing the PEP for Jython would require writing a Java or pure python equivalent of the CPython parser. My plans for resolving the lack of parser module support were to to implement an higher compatibility layer based on the AST format of tools/compiler, a more nicer format. PEP 269 adds issues to this open problem, which I would like to see addressed by future revisions and by further discussions. I can live with PEP 269 implemented only for CPython, for a lack of resources on Jython side, if is to be used for rare experimenting with the grammar. But it seems, as it is, a rather half-cooked solution to offer a module for mini language support in the standard library. regards, Samuele Pedroni. > From: Jonathan Riehl > To: Martin von Loewis > cc: , > MIME-Version: 1.0 > Subject: [Types-sig] Re: PEP 269 > X-BeenThere: types-sig@python.org > X-Mailman-Version: 2.0.6 (101270) > List-Help: > List-Post: > List-Subscribe: , > List-Id: Special Interest Group on the Python type system > List-Unsubscribe: , > List-Archive: > Date: Thu, 13 Sep 2001 14:49:32 -0500 (CDT) > > Howdy all, > I'm afraid Martin's attention to the PEP list has outted me > before I was able to post about this myself. Anyway, for those > interested, I wrote a PEP for the exposure of pgen to the Python > interpreter. You may view it at: > > http://python.sourceforge.net/peps/pep-0269.html > > I am looking for comments on this PEP, and below, I address some > interesting issues raised by Martin. Furthermore, I already have a > parially functioning reference implementation, and should be pestered to > make it available shortly. > > Thanks, > -Jon > > On Tue, 11 Sep 2001, Martin von Loewis wrote: > > > Hi Jonathan, > > > > With interest I noticed your proposal to include Pgen into the > > standard library. I'm not sure about the scope of the proposed change: > > Do you view pgen as a candidate for a general-purpose parser toolkit, > > or do you "just" contemplate using that for variations of the Python > > grammar? > > I am thinking of going for the low hanging fruit first (a Python centric > pgen module), and then adding more functionality for later releases of > Python (see below.) > > > If the former, I think there should be a strategy already how > > to expose pgen to the application; the proposed API seems > > inappropriate. In particular: > > > > - how would I integrate an alternative tokenizer? > > - how could I integrate semantic actions into the parse process, > > instead of creating the canonical AST? > > The current change proposed is somewhat restrained by the Python 2.2 > release schedule, and will initially only address building parsers that > use the Python tokenizer. If the module misses 2.2 release, I'd like to > make it more functional and provide the ability to override the Python > tokenizer. I may also add methods to export all the data found in the DFA > structure. > > I am unsure what the purpose of integration of semantics into the parse > process buys us besides lower memory overhead. In C/C++ such coupling is > needed because of the TYPEDEF/IDENTIFIER tokenization problem, but I > don't see Python and future Python-like, LL(1), languages needing such > hacks. Finally, I am prone to enforce the separation of the backend > actions from the AST. This allows the AST to be used for a variety of > purposes, rather than those intended by the initial parser developer. > > > Of course, these questions are less interesting if the scope is to > > parse Python: in that case, Python tokenization is fine, and everybody > > is used to getting the Python AST. > > An interesting note to make about this is that the since the nonterminal > integer values are generated by pgen, pgen AST's are not currently > compatible with the parser module AST's. Perhaps such unification may be > slated for future work (I know Fred left room in the parser AST datatype > for identification of the grammar that generated the AST using an integer > value, but using this would be questionable in a "rapid parser > development" environment.) > > > On the specific API, I think you should drop the File functions > > (parseGrammarFile, parseFile). Perhaps you can also drop the String > > functions, and provide only functions that expect file-like objects. > > I am open to further discussion on this, but I would note that filename > information is used (and useful) when reporting syntax errors. I think > that the "streaming" approach to parsing is another hold over from days > where memory constraints ruled (much like binding semantics to the parser > itself.) > > > On the naming of the API functions: I propose to use an underscore > > style instead of the mixedCaps style, or perhaps to leave out any > > structure (parsegrammar, buildparser, parse, symbol2string, > > string2symbolmap). That would be more in line with the parser module. > > I would like to hear more about this from the Pythonati. I am currently > following the naming conventions I use at work, which of course is most > natural for me at home. :) > > > > > Regards, > > Martin > > > > > > > _______________________________________________ > Types-SIG mailing list > Types-SIG@python.org > http://mail.python.org/mailman/listinfo/types-sig From Barrett@stsci.edu Fri Sep 14 15:26:26 2001 From: Barrett@stsci.edu (Paul Barrett) Date: Fri, 14 Sep 2001 10:26:26 -0400 Subject: [Python-Dev] A draft PEP for a new memory model Message-ID: <3BA21392.FB9068DB@STScI.Edu> The following is the beginnings of a PEP for a new memory model for Python. It currently contains only the motivation section and a description of a preliminary design. I'm submitting the PEP in its current form to get a feel for whether or not I should pursue this proposal and to find out if I am overlooking any details that would make it incompatible with Python's core implementation, i.e. implementing it would cause too much of an affect on Python's performance. I do plan to implement something along these lines, but may have to change my approach if I hear comments about this PEP to the contrary. Cheers, Paul PEP: XXX Title: A New Memory Management Model for Python Version: $Revision: 1.3 $ Last-Modified: $Date: 2001/08/20 23:59:26 $ Author: barrett@stsci.edu (Paul Barrett) Status: Draft Type: Standards Track Created: 05-Sep-2001 Python-Version: 2.3 Post-History: Replaces: PEP 42 Abstract This PEP proposes a new memory management model to provide better support for the various types of memory found in modern operating systems. The proposed model separates the memory object from its access method. In simplest terms, memory objects only allocate memory, while access objects only provide access to that memory. This separation allows various types of memory to share a common interface or access object and vice versa. Motivation There are three sequence objects which share similar interfaces, but have different intended uses. The first is the indispensable 'string' object. A 'string' is an immutable sequence of characters and supports slicing, indexing, concatenation, replication, and related string-type operations. The second is the 'array' object. Like a 'list', it is a mutable sequence and supports slicing, indexing, concatenation, and replication, but its values are constrained to one of several basic types, namely characters, integers, and floating point numbers. This constraint enables efficient storage of the values. The third object is the 'buffer' which behaves similar to a string object at the Python programming level: it supports slicing, indexing, concatenation, and related string-like operations. However, its data can come from either a block of memory or an object that exports the buffer interface, such as 'mmap', the memory-mapped file object which is its prime justification. Each object has been used at one time or other as a way of allocating read-write memory from the heap. The 'string' object is often used at the C programming level because it is a standard Python object, but its use goes counter to its intended behavior of being immutable. The preferred way of allocating such memory is the 'array' object, but its insistence on returning a representation of itself for both the 'repr' and 'str' methods makes it cumbersome to use. In addition, the use of a 'string' as an initializer during 'array' creation is inefficient, because the memory is temporarily allocated twice, once for the 'string' and once for the 'array'. This is particularly onerous when allocating tens of megabytes of memory. The 'buffer' object also has its problems, some of which have been discussed on python-dev. Some of the more important ones are: (1) the 'buffer' object always returns a read-only 'buffer', even for read-write objects. This is apparently a bug in the 'buffer' object, which is fixable. (2) The buffer API provides no guarantee about the lifetime of the base pointer - even if the 'buffer' object holds a reference to the base object, since there is no locking mechanism associated with the base pointer. For example, if the initial 'buffer' is deleted, the memory pointer of the derived 'buffer' will refer to freed memory. This situation happens most often at the C programming level as in the following situation: PyObject *base = PyBuffer_New(100); PyObject *buffer = PyBuffer_FromObject(base); Py_DECREF(base); This problem is also fixable. And (3) the 'buffer' object cannot easily be used to allocate read-write memory at the Python programming level. The obvious approach is to use a 'string' as the base object of the 'buffer'. Yet, a 'string' is immutable which means the 'buffer' object derived from it is also immutable, even if problem (1) is fixed. The only alternative at the Python programming level is to use the cumbersome 'array' object or to create your own version of the 'buffer' object to allocate a block of memory. We feel that the solution to these and other problems is best illustrated by problem (3), which can essentially be described as the simple operation of allocating a block of read-write memory from the heap. Python currently provides no standard way of doing this. It is instead done by subterfuge at the C programming level using the 'string', 'array', or 'buffer' APIs. A solution to this specific problem is to include a 'malloc' object as part of standard Python. This object will be used to allocate a block of memory from the heap and the 'buffer' object will be use to access this memory just as it is used to access data from a memory-mapped file. Yet, this hints at a more general solution, the creation of two classes of objects, one for memory-allocation, and one for memory-access. The Model We propose a new memory-management model for Python which separates the allocation object from its access method. This mix-and-match memory model will enable various access objects, such as 'array', 'string', and 'file', to access easily the data from different types of memory, namely heap, shared, and memory-mapped files; or in other words, different types of memory can share a common interface (see figure below). It will also provide better support for the various types of memory found in modern operating systems. |---------------------------------------------------| | interface layer | | ----------------------------------------------- | | array | string | file | ... | |===================================================| | data layer | | ----------------------------------------------- | | heap memory | shared memory | memory mapped file | |---------------------------------------------------| Memory Objects Modern operating systems, such as Unix and Windows, provide access to several different types of memory, namely heap, shared, and memory-mapped files. These memory types share two common attributes, a pointer to the memory and the size of the memory. This information is usually sufficient for objects whose data uses heap memory, since the object is expected to have sole control over that memory throughout the lifetime of the object. For objects whose data also uses shared and memory-mapped files, an additional attribute is necessary for access permission. However, the issue of how to handle memory persistence across processes does not appear well-defined in modern OSs, but appears to be left to the programmer to implement. In any case, a fourth attribute to handle memory persistence seems imperative. Access Objects Consider 'array', 'buffer', and 'string' objects. Each provides, more or less, the same string-like interface to its underlying data. They each support slicing, indexing, concatenation, and replication of the data. They differ primarily in the types of initializing data and the permissions associated with the underlying data. Currently, the 'array' initializer accepts only 'list' and 'string' objects. If this was extended to include objects that support the 'buffer interface', then the distinction between the 'array' and 'buffer' objects would disappear, since they both support the sequence interface and the same set of base objects. The 'buffer' object is therefore redundant and no longer necessary. The 'string' and 'array' objects would still be distinct, since the 'array' object encompasses more data-types than does the 'string' object. The 'array' object is also mutable requiring its underlying data to be read-write, while the 'string' object is immutable requiring read-only data. This new memory-management model therefore suggests that the 'string' object support the 'buffer interface' with the proviso that the data have read-only permission. Implementation References Copyright This document has been placed in the public domain. -- Paul Barrett, PhD Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Group FAX: 410-338-4767 Baltimore, MD 21218 From guido@python.org Fri Sep 14 16:33:53 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Sep 2001 11:33:53 -0400 Subject: [Python-Dev] A draft PEP for a new memory model In-Reply-To: Your message of "Fri, 14 Sep 2001 10:26:26 EDT." <3BA21392.FB9068DB@STScI.Edu> References: <3BA21392.FB9068DB@STScI.Edu> Message-ID: <200109141533.LAA14522@cj20424-a.reston1.va.home.com> Hi Paul, Thanks for yor PEP. After reading it, I'm still not quite sure what the real problem is that you're trying to solve. Your description hints at practical problems with using the buffer or array objects for managing large chunks of memory, but I'm not quite sure what kind of application you have in mind. Your diagram doesn't clarify things much -- it's too abstract, and can be used for many different designs. I guess I'm saying that I'm not sure what's different about your model. I also have a feeling that by looking at a slightly different abstraction level you could get most of what you want with only relatively small changes to some existing objects. Rather than focus on the differences between string/array/buffer/mmap, focus on their similarities: they all support the sequence API. The array module can be used to allocate heap memory. The mmap module can be used to allocate shared memory. Could you write the application you have in mind in such a way that it can work with either kind of object? You hint at a problem with array's repr(). What exactly is the problem here? Your complaints about the buffer object are misguided. Pretend the buffer object doesn't exist -- it is *not* intended as a memory management tool. (I guess the name is misleading, because we get this confusion a lot.) Maybe all you need to do is write your own object type that implements your model. How would it differ from the existing array and mmap implementations? --Guido van Rossum (home page: http://www.python.org/~guido/) From robin@alldunn.com Fri Sep 14 21:33:22 2001 From: robin@alldunn.com (Robin Dunn) Date: Fri, 14 Sep 2001 13:33:22 -0700 Subject: [Python-Dev] Preventing PyEval_AcquireLock deadlock Message-ID: <065f01c13d5c$8040cae0$0100a8c0@Rogue> Is there an easy way in the API to check if the current thread already has the interpreter lock so I can avoid calling PyEval_AcquireLock again? If so, is it available all the way back to 1.5.2? -- Robin Dunn Software Craftsman robin@AllDunn.com Java give you jitters? http://wxPython.org Relax with wxPython! From guido@python.org Fri Sep 14 21:40:51 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Sep 2001 16:40:51 -0400 Subject: [Python-Dev] Preventing PyEval_AcquireLock deadlock In-Reply-To: Your message of "Fri, 14 Sep 2001 13:33:22 PDT." <065f01c13d5c$8040cae0$0100a8c0@Rogue> References: <065f01c13d5c$8040cae0$0100a8c0@Rogue> Message-ID: <200109142040.QAA18940@cj20424-a.reston1.va.home.com> > Is there an easy way in the API to check if the current thread already has > the interpreter lock so I can avoid calling PyEval_AcquireLock again? If > so, is it available all the way back to 1.5.2? I don't think so. It's easy to check whether *some* thread has the lock, but the lock abstraction doesn't have a notion of ownership by a specific thread. Let's take a step back. Why do you need this? I'm guessing that you have a C++ library that calls C++ callbacks, and now you want to call a Python callback from your C++ callback. The proper solution is to make sure that you *always* release the Python lock before entering your event loop or anything else that could possibly call callbacks. See _tkinter for how I handled it there. It's ugly, but possible. --Guido van Rossum (home page: http://www.python.org/~guido/) From robin@alldunn.com Fri Sep 14 21:52:44 2001 From: robin@alldunn.com (Robin Dunn) Date: Fri, 14 Sep 2001 13:52:44 -0700 Subject: [Python-Dev] Preventing PyEval_AcquireLock deadlock References: <065f01c13d5c$8040cae0$0100a8c0@Rogue> <200109142040.QAA18940@cj20424-a.reston1.va.home.com> Message-ID: <067601c13d5f$34da8c50$0100a8c0@Rogue> > Let's take a step back. Why do you need this? I'm guessing that you > have a C++ library that calls C++ callbacks, and now you want to call > a Python callback from your C++ callback. The proper solution is to > make sure that you *always* release the Python lock before entering > your event loop or anything else that could possibly call callbacks. > See _tkinter for how I handled it there. It's ugly, but possible. Yep, I've already got it working this way except there are a few code paths that result in a callback sometimes being called indirectly from a different part of the code where the Python lock is already acquired. I was hoping to be able to use a general solution instead of having to find all those situations and special-case them. Oh well. Thanks anyway. -- Robin Dunn Software Craftsman robin@AllDunn.com Java give you jitters? http://wxPython.org Relax with wxPython! From guido@python.org Fri Sep 14 21:58:22 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 14 Sep 2001 16:58:22 -0400 Subject: [Python-Dev] Preventing PyEval_AcquireLock deadlock In-Reply-To: Your message of "Fri, 14 Sep 2001 13:52:44 PDT." <067601c13d5f$34da8c50$0100a8c0@Rogue> References: <065f01c13d5c$8040cae0$0100a8c0@Rogue> <200109142040.QAA18940@cj20424-a.reston1.va.home.com> <067601c13d5f$34da8c50$0100a8c0@Rogue> Message-ID: <200109142058.QAA19001@cj20424-a.reston1.va.home.com> > > Let's take a step back. Why do you need this? I'm guessing that you > > have a C++ library that calls C++ callbacks, and now you want to call > > a Python callback from your C++ callback. The proper solution is to > > make sure that you *always* release the Python lock before entering > > your event loop or anything else that could possibly call callbacks. > > See _tkinter for how I handled it there. It's ugly, but possible. > > Yep, I've already got it working this way except there are a few > code paths that result in a callback sometimes being called > indirectly from a different part of the code where the Python lock > is already acquired. I was hoping to be able to use a general > solution instead of having to find all those situations and > special-case them. Oh well. I guess you could have your own global variable that says "I've already got the Python lock". The lock sequences would look like this: // before entering the event loop clear the flag unlock the GIL // after returning from the event loop lock the GIL set the flag --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Fri Sep 14 22:20:31 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 14 Sep 2001 17:20:31 -0400 Subject: [Python-Dev] Preventing PyEval_AcquireLock deadlock In-Reply-To: <200109142058.QAA19001@cj20424-a.reston1.va.home.com> Message-ID: Robin, you may also profit from looking at Mark Hammond's utility classes in the win32all extensions, and exploiting platform TLS (thread-local storage) can make it easy to give each thread a "do I own the lock?" flag -- but you need to establish choke points in your own code, i.e. a protocol of your own for dealing with threads, wrapping the raw Python C API. > I was hoping to be able to use a general solution instead of having to > find all those situations and special-case them. Oh well. You can, but the general solution has to come from you now. Callbacks in the presence of threads are a bitch, and redesigning this part of the Python C API is something the thread-SIG intended to tackle but didn't accomplish. From jriehl@spaceship.com Fri Sep 14 22:43:58 2001 From: jriehl@spaceship.com (Jonathan Riehl) Date: Fri, 14 Sep 2001 16:43:58 -0500 (CDT) Subject: [Python-Dev] Re: PEP 269 In-Reply-To: <200109140932.LAA28452@pandora.informatik.hu-berlin.de> Message-ID: On Fri, 14 Sep 2001, Martin von Loewis wrote: (In response to Samuele Pedroni) > > - if it's purpose is to offer a framework for small languages > > support, there are already modules around that support that > > (SPARK, PLY ...), the only advantage of PEP 269 being speed > > wrt to the pure python solutions, because of the use of the internal > > CPython parser, OTOH the other solutions are more flexible... > > I agree. I'd like to see (or perhaps write myself) a proposal for > adding one or two of these packages to the Python core (two are > probably better, since there is no one-size-fits-all parser framework, > and adding two avoids the impression that there is a single "blessed" > parser). I would like to note that integration with these other systems are in the plans for my Basil project (http://wildideas.org/basil/). I just felt that a pgen integration would be better suited to the native code base rather than copying the code over to another project and building it as an extension module (which was a route being explored by Mobius Python.) > > It should be considered that Jython does not contain a parser > > similar to CPython one. Because of this jython does not offer parser > > module support. So implementing the PEP for Jython would require > > writing a Java or pure python equivalent of the CPython parser. I am all for writing a pgen implementation in pure Python. The reason I am not going this route from the get go is to do what is easy before we do what is less easy. If, for example, a reference implementation of a Python type system was to be adopted as standard, I would think that making the new system easy to add to Jython would be a prerequisite. Hence, we would need to develop a Jython parser that uses the grammar from CPython. > If the goal is to play with extensions to the Python grammar, I think > this is less of an issue. Of course, anybody wanting to extend the C > grammar could easily modify the Python interpreter itself. > So I think I'm -1 on this PEP, on the basis that this is code bloat > (i.e. new functionality used too rarely). As stated in the PEP, one of the primary motiviations for the proposal is to allow grammar extensions to be prototyped in Python (esp. optional static typing.) I would argue that making actual changes to CPython is much more expensive than writing a front end in Python. By adding a pgen module to Python, I feel that we are not bloating Python so much as we are exposing funtionality already built into Python. Thanks, -Jon From Samuele Pedroni Fri Sep 14 23:54:44 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Sat, 15 Sep 2001 00:54:44 +0200 (MET DST) Subject: [Python-Dev] Re: [Types-sig] Re: PEP 269 Message-ID: <200109142254.AAA22101@core.inf.ethz.ch> [jriehl] > > On Fri, 14 Sep 2001, Martin von Loewis wrote: (In response to Samuele > Pedroni) > > > - if it's purpose is to offer a framework for small languages > > > support, there are already modules around that support that > > > (SPARK, PLY ...), the only advantage of PEP 269 being speed > > > wrt to the pure python solutions, because of the use of the internal > > > CPython parser, OTOH the other solutions are more flexible... > > > > I agree. I'd like to see (or perhaps write myself) a proposal for > > adding one or two of these packages to the Python core (two are > > probably better, since there is no one-size-fits-all parser framework, > > and adding two avoids the impression that there is a single "blessed" > > parser). > > I would like to note that integration with these other systems are in > the plans for my Basil project (http://wildideas.org/basil/). Which is not in the scope of the PEP. > I just felt > that a pgen integration would be better suited to the native code base > rather than copying the code over to another project and building it as an > extension module (which was a route being explored by Mobius Python.) I see, but as you can see there are other issues that come into play. Jython and first the appropriateness of the interface. > > > > It should be considered that Jython does not contain a parser > > > similar to CPython one. Because of this jython does not offer parser > > > module support. So implementing the PEP for Jython would require > > > writing a Java or pure python equivalent of the CPython parser. > > I am all for writing a pgen implementation in pure Python. The reason I > am not going this route from the get go is to do what is easy before we do > what is less easy. :) > If, for example, a reference implementation of a > Python type system was to be adopted as standard, I would think that > making the new system easy to add to Jython would be a prerequisite. > Hence, we would need to develop a Jython parser that uses the grammar > from CPython. Of course Jython has a (different and not run-time configurable) parser that can be extended in case, so I don't get the point. > > > If the goal is to play with extensions to the Python grammar, I think > > this is less of an issue. Of course, anybody wanting to extend the C > > grammar could easily modify the Python interpreter itself. > > > So I think I'm -1 on this PEP, on the basis that this is code bloat > > (i.e. new functionality used too rarely). > > As stated in the PEP, one of the primary motiviations for the proposal is > to allow grammar extensions to be prototyped in Python (esp. optional > static typing.) I would argue that making actual changes to CPython is > much more expensive than writing a front end in Python. But you could write that using one of the SPARK, PLY ... tools And in any case the PEP is ignoring the part about how to produce the actual code from a Python front-end. And how to add possibly necessary new bytecodes... > By adding a pgen > module to Python, I feel that we are not bloating Python so much as we are > exposing funtionality already built into Python. > Yes but how much is worth to expose such functionality depends on the whole picture: how you want to concretely use the exposed functionality? It is all but clear, how you can exploit a python exposed pgen and parser in order to make as easy as possible for a casual user to experiment with a grammar extension? In an ideal scenario the user would install somefile in his python installation and start python having access to the extension. How this can work is unanswered by the PEP. I think that a PEP that adress the whole problem would make more sense and would be more easy to evaluate. regards. From ash@docono.org Sat Sep 15 09:07:59 2001 From: ash@docono.org (DoCoNo) Date: Sat, 15 Sep 2001 17:07:59 +0900 (JST) Subject: [Python-Dev] =?ISO-2022-JP?B?GyRCJEokcyQ0JC8jTSNhI2cbKEI=?= Message-ID: <20010915080759.A398F9806C@mail.docono.org> $BFn9q$$$m%a%,%M$O$3$A$i$+$i(B http://www.justinmail.net/testpage/gate/gate.htm $B#1#8:ML$K~$O%(%m$a$,$M$r$+$1$F$b$_$l$^$;$s(B -Remove Mail- ash@docono.org From aahz@panix.com Mon Sep 17 05:23:09 2001 From: aahz@panix.com (aahz@panix.com) Date: Mon, 17 Sep 2001 00:23:09 -0400 (EDT) Subject: [Python-Dev] nested-scopes redux (fwd) In-Reply-To: Message-ID: <200109170423.f8H4N9u28311@panix1.panix.com> Haven't seen any response to this, and I think it deserves one (then again, I could be wrong): ------- start of forwarded message ------- From: Cliff Wells Newsgroups: comp.lang.python Subject: nested-scopes redux Date: Wed, 12 Sep 2001 15:45:17 -0700 Message-ID: Reply-To: logiplexsoftware@earthlink.net To: python-list@python.org Sorry to bring this subject back up again, but I just noticed a somewhat annoying feature of the current nested-scope implementation. One of the big pluses of nested scopes is doing away with default arguments in lambda functions. Unfortunately the following bits of code behave differently: foo = [] for i in [1, 2]: foo.append(lambda i = i: i) and from __future__ import nested_scopes foo = [] for i in [1, 2]: foo.append(lambda: i) In the first case the output from foo[0]() and foo[1]() is 1 and 2, respectively - and it's what one would probably want. In the second case, the output is always 2. This may be the expected behavior, but it kind of does away with the benefits to providing a cleaner lambda call. -- Cliff Wells Software Engineer Logiplex Corporation (www.logiplex.net) (503) 978-6726 x308 (800) 735-0555 x308 ------- end of forwarded message ------- From guido@python.org Mon Sep 17 05:32:13 2001 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Sep 2001 00:32:13 -0400 Subject: [Python-Dev] nested-scopes redux (fwd) In-Reply-To: Your message of "Mon, 17 Sep 2001 00:23:09 EDT." <200109170423.f8H4N9u28311@panix1.panix.com> References: <200109170423.f8H4N9u28311@panix1.panix.com> Message-ID: <200109170432.AAA29150@cj20424-a.reston1.va.home.com> > Haven't seen any response to this, and I think it deserves one (then > again, I could be wrong): Alas, few people here have time to scan comp.lang.python. Cliff expects variable references to be bound to the value. That's not how it works -- the idea of nested scopes is that the owner of the variable (the surrounding scope) can change the variable's value, and the user (the inner scope) will pick up the changes. This is useful e.g. when an outer function calls an inner function repeatedly. This is how nested scopes work in other languages. If Cliff wants to create N different lambdas with N different values for i, he'll have to do it the old way. Can you post this reply? --Guido van Rossum (home page: http://www.python.org/~guido/) > ------- start of forwarded message ------- > From: Cliff Wells > Newsgroups: comp.lang.python > Subject: nested-scopes redux > Date: Wed, 12 Sep 2001 15:45:17 -0700 > Message-ID: > Reply-To: logiplexsoftware@earthlink.net > To: python-list@python.org > > Sorry to bring this subject back up again, but I just noticed a somewhat > annoying feature of the current nested-scope implementation. One of the big > pluses of nested scopes is doing away with default arguments in lambda > functions. Unfortunately the following bits of code behave differently: > > foo = [] > for i in [1, 2]: > foo.append(lambda i = i: i) > > and > > from __future__ import nested_scopes > foo = [] > for i in [1, 2]: > foo.append(lambda: i) > > In the first case the output from foo[0]() and foo[1]() is 1 and 2, > respectively - and it's what one would probably want. In the second case, > the output is always 2. This may be the expected behavior, but it kind of > does away with the benefits to providing a cleaner lambda call. > > -- > Cliff Wells > Software Engineer > Logiplex Corporation (www.logiplex.net) > (503) 978-6726 x308 > (800) 735-0555 x308 > > ------- end of forwarded message ------- > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From tim.one@home.com Mon Sep 17 05:52:41 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 17 Sep 2001 00:52:41 -0400 Subject: [Python-Dev] nested-scopes redux (fwd) In-Reply-To: <200109170423.f8H4N9u28311@panix1.panix.com> Message-ID: [Aahz] > Haven't seen any response to this, and I think it deserves one (then > again, I could be wrong): The semantics here were gone over in detail on c.l.py just a couple weeks ago, in response to Paul Rubin's questions. God knows I'm not averse to unbounded repetition , but not the same stuff *every* week. From loewis@informatik.hu-berlin.de Mon Sep 17 08:13:01 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Mon, 17 Sep 2001 09:13:01 +0200 (MEST) Subject: [Python-Dev] Re: PEP 269 In-Reply-To: (message from Jonathan Riehl on Fri, 14 Sep 2001 16:43:58 -0500 (CDT)) References: Message-ID: <200109170713.JAA17225@pandora.informatik.hu-berlin.de> > As stated in the PEP, one of the primary motiviations for the proposal is > to allow grammar extensions to be prototyped in Python (esp. optional > static typing.) I would argue that making actual changes to CPython is > much more expensive than writing a front end in Python. By adding a pgen > module to Python, I feel that we are not bloating Python so much as we are > exposing funtionality already built into Python. The potential problem is that this new module must then be supported for a long time. People will propose extensions to it, which must be evaluated, and every change must be reviewed carefully for incompatibilities. I'm not opposed to changes. However, I fail to see the value of prototyping the grammar, since you'll need subsequent changes as well, to the byte code generation, and perhaps evaluation. Also, I still doubt anybody interested in changing the grammar couldn't easily recompile Python. Regards, Martin From mal@lemburg.com Mon Sep 17 18:12:09 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 17 Sep 2001 19:12:09 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> Message-ID: <3BA62EE9.D8C4F0A5@lemburg.com> Martin von Loewis wrote: > > > is there any reason why "print" cannot pass unicode > > strings on to the underlying write method? > > Mostly because there is no guarantee that every .write method will > support Unicode objects. I see two options: either a stream might > declare itself as supporting unicode on output (by, say, providing a > unicode attribute), or all streams are required by BDFL pronouncement > to accept Unicode objects. I think the latter option would go a long way: many file-like objects are written in C and will use the C parser markers. These can handle Unicode without problem (issuing an exception in case the conversion to ASCII fails). The only notable exception is the cStringIO module -- but this could probably be changed to be buffer interface compliant too. > BTW, your wrapper example can be rewritten as > > import sys,codecs > sys.stdout = codecs.lookup("iso-8859-1")[3](sys.stdout) > > I wish codecs.lookup returned a record with named fields, instead of a > list, so I could write > > sys.stdout = codecs.lookup("iso-8859-1").writer(sys.stdout) > > (the other field names would be encode,decode, and reader). Why don't you write a small helper function for the codecs.py module ?! E.g. codecs.info("iso-8859-1") could provide an alternative interface which returns a CodecInfo instance with attributes instead of tuple entries. Note that the tuple interface was chosen for sake of speed and better handling at C level (tuples can be cached and are easily parseable in C). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From jeremy@alum.mit.edu Mon Sep 17 18:23:36 2001 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Mon, 17 Sep 2001 13:23:36 -0400 (EDT) Subject: [Python-Dev] test_socketserver fails Message-ID: <200109171723.NAA27608@newman.concentric.net> I just tried to run the socketserver test and got an unexpected failure. I've never run the test before, so far as I know. Has anyone else? Should I expect it to work? Jeremy Python 2.2a3+ (#336, Sep 17 2001, 12:56:00) [GCC 2.95.2 19991024 (release)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import test.test_socketserver ADDR = ('localhost', 13511) CLASS = SocketServer.TCPServer server created server running thread: creating server thread: serving three times test client 0 test client 1 test client 2 thread: done waiting for server done ADDR = ('localhost', 13512) CLASS = SocketServer.ThreadingTCPServer server created server running thread: creating server thread: serving three times test client 0 Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.2/test/test_socketserver.py", line 162, in ? main() File "/usr/local/lib/python2.2/test/test_socketserver.py", line 158, in main testall() File "/usr/local/lib/python2.2/test/test_socketserver.py", line 148, in testall testloop(socket.AF_INET, tcpservers, MyStreamHandler, teststream) File "/usr/local/lib/python2.2/test/test_socketserver.py", line 124, in testloop testfunc(proto, addr) File "/usr/local/lib/python2.2/test/test_socketserver.py", line 65, in teststream buf = data = receive(s, 100) File "/usr/local/lib/python2.2/test/test_socketserver.py", line 49, in receive raise RuntimeError, "timed out on %s" % `sock` RuntimeError: timed out on From guido@python.org Mon Sep 17 18:22:28 2001 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Sep 2001 13:22:28 -0400 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: Your message of "Mon, 17 Sep 2001 19:12:09 +0200." <3BA62EE9.D8C4F0A5@lemburg.com> References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> Message-ID: <200109171722.NAA01422@cj20424-a.reston1.va.home.com> > Martin von Loewis wrote: > > > > > is there any reason why "print" cannot pass unicode > > > strings on to the underlying write method? > > > > Mostly because there is no guarantee that every .write method will > > support Unicode objects. I see two options: either a stream might > > declare itself as supporting unicode on output (by, say, providing a > > unicode attribute), or all streams are required by BDFL pronouncement > > to accept Unicode objects. > > I think the latter option would go a long way: many file-like > objects are written in C and will use the C parser markers. These > can handle Unicode without problem (issuing an exception in case > the conversion to ASCII fails). Agreed, but BDFL pronouncement doesn't make it so: individual modules still have to be modified if they don't do the right thing (especially 3rd party modules -- we have no control there). And then, what's the point of handling Unicode if we only accept Unicode-encoded ASCII strings? > The only notable exception is > the cStringIO module -- but this could probably be changed to > be buffer interface compliant too. Sure, just submit a patch. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Sep 17 18:27:40 2001 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Sep 2001 13:27:40 -0400 Subject: [Python-Dev] test_socketserver fails In-Reply-To: Your message of "Mon, 17 Sep 2001 13:23:36 EDT." <200109171723.NAA27608@newman.concentric.net> References: <200109171723.NAA27608@newman.concentric.net> Message-ID: <200109171727.NAA01497@cj20424-a.reston1.va.home.com> > I just tried to run the socketserver test and got an unexpected > failure. I've never run the test before, so far as I know. Has > anyone else? Should I expect it to work? > > Jeremy > > Python 2.2a3+ (#336, Sep 17 2001, 12:56:00) > [GCC 2.95.2 19991024 (release)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import test.test_socketserver I believe that's the problem. The import lock gets in the way. Try running it from the command line: python Lib/test/test_socketserver.py --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Sep 17 18:50:08 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 17 Sep 2001 19:50:08 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109171722.NAA01422@cj20424-a.reston1.va.home.com> Message-ID: <3BA637D0.CB5489DD@lemburg.com> Guido van Rossum wrote: > > > Martin von Loewis wrote: > > > > > > > is there any reason why "print" cannot pass unicode > > > > strings on to the underlying write method? > > > > > > Mostly because there is no guarantee that every .write method will > > > support Unicode objects. I see two options: either a stream might > > > declare itself as supporting unicode on output (by, say, providing a > > > unicode attribute), or all streams are required by BDFL pronouncement > > > to accept Unicode objects. > > > > I think the latter option would go a long way: many file-like > > objects are written in C and will use the C parser markers. These > > can handle Unicode without problem (issuing an exception in case > > the conversion to ASCII fails). > > Agreed, but BDFL pronouncement doesn't make it so: individual modules > still have to be modified if they don't do the right thing (especially > 3rd party modules -- we have no control there). True, but we are only talking about file objects which are used for sys.stdout -- I don't think that allowing Unicode to be passed to their .write() methods will break a whole lot of code. > And then, what's the point of handling Unicode if we only accept > Unicode-encoded ASCII strings? I was under the impression that Fredrik wants to let Unicode pass through from the print statement to the .write method of sys.stdout. If the sys.stdout object knows about Unicode then things will work just fine; if not, the internal Python machinery will either try to convert it to an ASCII string (e.g. if the file object uses "s#") or the file object will raise a TypeError (this is what cStringIO does). Currently, Python forces conversion to 8-bit strings for all printed objects (at least this is what it did last time I looked into this problem a long while ago). > > The only notable exception is > > the cStringIO module -- but this could probably be changed to > > be buffer interface compliant too. > > Sure, just submit a patch. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From jeremy@zope.com Mon Sep 17 18:50:42 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Mon, 17 Sep 2001 13:50:42 -0400 (EDT) Subject: [Python-Dev] test_socketserver fails In-Reply-To: <200109171727.NAA01497@cj20424-a.reston1.va.home.com> References: <200109171723.NAA27608@newman.concentric.net> <200109171727.NAA01497@cj20424-a.reston1.va.home.com> Message-ID: <15270.14322.443955.555134@slothrop.digicool.com> >>>>> "GvR" == Guido van Rossum writes: >> Python 2.2a3+ (#336, Sep 17 2001, 12:56:00) [GCC 2.95.2 19991024 >> (release)] on linux2 Type "help", "copyright", "credits" or >> "license" for more information. >> >>> import test.test_socketserver GvR> I believe that's the problem. The import lock gets in the way. GvR> Try running it from the command line: GvR> python Lib/test/test_socketserver.py Aha! Then the code at the top is really broken: # XXX This must be run manually -- somehow the I/O redirection of the # regression test breaks the test. from test_support import verbose, verify, TESTFN, TestSkipped if not verbose: raise TestSkipped, "test_socketserver can only be run manually" There are two substantial problems. First, the comment about I/O redirection is wrong. Second, if you do run regrtest.py -v, then the test will be run anyway. This is how I first stumbled over the problem. Another test was failing and I used -v to figure out why. But regrtest.py also reported that test_socketserver failed and then hung, because socketserver non-daemon threads were still running. So what's the right way to fix this test? If the test can't be run as part of regrtest and can't be imported, I wonder why it's in Lib/test to begin with. Jeremy From guido@python.org Mon Sep 17 19:24:45 2001 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Sep 2001 14:24:45 -0400 Subject: [Python-Dev] test_socketserver fails In-Reply-To: Your message of "Mon, 17 Sep 2001 13:50:42 EDT." <15270.14322.443955.555134@slothrop.digicool.com> References: <200109171723.NAA27608@newman.concentric.net> <200109171727.NAA01497@cj20424-a.reston1.va.home.com> <15270.14322.443955.555134@slothrop.digicool.com> Message-ID: <200109171824.OAA01709@cj20424-a.reston1.va.home.com> > GvR> python Lib/test/test_socketserver.py > > Aha! Then the code at the top is really broken: > > # XXX This must be run manually -- somehow the I/O redirection of the > # regression test breaks the test. > > from test_support import verbose, verify, TESTFN, TestSkipped > if not verbose: > raise TestSkipped, "test_socketserver can only be run manually" I guess when I say "run a test manually" I think "from the command line", not "from the interactive prompt". :-( > There are two substantial problems. First, the comment about I/O > redirection is wrong. I remember I couldn't figure out *why* it was failing. :-( > Second, if you do run regrtest.py -v, then the > test will be run anyway. This is how I first stumbled over the > problem. Another test was failing and I used -v to figure out why. > But regrtest.py also reported that test_socketserver failed and then > hung, because socketserver non-daemon threads were still running. Yup, that's a problem. > So what's the right way to fix this test? If the test can't be run as > part of regrtest and can't be imported, I wonder why it's in Lib/test > to begin with. pystone.py is also in Lib/test. It's a collection of tests. Maybe this test should be renamed. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Mon Sep 17 20:27:44 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 17 Sep 2001 15:27:44 -0400 Subject: [Python-Dev] test_socketserver fails In-Reply-To: <15270.14322.443955.555134@slothrop.digicool.com> Message-ID: [Jeremy Hylton] > ... > So what's the right way to fix this test? If the test can't be run as > part of regrtest and can't be imported, I wonder why it's in Lib/test > to begin with. It runs fine if run as Guido suggested (from the cmdline). I just checked in(*) a change that causes it to punt if the import lock is held. This prevents running it via import (whether directly from a Python shell or indirectly via regrtest). test_threaded_import is the pioneer in these miserable cases, and I suppose test_socketserver could/should also be reworked to exploit test_threaded_import's test_main() trick too (then it should be able to run under regrtest, but still not via direct import). BTW, when large magical things happen as a side-effect of merely importing a module, we complain when the context is Zope . The "test_main() trick" breaks the connection between importing a test module and actually running the test. (*) But it hasn't completed -- lines and lines of cvs server: [12:27:14] waiting for anoncvs_python's lock in cvsroot/python/python/dist/src/Lib/test Hope we don't have an immortal lock again! From tim.one@home.com Mon Sep 17 20:54:27 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 17 Sep 2001 15:54:27 -0400 Subject: [Python-Dev] test_socketserver fails In-Reply-To: Message-ID: [Tim] > ... > (*) But it hasn't completed -- lines and lines of > > cvs server: [12:27:14] waiting for anoncvs_python's lock in > cvsroot/python/python/dist/src/Lib/test > > Hope we don't have an immortal lock again! No change. I opened an SF support request: From tim.one@home.com Mon Sep 17 21:09:23 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 17 Sep 2001 16:09:23 -0400 Subject: [Python-Dev] test_socketserver fails In-Reply-To: Message-ID: While I can't check it in, I fiddled test_socketserver to use test_main() too, and now it runs fine when run via regrtest (but does not, and cannot be made to, run as a side-effect of getting imported -- requires >>> from test import test_socketserver >>> test_socketserver.test_main() instead, if you want to do it that way). My question now is *should* it run under regrtest by default? Or should it require the network resource (regrtest "-u network")? On Win98SE, it takes about 20 seconds to run. From DavidA@ActiveState.com Mon Sep 17 22:08:23 2001 From: DavidA@ActiveState.com (David Ascher) Date: Mon, 17 Sep 2001 14:08:23 -0700 Subject: [Python-Dev] REMINDER: 10th Python Conference -- Deadline reminder -- Accepting Papers! References: <3B95B4D2.CB487F82@ActiveState.com> Message-ID: <3BA66647.4C99F425@ActiveState.com> The deadline for paper submissions for the 10th Python Conference is coming up soon: !!! October 8, 2001 !!! *NEW*: The website is now accepting paper submissions -- see papers.python10.org for details. The conference will be February 4-7, 2002 in Alexandria, VA. See www.python10.org for details. Contact me if you have questions. -- David Ascher Program Chair 10th International Python Conference. From guido@python.org Tue Sep 18 02:45:28 2001 From: guido@python.org (Guido van Rossum) Date: Mon, 17 Sep 2001 21:45:28 -0400 Subject: [Python-Dev] test_socketserver fails In-Reply-To: Your message of "Mon, 17 Sep 2001 16:09:23 EDT." References: Message-ID: <200109180145.VAA04304@cj20424-a.reston1.va.home.com> > While I can't check it in, I fiddled test_socketserver to use test_main() > too, and now it runs fine when run via regrtest (but does not, and cannot be > made to, run as a side-effect of getting imported -- requires > > >>> from test import test_socketserver > >>> test_socketserver.test_main() > > instead, if you want to do it that way). > > My question now is *should* it run under regrtest by default? Or should it > require the network resource (regrtest "-u network")? On Win98SE, it takes > about 20 seconds to run. The time is not the problem, but when you don't have networking properly configured, it will take a lot longer for the DNS requests to time out, so I propose it should require the network resource. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Tue Sep 18 03:20:30 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 17 Sep 2001 22:20:30 -0400 Subject: [Python-Dev] test_socketserver fails In-Reply-To: <200109180145.VAA04304@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > The time is not the problem, but when you don't have networking > properly configured, it will take a lot longer for the DNS requests to > time out, so I propose it should require the network resource. OK, it does now. From loewis@informatik.hu-berlin.de Tue Sep 18 09:02:15 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Tue, 18 Sep 2001 10:02:15 +0200 (MEST) Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: <3BA62EE9.D8C4F0A5@lemburg.com> (mal@lemburg.com) References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> Message-ID: <200109180802.KAA01481@pandora.informatik.hu-berlin.de> > > I wish codecs.lookup returned a record with named fields, instead of a > > list, so I could write > > > > sys.stdout = codecs.lookup("iso-8859-1").writer(sys.stdout) > > > > (the other field names would be encode,decode, and reader). > > Why don't you write a small helper function for the codecs.py > module ?! Because I'd like to avoid an inflation of functions. Instead, I'd prefer codecs.lookup to return an object that has the needed fields, but behaves like a tuple for backwards compatibility. > Note that the tuple interface was chosen for sake of speed and > better handling at C level (tuples can be cached and are easily > parseable in C). It may be that an inherited tuple class might achieve the same effect. Can you identify the places where codecs.lookup is assumed to return tuples? Regards, Martin From loewis@informatik.hu-berlin.de Tue Sep 18 09:08:39 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Tue, 18 Sep 2001 10:08:39 +0200 (MEST) Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: <200109171722.NAA01422@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Mon, 17 Sep 2001 13:22:28 -0400) References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109171722.NAA01422@cj20424-a.reston1.va.home.com> Message-ID: <200109180808.KAA01749@pandora.informatik.hu-berlin.de> > > > Mostly because there is no guarantee that every .write method will > > > support Unicode objects. I see two options: either a stream might > > > declare itself as supporting unicode on output (by, say, providing a > > > unicode attribute), or all streams are required by BDFL pronouncement > > > to accept Unicode objects. > > > > I think the latter option would go a long way: many file-like > > objects are written in C and will use the C parser markers. These > > can handle Unicode without problem (issuing an exception in case > > the conversion to ASCII fails). > > Agreed, but BDFL pronouncement doesn't make it so: individual modules > still have to be modified if they don't do the right thing (especially > 3rd party modules -- we have no control there). > > And then, what's the point of handling Unicode if we only accept > Unicode-encoded ASCII strings? By accepting Unicode, I would specifically require that they, at least: - do not crash the interpreter when being passed Unicode objects - attempt to perform some conversion if they do not support Unicode directly; if they don't know any specific conversion, the default conversion should be used (i.e. that they don't give a TypeError). With these assumptions, it is possible to allow print to pass Unicode objects to the file's write method, instead of converting Unicode itself. This, in turn, enables users to replace sys.stdout with something that supports a different encoding. Of course, you still may get Unicode errors, since some streams may not support all Unicode characters (e.g. since the terminal does not support them). Regards, Martin From mal@lemburg.com Tue Sep 18 09:22:02 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 18 Sep 2001 10:22:02 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> Message-ID: <3BA7042A.B9D5425F@lemburg.com> Martin von Loewis wrote: > > > > I wish codecs.lookup returned a record with named fields, instead of a > > > list, so I could write > > > > > > sys.stdout = codecs.lookup("iso-8859-1").writer(sys.stdout) > > > > > > (the other field names would be encode,decode, and reader). > > > > Why don't you write a small helper function for the codecs.py > > module ?! > > Because I'd like to avoid an inflation of functions. Instead, I'd > prefer codecs.lookup to return an object that has the needed fields, > but behaves like a tuple for backwards compatibility. That won't be possible without breaking user code since it is well documented that codecs.lookup() returns a tuple. BTW, I don't think that adding a class CodecInfo which takes the encoding name as constructor argument would introduce much inflation of functions here. You will have to provide such a class anyway to achieve what you are looking for, so I guess this is the way to go. > > Note that the tuple interface was chosen for sake of speed and > > better handling at C level (tuples can be cached and are easily > > parseable in C). > > It may be that an inherited tuple class might achieve the same > effect. Can you identify the places where codecs.lookup is assumed to > return tuples? I'd rather not make the interface more complicated. The C side certainly cannot be changed for the reasons given above and Python uses could choose your new CodecInfo class to get access to a nicer interface. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From loewis@informatik.hu-berlin.de Tue Sep 18 10:56:42 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Tue, 18 Sep 2001 11:56:42 +0200 (MEST) Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: <3BA7042A.B9D5425F@lemburg.com> (mal@lemburg.com) References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> Message-ID: <200109180956.LAA02780@pandora.informatik.hu-berlin.de> > That won't be possible without breaking user code since > it is well documented that codecs.lookup() returns a tuple. Suppose codecs.lookup would return an instance of _fields = {'encode':0,'decode':1,'reader':2,'writer':3} class CodecInfo(tuple): __dynamic__ = 0 def __getattr__(self, name): try: return self[_fields[name]] except KeyError: raise AttributeError, name What user code exactly would break? Would that be a serious problem? > BTW, I don't think that adding a class CodecInfo which takes > the encoding name as constructor argument would introduce > much inflation of functions here. You will have to provide > such a class anyway to achieve what you are looking for, so > I guess this is the way to go. I guess not. If the codec.lookup return value is changed, then I can write encoder = codecs.lookup("latin-1").encode Without that, I have to write encoder = codecs.CodecInfo(codecs.lookup("latin-1")).encode This is overly complicated. > > It may be that an inherited tuple class might achieve the same > > effect. Can you identify the places where codecs.lookup is assumed to > > return tuples? > > I'd rather not make the interface more complicated. The C side > certainly cannot be changed for the reasons given above and > Python uses could choose your new CodecInfo class to get access > to a nicer interface. What code exactly would have to change if I wanted lookup to return a CodecInfo object? Regards, Martin From fredrik@pythonware.com Tue Sep 18 11:22:03 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 18 Sep 2001 12:22:03 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> <200109180956.LAA02780@pandora.informatik.hu-berlin.de> Message-ID: <00ea01c1402b$c3001b20$0900a8c0@spiff> martin wrote: > I guess not. If the codec.lookup return value is changed, then I can > write > > encoder = codecs.lookup("latin-1").encode > > Without that, I have to write > > encoder = codecs.CodecInfo(codecs.lookup("latin-1")).encode or you could make things more readable, and add explicit get-functions for each property: encoder = codecs.getencoder("latin-1") much easier to understand, especially for casual users (cf. os.path.getsize etc) From mal@lemburg.com Tue Sep 18 11:30:35 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 18 Sep 2001 12:30:35 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109171722.NAA01422@cj20424-a.reston1.va.home.com> <200109180808.KAA01749@pandora.informatik.hu-berlin.de> Message-ID: <3BA7224B.EA8AFE17@lemburg.com> Martin von Loewis wrote: > > > > > Mostly because there is no guarantee that every .write method will > > > > support Unicode objects. I see two options: either a stream might > > > > declare itself as supporting unicode on output (by, say, providing a > > > > unicode attribute), or all streams are required by BDFL pronouncement > > > > to accept Unicode objects. > > > > > > I think the latter option would go a long way: many file-like > > > objects are written in C and will use the C parser markers. These > > > can handle Unicode without problem (issuing an exception in case > > > the conversion to ASCII fails). > > > > Agreed, but BDFL pronouncement doesn't make it so: individual modules > > still have to be modified if they don't do the right thing (especially > > 3rd party modules -- we have no control there). > > > > And then, what's the point of handling Unicode if we only accept > > Unicode-encoded ASCII strings? > > By accepting Unicode, I would specifically require that they, at > least: > - do not crash the interpreter when being passed Unicode objects I don't see how this could happen. At worst users will see a TypeError or UnicodeError when passing a Unicode object to print with some sys.stdout hook in place which doesn't know about Unicode objects. > - attempt to perform some conversion if they do not support Unicode > directly; if they don't know any specific conversion, the default > conversion should be used (i.e. that they don't give a TypeError). That's what happens if the hook uses "s#" or "t#". Otherwise they'll raise a TypeError. > With these assumptions, it is possible to allow print to pass Unicode > objects to the file's write method, instead of converting Unicode > itself. This, in turn, enables users to replace sys.stdout with > something that supports a different encoding. > > Of course, you still may get Unicode errors, since some streams may > not support all Unicode characters (e.g. since the terminal does not > support them). Right. Now who will write the patch ? :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@python.org Tue Sep 18 14:16:16 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Sep 2001 09:16:16 -0400 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: Your message of "Tue, 18 Sep 2001 10:02:15 +0200." <200109180802.KAA01481@pandora.informatik.hu-berlin.de> References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> Message-ID: <200109181316.JAA14066@cj20424-a.reston1.va.home.com> > Because I'd like to avoid an inflation of functions. Instead, I'd > prefer codecs.lookup to return an object that has the needed fields, > but behaves like a tuple for backwards compatibility. Here's how you can do that in 2.2a3 (I wrote this incomplete example for another situation :-): class Stat(tuple): def __new__(cls, t): assert len(t) == 9 self = tuple.__new__(cls, t[:7]) self.st_seven = t[7] self.st_eight = t[8] return self st_zero = property(lambda x: x[0]) st_one = property(lambda x: x[1]) # etc. t = (0,1,2,3,4,5,6,7,8) s = Stat(t) a,b,c,d,e,f,g = s assert (a, b, c, d, e, f, g) == t[:7] assert t == s + (s.st_seven, s.st_eight) > > Note that the tuple interface was chosen for sake of speed and > > better handling at C level (tuples can be cached and are easily > > parseable in C). Alas, a tuple subclass loses some of the speed and size advantage -- the additional instance variables require allocation of a dictionary. (And no, you cannot use __slots__ here -- the slot access mechanism doesn't jive well with the variable length tuple structure. If we were to subclass a list, we could add __slots__ = ["st_seven", "st_eight"] to the class. But that's not fully tuple-compatible.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Sep 18 14:17:07 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Sep 2001 09:17:07 -0400 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: Your message of "Tue, 18 Sep 2001 10:08:39 +0200." <200109180808.KAA01749@pandora.informatik.hu-berlin.de> References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109171722.NAA01422@cj20424-a.reston1.va.home.com> <200109180808.KAA01749@pandora.informatik.hu-berlin.de> Message-ID: <200109181317.JAA14081@cj20424-a.reston1.va.home.com> > > And then, what's the point of handling Unicode if we only accept > > Unicode-encoded ASCII strings? > > By accepting Unicode, I would specifically require that they, at > least: > - do not crash the interpreter when being passed Unicode objects > - attempt to perform some conversion if they do not support Unicode > directly; if they don't know any specific conversion, the default > conversion should be used (i.e. that they don't give a TypeError). > > With these assumptions, it is possible to allow print to pass Unicode > objects to the file's write method, instead of converting Unicode > itself. This, in turn, enables users to replace sys.stdout with > something that supports a different encoding. > > Of course, you still may get Unicode errors, since some streams may > not support all Unicode characters (e.g. since the terminal does not > support them). OK. That's very reasonable. What do we need to change to make this happen? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Tue Sep 18 14:24:20 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Sep 2001 09:24:20 -0400 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: Your message of "Tue, 18 Sep 2001 11:56:42 +0200." <200109180956.LAA02780@pandora.informatik.hu-berlin.de> References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> <200109180956.LAA02780@pandora.informatik.hu-berlin.de> Message-ID: <200109181324.JAA14154@cj20424-a.reston1.va.home.com> > _fields = {'encode':0,'decode':1,'reader':2,'writer':3} > class CodecInfo(tuple): > __dynamic__ = 0 > def __getattr__(self, name): > try: > return self[_fields[name]] > except KeyError: > raise AttributeError, name You want to change that raise statement into return tuple.__getattr__(self, name) Remember, new-style __getattr__ *replaces* the built-in getattr operation; it doesn't only get invoked when the built-in getattr doesn't find the attribute, like it does for classic classes. (I'm still considering whether this is too backwards incompatible; we could have two getattr hooks, one old-style and one new-style.) --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Sep 18 14:52:00 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 18 Sep 2001 15:52:00 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> <200109180956.LAA02780@pandora.informatik.hu-berlin.de> Message-ID: <3BA75180.79470CAC@lemburg.com> Martin von Loewis wrote: > > > That won't be possible without breaking user code since > > it is well documented that codecs.lookup() returns a tuple. > > Suppose codecs.lookup would return an instance of > > _fields = {'encode':0,'decode':1,'reader':2,'writer':3} > class CodecInfo(tuple): > __dynamic__ = 0 > def __getattr__(self, name): > try: > return self[_fields[name]] > except KeyError: > raise AttributeError, name > > What user code exactly would break? Would that be a serious problem? All code which assumes a tuple as return value. It's hard to say how much code makes such an assumption. Most Python code probably only uses the sequence interface, but the C interface was deliberately designed to return tuples so that C programmers can easily access the data. > > BTW, I don't think that adding a class CodecInfo which takes > > the encoding name as constructor argument would introduce > > much inflation of functions here. You will have to provide > > such a class anyway to achieve what you are looking for, so > > I guess this is the way to go. > > I guess not. If the codec.lookup return value is changed, then I can > write > > encoder = codecs.lookup("latin-1").encode > > Without that, I have to write > > encoder = codecs.CodecInfo(codecs.lookup("latin-1")).encode > > This is overly complicated. No... codecs.CodecInfo("latin-1").encode The __init__ contructor can to the call to lookup() and apply the needed initialization of the attributes. > > > It may be that an inherited tuple class might achieve the same > > > effect. Can you identify the places where codecs.lookup is assumed to > > > return tuples? > > > > I'd rather not make the interface more complicated. The C side > > certainly cannot be changed for the reasons given above and > > Python uses could choose your new CodecInfo class to get access > > to a nicer interface. > > What code exactly would have to change if I wanted lookup to return a > CodecInfo object? I don't see a need to argue over this. It's no use putting a lot of work into inventing some overly complex (subclassing types, etc.) strategy to maintain backwards compatibility when an easy solution is so close at hand. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Tue Sep 18 14:53:18 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 18 Sep 2001 15:53:18 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> <200109180956.LAA02780@pandora.informatik.hu-berlin.de> <00ea01c1402b$c3001b20$0900a8c0@spiff> Message-ID: <3BA751CE.3EF9F860@lemburg.com> Fredrik Lundh wrote: > > martin wrote: > > > I guess not. If the codec.lookup return value is changed, then I can > > write > > > > encoder = codecs.lookup("latin-1").encode > > > > Without that, I have to write > > > > encoder = codecs.CodecInfo(codecs.lookup("latin-1")).encode > > or you could make things more readable, and add explicit > get-functions for each property: > > encoder = codecs.getencoder("latin-1") > > much easier to understand, especially for casual users > (cf. os.path.getsize etc) +1 Funny, the C API provides APIs for this already ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From loewis@informatik.hu-berlin.de Tue Sep 18 17:21:02 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Tue, 18 Sep 2001 18:21:02 +0200 (MEST) Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: <200109181324.JAA14154@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Tue, 18 Sep 2001 09:24:20 -0400) References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> <200109180956.LAA02780@pandora.informatik.hu-berlin.de> <200109181324.JAA14154@cj20424-a.reston1.va.home.com> Message-ID: <200109181621.SAA13879@pandora.informatik.hu-berlin.de> > > _fields = {'encode':0,'decode':1,'reader':2,'writer':3} > > class CodecInfo(tuple): > > __dynamic__ = 0 > > def __getattr__(self, name): > > try: > > return self[_fields[name]] > > except KeyError: > > raise AttributeError, name > > You want to change that raise statement into > > return tuple.__getattr__(self, name) > > Remember, new-style __getattr__ *replaces* the built-in getattr > operation Originally, I had _fields as a class attribute, and was using self._fields inside getattr, which caused a StackOverflow. I could not figure out why it wouldn't just find _fields in the class before invoking __getattr__... > (I'm still considering whether this is too backwards incompatible; we > could have two getattr hooks, one old-style and one new-style.) So far, it is not really incompatible, since it only applies to new-style classes. It is confusing to long-time users, so it deserves documentation. Regards, Martin From loewis@informatik.hu-berlin.de Tue Sep 18 17:23:17 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Tue, 18 Sep 2001 18:23:17 +0200 (MEST) Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: <3BA75180.79470CAC@lemburg.com> (mal@lemburg.com) References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> <200109180956.LAA02780@pandora.informatik.hu-berlin.de> <3BA75180.79470CAC@lemburg.com> Message-ID: <200109181623.SAA14341@pandora.informatik.hu-berlin.de> > > What user code exactly would break? Would that be a serious problem? > > All code which assumes a tuple as return value. It's hard to > say how much code makes such an assumption. Most Python > code probably only uses the sequence interface, but the C interface > was deliberately designed to return tuples so that C programmers > can easily access the data. I believe that code would continue to work if you got a instance of a tuple subtype. > I don't see a need to argue over this. It's no use putting > a lot of work into inventing some overly complex (subclassing > types, etc.) strategy to maintain backwards compatibility > when an easy solution is so close at hand. Subclassing tuples is not at all overly complex. Regards, Martin From guido@python.org Tue Sep 18 17:34:33 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Sep 2001 12:34:33 -0400 Subject: [Python-Dev] __getattr__ In-Reply-To: Your message of "Tue, 18 Sep 2001 18:21:02 +0200." <200109181621.SAA13879@pandora.informatik.hu-berlin.de> References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> <200109180956.LAA02780@pandora.informatik.hu-berlin.de> <200109181324.JAA14154@cj20424-a.reston1.va.home.com> <200109181621.SAA13879@pandora.informatik.hu-berlin.de> Message-ID: <200109181634.f8IGYYW28171@odiug.digicool.com> (Changing subject) > > > _fields = {'encode':0,'decode':1,'reader':2,'writer':3} > > > class CodecInfo(tuple): > > > __dynamic__ = 0 > > > def __getattr__(self, name): > > > try: > > > return self[_fields[name]] > > > except KeyError: > > > raise AttributeError, name > > > > You want to change that raise statement into > > > > return tuple.__getattr__(self, name) > > > > Remember, new-style __getattr__ *replaces* the built-in getattr > > operation > > Originally, I had _fields as a class attribute, and was using > self._fields inside getattr, which caused a StackOverflow. I could not > figure out why it wouldn't just find _fields in the class before > invoking __getattr__... > > > (I'm still considering whether this is too backwards incompatible; we > > could have two getattr hooks, one old-style and one new-style.) > > So far, it is not really incompatible, since it only applies to > new-style classes. It is confusing to long-time users, so it deserves > documentation. It seems to be the number one point of confusion. It's also one of the few places where you have to change your code when porting a class to the new style -- which is otherwise very simple, either place a ``__metaclass__ = type'' line in your module or make all your base classes derive from object. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Sep 18 18:05:04 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 18 Sep 2001 19:05:04 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109171722.NAA01422@cj20424-a.reston1.va.home.com> Message-ID: <3BA77EC0.711B2EEF@lemburg.com> Guido van Rossum wrote: > > > The only notable exception is > > the cStringIO module -- but this could probably be changed to > > be buffer interface compliant too. > > Sure, just submit a patch. Done. See #462596. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Tue Sep 18 18:35:45 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 18 Sep 2001 19:35:45 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> <200109180956.LAA02780@pandora.informatik.hu-berlin.de> <3BA75180.79470CAC@lemburg.com> <200109181623.SAA14341@pandora.informatik.hu-berlin.de> Message-ID: <3BA785F1.C07E82F7@lemburg.com> Martin von Loewis wrote: > > > > What user code exactly would break? Would that be a serious problem? > > > > All code which assumes a tuple as return value. It's hard to > > say how much code makes such an assumption. Most Python > > code probably only uses the sequence interface, but the C interface > > was deliberately designed to return tuples so that C programmers > > can easily access the data. > > I believe that code would continue to work if you got a instance of a > tuple subtype. It might work at Python level, maybe even at C level, but I really don't see the point in trying to hack up a new type just for this purpose. Here's an implementation which pretty much solves the "problem": -- ### Helpers for codec lookup def getencoder(encoding): """ Lookup up the codec for the given encoding and return its encoder function. Raises a LookupError in case the encoding cannot be found. """ return lookup(encoding)[0] def getdecoder(encoding): """ Lookup up the codec for the given encoding and return its decoder function. Raises a LookupError in case the encoding cannot be found. """ return lookup(encoding)[1] def getreader(encoding): """ Lookup up the codec for the given encoding and return its StreamReader class or factory function. Raises a LookupError in case the encoding cannot be found. """ return lookup(encoding)[2] def getwriter(encoding): """ Lookup up the codec for the given encoding and return its StreamWriter class or factory function. Raises a LookupError in case the encoding cannot be found. """ return lookup(encoding)[3] -- If noone objects, I'll check these into CVS along with some docs for libcodecs.tex. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From chrishbarker@home.net Tue Sep 18 19:47:36 2001 From: chrishbarker@home.net (Chris Barker) Date: Tue, 18 Sep 2001 11:47:36 -0700 Subject: [Python-Dev] MacPython and line-endings References: <20010908213026.DD94E139843@oratrix.oratrix.nl> Message-ID: <3BA796C8.7E05841B@home.net> This is a multi-part message in MIME format. --------------54C7C351FE079B794391A813 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Jack Jansen wrote: > - On input, unix line-endings are now acceptable for all text files. This > is an experimental feature (awaiting a general solution, for which a > PEP has been promised but not started yet, the giulty parties know who > they are:-), and it can be turned off with a preference. Jack, I don't know if I qualify as one of the "guilty" parties, but I did volunteer to help with a PEP about this, and I'd still like to. I do have some ideas about what I'd like to see in that PEP. The one thing I have done is write a prototype in pure Python for how I would like platform neutral text files to work. I've enclosed it with this message, and invite comments. Has anyone started this PEP yet? if so, I'd like to help, if not, then the following is a very early draft of my thoughts. Note that I am writting this from memory, without going back to the archives to see what all the comments were at the time. I will do that before I call this a PEP. Here are my quick thoughts: This started (the recent thread, anyway) with the need for MacPython (with the introduction of OS-X) to be able to read both traditional mac style text files and unix style text files. An import-hook was suggested, but then it was brought up that a lot of python code can be read in other ways than an import, from execfile(), and a whole lot of others, so an imprt hook would not be enough. In general, the problem stems from the fact that while Python knows what system it is running on, a file that is being read may or may not be on that same system. This is most agregeuos with OS-X as you essentially have both Unix and MacOS running on the same machine at the same time, often sharing a file system. The issue also comes up with heterogeneous networks, where the file might reside on a server running on a different system than Python, and that file may be accessed by various systems. Some servers can do line feed translation on the fly, but this is not universal or foolproof. In addition to Python code, many Python programs need to read and write text files that are not in a native format, and the format may not be known by the programmer when the code is writen. My proposed solution to these problems is to have a new type of file: a "Universal" text file. This would be a text file that would do line-feed translation to the internal representation on the fly as the file was being read (like the current text file type), but it would translate any of the known text file formats automatically (\r\n, \r, \n Any others???). When the file was being written to, a single terminator would have to be specified, defaulting to the native one, or in the case of a file opened for appending, perhaps the one in the file when it is opened. The user could specify a non-native terminator when openign a file for writing. Issues: The two big issues that came up in the discussion were backward compatability and performance: 1) The python open() function currently defaults to a text file type. However, on Posix systems, there is no difference between a text file and a binary file, so many programmers writing code that is designed to run only on such systems left the "b" flag off when opening files for binary reading and writing. If the behaviour of a file opened without the binary flag were to change, a lot of code would break. 2) In recent versions of Python, a lot of effort was put into improving performance of line oriented text file reading. These optimisations require the use of native line endings. In order to get similar performance with non-native endings, some portions of the C stdio library would have to be re-written. This is a major undertaking, and no one has stepped up to volunteer. The proposed solution to both of these problems is to introduce a new flag to the open() function: "t". If the "t" flag is present, the function returns a Universal Text File, rather than a standard text file. As this is a new flag, no old code should be broken. The default would return a standard text file with the current behaviour. This would allow the implimentation to be written in a way that was robust, but perhaps not have optimum performance. If performance were critical, a programmer could always use the old style text file. If, at some point, code is written that allows the performance of Universal Text Files to approach that of standard text files, perhaps the two could be merged. It is unfortunate that the default would be the performance-optimised but less generally useful case, but that is a reasonable price to be paid for backward compatability. Perhaps the default could be changed at some point in the future when other incompatabilities are introduced (Python 3?) In the case of Python code being read, performance of the file read is unlikely to be critical to the performance of the application as a whole. Issues / questions: Some systems, (VMS ?) store text files in the file system as a series of lines, rather than just a string of bytes like most common systems today. It would take a little more code to accomidate this, but it could be done. Should a file being read be required to have a single line termination type, or could they be mixed and matched? The prototype code allows mix and match, but I'm not married to that idea. If it requires a single terminator, then some performance could be gained by checking the terminator type when opening the file, and using the existing native text file code when it is a native file. Others Issues??? I'd love to hear all your feedback on this write-up, as well as my code. Please either CC me or the MacPython list, as I'm not subscribed to python-dev -Chris -- Christopher Barker, Ph.D. ChrisHBarker@home.net --- --- --- http://members.home.net/barkerlohmann ---@@ -----@@ -----@@ ------@@@ ------@@@ ------@@@ Oil Spill Modeling ------ @ ------ @ ------ @ Water Resources Engineering ------- --------- -------- Coastal and Fluvial Hydrodynamics -------------------------------------- ------------------------------------------------------------------------ --------------54C7C351FE079B794391A813 Content-Type: text/plain; charset=us-ascii; name="TextFile.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="TextFile.py" #!/usr/bin/env python """ TextFile.py : a module that provides a UniversalTextFile class, and a replacement for the native python "open" command that provides an interface to that class. It would usually be used as: from TextFile import open then you can use the new open just like the old one (with some added flags and arguments) or import TextFile file = TextFile.open(filename,flags,[bufsize], [LineEndingType], [LineBufferSize]) please send bug reports, helpful hints, and/or feature requests to: Chris Barker ChrisHBarker@home.net Copyright/licence is the same as whatever version of python you are running. """ import os ## Re-map the open function _OrigOpen = open def open(filename,flags = "",bufsize = -1, LineEndingType = "", LineBufferSize = ""): """ A new open function, that returns a regular python file object for the old calls, and returns a new nifty universal text file when required. This works just like the regular open command, except that a new flag and a new parameter has been added. The new flag is "t" which indicates that the file to be opened is a universal text file. While the standard open() function defaults to a text file, on Posix systems, there is no difference between a text file and binary fiole so there is a lot of code out there that opens files as text, when a binary file is really required. This code currently works just fine on Posix systems, so it was neccessary to introduce a new flag, to maintian backward compatabilty. The old style, line ending dpeendent text file with also provide better performance. To Call: file = open(filename,flags = "",bufsize = -1, LineEndingType = ""): - filename is the name of the file to be opened - flags is a string of one letter flags, the same as the standard open command, plus a "t" for universal text file. - - "b" means binary file, this returns the standard binary file object - - "t" means universal text file - - "r" for read only - - "w" for write. If there is both "w" and "t" than the user can specify a line ending type to be used with the LineEndingType parameter. - - "a" means append to existing file - bufsize specifies the buffer size to be used by the system. Same as the regular open function - LineEndingType is used only for writing (and appending) files, to specify a non-native line ending to be written. - - The options are: "native", "DOS", "Posix", "Unix", "Mac", or the characters themselves( "\r\n", etc. ). "native" will result in using the standard file object, which uses whatever is native for the system that python is running on. - LineBufferSize is the size of the buffer used to read data in a readline() operation. The default is currently set to 200 characters. If you will be reading files with many lines over 200 characters long, you should set this number to the largest expected line length. NOTE: I'm sure the flag checking could be more robust. """ if "t" in flags: # this is a universal text file if ("w" in flags) and (not "w+" in flags) and LineEndingType == "native": return _OrigOpen(filename,flags.replace("t",""), bufsize) return UniversalTextFile(filename,flags,LineEndingType,LineBufferSize) else: # this is a regular old file return _OrigOpen(filename,flags,bufsize) class UniversalTextFile: """ A class that acts just like a python file object, but has a mode that allows the reading of arbitrary formated text files, i.e. with either Unix, DOS or Mac line endings. [\n , \r\n, or \r] To keep it truly universal, it checks for each of these line ending possibilities at every line, so it should work on a file with mixed endings as well. """ def __init__(self,filename,flags = "",LineEndingType = "native",LineBufferSize = ""): self._file = _OrigOpen(filename,flags.replace("t","")+"b") LineEndingType = LineEndingType.lower() if LineEndingType == "native": self.LineSep = os.linesep() elif LineEndingType == "dos": self.LineSep = "\r\n" elif LineEndingType == "posix" or LineEndingType == "unix" : self.LineSep = "\n" elif LineEndingType == "mac": self.LineSep = "\r" else: self.LineSep = LineEndingType ## some attributes self.closed = 0 self.mode = flags self.softspace = 0 if LineBufferSize: self._BufferSize = LineBufferSize else: self._BufferSize = 100 def readline(self): start_pos = self._file.tell() ##print "Current file posistion is:", start_pos line = "" TotalBytes = 0 Buffer = self._file.read(self._BufferSize) while Buffer: ##print "Buffer = ",repr(Buffer) newline_pos = Buffer.find("\n") return_pos = Buffer.find("\r") if return_pos == newline_pos-1 and return_pos >= 0: # we have a DOS line line = Buffer[:return_pos]+ "\n" TotalBytes = newline_pos+1 break elif ((return_pos < newline_pos) or newline_pos < 0 ) and return_pos >=0: # we have a Mac line line = Buffer[:return_pos]+ "\n" TotalBytes = return_pos+1 break elif newline_pos >= 0: # we have a Posix line line = Buffer[:newline_pos]+ "\n" TotalBytes = newline_pos+1 break else: # we need a larger buffer NewBuffer = self._file.read(self._BufferSize) if NewBuffer: Buffer = Buffer + NewBuffer else: # we are at the end of the file, without a line ending. self._file.seek(start_pos + len(Buffer)) return Buffer self._file.seek(start_pos + TotalBytes) return line def readlines(self,sizehint = None): """ readlines acts like the regular readlines, except that it understands any of the standard text file line endings ("\r\n", "\n", "\r"). If sizehint is used, it will read a a maximum of that many bytes. It will never round up, as the regular readline sometimes does. This means that if your buffer size is less than the length of the next line, you'll get an empty string, which could incorrectly be interpreted as the end of the file. """ if sizehint: Data = self._file.read(sizehint) else: Data = self._file.read() if len(Data) == sizehint: #print "The buffer is full" FullBuffer = 1 else: FullBuffer = 0 Data = Data.replace("\r\n","\n").replace("\r","\n") Lines = [line + "\n" for line in Data.split('\n')] ## If the last line is only a linefeed it is an extra line if Lines[-1] == "\n": del Lines[-1] ## if it isn't then the last line didn't have a linefeed, so we need to remove the one we put on. else: ## or it's the end of the buffer if FullBuffer: self._file.seek(-(len(Lines[-1])-1),1) # reset the file position del(Lines[-1]) else: Lines[-1] = Lines[-1][:-1] return Lines def readnumlines(self,NumLines = 1): """ readnumlines is an extension to the standard file object. It returns a list containing the number of lines that are requested. I have found this to be very useful, and allows me to avoid the many loops like: lines = [] for i in range(N): lines.append(file.readline()) Also, If I ever get around to writing this in C, it will provide a speed improvement. """ Lines = [] while len(Lines) < NumLines: Lines.append(self.readline()) return Lines def read(self,size = None): """ read acts like the regular read, except that it tranlates any of the standard text file line endings ("\r\n", "\n", "\r") into a "\n" If size is used, it will read a maximum of that many bytes, before translation. This means that if the line endings have more than one character, the size returned will be smaller. This could be fixed, but it didn't seem worth it. If you want that much control, use a binary file. """ if size: Data = self._file.read(size) else: Data = self._file.read() return Data.replace("\r\n","\n").replace("\r","\n") def write(self,string): """ write is just like the regular one, except that it uses the line separator specified when the file was opened for writing or appending. """ self._file.write(string.replace("\n",self.LineSep)) def writelines(self,list): for line in list: self.write(line) # The rest of the standard file methods mapped def close(self): self._file.close() self.closed = 1 def flush(self): self._file.flush() def fileno(self): return self._file.fileno() def seek(self,offset,whence = 0): self._file.seek(offset,whence) def tell(self): return self._file.tell() --------------54C7C351FE079B794391A813-- From fredrik@effbot.org Tue Sep 18 19:53:08 2001 From: fredrik@effbot.org (Fredrik Lundh) Date: Tue, 18 Sep 2001 20:53:08 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> <200109180956.LAA02780@pandora.informatik.hu-berlin.de> <3BA75180.79470CAC@lemburg.com> <200109181623.SAA14341@pandora.informatik.hu-berlin.de> <3BA785F1.C07E82F7@lemburg.com> Message-ID: <015601c14073$29bf8ac0$3afb42d5@hagrid> mal wrote: > Here's an implementation which pretty much solves the "problem": +1 (obviously ;-) > If noone objects, I'll check these into CVS along with some docs > for libcodecs.tex. go ahead. From Benjamin.Schollnick@usa.xerox.com Tue Sep 18 19:49:19 2001 From: Benjamin.Schollnick@usa.xerox.com (Schollnick, Benjamin) Date: Tue, 18 Sep 2001 14:49:19 -0400 Subject: [Python-Dev] RE: [Pythonmac-SIG] Mac Python and line-endings Message-ID: > Should a file being read be required to have a single line termination > type, or could they be mixed and matched? The prototype code allows mix > and match, but I'm not married to that idea. If it requires a single > terminator, then some performance could be gained by checking the > terminator type when opening the file, and using the existing native > text file code when it is a native file. I'm not aware of any type of text file, that supports switching line deliminators inside of the same file.... Now that doesn't mean it couldn't exist, but logically that would be a strange file.... I think it's a even bet that a file would have the same deliminator throughout the file. > My proposed solution to these problems is to have a new type of file: a > "Universal" text file. This would be a text file that would do line-feed > translation to the internal representation on the fly as the file was > being read (like the current text file type), but it would translate any > of the known text file formats automatically (\r\n, \r, \n Any > others???). That would be a interesting idea... I'm not sure how much of a performance hit we'd see, but that would certainly solve a PC / MAC issue I'm having.... (Any chance we could see the code?) Regarding your points on changing the default text file type.... There are several different ways to solve this. * Don't have the Universal be the default text file type, instead offer it as a different class, or as a open('garbage.txt', "r", universal), or some other optional switch. - Advantages being that this allows the programmer to control of the Universal usage. * Just the opposite, the programmer explicitly tells Python not to support universal... * Have the programmer subclass the File Type? * Add a global directive? * Specifically import a "universal" which will depreciate the "standard" text file IO routines... * Actually I think this sounds like the easiest and fastest way to deal with it. This way, you could add a extension library to speed it up, or whatever... (This idea is very much along the lines (??) of the STRING / STROP import) - Benjamin From guido@python.org Tue Sep 18 20:03:10 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Sep 2001 15:03:10 -0400 Subject: [Python-Dev] RE: [Pythonmac-SIG] Mac Python and line-endings In-Reply-To: Your message of "Tue, 18 Sep 2001 14:49:19 EDT." References: Message-ID: <200109181903.f8IJ3AQ01321@odiug.digicool.com> > I'm not aware of any type of text file, that supports switching line > deliminators inside of the same file.... > > Now that doesn't mean it couldn't exist, but logically that would be > a strange file.... > > I think it's a even bet that a file would have the same deliminator > throughout the file. I have observed this on Windows, where the text editor in VC++ can read files with \n line endings, and doesn't change those when it writes the file back, but always adds \r\n to lines it adds. So if you edit a file containing only \n line endings, inserting a few lines, you have mixed line endings. Also, Java supports this, and the algorithm to support it is not difficult: to read a line, read until you see either \r or \n; if you see \r, peek one character ahead and if that's a \n, include it in the line. (Haven't had the time to read the whole proposal, but a Java style text file implementation has been in my wish list for a long time.) --Guido van Rossum (home page: http://www.python.org/~guido/) From nathan@vividworks.com Tue Sep 18 20:11:47 2001 From: nathan@vividworks.com (Nathan Heagy) Date: Tue, 18 Sep 2001 13:11:47 -0600 (CST) Subject: [Python-Dev] RE: [Pythonmac-SIG] Mac Python and line-endings In-Reply-To: <200109181903.f8IJ3AQ01321@odiug.digicool.com> Message-ID: > Also, Java supports this, and the algorithm to support it is not > difficult: to read a line, read until you see either \r or \n; if you > see \r, peek one character ahead and if that's a \n, include it in the > line. What about the mac where \r *is* the line ending? Nathan From guido@python.org Tue Sep 18 20:16:42 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Sep 2001 15:16:42 -0400 Subject: [Python-Dev] RE: [Pythonmac-SIG] Mac Python and line-endings In-Reply-To: Your message of "Tue, 18 Sep 2001 13:11:47 MDT." References: Message-ID: <200109181916.f8IJGlv01419@odiug.digicool.com> > > Also, Java supports this, and the algorithm to support it is not > > difficult: to read a line, read until you see either \r or \n; if you > > see \r, peek one character ahead and if that's a \n, include it in the > > line. > > What about the mac where \r *is* the line ending? Works fine, as long as you only *peek* (i.e. don't actually consume the character following \r if it is not \n, so it's available for the next read). Requires a little smart buffer handling, which is one reason why it's hard to do using regular stdio. Also, interactive input must be treated special (so that if the user types "foo" followed by \r, the peek doesn't force the user to type another line just so that we can peek at the character following the \r). --Guido van Rossum (home page: http://www.python.org/~guido/) From chrishbarker@home.net Tue Sep 18 20:39:55 2001 From: chrishbarker@home.net (Chris Barker) Date: Tue, 18 Sep 2001 12:39:55 -0700 Subject: [Python-Dev] Re: [Pythonmac-SIG] Mac Python and line-endings References: Message-ID: <3BA7A30B.5D48B18B@home.net> "Schollnick, Benjamin" wrote: > > Should a file being read be required to have a single line termination > > type, or could they be mixed and matched? The prototype code allows mix > > and match, but I'm not married to that idea. If it requires a single > > terminator, then some performance could be gained by checking the > > terminator type when opening the file, and using the existing native > > text file code when it is a native file. > > I'm not aware of any type of text file, that supports switching line > deliminators > inside of the same file.... I agree that it wouldn't be generated on purpose, but I have seen files that got edited on different systems get mixed up...that doesn't mean we have to support it though. > That would be a interesting idea... I'm not sure how much of a performance > hit > we'd see, but that would certainly solve a PC / MAC issue I'm having.... > (Any chance we could see the code?) I enclosed a Python version of the code with the message. If you didn't get it, lwt me know and I'll send you one directly. > Regarding your points on changing the default text file type.... > * Don't have the Universal be the default text file type, instead > offer it > as a different class, or as a open('garbage.txt', "r", > universal), or some > other optional switch. Exactly. I proposed adding a "t" flag to open(). I guess I didn't write that as clearly as I might have liked. > * Just the opposite, the programmer explicitly tells Python not to > support > universal... Not good, old code coule break > * Have the programmer subclass the File Type? too much work > * Add a global directive? maybe... but it wouldn't allow mix and match > * Specifically import a "universal" which will depreciate the > "standard" text file > IO routines... That's an option too.. but would again not allow mix and match. Also. part of the point of thios is to have Python use it when it imports code, so it would have to be pretty built in. > * Actually I think this sounds like the easiest and fastest > way to deal with it. > This way, you could add a extension library to speed > it up, or whatever... true, and that's exactly what I do with my prototype, which I am using in a bunch of my code already. Maybe some day I'll get around to writing it in C, alhough I'd love someone else to do it, I am a pretty lame C programmer. -Chris -- Christopher Barker, Ph.D. ChrisHBarker@home.net --- --- --- http://members.home.net/barkerlohmann ---@@ -----@@ -----@@ ------@@@ ------@@@ ------@@@ Oil Spill Modeling ------ @ ------ @ ------ @ Water Resources Engineering ------- --------- -------- Coastal and Fluvial Hydrodynamics -------------------------------------- ------------------------------------------------------------------------ From chrishbarker@home.net Tue Sep 18 20:42:48 2001 From: chrishbarker@home.net (Chris Barker) Date: Tue, 18 Sep 2001 12:42:48 -0700 Subject: [Python-Dev] RE: [Pythonmac-SIG] Mac Python and line-endings References: Message-ID: <3BA7A3B8.8959AA4B@home.net> Nathan Heagy wrote: > > > Also, Java supports this, and the algorithm to support it is not > > difficult: to read a line, read until you see either \r or \n; if you > > see \r, peek one character ahead and if that's a \n, include it in the > > line. > > What about the mac where \r *is* the line ending? then it won't be a \n, so it won't be included. -Chris -- Christopher Barker, Ph.D. ChrisHBarker@home.net --- --- --- http://members.home.net/barkerlohmann ---@@ -----@@ -----@@ ------@@@ ------@@@ ------@@@ Oil Spill Modeling ------ @ ------ @ ------ @ Water Resources Engineering ------- --------- -------- Coastal and Fluvial Hydrodynamics -------------------------------------- ------------------------------------------------------------------------ From phf@acm.org Tue Sep 18 20:29:48 2001 From: phf@acm.org (Peter H. Froehlich) Date: Tue, 18 Sep 2001 12:29:48 -0700 Subject: [Python-Dev] RE: [Pythonmac-SIG] Mac Python and line-endings In-Reply-To: <3BA7A3B8.8959AA4B@home.net> Message-ID: <200109181229.aa02775@gremlin-relay.ics.uci.edu> Hi there! On Tuesday, September 18, 2001, at 12:42 , Chris Barker wrote: > Nathan Heagy wrote: >> >>> Also, Java supports this, and the algorithm to support it is not >>> difficult: to read a line, read until you see either \r or \n; if you >>> see \r, peek one character ahead and if that's a \n, include it in the >>> line. >> >> What about the mac where \r *is* the line ending? > > then it won't be a \n, so it won't be included. I think the point is what to do if a file on the Mac includes the combination "\r\n" where the line should end at the "\r" but the "\n" should be returned as the next character or part of the next line... Peter (who thinks text files suck :-) -- Peter H. Froehlich @ http://www.ics.uci.edu/~pfroehli/ From guido@python.org Tue Sep 18 20:35:11 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Sep 2001 15:35:11 -0400 Subject: [Python-Dev] RE: [Pythonmac-SIG] Mac Python and line-endings In-Reply-To: Your message of "Tue, 18 Sep 2001 12:29:48 PDT." <200109181229.aa02775@gremlin-relay.ics.uci.edu> References: <200109181229.aa02775@gremlin-relay.ics.uci.edu> Message-ID: <200109181935.f8IJZCd01554@odiug.digicool.com> > I think the point is what to do if a file on the Mac includes the > combination "\r\n" where the line should end at the "\r" but the "\n" > should be returned as the next character or part of the next line... Then it's not a universal text file. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Tue Sep 18 21:17:29 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Tue, 18 Sep 2001 16:17:29 -0400 Subject: [Python-Dev] Parametrized codecs Message-ID: Are there any codecs that can take parameters at initialization time? If not, how about supporting them? Here's a motivating example: import codecs f = codecs.open('output', 'wb', encoding='DES', # arguments to open() key = '...', mode = DES.ECB) f.write('This will be encrypted\n') f.close() For this to work, you'd need a few changes to codecs.py. Off the top of my head, I think StreamReader, StreamWriter, and StreamReaderWriter all need to grow **kwargs arguments for any additional arguments, as do codecs.open() and codecs.lookup(). Then someone could write a DES StreamReader/Streamwriter that took a key and feedback mode at creation time. (codecs.Codec instances are supposed to be stateless, right?) --amk From gmcm@hypernet.com Tue Sep 18 21:31:31 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Tue, 18 Sep 2001 16:31:31 -0400 Subject: [Python-Dev] Parametrized codecs In-Reply-To: Message-ID: <3BA776E3.16942.1164559D@localhost> Andrew wrote: > Are there any codecs that can take parameters at initialization > time? If not, how about supporting them? Here's a motivating > example: > > import codecs > f = codecs.open('output', 'wb', encoding='DES', # arguments to > open() > key = '...', mode = DES.ECB) > f.write('This will be encrypted\n') > f.close() RC4, maybe. But I think you're asking for trouble in trying to pretend a block-mode cipher is a stream. - Gordon From jeremy@zope.com Tue Sep 18 22:57:26 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Tue, 18 Sep 2001 17:57:26 -0400 (EDT) Subject: [Python-Dev] fredrik? Message-ID: <15271.49990.433899.692356@slothrop.digicool.com> I've tried to send mail to Fredrik as fredrik@pythonware.com and effbot@users.sourceforge.net. Both attempts today bounced. Has anyone else reached him lately? Perhaps a different email address? Jeremy From skip@pobox.com (Skip Montanaro) Tue Sep 18 23:22:57 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 18 Sep 2001 17:22:57 -0500 Subject: [Python-Dev] fredrik? In-Reply-To: <15271.49990.433899.692356@slothrop.digicool.com> References: <15271.49990.433899.692356@slothrop.digicool.com> Message-ID: <15271.51521.107812.977776@beluga.mojam.com> Jeremy> I've tried to send mail to Fredrik as fredrik@pythonware.com and Jeremy> effbot@users.sourceforge.net. Both attempts today bounced. Has Jeremy> anyone else reached him lately? Perhaps a different email Jeremy> address? No, but I've seen posts by him (responses to other email), so presumably he's receiving mail. It's also worth noting that there is another worm working its way through the Windows community (both servers and non-servers): http://news.cnet.com/news/0-1003-200-7215349.html?tag=lthd Skip From guido@python.org Tue Sep 18 23:50:36 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Sep 2001 18:50:36 -0400 Subject: [Python-Dev] fredrik? In-Reply-To: Your message of "Tue, 18 Sep 2001 17:57:26 EDT." <15271.49990.433899.692356@slothrop.digicool.com> References: <15271.49990.433899.692356@slothrop.digicool.com> Message-ID: <200109182250.SAA15559@cj20424-a.reston1.va.home.com> > I've tried to send mail to Fredrik as fredrik@pythonware.com and > effbot@users.sourceforge.net. Both attempts today bounced. Has anyone > else reached him lately? Perhaps a different email address? No. fredrik@effbot.org also bounces. --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@effbot.org Wed Sep 19 02:10:21 2001 From: fredrik@effbot.org (Fredrik Lundh) Date: Wed, 19 Sep 2001 03:10:21 +0200 Subject: [Python-Dev] fredrik? References: <15271.49990.433899.692356@slothrop.digicool.com> Message-ID: <027801c140a7$e217c810$3afb42d5@hagrid> > I've tried to send mail to Fredrik as fredrik@pythonware.com and > effbot@users.sourceforge.net. Both attempts today bounced. Has anyone > else reached him lately? Perhaps a different email address? mail to pythonware.com stopped working soon after the worm first appeared; don't know why (a copy of the bounce message would be appreciated). (effbot.org mail is currently routed via pythonware.com) mail to effbot@telia.com appears to work. if that doesn't work, send mail to effbot@hotmail.com and post a note to python-dev. From barry@zope.com Wed Sep 19 03:49:00 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 18 Sep 2001 22:49:00 -0400 Subject: [Python-Dev] Broken build Message-ID: <15272.1948.811531.692710@anthem.wooz.org> Does anybody else see this? ... gcc -g -O3 -Wall -Wstrict-prototypes -I. -I./Include -DHAVE_CONFIG_H -c ./Modules/posixmodule.c -o Modules/posixmodule.o ./Modules/posixmodule.c: In function `posix_nice': ./Modules/posixmodule.c:1145: warning: implicit declaration of function `getpriority' ./Modules/posixmodule.c:1145: `PRIO_PROCESS' undeclared (first use in this function) ./Modules/posixmodule.c:1145: (Each undeclared identifier is reported only once ./Modules/posixmodule.c:1145: for each function it appears in.) make: *** [Modules/posixmodule.o] Error 1 I just did a fresh cvs update -A, make distclean, configure, make. This on a RH6.1-ish Linux box, with uname -a: Linux anthem 2.2.18 #21 SMP Mon Jan 8 00:33:29 EST 2001 i686 unknown This bit of code doesn't appear to have changed in a while, so maybe something else in the configure/build process has changed to break this? From ./configure: ... checking for getpriority... yes ... checking for broken nice()... yes -Barry From skip@pobox.com (Skip Montanaro) Wed Sep 19 04:10:54 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 18 Sep 2001 22:10:54 -0500 Subject: [Python-Dev] Broken build In-Reply-To: <15272.1948.811531.692710@anthem.wooz.org> References: <15272.1948.811531.692710@anthem.wooz.org> Message-ID: <15272.3262.405850.412988@beluga.mojam.com> Based upon Barry's speculation that something might have changed in the configure process I mv'd my config.cache out of the way before configuring (like Barry, after a cvs up -A). I was a bit surprised to see these changes: % diff config.cache.save config.cache | less 185a186 > ac_cv_pthread_system_supported=${ac_cv_pthread_system_supported=no} 190c191 < ac_cv_sizeof_fpos_t=${ac_cv_sizeof_fpos_t=12} --- > ac_cv_sizeof_fpos_t=${ac_cv_sizeof_fpos_t=16} 194c195 < ac_cv_sizeof_off_t=${ac_cv_sizeof_off_t=4} --- > ac_cv_sizeof_off_t=${ac_cv_sizeof_off_t=8} Given that my system hasn't changed in the past couple of days, I wonder why it thought the default size of some of these offsets should change. (New update of configure, perhaps?) Still, I had no problem building on my Mandrake 8.0 system. Skip From guido@python.org Wed Sep 19 04:21:15 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 18 Sep 2001 23:21:15 -0400 Subject: [Python-Dev] Broken build In-Reply-To: Your message of "Tue, 18 Sep 2001 22:10:54 CDT." <15272.3262.405850.412988@beluga.mojam.com> References: <15272.1948.811531.692710@anthem.wooz.org> <15272.3262.405850.412988@beluga.mojam.com> Message-ID: <200109190321.XAA19446@cj20424-a.reston1.va.home.com> > Based upon Barry's speculation that something might have changed in the > configure process I mv'd my config.cache out of the way before configuring > (like Barry, after a cvs up -A). I was a bit surprised to see these > changes: > > % diff config.cache.save config.cache | less > 185a186 > > ac_cv_pthread_system_supported=${ac_cv_pthread_system_supported=no} > 190c191 > < ac_cv_sizeof_fpos_t=${ac_cv_sizeof_fpos_t=12} > --- > > ac_cv_sizeof_fpos_t=${ac_cv_sizeof_fpos_t=16} > 194c195 > < ac_cv_sizeof_off_t=${ac_cv_sizeof_off_t=4} > --- > > ac_cv_sizeof_off_t=${ac_cv_sizeof_off_t=8} > > Given that my system hasn't changed in the past couple of days, I wonder why > it thought the default size of some of these offsets should change. (New > update of configure, perhaps?) Still, I had no problem building on my > Mandrake 8.0 system. It has to do with this NEWS item: - Large file support (LFS) is now automatic when the platform supports it; no more manual configuration tweaks are needed. On Linux, at least, it's possible to have a system whose C library supports large files but whose kernel doesn't; in this case, large file support is still enabled but doesn't do you any good unless you upgrade your kernel or share your Python executable with another system whose kernel has large file support. I believe there's an SF bug report mentioning the same problem that Barry reports. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Wed Sep 19 04:35:58 2001 From: tim.one@home.com (Tim Peters) Date: Tue, 18 Sep 2001 23:35:58 -0400 Subject: [Python-Dev] Broken build In-Reply-To: <200109190321.XAA19446@cj20424-a.reston1.va.home.com> Message-ID: > I believe there's an SF bug report mentioning the same problem that > Barry reports. Not only that, but Barry reported fixing it two months ago(!): Date: 2001-07-11 18:10 Sender: bwarsaw Logged In: YES user_id=12800 Ploing! I just reported this same problem to python-dev. Attached is a patch that fixes it for me on my RH6.1-ish Linux box. That from http://sf.net/tracker/?group_id=5470&atid=105470&func=detail&aid=440522 It was reported (and closed) again in http://sf.net/tracker/?group_id=5470&atid=105470&func=detail&aid=443042 From barry@zope.com Wed Sep 19 04:36:34 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 18 Sep 2001 23:36:34 -0400 Subject: [Python-Dev] Broken build References: <15272.1948.811531.692710@anthem.wooz.org> <15272.3262.405850.412988@beluga.mojam.com> Message-ID: <15272.4802.741692.170176@anthem.wooz.org> Guido informs me that there was a similar bug report for 2.2a1, which Thomas closed. It's 443042, which I'm re-opening. Thomas, feel free to assign it to me for further investigation. -Barry From barry@zope.com Wed Sep 19 04:51:56 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 18 Sep 2001 23:51:56 -0400 Subject: [Python-Dev] Broken build References: <200109190321.XAA19446@cj20424-a.reston1.va.home.com> Message-ID: <15272.5724.442321.507059@anthem.wooz.org> >>>>> "TP" == Tim Peters writes: >> I believe there's an SF bug report mentioning the same problem >> that Barry reports. TP> Not only that, but Barry reported fixing it two months ago(!): Oh jeez. | Date: 2001-07-11 18:10 | Sender: bwarsaw | Logged In: YES | user_id=12800 | Ploing! I just reported this same problem to python-dev. | Attached is a patch that fixes it for me on my RH6.1-ish | Linux box. Yeah, but posixmodule.c has this bit in it: #ifdef HAVE_NICE #if defined(HAVE_BROKEN_NICE) && defined(HAVE_SYS_RESOURCE_H) #if defined(HAVE_GETPRIORITY) && !defined(PRIO_PROCESS) #include #endif #endif So why isn't sys/resource.h getting included? -Barry From barry@zope.com Wed Sep 19 05:15:26 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Sep 2001 00:15:26 -0400 Subject: [Python-Dev] Broken build References: <200109190321.XAA19446@cj20424-a.reston1.va.home.com> <15272.5724.442321.507059@anthem.wooz.org> Message-ID: <15272.7134.282400.577500@anthem.wooz.org> >>>>> "BAW" == Barry A Warsaw writes: BAW> Yeah, but posixmodule.c has this bit in it: | #ifdef HAVE_NICE | #if defined(HAVE_BROKEN_NICE) && defined(HAVE_SYS_RESOURCE_H) | #if defined(HAVE_GETPRIORITY) && !defined(PRIO_PROCESS) | #include | #endif | #endif BAW> So why isn't sys/resource.h getting included? ...because configure isn't finding it: checking for sys/resource.h... no so HAVE_SYS_RESOURCE_H isn't getting defined in pyconfig.h. It's there in /usr/include though: % ls -l /usr/include/sys/resource.h -rw-r--r-- 1 root root 3185 Sep 20 1999 /usr/include/sys/resource.h Strange. More investigation tomorrow. -Barry From barry@zope.com Wed Sep 19 05:38:29 2001 From: barry@zope.com (Barry A. Warsaw) Date: Wed, 19 Sep 2001 00:38:29 -0400 Subject: [Python-Dev] Broken build References: <200109190321.XAA19446@cj20424-a.reston1.va.home.com> <15272.5724.442321.507059@anthem.wooz.org> <15272.7134.282400.577500@anthem.wooz.org> Message-ID: <15272.8517.922778.538026@anthem.wooz.org> >>>>> "BAW" == Barry A Warsaw writes: BAW> So why isn't sys/resource.h getting included? BAW> ...because configure isn't finding it: BAW> checking for sys/resource.h... no Closer... /usr/include/sys/resource.h #include's /usr/include/bits/resource.h which in turn #include's /usr/include/asm/resource.h. asm/resource.h has this: /* * SuS says limits have to be unsigned. * Which makes a ton more sense anyway. */ #define RLIM_INFINITY (~0UL) which is not protected by #defines. bits/resource.h has this: /* Value to indicate that there is no limit. */ #ifndef __USE_FILE_OFFSET64 # define RLIM_INFINITY ((long int)(~0UL >> 1)) #else # define RLIM_INFINITY 0x7fffffffffffffffLL #endif which itself doesn't #undef RLIM_INFINITY or otherwise protect itself. So when configure runs we get a bunch of redefined warnings. Compiling just this simple file: -------------------- snip snip -------------------- #include int main() { return 0; } -------------------- snip snip -------------------- Gives us: -------------------- snip snip -------------------- In file included from /usr/include/sys/resource.h:25, from main.c:1: /usr/include/bits/resource.h:109: warning: `RLIM_INFINITY' redefined /usr/include/asm/resource.h:26: warning: this is the location of the previous definition -------------------- snip snip -------------------- which must configure configure into thinking resource.h isn't available. Looks like a glibc update might be in order : http://sources.redhat.com/ml/bug-glibc/2000-02/msg00026.html http://sources.redhat.com/ml/bug-glibc/2000-02/msg00027.html http://sources.redhat.com/ml/bug-glibc/2000-10/msg00078.html http://sources.redhat.com/ml/bug-glibc/2000-10/msg00084.html The odd bit is that I'm still running the 2.2.18 kernel, although glibc is fairly old. % rpm -q glibc glibc-2.1.2-11 Wonder if I can find glibc-2.1.3 somewhere. then-again-i've-been-looking-for-a-good-excuse-to-install-mandrake-8.0-ly y'rs, -Barry From martin@loewis.home.cs.tu-berlin.de Wed Sep 19 06:43:19 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 19 Sep 2001 07:43:19 +0200 Subject: [Python-Dev] Parametrized codecs Message-ID: <200109190543.f8J5hJ201414@mira.informatik.hu-berlin.de> > Are there any codecs that can take parameters at initialization time? _codecs.utf_16_encode takes an additional parameter to request a certain endianness. This is not used in the codecs modules, though, since there are also functions for each endianness. > If not, how about supporting them? I think this would be up to the individual codecs to support; I'm -0 on the changes to codecs.open. >f = codecs.open('output', 'wb', encoding='DES', # arguments to open() > key = '...', mode = DES.ECB) Would this better be spelled f = codecs.open('output', 'wb', encoding='DES-ECB', # arguments to open() key = '...') To write it your way: Where did you get the DES module from? If the user has to specifically access the DES module, anyway, how about f = DES.open('output', 'wb', key = '...', mode = DES.ECB) > For this to work, you'd need a few changes to codecs.py. Off the > top of my head, I think StreamReader, StreamWriter, and > StreamReaderWriter all need to grow **kwargs arguments for any > additional arguments, Why is that? Shouldn't you instead provide a StreamReader etc. subclass that just accepts the additional arguments? Users could then write writer = codecs.lookup("DES")[3] f = writer(open('output', 'wb'), key="...", mode=DES.ECB) > as do codecs.open() and codecs.lookup() I cannot see at all why you want to pass arguments to lookup. Instead, wouldn't it be sufficient to pass them to the stream readers and codec functions returned? This also complicates caching of codecs, and probably the implementation of the codecs: how is lookup supposed to preserve the state in the tuple being returned? On arguments to open: mixing arguments that are consumed at different places (i.e. in .open, and in the reader/writer) makes me nervous. If we ever find the need for further standard arguments to codecs.open, it may break codecs that use these argument names for their own purposes. Regards, Martin From loewis@informatik.hu-berlin.de Wed Sep 19 07:28:36 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 19 Sep 2001 08:28:36 +0200 (MEST) Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: <3BA785F1.C07E82F7@lemburg.com> (mal@lemburg.com) References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> <200109180956.LAA02780@pandora.informatik.hu-berlin.de> <3BA75180.79470CAC@lemburg.com> <200109181623.SAA14341@pandora.informatik.hu-berlin.de> <3BA785F1.C07E82F7@lemburg.com> Message-ID: <200109190628.IAA11208@pandora.informatik.hu-berlin.de> > If noone objects, I'll check these into CVS along with some docs > for libcodecs.tex. Sounds good to me. Martin From mal@lemburg.com Wed Sep 19 12:01:39 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 19 Sep 2001 13:01:39 +0200 Subject: [Python-Dev] Parametrized codecs References: Message-ID: <3BA87B13.511A89FC@lemburg.com> Andrew Kuchling wrote: > > Are there any codecs that can take parameters at initialization time? > If not, how about supporting them? Even though this is not documented in the Unicode PEP, the extensibility of the constructor using extra (keyword) arguments is intended. > Here's a motivating example: > > import codecs > f = codecs.open('output', 'wb', encoding='DES', # arguments to open() > key = '...', mode = DES.ECB) > f.write('This will be encrypted\n') > f.close() > > For this to work, you'd need a few changes to codecs.py. Off the top > of my head, I think StreamReader, StreamWriter, and StreamReaderWriter > all need to grow **kwargs arguments for any additional arguments, as > do codecs.open() and codecs.lookup(). The codec design allows extending the constructors using keyword arguments, however I'd rather not add a generic **kws argument to the base classes since this would hide errors (unknown or unsupported keywords, mispellings, etc.). The codec.open() API is different though: it could grow a **kws argument which is then apply()ed to the codec constructor. The constructor will then raise any exceptions related to malformed keyword parameters. There's no need to add **kws to codec.lookup(). > Then someone could write a DES StreamReader/Streamwriter that took a > key and feedback mode at creation time. (codecs.Codec instances are > supposed to be stateless, right?) The encoder and decoder functions must work stateless. StreamReader/Writer instances should be used for everything that has to do with state since these are instantiatied per use whereas encoder and decoder functions fetched using codec.lookup() can be used multiple times. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Wed Sep 19 12:34:02 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 19 Sep 2001 13:34:02 +0200 Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109180802.KAA01481@pandora.informatik.hu-berlin.de> <3BA7042A.B9D5425F@lemburg.com> <200109180956.LAA02780@pandora.informatik.hu-berlin.de> <3BA75180.79470CAC@lemburg.com> <200109181623.SAA14341@pandora.informatik.hu-berlin.de> <3BA785F1.C07E82F7@lemburg.com> <200109190628.IAA11208@pandora.informatik.hu-berlin.de> Message-ID: <3BA882AA.443FDB70@lemburg.com> Martin von Loewis wrote: > > > If noone objects, I'll check these into CVS along with some docs > > for libcodecs.tex. > > Sounds good to me. Great. I just checked them in... -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From fdrake@acm.org Wed Sep 19 13:50:10 2001 From: fdrake@acm.org (Fred L. Drake) Date: Wed, 19 Sep 2001 08:50:10 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010919125010.D91A028845@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Miscellaneous minor updates, including docs for the new codec interfaces. From loewis@informatik.hu-berlin.de Wed Sep 19 14:23:40 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Wed, 19 Sep 2001 15:23:40 +0200 (MEST) Subject: [Python-Dev] why doesn't print pass unicode strings on to the file object? In-Reply-To: <200109181317.JAA14081@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Tue, 18 Sep 2001 09:17:07 -0400) References: <200109061656.SAA23098@pandora.informatik.hu-berlin.de> <3BA62EE9.D8C4F0A5@lemburg.com> <200109171722.NAA01422@cj20424-a.reston1.va.home.com> <200109180808.KAA01749@pandora.informatik.hu-berlin.de> <200109181317.JAA14081@cj20424-a.reston1.va.home.com> Message-ID: <200109191323.PAA08056@pandora.informatik.hu-berlin.de> > OK. That's very reasonable. > > What do we need to change to make this happen? Applying patch #462849 may be sufficient for the moment. If there are any file-like objects that are used in print but fail to convert Unicode objects, this hopefully will be found until the final release. Regards, Martin From akuchlin@mems-exchange.org Wed Sep 19 18:47:44 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 19 Sep 2001 13:47:44 -0400 Subject: [Python-Dev] Parametrized codecs In-Reply-To: <3BA776E3.16942.1164559D@localhost>; from gmcm@hypernet.com on Tue, Sep 18, 2001 at 04:31:31PM -0400 References: <3BA776E3.16942.1164559D@localhost> Message-ID: <20010919134744.C21095@ute.mems-exchange.org> On Tue, Sep 18, 2001 at 04:31:31PM -0400, Gordon McMillan wrote: >RC4, maybe. But I think you're asking for trouble in trying to >pretend a block-mode cipher is a stream. Dunno, but we'll hash that out on the python-crypto list. For now I just wanted to know if parametrized codecs would be an inherently bad idea. (Two ways of handling this would be requiring all write() calls to use a multiple of the block size, or reporting an error on the close() if you haven't written out an even number of blocks. I have no idea which is preferable; not sure I like either of them...) --amk From thomas.heller@ion-tof.com Wed Sep 19 18:51:53 2001 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Wed, 19 Sep 2001 19:51:53 +0200 Subject: [Python-Dev] ModuleFinder patch Message-ID: <00d501c14133$c4928a00$e000a8c0@thomasnotebook> I've uploaded a freeze/ModuleFinder patch to sf, #462936. This patch adds two improvements: 1. ModuleFinder now keeps track of which module is imported by whom. 2. ModuleFinder, when instantiated with the new scan_extdeps=1 argument, tries to track dependencies of builtin and extension modules. I'd love to hear some comments especially on the last point, so I'll shortly explain how it works. ModuleFinder starts a separate python interpreter process with a command line of '-S -v -c "import "', pipes this through popen(), and scans the error-output for lines of the form 'import blabla # ....'. This is somewhat expensive, but it is the only way I've found so far to find out about these dependencies. Examples are: The multiarray extension needs the _numpy extension. Every extension using ExtensionClass needs this. cPickle needs copy_reg and other modules. Any other ideas? Regards, Thomas From akuchlin@mems-exchange.org Wed Sep 19 19:01:55 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 19 Sep 2001 14:01:55 -0400 Subject: [Python-Dev] Parametrized codecs In-Reply-To: <3BA87B13.511A89FC@lemburg.com>; from mal@lemburg.com on Wed, Sep 19, 2001 at 01:01:39PM +0200 References: <3BA87B13.511A89FC@lemburg.com> Message-ID: <20010919140155.D21095@ute.mems-exchange.org> >The codec design allows extending the constructors using keyword >arguments, however I'd rather not add a generic **kws argument >to the base classes since this would hide errors (unknown or >unsupported keywords, mispellings, etc.). Isn't such an argument required at least for StreamReaderWriter, because it actually instantiates two objects? class StreamReaderWriter: def __init__(self, stream, Reader, Writer, errors='strict'): """ ... Reader, Writer must be factory functions or classes providing the StreamReader, StreamWriter interface resp. """ self.stream = stream self.reader = Reader(stream, errors) self.writer = Writer(stream, errors) Hmm... you'd better not need to have two different set of keyword arguments for Reader and Writer. Is that limitation acceptable? If it is, then the list of changes becomes 1) add **kw to codecs.open, and 2) add **kw to StreamReaderWriter's constructor, and apply it to the Reader() and Writer() calls. >The encoder and decoder functions must work stateless. OK, then they're not suitable for encryption (in general, though they'd work fine for ECB mode). --amk From mark.favas@csiro.au Wed Sep 19 22:21:02 2001 From: mark.favas@csiro.au (Mark Favas) Date: Thu, 20 Sep 2001 05:21:02 +0800 Subject: [Python-Dev] Unaligned accesses in coercions, current CSV Message-ID: <3BA90C3E.F85B6EFC@csiro.au> On a Compaq Alpha (Tru64 Unix, 4.0F), the following code (from test_descr.py) produces unaligned access warnings when run: class L(long): pass coerce(L(0), 0) coerce(L(0), 0L) coerce(0, L(0)) coerce(0L, L(0)) Unaligned access pid=5070 va=0x1400c9266 pc=0x120031228 ra=0x120031218 inst=0xa6000000 Unaligned access pid=5070 va=0x1400d9086 pc=0x120031228 ra=0x120031218 inst=0xa6000000 Unaligned access pid=5070 va=0x1400c9266 pc=0x120031228 ra=0x120031218 inst=0xa6000000 Unaligned access pid=5070 va=0x1400c9266 pc=0x120031228 ra=0x120031218 inst=0xa6000000 I'll try to chase this further, but work is a bit fraught at the moment... If you want me to log it as a bug, let me know. -- Mark Favas - m.favas@per.dem.csiro.au CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA From guido@python.org Thu Sep 20 06:00:49 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Sep 2001 01:00:49 -0400 Subject: [Python-Dev] Unaligned accesses in coercions, current CSV In-Reply-To: Your message of "Thu, 20 Sep 2001 05:21:02 +0800." <3BA90C3E.F85B6EFC@csiro.au> References: <3BA90C3E.F85B6EFC@csiro.au> Message-ID: <200109200500.BAA31016@cj20424-a.reston1.va.home.com> > On a Compaq Alpha (Tru64 Unix, 4.0F), the following code (from > test_descr.py) produces unaligned access warnings when run: I think this is a duplicate of bug report #462848. See the second patch attached there and let me know if it fixes it! --Guido van Rossum (home page: http://www.python.org/~guido/) From Mark.Favas@csiro.au Thu Sep 20 10:59:31 2001 From: Mark.Favas@csiro.au (Mark.Favas@csiro.au) Date: Thu, 20 Sep 2001 17:59:31 +0800 Subject: [Python-Dev] Unaligned accesses in coercions, current CSV Message-ID: <116D27C8E12BD411B3AB00B0D022B0B87C4673@yate.wa.CSIRO.AU> Yes, patch take 2 works just fine on Tru64 - thanks! MCF -----Original Message----- From: Guido van Rossum [mailto:guido@python.org] Sent: Thursday, 20 September 2001 1:01 PM To: Mark Favas Cc: python-dev@python.org Subject: Re: [Python-Dev] Unaligned accesses in coercions, current CSV > On a Compaq Alpha (Tru64 Unix, 4.0F), the following code (from > test_descr.py) produces unaligned access warnings when run: I think this is a duplicate of bug report #462848. See the second patch attached there and let me know if it fixes it! --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Sep 20 11:03:09 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 20 Sep 2001 12:03:09 +0200 Subject: [Python-Dev] Parametrized codecs References: <3BA87B13.511A89FC@lemburg.com> <20010919140155.D21095@ute.mems-exchange.org> Message-ID: <3BA9BEDD.946419AD@lemburg.com> Andrew Kuchling wrote: > > >The codec design allows extending the constructors using keyword > >arguments, however I'd rather not add a generic **kws argument > >to the base classes since this would hide errors (unknown or > >unsupported keywords, mispellings, etc.). > > Isn't such an argument required at least for StreamReaderWriter, > because it actually instantiates two objects? > > class StreamReaderWriter: > def __init__(self, stream, Reader, Writer, errors='strict'): > > """ ... > Reader, Writer must be factory functions or classes > providing the StreamReader, StreamWriter interface resp. > """ > self.stream = stream > self.reader = Reader(stream, errors) > self.writer = Writer(stream, errors) > > Hmm... you'd better not need to have two different set of keyword > arguments for Reader and Writer. Is that limitation acceptable? Should be OK since both work on the same stream backend and thus will normally use the same encoding parameters. > If > it is, then the list of changes becomes 1) add **kw to codecs.open, > and 2) add **kw to StreamReaderWriter's constructor, and apply it to > the Reader() and Writer() calls. Right. > >The encoder and decoder functions must work stateless. > > OK, then they're not suitable for encryption (in general, though > they'd work fine for ECB mode). Right; would it be feasable to use ECB ciphers for the stateless part and also allow other modes for the StreamReader/Writer ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Thu Sep 20 11:05:07 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 20 Sep 2001 12:05:07 +0200 Subject: [Python-Dev] Parametrized codecs References: <3BA776E3.16942.1164559D@localhost> <20010919134744.C21095@ute.mems-exchange.org> Message-ID: <3BA9BF53.67CA9FB4@lemburg.com> Andrew Kuchling wrote: > > On Tue, Sep 18, 2001 at 04:31:31PM -0400, Gordon McMillan wrote: > >RC4, maybe. But I think you're asking for trouble in trying to > >pretend a block-mode cipher is a stream. > > Dunno, but we'll hash that out on the python-crypto list. For now I > just wanted to know if parametrized codecs would be an inherently bad > idea. > > (Two ways of handling this would be requiring all write() calls to use > a multiple of the block size, or reporting an error on the close() if > you haven't written out an even number of blocks. I have no idea > which is preferable; not sure I like either of them...) I guess you'll need some kind of buffering in the reader/writer to handle this situation. The cipher.py tools in mxCrypto does this already, so it may serve as template for the needed code. (This is getting off-topic though for this list...) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From gmcm@hypernet.com Thu Sep 20 13:27:44 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Thu, 20 Sep 2001 08:27:44 -0400 Subject: [Python-Dev] Parametrized codecs In-Reply-To: <3BA9BF53.67CA9FB4@lemburg.com> Message-ID: <3BA9A880.18711.19F62524@localhost> MAL wrote: > Andrew Kuchling wrote: > > > > On Tue, Sep 18, 2001 at 04:31:31PM -0400, Gordon McMillan > > wrote: > > >RC4, maybe. But I think you're asking for trouble in trying to > > >pretend a block-mode cipher is a stream. > > > > Dunno, but we'll hash that out on the python-crypto list. For > > now I just wanted to know if parametrized codecs would be an > > inherently bad idea. > > > > (Two ways of handling this would be requiring all write() calls > > to use a multiple of the block size, or reporting an error on > > the close() if you haven't written out an even number of > > blocks. I have no idea which is preferable; not sure I like > > either of them...) > > I guess you'll need some kind of buffering in the reader/writer > to handle this situation. The cipher.py tools in mxCrypto does > this already, so it may serve as template for the needed code. Sigh. If it's a stream, you can hook a writer through IPC to a reader. Let's say the first 15 bytes are enough to tell the reading process whether it's interested in the rest. So unsupecting programmer sends 15 bytes to the writer, then waits for the reading process to tell him whether to continue. And waits. And waits... It's not a stream. > (This is getting off-topic though for this list...) It is on-topic to the extent that python dev-vers should be encouraged not to violate certan well-established notions. - Gordon From juergen.erhard@gmx.net Thu Sep 20 13:38:14 2001 From: juergen.erhard@gmx.net (=?ISO-8859-1?Q?=22J=FCrgen_A=2E_Erhard=22?=) Date: Thu, 20 Sep 2001 14:38:14 +0200 Subject: [Python-Dev] copy, len and the like as 'object' methods? In-Reply-To: <3B83EE5B.CEB38C48@ActiveState.com> (message from Paul Prescod on Wed, 22 Aug 2001 09:39:39 -0800) References: <3B82AFFE.652A19F5@ActiveState.com> <200108220048.UAA24452@cj20424-a.reston1.va.home.com> <3B83DD5D.3260BC9C@ActiveState.com> <3B83EE5B.CEB38C48@ActiveState.com> Message-ID: <20092001.2@wanderer.local.jae.dyndns.org> --pgp-sign-Multipart_Thu_Sep_20_14:38:03_2001-1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable >>>>> "Paul" =3D=3D Paul Prescod writes: Paul> Also, every time we'd point to a feature of Python (OO Paul> syntax, exception handling, generators, *) that was clearly Paul> better than Perl, Randal claimed it was already slated to be Paul> fixed for Perl 6. We suggested he print t-shirts with that Paul> refrain. The funny thing is, you can make it to almost rhyme (enough for a pop song at least... you could do a filk on that, maybe). "Slated to be fixed / in Perl 6". ;-) Bye, J PS: Do I have to mention I'm behind in my email? ^___^ PPS: I don't have any SF-fandom and/or filking experience... I don't even know if I used the term correctly (do you call such a song a "filk"?). But I think the meaning is clear... --=20 J=FCrgen A. Erhard (juergen.erhard@gmx.net, jae@users.sf.net) MARS: http://members.tripod.com/Juergen_Erhard/mars_index.html Life's Better Without Braces (http://www.python.org) Comes in two sizes: huge and Oh-My-God. --pgp-sign-Multipart_Thu_Sep_20_14:38:03_2001-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iEYEABECAAYFAjup4zYACgkQN0B+CS56qs0tTwCeN6df6s8ZLYBBmT36IswQ5fdg zYwAn0P54CJQZAo2hUdEvN5Ru1SMrU4C =rsdq -----END PGP SIGNATURE----- --pgp-sign-Multipart_Thu_Sep_20_14:38:03_2001-1-- From guido@python.org Thu Sep 20 14:54:05 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Sep 2001 09:54:05 -0400 Subject: [Python-Dev] test_string, test_unicode fail Message-ID: <200109201354.JAA01747@cj20424-a.reston1.va.home.com> Today, test_string and test_unicode have started to fail. I'm suspicious that this is the result of changes Marc-Andre checked in to make unicode() behave more like str(). A little investigation (not that the string test suite makes this easy :-( ) shows that line 133 of string_tests.py checks to make sure that '.'.join('a', u'b', 3) raise an exception. But it now silently casts the 3 to u'3', so the result is u'a.b.3'. Is this really an good idea? Was it an intended side effect? Also, Marc-Andre, please run the full test suite and check its results before checking in changes like this. --Guido van Rossum (home page: http://www.python.org/~guido/) From arigo@ulb.ac.be Thu Sep 20 17:31:54 2001 From: arigo@ulb.ac.be (Armin Rigo) Date: Thu, 20 Sep 2001 18:31:54 +0200 (CEST) Subject: [Python-Dev] Progress on Psyco Message-ID: Hello everybody, A short note about my progress on Psyco, the kind of compiler for Python. In its latest form it is a kind of just-in-time compiler, working at run-time. It consists of two parts: a language-independent "glue" part (which I more or less finished), and the inner loop of the Python interpreter rewritten in a special, resticted language. The "glue" part executes Python programs by running the interpreter, but not in the common way. The run-time data are split into what must be known to produce good processor code, and what need not be. Typically, we need to know at least which Python code to interpret before we can make good processor code that performs similarily; in the case of Python we will probably also need to know the type fields of the actual Python objects fed to the Python code; the processor code can be very efficiently optimized depending on the type. On the other hand, if the Python objects are, say, integers, then the same processor code can easily manipulate all possible integer values. The "glue" code is responsible for choosing which data goes into which category, and managing caches of the produced code. The Python-dependent part of the code either makes computations at compile-time (for compile-time-known values) or emits code to do it at run-time (for the rest). The choice between the two categories of data is not done statically; I mean, I didn't say "Python code and type fields are in category 1, the rest in category 2", which would amount to writing a JIT compiler with type-analysis. No; I let this separation be done based on the uses of the data. In two words, if I have a "switch" on data, or a call to function pointer, then knowing the data at compile time enables significant improvements, so I mark that data as "category 1"; as long as this does not occur data is in "category 2". This makes the approach quite general, I think. Experiment will be needed to know whether it performs reasonably well. Another interesting point to note is that this idea applies to any other interpreted language; all you need to do is rewrite the inner loop of the interpreter for the other language in a special way. Orthogonally, you can change the target processor by adapting parts of the "glue" code. More about it later, Armin. From mal@lemburg.com Thu Sep 20 17:32:51 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 20 Sep 2001 18:32:51 +0200 Subject: [Python-Dev] Re: test_string, test_unicode fail References: <200109201354.JAA01747@cj20424-a.reston1.va.home.com> Message-ID: <3BAA1A33.73E16F16@lemburg.com> Guido van Rossum wrote: > > Today, test_string and test_unicode have started to fail. I'm > suspicious that this is the result of changes Marc-Andre checked in to > make unicode() behave more like str(). > > A little investigation (not that the string test suite makes this easy > :-( ) shows that line 133 of string_tests.py checks to make sure that > > '.'.join('a', u'b', 3) > > raise an exception. But it now silently casts the 3 to u'3', so the > result is u'a.b.3'. Oops. I forgot to run test_string.py -- I did run test_unicode.py and it passes; could be that I need to update the output of that test. Sorry. > Is this really an good idea? Was it an intended side effect? The intention is to make str() and unicode() behave in the same way. It is not a side-effect. unicode() now behaves in the same way as str() always did. > Also, Marc-Andre, please run the full test suite and check its results > before checking in changes like this. Ok. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@python.org Thu Sep 20 17:42:18 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Sep 2001 12:42:18 -0400 Subject: [Python-Dev] Re: test_string, test_unicode fail In-Reply-To: Your message of "Thu, 20 Sep 2001 18:32:51 +0200." <3BAA1A33.73E16F16@lemburg.com> References: <200109201354.JAA01747@cj20424-a.reston1.va.home.com> <3BAA1A33.73E16F16@lemburg.com> Message-ID: <200109201642.f8KGgIi06829@odiug.digicool.com> > Guido van Rossum wrote: > > > > Today, test_string and test_unicode have started to fail. I'm > > suspicious that this is the result of changes Marc-Andre checked in to > > make unicode() behave more like str(). > > > > A little investigation (not that the string test suite makes this easy > > :-( ) shows that line 133 of string_tests.py checks to make sure that > > > > '.'.join('a', u'b', 3) > > > > raise an exception. But it now silently casts the 3 to u'3', so the > > result is u'a.b.3'. > > Oops. I forgot to run test_string.py -- I did run test_unicode.py > and it passes; could be that I need to update the output of that > test. I don't know how to tell whether test_unicode.py passes or not from looking at the output (which is inscrutable to me). All I know is that regrtest says it fails. > > Is this really an good idea? Was it an intended side effect? > > The intention is to make str() and unicode() behave in the same > way. It is not a side-effect. unicode() now behaves in the same > way as str() always did. Not true. "".join(["a", 3]) raises a TypeError, and this is how it should be. So I expect that u"".join(["a", 3]) also raises TypeError, not return u"a3" as it does now. I urge you to reconsider how this change is implemented. The implied unicode() call for list items in the join() method is probably just scratching the surface -- there are likely other places where now suddenly everything is auto-converted to a unicode string. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Sep 20 17:44:47 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 20 Sep 2001 18:44:47 +0200 Subject: [Python-Dev] Re: string .join() vs. Unicode .join() References: <200109201354.JAA01747@cj20424-a.reston1.va.home.com> Message-ID: <3BAA1CFF.EACDB4DA@lemburg.com> Guido van Rossum wrote: > > Today, test_string and test_unicode have started to fail. I'm > suspicious that this is the result of changes Marc-Andre checked in to > make unicode() behave more like str(). > > A little investigation (not that the string test suite makes this easy > :-( ) shows that line 133 of string_tests.py checks to make sure that > > '.'.join('a', u'b', 3) > > raise an exception. But it now silently casts the 3 to u'3', so the > result is u'a.b.3'. Some more investigation showed that Unicode .join() does an implicit unicode() on all objects in the list whereas the string .join() does not. I think that this is a bug in the Unicode .join() method (the string .join() method can only handle strings and Unicode objects). Should I change the Unicode .join() method to match the semantics of string.join() ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@python.org Thu Sep 20 17:45:28 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Sep 2001 12:45:28 -0400 Subject: [Python-Dev] Bogus checkin In-Reply-To: Your message of "Thu, 20 Sep 2001 09:37:25 PDT." References: Message-ID: <200109201645.f8KGjS406857@odiug.digicool.com> > Index: test_unicode > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Lib/test/output/test_unicode,v > retrieving revision 1.10 > retrieving revision 1.11 > diff -C2 -d -r1.10 -r1.11 > *** test_unicode 2000/08/08 08:04:03 1.10 > --- test_unicode 2001/09/20 16:37:23 1.11 > *************** > *** 1,3 **** > --- 1,4 ---- > test_unicode > + * u' ' u'7 hello 123' > Testing Unicode comparisons... done. > Testing Unicode contains method... done. Marc-Andre, please go back to bed, sleep in, and try again. Or at least have another cup of coffee and go for a walk. For Chrissakes, you're checking in a line that contains the addresses that an object happened to have when you ran the test. Don't you review the diffs before you check in any more? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Sep 20 17:48:24 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Sep 2001 12:48:24 -0400 Subject: [Python-Dev] Re: string .join() vs. Unicode .join() In-Reply-To: Your message of "Thu, 20 Sep 2001 18:44:47 +0200." <3BAA1CFF.EACDB4DA@lemburg.com> References: <200109201354.JAA01747@cj20424-a.reston1.va.home.com> <3BAA1CFF.EACDB4DA@lemburg.com> Message-ID: <200109201648.f8KGmOV06908@odiug.digicool.com> > I think that this is a bug in the Unicode .join() method (the > string .join() method can only handle strings and Unicode > objects). Should I change the Unicode .join() method to match > the semantics of string.join() ?! Yes please. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Sep 20 18:01:37 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 20 Sep 2001 19:01:37 +0200 Subject: [Python-Dev] Re: string .join() vs. Unicode .join() References: <200109201354.JAA01747@cj20424-a.reston1.va.home.com> <3BAA1CFF.EACDB4DA@lemburg.com> <200109201648.f8KGmOV06908@odiug.digicool.com> Message-ID: <3BAA20F1.43A503FD@lemburg.com> Guido van Rossum wrote: > > > I think that this is a bug in the Unicode .join() method (the > > string .join() method can only handle strings and Unicode > > objects). Should I change the Unicode .join() method to match > > the semantics of string.join() ?! > > Yes please. Ok. I'll fix within the next hour. Sorry for the checkin mixup. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From mal@lemburg.com Thu Sep 20 18:40:18 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 20 Sep 2001 19:40:18 +0200 Subject: [Python-Dev] Re: test_string, test_unicode fail References: <200109201354.JAA01747@cj20424-a.reston1.va.home.com> <3BAA1A33.73E16F16@lemburg.com> <200109201642.f8KGgIi06829@odiug.digicool.com> Message-ID: <3BAA2A02.E6697AD4@lemburg.com> Guido van Rossum wrote: > > > ... > > > Is this really an good idea? Was it an intended side effect? > > > > The intention is to make str() and unicode() behave in the same > > way. It is not a side-effect. unicode() now behaves in the same > > way as str() always did. > > Not true. "".join(["a", 3]) raises a TypeError, and this is how it > should be. So I expect that u"".join(["a", 3]) also raises TypeError, > not return u"a3" as it does now. This is fixed now. It was a bug in the .join() method (it would have excepted instances with __str__ too, for example). > I urge you to reconsider how this change is implemented. The implied > unicode() call for list items in the join() method is probably just > scratching the surface -- there are likely other places where now > suddenly everything is auto-converted to a unicode string. Hmm, perhaps you are right and we should use a slightly extended version of PyObject_Unicode() for unicode() and leave PyUnicode_FromObject() as it was (it only converted strings, buffers and instances implementing __str__ to Unicode). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@python.org Thu Sep 20 19:38:59 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 20 Sep 2001 14:38:59 -0400 Subject: [Python-Dev] Re: test_string, test_unicode fail In-Reply-To: Your message of "Thu, 20 Sep 2001 19:40:18 +0200." <3BAA2A02.E6697AD4@lemburg.com> References: <200109201354.JAA01747@cj20424-a.reston1.va.home.com> <3BAA1A33.73E16F16@lemburg.com> <200109201642.f8KGgIi06829@odiug.digicool.com> <3BAA2A02.E6697AD4@lemburg.com> Message-ID: <200109201838.f8KIcxc07113@odiug.digicool.com> > > I urge you to reconsider how this change is implemented. The implied > > unicode() call for list items in the join() method is probably just > > scratching the surface -- there are likely other places where now > > suddenly everything is auto-converted to a unicode string. > > Hmm, perhaps you are right and we should use a slightly extended > version of PyObject_Unicode() for unicode() and leave > PyUnicode_FromObject() as it was (it only converted strings, > buffers and instances implementing __str__ to Unicode). This sounds like a much better idea to me. --Guido van Rossum (home page: http://www.python.org/~guido/) From eleven11@hushmail.com Thu Sep 20 06:10:20 2001 From: eleven11@hushmail.com (eleven11@hushmail.com) Date: Thu, 20 Sep 2001 22:10:20 +1700 Subject: [Python-Dev] 11 Message-ID: World Trade Center Disaster Facts The date of the attack 9/11 9+1+1=11 September 11th is the 254th day of the year: 2+5+4=11 After September 11th there are 111 days left to the end of the year. Twin Towers - standing side by side look like the number 11. The first plane to hit the towers was Flight 11. State of New York is the 11th state added to the union. New York City 11 letters The Pentagon 11 letters Afghanistan 11 letters Ranzi Yousef 11 letters (Convicted of orchestrating the 1993 attack on the WTC) September 11 11 letters Flight 11 had 92 on board 9+2=11 Flight 77 had 65 on board 6+5=11 From fredrik@pythonware.com Fri Sep 21 13:14:48 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 21 Sep 2001 14:14:48 +0200 Subject: [Python-Dev] where's 2.2a4 ;-) Message-ID: <00c001c14297$033ed970$0900a8c0@spiff> http://python.sourceforge.net/peps/pep-0251.html still says september 19th. From guido@python.org Fri Sep 21 13:39:35 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Sep 2001 08:39:35 -0400 Subject: [Python-Dev] where's 2.2a4 ;-) In-Reply-To: Your message of "Fri, 21 Sep 2001 14:14:48 +0200." <00c001c14297$033ed970$0900a8c0@spiff> References: <00c001c14297$033ed970$0900a8c0@spiff> Message-ID: <200109211239.IAA08650@cj20424-a.reston1.va.home.com> > http://python.sourceforge.net/peps/pep-0251.html > > still says september 19th. We've had some slippage (I don't want to blame it on terrorism, but last week wasn't particularly productive). I expect 2.2a4 to arrive sometime next week. You have an extra week to fix SRE bugs! :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Fri Sep 21 14:44:59 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 21 Sep 2001 15:44:59 +0200 Subject: [Python-Dev] str() vs. unicode() Message-ID: <3BAB445B.A10584F8@lemburg.com> I'd like to query for the common opinion on an issue which I've run into when trying to resynchronize unicode() and str() in terms on what happens when you pass arbitrary objects to these constructors which happen to implement tp_str (or __str__ for instances). Currenty, str() will accept any object which supports the tp_str interface and revert to tp_repr in case that slot should not be available. unicode() supported strings, character buffers and instances having a __str__ method before yesterdays checkins. Now the goal of the checkins was to make str() and unicode() behave in a more compatible fashion. Both should accept the same kinds of objects and raise exceptions for all others. The path I chose was to fix PyUnicode_FromEncodedObject() to also accept tp_str compatible objects. This API is used by the unicode_new() constructor (which is exposed as unicode() in Python) to create a Unicode object from the input object. str() OTOH uses PyObject_Str() via string_new(). Now there also is a PyObject_Unicode() API which tries to mimic PyObject_Str(). However, it does not support the additional encoding and errors arguments which the unicode() constructor has. The problem which Guido raised about my checkins was that the changes to PyUnicode_FromEncodedObject() are seen not only in unicode(), but also all other instances where this API is used. OTOH, PyUnicode_FromEncodedObject() is the most generic constructor for Unicode objects there currently is in Python. So the questions are - should I revert the change in PyUnicode_FromEncodedObject() and instead extend PyObject_Unicode() to support encodings ? - should we make PyUnicode_Object() use=20 PyUnicode_FromEncodedObject() instead of providing its own implementation ? The overall picture of all this auto-conversion stuff going on in str() and unicode() is very confusing. Perhaps what we really need is first to agree on a common understanding of which auto-conversion should take place and then make str() and unicode() support exactly the same interface ?! PS: Also see patch #446754 by Walter D=F6rwald: http://sourceforge.net/tracker/?func=3Ddetail&atid=3D305470&aid=3D446754&= group_id=3D5470 --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@python.org Fri Sep 21 15:59:27 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Sep 2001 10:59:27 -0400 Subject: [Python-Dev] str() vs. unicode() In-Reply-To: Your message of "Fri, 21 Sep 2001 15:44:59 +0200." <3BAB445B.A10584F8@lemburg.com> References: <3BAB445B.A10584F8@lemburg.com> Message-ID: <200109211459.f8LExRc24735@odiug.digicool.com> > I'd like to query for the common opinion on an issue which I've > run into when trying to resynchronize unicode() and str() in terms > on what happens when you pass arbitrary objects to these constructors > which happen to implement tp_str (or __str__ for instances). > > Currenty, str() will accept any object which supports the tp_str > interface and revert to tp_repr in case that slot should not > be available. > > unicode() supported strings, character buffers and instances > having a __str__ method before yesterdays checkins. > > Now the goal of the checkins was to make str() and unicode() > behave in a more compatible fashion. Both should accept > the same kinds of objects and raise exceptions for all others. Well, historically, str() has rarely raised exceptions, because there's a default implementation (same as for repr(), returning . This is used when neither tp_repr nor tp_str is set. Note that PyObject_Str() never looks at __str__ -- this is done by the tp_str handler of instances (and now also by the tp_str handler of new-style classes). I see no reason to change this. The question then becomes, do we want unicode() to behave similarly? > The path I chose was to fix PyUnicode_FromEncodedObject() > to also accept tp_str compatible objects. This API is used > by the unicode_new() constructor (which is exposed as unicode() > in Python) to create a Unicode object from the input object. > > str() OTOH uses PyObject_Str() via string_new(). > > Now there also is a PyObject_Unicode() API which tries to > mimic PyObject_Str(). However, it does not support the additional > encoding and errors arguments which the unicode() constructor > has. > > The problem which Guido raised about my checkins was that > the changes to PyUnicode_FromEncodedObject() are seen not > only in unicode(), but also all other instances where this > API is used. > > OTOH, PyUnicode_FromEncodedObject() is the most generic constructor > for Unicode objects there currently is in Python. > > So the questions are > - should I revert the change in PyUnicode_FromEncodedObject() > and instead extend PyObject_Unicode() to support encodings ? > - should we make PyUnicode_Object() use > PyUnicode_FromEncodedObject() instead of providing its > own implementation ? > > The overall picture of all this auto-conversion stuff going > on in str() and unicode() is very confusing. Perhaps what > we really need is first to agree on a common understanding > of which auto-conversion should take place and then make > str() and unicode() support exactly the same interface ?! > > PS: Also see patch #446754 by Walter Dörwald: > http://sourceforge.net/tracker/?func=detail&atid=305470&aid=446754&group_id=5470 OK, let's take a step back. The str() function (now constructor) converts *anything* to a string; tp_str and tp_repr exist to allow objects to customize this. These slots, and the str() function, take no additional arguments. To invoke the equivalent of str() from C, you call PyObject_Str(). I see no reason to change this; we may want to make the Unicode situation is similar as possible. The unicode() function (now constructor) traditionally converted only 8-bit strings to Unicode strings, with additional arguments to specify the encoding (and error handling preference). There is no tp_unicode slot, but for some reason there are at least three C APIs that could correspond to unicode(): PyObject_Unicode() and PyUnicode_FromObject() take a single object argument, and PyObject_FromEncodedObject() takes object, encoding, and error arguments. The first question is, do we want the unicode() constructor to be applicable in all cases where the str() constructor is? I guess that we do, since we want to be able to print to streams that support Unicode. Unicode strings render themselves as Unicode characters to such a stream, and it's reasonable to allow other objects to also customize their rendition in Unicode. Now, what should be the signature of this conversion? If we print object X to a Unicode stream, should we invoke unicode(X), or unicode(X, encoding, error)? I believe it should be just unicode(X), since the encoding used by the stream shouldn't enter into the picture here: that's just used for converting Unicode characters written to the stream to some external format. How should an object be allowed to customize its Unicode rendition? We could add a tp_unicode slot to the type object, but there's no need: we can just look for a __unicode__ method and call it if it exists. The signature of __unicode__ should take no further arguments: unicode(X) should call X.__unicode__(). As a fallback, if the object doesn't have a __unicode__ attribute, PyObject_Str() should be called and the resulting string converted to Unicode using the default encoding. Regarding the "long form" of unicode(), unicode(X, encoding, error), I see no reason to treat this with the same generality. This form should restrict X to something that supports the buffer API (IOW, 8-bit string objects and things that are treated the same as these in most situations). (Note that it already balks when X is a Unicode string.) So about those C APIs: I propose that PyObject_Unicode() correspond to the one-arg form of unicode(), taking any kind of object, and that PyUnicode_FromEncodedObject() correspond to the three-arg form. PyUnicode_FromObject() shouldn't really need to exist. I don't see a reason for PyUnicode_From[Encoded]Object() to use the __unicode__ customization -- it should just take the bytes provided by the object and decode them according to the given encoding. PyObject_Unicode(), on the other hand, should look for __unicode__ first and then PyObject_Str(). I hope this helps. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz@rahul.net Fri Sep 21 16:26:09 2001 From: aahz@rahul.net (Aahz Maruch) Date: Fri, 21 Sep 2001 08:26:09 -0700 (PDT) Subject: [Python-Dev] Re: 2.2a1: classmethod() and class attributes In-Reply-To: <200108151525.LAA27011@cj20424-a.reston1.va.home.com> from "Guido van Rossum" at Aug 15, 2001 11:25:53 AM Message-ID: <20010921152610.3E0DBE90F@waltz.rahul.net> [okay, so I'm following up on old stuff -- sue me] [there's also excessive quoting to maintain context] Guido van Rossum wrote: >Aahz: >> >> +1 on changing __dynamic__ or at least enabling some kind of class >> variable mutability by default. > > After converting the Tools/compiler package to the new class system, I > tend to agree (I already said so on c.l.py in my last post there). Yeah, you did, but I didn't see you directly address my point about the necessary mutability of class variables. > For dynamic classes, I must assume that at any time someone can add a > special method (like __iter__) to a class. It's hard to set up things > so that the dispatch function is set in the type object at the moment > C.__iter__ is assigned: it would require a class to keep track of all > its subclasses, without keeping those subclasses alive. While I know > I can do that using weak references, I don't like having to maintain > all that administration. So at the moment, when a class is dynamic, I > just stick all dispatch functions in the type object -- the dispatch > functions will raise AttributeError when their corresponding method is > not found. (This is the same approach used for classic classes, BTW.) > > This is slower than it should be -- the fully dynamic Tools/compiler > package compiles itself about 25% slower this way. If I tweak it to > use all static classes (not very hard), it runs at about the same > speed as it does with classic classes. I imagine I could make it > faster by using __slots__, but I don't know enough about the internals > yet to be able to do that. > > My goal (before I'm happy with making __dynamic__=1 the default) is > that dynamic classes should be at least as fast as classic classes. I > haven't profiled it yet -- it's possible that there's a cheap hack > possible by making more conservative assumptions about __getattr__ > alone -- classic classes special-case __getattr__ too.) Okay, keeping in mind that I don't actually understand what I'm talking about, what is the problem involved in permitting existing attributes to be mutable? That is, as I think James Althoff has pointed out, there are at least two levels of mutability, one of which is the ability to mutate existing attributes, and another of which is the ability to add attributes. Would making this all finer-grained help? Do we actually need to control whether existing attributes are mutable? -- --- Aahz (@pobox.com) Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/ Androgynous poly kinky vanilla queer het Pythonista We must not let the evil of a few trample the freedoms of the many. From guido@python.org Fri Sep 21 16:51:23 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Sep 2001 11:51:23 -0400 Subject: [Python-Dev] Re: 2.2a1: classmethod() and class attributes In-Reply-To: Your message of "Fri, 21 Sep 2001 08:26:09 PDT." <20010921152610.3E0DBE90F@waltz.rahul.net> References: <20010921152610.3E0DBE90F@waltz.rahul.net> Message-ID: <200109211551.f8LFpNI29827@odiug.digicool.com> > >Aahz: > >> > >> +1 on changing __dynamic__ or at least enabling some kind of class > >> variable mutability by default. > Guido van Rossum wrote: > > After converting the Tools/compiler package to the new class system, I > > tend to agree (I already said so on c.l.py in my last post there). [Aahz] > Yeah, you did, but I didn't see you directly address my point about the > necessary mutability of class variables. I agree that they are necessary. What else do you want me to say? > > For dynamic classes, I must assume that at any time someone can add a > > special method (like __iter__) to a class. It's hard to set up things > > so that the dispatch function is set in the type object at the moment > > C.__iter__ is assigned: it would require a class to keep track of all > > its subclasses, without keeping those subclasses alive. While I know > > I can do that using weak references, I don't like having to maintain > > all that administration. So at the moment, when a class is dynamic, I > > just stick all dispatch functions in the type object -- the dispatch > > functions will raise AttributeError when their corresponding method is > > not found. (This is the same approach used for classic classes, BTW.) > > > > This is slower than it should be -- the fully dynamic Tools/compiler > > package compiles itself about 25% slower this way. If I tweak it to > > use all static classes (not very hard), it runs at about the same > > speed as it does with classic classes. I imagine I could make it > > faster by using __slots__, but I don't know enough about the internals > > yet to be able to do that. > > > > My goal (before I'm happy with making __dynamic__=1 the default) is > > that dynamic classes should be at least as fast as classic classes. I > > haven't profiled it yet -- it's possible that there's a cheap hack > > possible by making more conservative assumptions about __getattr__ > > alone -- classic classes special-case __getattr__ too.) > > Okay, keeping in mind that I don't actually understand what I'm talking > about, what is the problem involved in permitting existing attributes to > be mutable? > > That is, as I think James Althoff has pointed out, there are at least > two levels of mutability, one of which is the ability to mutate existing > attributes, and another of which is the ability to add attributes. > Would making this all finer-grained help? Do we actually need to > control whether existing attributes are mutable? The problem is that the only kind of control that is easy to implement is to make *everything* immutable. Given that I strive for __dynamic__=1 as the default, I don't want to add more code that's only temporary. Maybe I'll just make __dynamic__=1 the default in 2.2a4, and work on the performance issues later. (But there are subtle semantics differences as well that make life more complicated than it should be with __dynamic__=1.) --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Fri Sep 21 22:19:58 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 21 Sep 2001 17:19:58 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010921211958.C4F3924231@grendel.zope.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Added more discussion of user-defined exceptions, more descriptions for the xml.parsers.expat module. From guido@python.org Fri Sep 21 22:30:43 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 21 Sep 2001 17:30:43 -0400 Subject: [Python-Dev] python-checkins mail blocked somewhere Message-ID: <200109212130.RAA13662@cj20424-a.reston1.va.home.com> I'm not getting my python-checkins mail. The messages are getting archived, but I don't get them in my inbox. Does anybody else still get them? Barry, can you see if there's a blockage somewhere? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Fri Sep 21 22:39:37 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 21 Sep 2001 17:39:37 -0400 Subject: [Python-Dev] python-checkins mail blocked somewhere In-Reply-To: <200109212130.RAA13662@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > I'm not getting my python-checkins mail. The messages are getting > archived, but I don't get them in my inbox. Does anybody else still > get them? I'm getting them; they come to my @Home address. From jeremy@zope.com Fri Sep 21 22:48:09 2001 From: jeremy@zope.com (Jeremy Hylton) Date: Fri, 21 Sep 2001 17:48:09 -0400 (EDT) Subject: [Python-Dev] python-checkins mail blocked somewhere In-Reply-To: <200109212130.RAA13662@cj20424-a.reston1.va.home.com> References: <200109212130.RAA13662@cj20424-a.reston1.va.home.com> Message-ID: <15275.46489.437644.203863@slothrop.digicool.com> I've seen them recently. They go to jeremy@alum.mit.edu and then to my POP server at XO/Concentric. Jeremy From fdrake@acm.org Fri Sep 21 22:46:15 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 21 Sep 2001 17:46:15 -0400 Subject: [Python-Dev] python-checkins mail blocked somewhere In-Reply-To: References: <200109212130.RAA13662@cj20424-a.reston1.va.home.com> Message-ID: <15275.46375.511942.466212@grendel.digicool.com> Tim Peters writes: > I'm getting them; they come to my @Home address. Same here, though they arrive indirectly. Mine do not go through zope.com at all. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mal@lemburg.com Sat Sep 22 17:14:41 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 22 Sep 2001 18:14:41 +0200 Subject: [Python-Dev] str() vs. unicode() References: <3BAB445B.A10584F8@lemburg.com> <200109211459.f8LExRc24735@odiug.digicool.com> Message-ID: <3BACB8F1.235DDE92@lemburg.com> Guido van Rossum wrote: >=20 > > I'd like to query for the common opinion on an issue which I've > > run into when trying to resynchronize unicode() and str() in terms > > on what happens when you pass arbitrary objects to these constructors > > which happen to implement tp_str (or __str__ for instances). > > > > Currenty, str() will accept any object which supports the tp_str > > interface and revert to tp_repr in case that slot should not > > be available. > > > > unicode() supported strings, character buffers and instances > > having a __str__ method before yesterdays checkins. > > > > Now the goal of the checkins was to make str() and unicode() > > behave in a more compatible fashion. Both should accept > > the same kinds of objects and raise exceptions for all others. >=20 > Well, historically, str() has rarely raised exceptions, because > there's a default implementation (same as for repr(), returning object at ADDRESS>. This is used when neither tp_repr nor tp_str is > set. Note that PyObject_Str() never looks at __str__ -- this is done > by the tp_str handler of instances (and now also by the tp_str handler > of new-style classes). I see no reason to change this. Me neither; what str() does not do (and unicode() does) is try the char buffer interface before trying tp_str. =20 > The question then becomes, do we want unicode() to behave similarly? Given that porting an application from strings to Unicode should be easy, I'd say: yes. =20 > > The path I chose was to fix PyUnicode_FromEncodedObject() > > to also accept tp_str compatible objects. This API is used > > by the unicode_new() constructor (which is exposed as unicode() > > in Python) to create a Unicode object from the input object. > > > > str() OTOH uses PyObject_Str() via string_new(). > > > > Now there also is a PyObject_Unicode() API which tries to > > mimic PyObject_Str(). However, it does not support the additional > > encoding and errors arguments which the unicode() constructor > > has. > > > > The problem which Guido raised about my checkins was that > > the changes to PyUnicode_FromEncodedObject() are seen not > > only in unicode(), but also all other instances where this > > API is used. > > > > OTOH, PyUnicode_FromEncodedObject() is the most generic constructor > > for Unicode objects there currently is in Python. > > > > So the questions are > > - should I revert the change in PyUnicode_FromEncodedObject() > > and instead extend PyObject_Unicode() to support encodings ? > > - should we make PyUnicode_Object() use > > PyUnicode_FromEncodedObject() instead of providing its > > own implementation ? > > > > The overall picture of all this auto-conversion stuff going > > on in str() and unicode() is very confusing. Perhaps what > > we really need is first to agree on a common understanding > > of which auto-conversion should take place and then make > > str() and unicode() support exactly the same interface ?! > > > > PS: Also see patch #446754 by Walter D=F6rwald: > > http://sourceforge.net/tracker/?func=3Ddetail&atid=3D305470&aid=3D446= 754&group_id=3D5470 >=20 > OK, let's take a step back. >=20 > The str() function (now constructor) converts *anything* to a string; > tp_str and tp_repr exist to allow objects to customize this. These > slots, and the str() function, take no additional arguments. To > invoke the equivalent of str() from C, you call PyObject_Str(). I see > no reason to change this; we may want to make the Unicode situation is > similar as possible. Right. =20 > The unicode() function (now constructor) traditionally converted only > 8-bit strings to Unicode strings,=20 Slightly incorrect: it converted 8-bit strings, objects compatible=20 to the char buffer interface and instances having a __str__ method to Unicode. To synchronize unicode() with str() we'd have to replace the __str__ lookup with a tp_str lookup (this will also allow things like unicode(2) and unicode(instance_having__str__)) and maybe also add the charbuf=20 lookup to str() (this would make str() compatible with memory mapped files and probably a few other char buffer aware objects as well). Note that in a discussion we had some time ago we decided that __str__ should be allowed to return Unicode objects as well (instead of defining a separate __unicode__ method/slot for this purpose). str() will convert a Unicode return value to an 8-bit string using the default encoding while unicode() takes the return value as-is. This was done to simplify moving from strings to Unicode. > with additional arguments to specify > the encoding (and error handling preference). There is no tp_unicode > slot, but for some reason there are at least three C APIs that could > correspond to unicode(): PyObject_Unicode() and PyUnicode_FromObject() > take a single object argument, and PyObject_FromEncodedObject() takes > object, encoding, and error arguments. >=20 > The first question is, do we want the unicode() constructor to be > applicable in all cases where the str() constructor is? =20 Yes. > I guess that > we do, since we want to be able to print to streams that support > Unicode. Unicode strings render themselves as Unicode characters to > such a stream, and it's reasonable to allow other objects to also > customize their rendition in Unicode. >=20 > Now, what should be the signature of this conversion? If we print > object X to a Unicode stream, should we invoke unicode(X), or > unicode(X, encoding, error)? I believe it should be just unicode(X), > since the encoding used by the stream shouldn't enter into the picture > here: that's just used for converting Unicode characters written to > the stream to some external format. >=20 > How should an object be allowed to customize its Unicode rendition? > We could add a tp_unicode slot to the type object, but there's no > need: we can just look for a __unicode__ method and call it if it > exists. The signature of __unicode__ should take no further > arguments: unicode(X) should call X.__unicode__(). As a fallback, if > the object doesn't have a __unicode__ attribute, PyObject_Str() should > be called and the resulting string converted to Unicode using the > default encoding. I'd rather leave things as they are: __str__/tp_str are allowed to return Unicode objects and if an object wishes to be rendered as Unicode it can simply return a Unicode object through the __str__/tp_str interface. =20 > Regarding the "long form" of unicode(), unicode(X, encoding, error), I > see no reason to treat this with the same generality. This form > should restrict X to something that supports the buffer API (IOW, > 8-bit string objects and things that are treated the same as these in > most situations).=20 Hmm, but this would restrict users from implementing string like objects (i.e. objects having the __str__ method to make it compatible to str()). > (Note that it already balks when X is a Unicode > string.) True -- since you normally cannot decode Unicode into Unicode using=20 some 8-bit character encoding. As a result encodings which convert=20 Unicode to Unicode (e.g. normalizations) cannot use this interface, but since these are probably only rarely used, I think it's better to prevent accidental usage of an 8-bit character codec on Unicode. =20 > So about those C APIs: I propose that PyObject_Unicode() correspond to > the one-arg form of unicode(), taking any kind of object, and that > PyUnicode_FromEncodedObject() correspond to the three-arg form. Ok. I'll fix this once we've reached consensus on what to do about str() and unicode(). > PyUnicode_FromObject() shouldn't really need to exist.=20 Note: PyUnicode_FromObject() was extended by PyUnicode_FromEncodedObject(= ) and only exists for backward compatibility reasons. > I don't see a > reason for PyUnicode_From[Encoded]Object() to use the __unicode__ > customization -- it should just take the bytes provided by the object > and decode them according to the given encoding. PyObject_Unicode(), > on the other hand, should look for __unicode__ first and then > PyObject_Str(). >=20 > I hope this helps. Thanks for the summary. --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From tim.one@home.com Sun Sep 23 04:29:49 2001 From: tim.one@home.com (Tim Peters) Date: Sat, 22 Sep 2001 23:29:49 -0400 Subject: [Python-Dev] A baffler in test_repr Message-ID: test_repr (very) recently started failing for me, on Windows, but only when doing a full run of the test suite. Same symptom in release or debug builds. test_repr does not fail when run alone, and regardless of whether run directly from cmdline, or via "regrtest test_repr". Here's the failure: test_repr test test_repr failed -- Traceback (most recent call last): File "../lib/test\test_repr.py", line 156, in test_descriptors self.failUnless(repr(x).startswith('> sys.stderr, '*' * 30, repr(x) before the last line, when it fails repr(x) is actually: when it fails (although the address varies, of course). How can it be either "an object" or "an instance" depending on how the test is run? A staticmethod doesn't contain enough data that it's possible to forget to initialize any of it <0.7 wink>. From tim.one@home.com Sun Sep 23 09:52:21 2001 From: tim.one@home.com (Tim Peters) Date: Sun, 23 Sep 2001 04:52:21 -0400 Subject: [Python-Dev] A baffler in test_repr In-Reply-To: Message-ID: [Tim, reports test_repr failure due to repr(a_static_method), but only when the whole test suite is run, not in isolation] + The same thing happens with classmethod (repr is normally but magically turns into "sometimes"). + The gross cause is that tp_repr is normally NULL in the classmethod and staticmethod types, and PyObject_Repr then prints "object at". But by the time test_repr starts running (but not in isolation), their tp_repr slots point to object_repr (which prints "instance at"). The tp_str and tp_hash slots have also changed. + To provoke a failure via regrtest, it's necessary and sufficient to run test_inspect before test_repr, in the same run: C:\Code\python\PCbuild>python ../lib/test/regrtest.py test_repr # ok test_repr 1 test OK. C:\Code\python\PCbuild>python ../lib/test/regrtest.py test_inspect test_repr # fails test_inspect test_repr test test_repr failed -- Traceback (most recent call last): File "../lib/test\test_repr.py", line 156, in test_descriptors self.failUnless(repr(x).startswith(' + Turns out I provoked this by adding classify_class_attrs() to inspect.py, and then foolishly added a test for it . The symptom can be provoked by one judiciously chosen line: C:\Code\python\PCbuild>python Python 2.2a3+ (#23, Sep 20 2001, 19:43:51) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> staticmethod(42) # "object at" output >>> len(staticmethod.__dict__) # and you think this is harmless 10 >>> staticmethod(42) # whoa! now it's "instance at" output >>> That is, just asking for a type's __dict__ can change the values in the type slots. This doesn't seem right -- or does it? gunshyly y'rs - tim From GhenadieP@HOTMAIL.COM Sun Sep 23 21:39:26 2001 From: GhenadieP@HOTMAIL.COM (GhenadieP@HOTMAIL.COM) Date: Sun, 23 Sep 2001 22:39:26 +0200 (CEST) Subject: [Python-Dev] regarding job Message-ID: <20010923203926.54C077CAE@shark.amis.net> Dear Sir, I found your e-mail address on the net. I see Your activity is related with software development. I hope you are interested in experienced software developers. I am currently seeking work. If you have a site where I should fill a form, please, reply me, so that I will go there and fill that form. Below is my resume. If in your consideration I can be usefull for you, please call me or send me a message. Thank you. RESUME --------------------------------------------------------------- Name: Ghenadie Plingau Tel: +386 31 710 519 ; +373 2 292976 Email: ghenadiep@hotmail.com; pghena@hotmail.com Language: English, French IT experience: 5+ Years Knowledge’s summary and basic concepts: - Component based design concepts - Object oriented thinking - Windows applications development experience - Microsoft development technologies stickler - database designing and SQL transactions experience - Web UI orientation Good knowledges of Delphi/VCL, C++/COM/ATL, Win32 API, ADO, ODBC, XML, HTTP. Programming tools experience: Visual C++, Visual Basic, Borland C++/Delphi Databases: MS SQLServer, Oracle, Sybase, Access Education: * 1989-1991 Mathematical Lyceum "C.Negruzzi", Iasi, Romania, BSc in Mathematics * 1994-1999 Academy of Economic Studies of Moldova, Banking & SE Career Summary Project: RISP-SQL Time: Dec.2000- Organization: E.R.S. Rokada Inzeniring d.o.o. Ljubljana, Slovenia (rokada@amis.net) Position: Software developer Tools: Delphi, Visual C++, SQL Server 2000 Description: Developing a database application for SQL Server 2000 using Delphi, COM/OLE for internal data interchanging between modules, VCL and ExpressQuantumGrid Suite for UI. Developed multithread applications, wrote recursive stored procedures and extended procedures for SQL Server, ActiveX components for AutoCAD and MS Office automation. Got experience in analyzing database performance, records fetching, SQL coding, Data Warehousing, Data Mining, Data Modeling. Especially focuses on developing data aware components and implementing new features in existed components by adding new properties and events. Created both data aware and non data aware components for performing direct lookup on server by runtime generation of SQL statement and sending them to the server during the information input, that allows only input of names and codes that exists in database directory table. Created an application wizard for automatic upgrading of the structure of target database based on an template script with the condition of keeping the data in the target database using SQL DMO. Created Report designer based on Quick Report libraries that allows runtime customization of report items and bands and storing all settings in an XML file. Designer looks similar like Delphi form designer. Created runtime assembler of application menu out of an XML file that stores GUID and interface method number of each menu item command. Interface is unique and supported by all COM objects. Managing source code using MS Source Safe and created batch files for automating building process and creation of installation kit. Developed modules for Web deployment of reports, dynamical generation of http for web pages. Project: e-tools Time: May 2000 - Sep 2000 Organization: Edifecs Commerce Inc, Seattle, USA (joed@edifecs.com) Position: Independent Contractor Tools: MS Visual C++, WinCVS Description: Developed server side and implemented user interfaces of a component, part of an application designed for business process modeling based on EDI standards. Designed and developed a search engine that seeks in a special formated file that contain EDI standards. Search results are displayed in a tree on the UI or saved in XML format. Designed classes and protocols of data interchanging between them. Worked with linked lists, binary trees, STL containers, templates. Got experience in designing object classes for well encapsulated objects, implementing recursive member functions. Analyzed performance, performed unit tests. Project: SQL-Buch Time: Feb 1999 - May 2000 Organization: Technical University of Moldova, Chisinau, Moldova (perju@adm.utm.md) Position: Software Design Lead Task: Provided technical leadership to a team of developers, developing software for business and financial accounting. Tools: MS Visual C++, MFC, Visual Basic, Delphi, SQL Server, Sybase SQL, MySQL (LINUX), ODBC, ADO, MIDAS, DCOM Description: Was responsible for planning and scheduling technical assignments and goals, defining and analyzing requirements, training, and contributing in performance reviews. Administrated and designed relational databases, wrote SQL code for transactions, implemented User Interfaces, developed multi-tied database applications. Got experience in API level Windows programming. Gained excellent skills in working with all Visual Studio tools. Created COM-based application servers for providing services over a local network and via internet. Created installations kits and CAB files for Web deployment of ActiveX controls. Project: A-BANCO Time: Apr 1996 - Feb 1999 Organization: «A-BANCO» SRL Chisinau, Moldova Position: Programmer Task: Informational support Tools: MS Access, Visual Basic, dBase Description: Developed information system of the company including book keeping, production control and resource management. Worked jointly with book-keepers, solving their needs and requirements. Designed databases, developed application for MS Access using Access tools, VBA coding and MS Office applications automation. Developed a set of MS Access, VB and Delphi applications, which are able to provide any small company from Moldova with the most often used and most needed documents and reports, that meet all accounting standards and are able to facilitate book-keeping and document management. Lots of C and Pascal programming under DOS. From guido@python.org Mon Sep 24 14:30:14 2001 From: guido@python.org (Guido van Rossum) Date: Mon, 24 Sep 2001 09:30:14 -0400 Subject: [Python-Dev] str() vs. unicode() In-Reply-To: Your message of "Sat, 22 Sep 2001 18:14:41 +0200." <3BACB8F1.235DDE92@lemburg.com> References: <3BAB445B.A10584F8@lemburg.com> <200109211459.f8LExRc24735@odiug.digicool.com> <3BACB8F1.235DDE92@lemburg.com> Message-ID: <200109241330.JAA31034@cj20424-a.reston1.va.home.com> > > Well, historically, str() has rarely raised exceptions, because > > there's a default implementation (same as for repr(), returning > object at ADDRESS>. This is used when neither tp_repr nor tp_str is > > set. Note that PyObject_Str() never looks at __str__ -- this is done > > by the tp_str handler of instances (and now also by the tp_str handler > > of new-style classes). I see no reason to change this. > > Me neither; what str() does not do (and unicode() does) is try > the char buffer interface before trying tp_str. The meanings of these two are different: tp_str means "give me a string that's useful for printing"; the buffer API means "let me treat you as a sequence of 8-bit bytes (or 8-bit characters)". They are different e.g. when you consider a PIL image, whose str() probably returns something like '' while its buffer API probably gives access to the raw image buffer. The str() function should map directly to tp_str(). You *might* claim that the 8-bit string type constructor *ought to* look at the buffer API, but I'd say that it's easy enough for a type to provide a tp_str implementation that does what the type wants. I guess "convert yourself to string" is different than "display yourself as a string". > > The question then becomes, do we want unicode() to behave similarly? > > Given that porting an application from strings to Unicode should > be easy, I'd say: yes. Fearing this ends up being a trick question, I'll say +0. If we end up with something I don't like, I reserve the right to change my opinion on this. > > The str() function (now constructor) converts *anything* to a string; > > tp_str and tp_repr exist to allow objects to customize this. These > > slots, and the str() function, take no additional arguments. To > > invoke the equivalent of str() from C, you call PyObject_Str(). I see > > no reason to change this; we may want to make the Unicode situation is > > similar as possible. > > Right. > > > The unicode() function (now constructor) traditionally converted only > > 8-bit strings to Unicode strings, > > Slightly incorrect: it converted 8-bit strings, objects compatible > to the char buffer interface and instances having a __str__ method to > Unicode. That's rather random collection of APIs, if you ask me... Also, do you really mean *instances* (i.e. objects for which PyInstance_Check() returns true), or do you mean anything for which getattr(x, "__str__") is true? If the latter, you're in for a surprise in 2.2 -- almost all built-in objects now respond to that method, due to the type/class unification: whenever something has a tp_str slot, a __str__ attribute is synthesized (and vice versa). (Exceptions are a few obscure types and maybe 3rd party extension types.) > To synchronize unicode() with str() we'd have to replace the __str__ > lookup with a tp_str lookup (this will also allow things like unicode(2) > and unicode(instance_having__str__)) and maybe also add the charbuf > lookup to str() (this would make str() compatible with memory mapped > files and probably a few other char buffer aware objects as well). I definitely don't want the latter change to str(); see above. If you want unicode(x) to behave as much as str(x) as possible, I recommend removing using the buffer API. > Note that in a discussion we had some time ago we decided that __str__ > should be allowed to return Unicode objects as well (instead of > defining a separate __unicode__ method/slot for this purpose). str() > will convert a Unicode return value to an 8-bit string using the > default encoding while unicode() takes the return value as-is. > > This was done to simplify moving from strings to Unicode. I'm now not so sure if this was the right decision. > > with additional arguments to specify > > the encoding (and error handling preference). There is no tp_unicode > > slot, but for some reason there are at least three C APIs that could > > correspond to unicode(): PyObject_Unicode() and PyUnicode_FromObject() > > take a single object argument, and PyObject_FromEncodedObject() takes > > object, encoding, and error arguments. > > > > The first question is, do we want the unicode() constructor to be > > applicable in all cases where the str() constructor is? > > Yes. > > > I guess that > > we do, since we want to be able to print to streams that support > > Unicode. Unicode strings render themselves as Unicode characters to > > such a stream, and it's reasonable to allow other objects to also > > customize their rendition in Unicode. > > > > Now, what should be the signature of this conversion? If we print > > object X to a Unicode stream, should we invoke unicode(X), or > > unicode(X, encoding, error)? I believe it should be just unicode(X), > > since the encoding used by the stream shouldn't enter into the picture > > here: that's just used for converting Unicode characters written to > > the stream to some external format. > > > > How should an object be allowed to customize its Unicode rendition? > > We could add a tp_unicode slot to the type object, but there's no > > need: we can just look for a __unicode__ method and call it if it > > exists. The signature of __unicode__ should take no further > > arguments: unicode(X) should call X.__unicode__(). As a fallback, if > > the object doesn't have a __unicode__ attribute, PyObject_Str() should > > be called and the resulting string converted to Unicode using the > > default encoding. > > I'd rather leave things as they are: __str__/tp_str are allowed > to return Unicode objects and if an object wishes to be rendered > as Unicode it can simply return a Unicode object through the > __str__/tp_str interface. Can you explain your motivation? In the long run, it seems better to me to think of __str__ as "render as 8-bit string" and __unicode__ as "render as Unicode string". > > Regarding the "long form" of unicode(), unicode(X, encoding, error), I > > see no reason to treat this with the same generality. This form > > should restrict X to something that supports the buffer API (IOW, > > 8-bit string objects and things that are treated the same as these in > > most situations). > > Hmm, but this would restrict users from implementing string like > objects (i.e. objects having the __str__ method to make it compatible > to str()). Having __str__ doesn't make something a string-like object! A string-like object (at least the way I understand this term) would behave like a string, e.g. have string methods. The UserString module is an example, and in 2.2 subclasses of the 'str' type are prime examples. To convert one of these to Unicode given an encoding, shouldn't their decode() method be used? > > (Note that it already balks when X is a Unicode > > string.) > > True -- since you normally cannot decode Unicode into Unicode using > some 8-bit character encoding. As a result encodings which convert > Unicode to Unicode (e.g. normalizations) cannot use this interface, > but since these are probably only rarely used, I think it's better > to prevent accidental usage of an 8-bit character codec on Unicode. Sigh. More special cases. Unicode objects do have a tp_str/__str__ slot, but they are not acceptable to unicode(). Really, this is such an incredible morass of APIs that I wonder if we shouldn't start over... There are altogether too many places in the code where PyUnicode_Check() is used. I wish there was a better way... > > So about those C APIs: I propose that PyObject_Unicode() correspond to > > the one-arg form of unicode(), taking any kind of object, and that > > PyUnicode_FromEncodedObject() correspond to the three-arg form. > > Ok. I'll fix this once we've reached consensus on what to do > about str() and unicode(). Alas, this is harder than we seem to have thought, collectively. I want someone to sit back and rethink how this should eventually work (say in Python 2.9), and then work backwards from there to a reasonable API to be used in 2.2. The current piling of hack upon hack seems hopeless. We have some time: 2.2a4 will be released this week, but 2.2b1 isn't due until Oct 10, and we can even slip that a bit. Compatibility with previous 2.2 alpha releases in not necessary; the hard compatibility baseline is 2.1.1. > > PyUnicode_FromObject() shouldn't really need to exist. > > Note: PyUnicode_FromObject() was extended by PyUnicode_FromEncodedObject() > and only exists for backward compatibility reasons. Excellent. > > I don't see a > > reason for PyUnicode_From[Encoded]Object() to use the __unicode__ > > customization -- it should just take the bytes provided by the object > > and decode them according to the given encoding. PyObject_Unicode(), > > on the other hand, should look for __unicode__ first and then > > PyObject_Str(). > > > > I hope this helps. > > Thanks for the summary. Alas, we're not done. :-( I don't have much time for this -- there still are important pieces of the type/class unification missing (e.g. comparisons and pickling don't work right, and _ must be able to make __dynamic__ the default). --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Sep 24 16:32:50 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 24 Sep 2001 17:32:50 +0200 Subject: [Python-Dev] str() vs. unicode() References: <3BAB445B.A10584F8@lemburg.com> <200109211459.f8LExRc24735@odiug.digicool.com> <3BACB8F1.235DDE92@lemburg.com> <200109241330.JAA31034@cj20424-a.reston1.va.home.com> Message-ID: <3BAF5222.14A88E31@lemburg.com> Guido van Rossum wrote: > > > > Well, historically, str() has rarely raised exceptions, because > > > there's a default implementation (same as for repr(), returning > > object at ADDRESS>. This is used when neither tp_repr nor tp_str is > > > set. Note that PyObject_Str() never looks at __str__ -- this is done > > > by the tp_str handler of instances (and now also by the tp_str handler > > > of new-style classes). I see no reason to change this. > > > > Me neither; what str() does not do (and unicode() does) is try > > the char buffer interface before trying tp_str. > > The meanings of these two are different: tp_str means "give me a > string that's useful for printing"; the buffer API means "let me treat > you as a sequence of 8-bit bytes (or 8-bit characters)". They are > different e.g. when you consider a PIL image, whose str() probably > returns something like '' while its buffer API > probably gives access to the raw image buffer. > > The str() function should map directly to tp_str(). You *might* claim > that the 8-bit string type constructor *ought to* look at the buffer > API, but I'd say that it's easy enough for a type to provide a tp_str > implementation that does what the type wants. I guess "convert > yourself to string" is different than "display yourself as a string". Sure is :-) Ok, so let's leave remove the buffer API check from the list of str()/ unicode() conversion checks. > > > The question then becomes, do we want unicode() to behave similarly? > > > > Given that porting an application from strings to Unicode should > > be easy, I'd say: yes. > > Fearing this ends up being a trick question, I'll say +0. If we end > up with something I don't like, I reserve the right to change my > opinion on this. Ok. > > > The str() function (now constructor) converts *anything* to a string; > > > tp_str and tp_repr exist to allow objects to customize this. These > > > slots, and the str() function, take no additional arguments. To > > > invoke the equivalent of str() from C, you call PyObject_Str(). I see > > > no reason to change this; we may want to make the Unicode situation is > > > similar as possible. > > > > Right. > > > > > The unicode() function (now constructor) traditionally converted only > > > 8-bit strings to Unicode strings, > > > > Slightly incorrect: it converted 8-bit strings, objects compatible > > to the char buffer interface and instances having a __str__ method to > > Unicode. > > That's rather random collection of APIs, if you ask me... It was modelled after the PyObject_Str() API at the time. Don't know how the buffer interface ended up in there, but I guess it was a left-over from early revisions in the design. > Also, do you really mean *instances* (i.e. objects for which > PyInstance_Check() returns true), or do you mean anything for which > getattr(x, "__str__") is true? Looking at the code from Python 2.1: if (!PyInstance_Check(v) || (func = PyObject_GetAttr(v, strstr)) == NULL) { PyErr_Clear(); res = PyObject_Repr(v); } else { res = PyEval_CallObject(func, (PyObject *)NULL); Py_DECREF(func); } ... instances which have the __str__ attribute. > If the latter, you're in for a > surprise in 2.2 -- almost all built-in objects now respond to that > method, due to the type/class unification: whenever something has a > tp_str slot, a __str__ attribute is synthesized (and vice versa). > (Exceptions are a few obscure types and maybe 3rd party extension > types.) Nice :-) > > To synchronize unicode() with str() we'd have to replace the __str__ > > lookup with a tp_str lookup (this will also allow things like unicode(2) > > and unicode(instance_having__str__)) and maybe also add the charbuf > > lookup to str() (this would make str() compatible with memory mapped > > files and probably a few other char buffer aware objects as well). > > I definitely don't want the latter change to str(); see above. If you > want unicode(x) to behave as much as str(x) as possible, I recommend > removing using the buffer API. Ok, let's remove the buffer API from unicode(). Should it still be maintained for unicode(obj, encoding, errors) ? > > Note that in a discussion we had some time ago we decided that __str__ > > should be allowed to return Unicode objects as well (instead of > > defining a separate __unicode__ method/slot for this purpose). str() > > will convert a Unicode return value to an 8-bit string using the > > default encoding while unicode() takes the return value as-is. > > > > This was done to simplify moving from strings to Unicode. > > I'm now not so sure if this was the right decision. Hmm, perhaps we do need a __unicode__/tp_unicode slot after all. It would certainly help clarify the communication between the interpreter and the object. > > > with additional arguments to specify > > > the encoding (and error handling preference). There is no tp_unicode > > > slot, but for some reason there are at least three C APIs that could > > > correspond to unicode(): PyObject_Unicode() and PyUnicode_FromObject() > > > take a single object argument, and PyObject_FromEncodedObject() takes > > > object, encoding, and error arguments. > > > > > > The first question is, do we want the unicode() constructor to be > > > applicable in all cases where the str() constructor is? > > > > Yes. > > > > > I guess that > > > we do, since we want to be able to print to streams that support > > > Unicode. Unicode strings render themselves as Unicode characters to > > > such a stream, and it's reasonable to allow other objects to also > > > customize their rendition in Unicode. > > > > > > Now, what should be the signature of this conversion? If we print > > > object X to a Unicode stream, should we invoke unicode(X), or > > > unicode(X, encoding, error)? I believe it should be just unicode(X), > > > since the encoding used by the stream shouldn't enter into the picture > > > here: that's just used for converting Unicode characters written to > > > the stream to some external format. > > > > > > How should an object be allowed to customize its Unicode rendition? > > > We could add a tp_unicode slot to the type object, but there's no > > > need: we can just look for a __unicode__ method and call it if it > > > exists. The signature of __unicode__ should take no further > > > arguments: unicode(X) should call X.__unicode__(). As a fallback, if > > > the object doesn't have a __unicode__ attribute, PyObject_Str() should > > > be called and the resulting string converted to Unicode using the > > > default encoding. > > > > I'd rather leave things as they are: __str__/tp_str are allowed > > to return Unicode objects and if an object wishes to be rendered > > as Unicode it can simply return a Unicode object through the > > __str__/tp_str interface. > > Can you explain your motivation? In the long run, it seems better to > me to think of __str__ as "render as 8-bit string" and __unicode__ as > "render as Unicode string". The motivation was the idea of a unification of strings and Unicode. You may be right, though, that this idea is not really practical. > > > Regarding the "long form" of unicode(), unicode(X, encoding, error), I > > > see no reason to treat this with the same generality. This form > > > should restrict X to something that supports the buffer API (IOW, > > > 8-bit string objects and things that are treated the same as these in > > > most situations). > > > > Hmm, but this would restrict users from implementing string like > > objects (i.e. objects having the __str__ method to make it compatible > > to str()). > > Having __str__ doesn't make something a string-like object! A > string-like object (at least the way I understand this term) would > behave like a string, e.g. have string methods. The UserString module > is an example, and in 2.2 subclasses of the 'str' type are prime > examples. > > To convert one of these to Unicode given an encoding, shouldn't their > decode() method be used? Right... perhaps we don't need __unicode__ after all: the .decode() method already provides this functionality (on strings at least). > > > (Note that it already balks when X is a Unicode > > > string.) > > > > True -- since you normally cannot decode Unicode into Unicode using > > some 8-bit character encoding. As a result encodings which convert > > Unicode to Unicode (e.g. normalizations) cannot use this interface, > > but since these are probably only rarely used, I think it's better > > to prevent accidental usage of an 8-bit character codec on Unicode. > > Sigh. More special cases. Unicode objects do have a tp_str/__str__ > slot, but they are not acceptable to unicode(). > > Really, this is such an incredible morass of APIs that I wonder if we > shouldn't start over... There are altogether too many places in the > code where PyUnicode_Check() is used. I wish there was a better > way... Ideally, we'd need a new base class for strings and then have 8-bit and Unicode be subclasses of the this base class. There are several problems with this approach though; one certainly being the different memory allocation mechanisms used (strings store the value in the object, Unicode references an external buffer), the other being the different nature: strings don't carry meta-information while Unicode is in many ways restricted in use. > > > So about those C APIs: I propose that PyObject_Unicode() correspond to > > > the one-arg form of unicode(), taking any kind of object, and that > > > PyUnicode_FromEncodedObject() correspond to the three-arg form. > > > > Ok. I'll fix this once we've reached consensus on what to do > > about str() and unicode(). > > Alas, this is harder than we seem to have thought, collectively. I > want someone to sit back and rethink how this should eventually work > (say in Python 2.9), and then work backwards from there to a > reasonable API to be used in 2.2. The current piling of hack upon > hack seems hopeless. Agreed. > We have some time: 2.2a4 will be released this week, but 2.2b1 isn't > due until Oct 10, and we can even slip that a bit. Compatibility with > previous 2.2 alpha releases in not necessary; the hard compatibility > baseline is 2.1.1. > > > > PyUnicode_FromObject() shouldn't really need to exist. > > > > Note: PyUnicode_FromObject() was extended by PyUnicode_FromEncodedObject() > > and only exists for backward compatibility reasons. > > Excellent. I would like to boil this down to one API if possible which then implements unicode(obj) and unicode(obj, encoding, errors) -- if no encoding is given the semantics of PyObject_Str() are closely followed, with encoding the semantics of PyUnicode_FromEncodedObject() as it was are used (with the buffer interface logic removed). In a first step, I'd use the tp_str/__str__ for unicode(obj) as well. Later we can add a tp_unicode/__unicode__ lookup before trying tp_str/__str__ as fallback. If this sounds reasonable, I'll give it a go... > > > I don't see a > > > reason for PyUnicode_From[Encoded]Object() to use the __unicode__ > > > customization -- it should just take the bytes provided by the object > > > and decode them according to the given encoding. PyObject_Unicode(), > > > on the other hand, should look for __unicode__ first and then > > > PyObject_Str(). > > > > > > I hope this helps. > > > > Thanks for the summary. > > Alas, we're not done. :-( > > I don't have much time for this -- there still are important pieces of > the type/class unification missing (e.g. comparisons and pickling > don't work right, and _ must be able to make __dynamic__ the default). -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@python.org Mon Sep 24 17:12:25 2001 From: guido@python.org (Guido van Rossum) Date: Mon, 24 Sep 2001 12:12:25 -0400 Subject: [Python-Dev] A baffler in test_repr In-Reply-To: Your message of "Sun, 23 Sep 2001 04:52:21 EDT." References: Message-ID: <200109241612.MAA01367@cj20424-a.reston1.va.home.com> > + Turns out I provoked this by adding classify_class_attrs() to > inspect.py, and then foolishly added a test for it . The > symptom can be provoked by one judiciously chosen line: > > C:\Code\python\PCbuild>python > Python 2.2a3+ (#23, Sep 20 2001, 19:43:51) [MSC 32 bit (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> staticmethod(42) # "object at" output > > >>> len(staticmethod.__dict__) # and you think this is harmless > 10 > >>> staticmethod(42) # whoa! now it's "instance at" output > > >>> > > That is, just asking for a type's __dict__ can change the values in the type > slots. This doesn't seem right -- or does it? Argh. Sorry. The short answer: my bad, fixed in CVS now. The long answer: There's a default repr() implementation in object.c, which uses "object". This is in PyObject_Repr() when the object doesn't have a tp_repr slot. But there's also a default repr() implementation in typeobject.c, and this one used "instance". (The fix was to change it to use "object" too; it was clearly my intention that this would yield the same output as the default in PyObject_Repr().) The version in typeobject.c is the tp_repr slot of 'object', the universal base class. When a type's initialization is completed by PyType_Ready(), a NULL tp_repr slot is replaced by the tp_repr slot inherited from its base class -- and if there's no explicit base class, the base class 'object' is assumed. Because we currently don't explicitly initialize all type object, most types are auto-initialized on their first attribute requested from one of their instances, by PyObject_GenericGetAttr(): this calls PyType_Ready(tp) if the type is not fully initialized. Asking a type's attribute also auto-initializes the type, because the getattr operation does the same thing that PyObject_GenericGetAttr() does. So repr(T()) does not initialize T, because it doesn't request an attribute (it merely looks at the tp_repr slot). But asking for T.__dict__ *does* initialize T. Yes, this is a mess. We may be better off requesting that all types are initialized explicitly; for most standard types, we can do that in a new function called from Py_Initialize(). But there's a catch: PyType_Ready() uses some other types as helpers, in particular it may create tuples and dictionaries and use them. So at least these two types (and of course strings, which are used as keys) have to be usable without being initialized, in order for their own initialization to be able to proceed. Maybe others. --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas@xs4all.net Mon Sep 24 20:44:50 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 24 Sep 2001 21:44:50 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.276,2.277 In-Reply-To: References: Message-ID: <20010924214450.A669@xs4all.nl> On Mon, Sep 24, 2001 at 12:32:03PM -0700, Thomas Wouters wrote: > This fixes SF bugs #463359 and #462937, and possibly other, *very* obscure > bugs with very deeply nested loops that continue the loop and then break out > of it or raise an exception. One thing I didn't do yet was make a test case out of this. Should I, and if so, where should I put it ? BTW, this bug is living proof of Karma. I fix an insignificant little bug, Guido goes ahead and mentions it on the Python conference opening keynote with something like "thomas fixed what I couldn't", and presto, it ends up being a broken fix all along! :) Back-to-safe-obscurity-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim.one@home.com Mon Sep 24 22:25:55 2001 From: tim.one@home.com (Tim Peters) Date: Mon, 24 Sep 2001 17:25:55 -0400 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.276,2.277 In-Reply-To: <20010924214450.A669@xs4all.nl> Message-ID: [Thomas Wouters] >> This fixes SF bugs #463359 and #462937, and possibly other, >> *very* obscure bugs with very deeply nested loops that continue >> the loop and then break out of it or raise an exception. [Thomas Wouters] > One thing I didn't do yet was make a test case out of this. > Should I, Absolutely. > and if so, where should I put it ? In a test_xxx.py file under Lib/test/, except for test_descr.py . That is, pick one where it doesn't obviously not fit, e.g. test_grammar.py. > BTW, this bug is living proof of Karma. I fix an insignificant little > bug, Guido goes ahead and mentions it on the Python conference opening > keynote with something like "thomas fixed what I couldn't", and presto, > it ends up being a broken fix all along! :) Another possibility is that Guido got so honked about you fixing something he couldn't, that he used the time machine to make it appear that you had always been swapping the arguments "by mistake". We've all suspected chicanery of this nature, right? > Back-to-safe-obscurity-ly y'rs, No no! Code more! I like your code, and it's done Python good. All obscurity will buy you is time enough to contract venereal diseases. From fdrake@acm.org Mon Sep 24 22:22:50 2001 From: fdrake@acm.org (Fred L. Drake) Date: Mon, 24 Sep 2001 17:22:50 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010924212250.BC6E628845@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Added documentation for several of the new C functions added for Python 2.2 in the Python/C API reference manual. From guido@python.org Tue Sep 25 06:17:07 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 25 Sep 2001 01:17:07 -0400 Subject: [Python-Dev] str() vs. unicode() In-Reply-To: Your message of "Mon, 24 Sep 2001 17:32:50 +0200." <3BAF5222.14A88E31@lemburg.com> References: <3BAB445B.A10584F8@lemburg.com> <200109211459.f8LExRc24735@odiug.digicool.com> <3BACB8F1.235DDE92@lemburg.com> <200109241330.JAA31034@cj20424-a.reston1.va.home.com> <3BAF5222.14A88E31@lemburg.com> Message-ID: <200109250517.BAA11617@cj20424-a.reston1.va.home.com> > Ok, let's remove the buffer API from unicode(). Should it still be > maintained for unicode(obj, encoding, errors) ? I think so yes. > Hmm, perhaps we do need a __unicode__/tp_unicode slot after all. > It would certainly help clarify the communication between the > interpreter and the object. Would you settle for a __unicode__ method but no tp_unicode slot? It's easy enough to define a C method named __unicode__ if the need arises. This should always be tried first, not just for classic instances. Adding a slot is a bit painful now that there are so many new slots already (adding it to the end means you have to add tons of zeros, adding it to the middle means I have to edit every file). > > To convert one of these to Unicode given an encoding, shouldn't their > > decode() method be used? > > Right... perhaps we don't need __unicode__ after all: the .decode() > method already provides this functionality (on strings at least). So maybe we should deprecate unicode(obj, encoding[, error]) and recommend obj.decode(encoding[, error]) instead. But this means that objects with a buffer API but no decode() method cannot efficiently be decoded. That's what unicode(obj, encoding[, error]) was good for. To decide, we need to know how useful it is in practice to be able to decode buffers -- I doubt it is very useful, since most types supporting the buffer API are not text but raw data like memory-mapped files, arrays, PIL images. > > Really, this is such an incredible morass of APIs that I wonder if we > > shouldn't start over... There are altogether too many places in the > > code where PyUnicode_Check() is used. I wish there was a better > > way... > > Ideally, we'd need a new base class for strings and then have 8-bit > and Unicode be subclasses of the this base class. There are several > problems with this approach though; one certainly being the different > memory allocation mechanisms used (strings store the value in the > object, Unicode references an external buffer), the other > being the different nature: strings don't carry meta-information > while Unicode is in many ways restricted in use. I've thought of defining an abstract base class "string" from which both str and unicode derive. Since str and unicode don't share representation, they shouldn't share implementation, but they could still share interface. Certainly conceptually this is how we think of strings. Useless thought: the string class would have unbound methods that are almost the same as the functions defined in the string module, e.g. string.split(s) and string.strip(s) could be made to call s.split() and s.strip(), just like the module. The class could have data attributes for string.whitespace etc. But string.join() would have a different signature: the class method is join(s, list) while the function is join(list, s). So we can't quite make the module an alias for the class. :-( > I would like to boil this down to one API if possible which then > implements unicode(obj) and unicode(obj, encoding, errors) -- if > no encoding is given the semantics of PyObject_Str() are closely > followed, with encoding the semantics of PyUnicode_FromEncodedObject() > as it was are used (with the buffer interface logic removed). I would actually recommend using two different C level APIs: PyObject_Unicode() to implement unicode(obj), which should follow str(obj), and PyUnicode_FromEncodedObject() to implement unicode(obj, decoding[, error]), which should use the buffer API on obj. > In a first step, I'd use the tp_str/__str__ for unicode(obj) as > well. Later we can add a tp_unicode/__unicode__ lookup before > trying tp_str/__str__ as fallback. I would add __unicode__ support without tp_unicode right away. I would use tp_str without even looking at __str__. > If this sounds reasonable, I'll give it a go... Yes. --Guido van Rossum (home page: http://www.python.org/~guido/) From walter@livinglogic.de Tue Sep 25 14:36:04 2001 From: walter@livinglogic.de (Walter =?ISO-8859-1?Q?D=F6rwald?=) Date: Tue, 25 Sep 2001 15:36:04 +0200 Subject: [Python-Dev] str() vs. unicode() References: <3BAB445B.A10584F8@lemburg.com> <200109211459.f8LExRc24735@odiug.digicool.com> <3BACB8F1.235DDE92@lemburg.com> <200109241330.JAA31034@cj20424-a.reston1.va.home.com> <3BAF5222.14A88E31@lemburg.com> <200109250517.BAA11617@cj20424-a.reston1.va.home.com> Message-ID: <3BB08844.80700@livinglogic.de> Guido van Rossum wrote: >>Ok, let's remove the buffer API from unicode(). Should it still be >>maintained for unicode(obj, encoding, errors) ? >> >=20 > I think so yes. >=20 >=20 >>Hmm, perhaps we do need a __unicode__/tp_unicode slot after all.=20 >>It would certainly help clarify the communication between the=20 >>interpreter and the object. >> >=20 > Would you settle for a __unicode__ method but no tp_unicode slot? > It's easy enough to define a C method named __unicode__ if the need > arises. This should always be tried first, not just for classic > instances. Adding a slot is a bit painful now that there are so many > new slots already (adding it to the end means you have to add tons of > zeros, adding it to the middle means I have to edit every file). Hmm, what about a type object initialisation function that takes "named arguments" via varargs: PyType_Initialize(&PyUnicode_Type, TYPE_TYPE, &PyType_Type, TYPE_NAME, "unicode", SLOT_DESTRUCTOR, _PyUnicode_Free, SLOT_CMP, unicode_compare, SLOT_REPR, unicode_repr, SLOT_SEQ, unicode_as_sequence, SLOT_HASH, unicode_hash, DONE ) The SLOT_xxx arguments would be #defines like this #define DONE 0 #define TYPE_TYPE 1 #define TYPE_NAME 2 #define SLOT_DESTRUCTOR 3 #define SLOT_CMP 4 Adding a new slot would require much less work: define a new slot=20 *somewhere* in the struct, define a new SLOT_xxx and add SLOT_xxx, foo_xxx to the call to the initializer for every type that implements this struct. Performance shouldn't be a problem, because this function would only be called once for every type. And we could get rid of the problem with static initialization of ob_type with some compilers. > [...] >=20 > I would add __unicode__ support without tp_unicode right away. I like this idea. There is no need to piggyback unicode representation of objects onto tp_str/__str__. Both PyObject_Str and PyObject_Unicode will get much simpler. But we will need int.__unicode__, float.__unicode__ etc. (or fallback to __str__) BTW, what about __repr__? Should this be allowed to return unicode=20 objects? (currently it is, and uses PyUnicode_AsUnicodeEscapeString) Bye, Walter D=F6rwald From guido@python.org Tue Sep 25 20:57:09 2001 From: guido@python.org (Guido van Rossum) Date: Tue, 25 Sep 2001 15:57:09 -0400 Subject: [Python-Dev] str() vs. unicode() In-Reply-To: Your message of "Tue, 25 Sep 2001 15:36:04 +0200." <3BB08844.80700@livinglogic.de> References: <3BAB445B.A10584F8@lemburg.com> <200109211459.f8LExRc24735@odiug.digicool.com> <3BACB8F1.235DDE92@lemburg.com> <200109241330.JAA31034@cj20424-a.reston1.va.home.com> <3BAF5222.14A88E31@lemburg.com> <200109250517.BAA11617@cj20424-a.reston1.va.home.com> <3BB08844.80700@livinglogic.de> Message-ID: <200109251957.f8PJv9r21446@odiug.digicool.com> > > Adding a slot is a bit painful now that there are so many > > new slots already (adding it to the end means you have to add tons of > > zeros, adding it to the middle means I have to edit every file). > > > Hmm, what about a type object initialisation function that takes > "named arguments" via varargs: > PyType_Initialize(&PyUnicode_Type, > TYPE_TYPE, &PyType_Type, > TYPE_NAME, "unicode", > SLOT_DESTRUCTOR, _PyUnicode_Free, > SLOT_CMP, unicode_compare, > SLOT_REPR, unicode_repr, > SLOT_SEQ, unicode_as_sequence, > SLOT_HASH, unicode_hash, > DONE > ) > > The SLOT_xxx arguments would be #defines like this > #define DONE 0 > #define TYPE_TYPE 1 > #define TYPE_NAME 2 > #define SLOT_DESTRUCTOR 3 > #define SLOT_CMP 4 > > Adding a new slot would require much less work: define a new slot > > *somewhere* in the struct, define a new SLOT_xxx and add > SLOT_xxx, foo_xxx > to the call to the initializer for every type that implements this > struct. Performance shouldn't be a problem, because this function > would only be called once for every type. And we could get rid of > the problem with static initialization of ob_type with some > compilers. Cool idea. It would definitely be worth to pursue this when starting from scratch. Right now, it would only slow us down to convert all the existing statically initialized types to use this mechanism. Also, for some of the built-in types we'd have to decide on a point in the initialization sequence where to initialize them. > > [...] > > > > I would add __unicode__ support without tp_unicode right away. > > I like this idea. There is no need to piggyback unicode > representation of objects onto tp_str/__str__. Both PyObject_Str > and PyObject_Unicode will get much simpler. > > But we will need int.__unicode__, float.__unicode__ etc. > (or fallback to __str__) We should fallback to tp_str -- for most of these types there's never a need to generate non-ASCII characters so using the ASCII representation and converting that to Unicode would work just fine. > BTW, what about __repr__? Should this be allowed to return unicode > objects? (currently it is, and uses PyUnicode_AsUnicodeEscapeString) But this is rarely what the caller expects, and it violates the guideline that repr() should return something that can be fed back to the parser. I'd rather change the rules to require that __repr__ and tp_repr return an 8-bit string at all times. --Guido van Rossum (home page: http://www.python.org/~guido/) From gward@mems-exchange.org Tue Sep 25 22:25:48 2001 From: gward@mems-exchange.org (Greg Ward) Date: Tue, 25 Sep 2001 17:25:48 -0400 Subject: [Python-Dev] Null bytes in source code Message-ID: <20010925172548.A30282@mems-exchange.org> Weird: in the current CVS, certain files in Lib have null bytes (ASCII 0) in them. Install-time byte-compilation complains about this, so "make install" fails. Eg. here's what "less Lib/reconvert.py" shows me me: ------------------------------------------------------------------------ #! /usr/bin/env python1.5 r"""Convert old ("regex") regular expressions to new syntax ("re"). When imported as a module, there are two functions, with their own strings: convert(s, syntax=None) -- convert a regex regular expression to re syntax ^@^@^@^@te(s) -- return a quoted string literal [...] ------------------------------------------------------------------------ Those "^@"'s are really ASCII 0 -- I checked with "od -c". Am I on drugs here? -- Greg Ward - software developer gward@mems-exchange.org MEMS Exchange http://www.mems-exchange.org From skip@pobox.com (Skip Montanaro) Tue Sep 25 22:40:11 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Tue, 25 Sep 2001 16:40:11 -0500 Subject: [Python-Dev] Null bytes in source code In-Reply-To: <20010925172548.A30282@mems-exchange.org> References: <20010925172548.A30282@mems-exchange.org> Message-ID: <15280.63931.38641.567914@beluga.mojam.com> Greg> Weird: in the current CVS, certain files in Lib have null bytes Greg> (ASCII 0) in them. Greg, I just did a cvs up. Lib/reconvert.py looks fine to me. It was not modified by the cvs up, so I assume nobody fixed any bugs. I suggest you delete (or move out of the way) any files containing ASCII NUL and cvs up again. On second thought, perhaps you should perform a quick backup of your important files first. Maybe your disk drive is going on holiday... :-( Skip From barry@zope.com Tue Sep 25 22:41:18 2001 From: barry@zope.com (Barry A. Warsaw) Date: Tue, 25 Sep 2001 17:41:18 -0400 Subject: [Python-Dev] Null bytes in source code References: <20010925172548.A30282@mems-exchange.org> Message-ID: <15280.63998.551309.563245@anthem.wooz.org> >>>>> "GW" == Greg Ward writes: GW> Weird: in the current CVS, certain files in Lib have null GW> bytes (ASCII 0) in them. Install-time byte-compilation GW> complains about this, so "make install" fails. GW> Eg. here's what "less Lib/reconvert.py" shows me me: GW> ------------------------------------------------------------------------ GW> #! /usr/bin/env python1.5 GW> r"""Convert old ("regex") regular expressions to new syntax GW> ("re"). GW> When imported as a module, there are two functions, with their GW> own strings: GW> convert(s, syntax=None) -- convert a regex regular GW> expression to re syntax GW> ^@^@^@^@te(s) -- return a quoted string literal [...] GW> ------------------------------------------------------------------------ GW> Those "^@"'s are really ASCII 0 -- I checked with "od -c". GW> Am I on drugs here? Probably! A cvs update as of 2 minutes ago showed nothing like this in the file for me. say-hi-to-the-flying-pink-elephants-for-me-ly y'rs, -Barry From gward@mems-exchange.org Tue Sep 25 22:50:05 2001 From: gward@mems-exchange.org (Greg Ward) Date: Tue, 25 Sep 2001 17:50:05 -0400 Subject: [Python-Dev] Null bytes in source code In-Reply-To: <15280.63998.551309.563245@anthem.wooz.org> References: <20010925172548.A30282@mems-exchange.org> <15280.63998.551309.563245@anthem.wooz.org> Message-ID: <20010925175005.A31334@mems-exchange.org> On 25 September 2001, Barry A. Warsaw said: > Probably! A cvs update as of 2 minutes ago showed nothing like this > in the file for me. As Skip suggested, I deleted the affected files, cvs up'd, and now no problems. OK, where's that disk diagnostic program again... or maybe I should just take more drugs and forget about all my worries... > say-hi-to-the-flying-pink-elephants-for-me-ly y'rs, No way, the flying pink elephants are carrying MACHINE GUNS! Aiiee!! Time for a kinder, gentler hallucinogen... Greg From tim@zope.com Tue Sep 25 22:50:55 2001 From: tim@zope.com (Tim Peters) Date: Tue, 25 Sep 2001 17:50:55 -0400 Subject: [Python-Dev] Null bytes in source code In-Reply-To: <20010925172548.A30282@mems-exchange.org> Message-ID: [Greg Ward] > Weird: in the current CVS, certain files in Lib have null bytes (ASCII > 0) in them. Install-time byte-compilation complains about this, so > "make install" fails. > > Eg. here's what "less Lib/reconvert.py" shows me me: > > ------------------------------------------------------------------------ > #! /usr/bin/env python1.5 > > r"""Convert old ("regex") regular expressions to new syntax ("re"). > > When imported as a module, there are two functions, with their own > strings: > > convert(s, syntax=None) -- convert a regex regular expression > to re syntax > > ^@^@^@^@te(s) -- return a quoted string literal > [...] > ------------------------------------------------------------------------ > > Those "^@"'s are really ASCII 0 -- I checked with "od -c". > > Am I on drugs here? This is a problem with running on a system other than Windows . I get this output from the attached program: checking .py files in C:\Code\python\Lib checked 165 files checking .py files in C:\Code\python\Lib\test checked 201 files import os dir = os.getcwd() print "checking .py files in", dir count = 0 for fname in os.listdir(dir): if fname.endswith('.py'): count += 1 f = file(fname, 'rb') guts = f.read() i = guts.find('\x00') if i >= 0: print "Whoa! Null byte at offset", i, "in", fname f.close() print "checked", count, "files" no-nulls-under-a-real-os-ly y'rs - tim From thomas@xs4all.net Wed Sep 26 13:48:58 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 26 Sep 2001 14:48:58 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.276,2.277 In-Reply-To: References: <20010924214450.A669@xs4all.nl> Message-ID: <20010926144858.E844@xs4all.nl> On Mon, Sep 24, 2001 at 05:25:55PM -0400, Tim Peters wrote: > That is, pick one where it doesn't obviously not fit, e.g. test_grammar.py. Done. > > Back-to-safe-obscurity-ly y'rs, > All obscurity will buy you is time enough to contract venereal diseases. And you would know, eh ? :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From akuchlin@mems-exchange.org Wed Sep 26 15:00:56 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 26 Sep 2001 10:00:56 -0400 Subject: [Python-Dev] Crude Python->Parrot compiler Message-ID: Over the last few days I've been experimenting with a simple Python->Parrot compiler. This morning I finally implemented binding variables to a register, so it now can actually compute something. See http://www.mems-exchange.org/software/files/parrot-gen.py for the code. Limitations: * Currently this only understands a *really* limited subset of Python. * The only allowable type is integer; strings, floats, long ints, &c., aren't supported at all. * It will die with an assertion if you feed it a language construct it doesn't handle (def, class, most operators). * Code structure is suboptimal; this is just a quick-and-dirty hack. Example usage: ute parrot>cat euclid.py # Python implementation of Euclid's algorithm m = 96 n = 64 print m,n r = m % n while r != 0: m = n n = r r = m % n print n ute parrot>python euclid.py 96 64 32 ute parrot>python parrot-gen.py -r euclid.py 96 64 32 ute parrot>cat euclid.pasm main: set I3, 96 set I10, 64 set I9, 0 add I0, I3, I9 ... ... Currently the Parrot interpreter only supports integer, floating point, and string registers. There's no way to store the contents of a register in memory as far as I can tell, and PMCs -- the polymorphic objects that would correspond to PyObjects -- aren't implemented either. This means it's not possible to handle general variables, and therefore we'll have to wait for PMCs before general Python programs can be handled. --amk From guido@python.org Wed Sep 26 15:25:35 2001 From: guido@python.org (Guido van Rossum) Date: Wed, 26 Sep 2001 10:25:35 -0400 Subject: [Python-Dev] Crude Python->Parrot compiler In-Reply-To: Your message of "Wed, 26 Sep 2001 10:00:56 EDT." References: Message-ID: <200109261425.f8QEPZd23217@odiug.digicool.com> Wow. Cool! You should cc the language-dev list. I got some complaints from the Perlers there that the Python community seems uninterested in Parrot -- this is proof we *are* interested! --Guido van Rossum (home page: http://www.python.org/~guido/) From simon@netthink.co.uk Wed Sep 26 15:24:19 2001 From: simon@netthink.co.uk (Simon Cozens) Date: Wed, 26 Sep 2001 15:24:19 +0100 Subject: [Python-Dev] Crude Python->Parrot compiler In-Reply-To: <200109261425.f8QEPZd23217@odiug.digicool.com> References: <200109261425.f8QEPZd23217@odiug.digicool.com> Message-ID: <20010926152419.A929@netthink.co.uk> On Wed, Sep 26, 2001 at 10:25:35AM -0400, Guido van Rossum wrote: > Wow. Cool! You should cc the language-dev list. I got some > complaints from the Perlers there that the Python community seems > uninterested in Parrot -- this is proof we *are* interested! I've seen. I'm convinced. Uhm. Wow, cool. Andrew, are you planning to do any more on this? If so, let me know if we can do anything that will make it easier for you - more information, more things implemented, or whatever. You know, it would be quite an embarrassment to me if there was a Python->Parrot compiler before a Perl->Parrot one. Not that I'm offering a challenge, or anything. ;) -- UNIX was not designed to stop you from doing stupid things, because that would also stop you from doing clever things. -- Doug Gwyn From pedroni@inf.ethz.ch Wed Sep 26 16:08:54 2001 From: pedroni@inf.ethz.ch (Samuele Pedroni) Date: Wed, 26 Sep 2001 17:08:54 +0200 Subject: [Python-Dev] Crude Python->Parrot compiler References: <200109261425.f8QEPZd23217@odiug.digicool.com> <20010926152419.A929@netthink.co.uk> Message-ID: <005201c1469d$297619a0$4d5821c0@newmexico> [Simon Cozens] > You know, it would be quite an embarrassment to me if there was a > Python->Parrot compiler before a Perl->Parrot one. Not that I'm offering a > challenge, or anything. ;) Don't worry, spitting out bytecodes is the easy part ... anyway ;) When there is PMCs support in place and someone begins to reimplement Python runtime and submitting a lot of patches to adjust things ... then you should worry ;) regards. From akuchlin@mems-exchange.org Wed Sep 26 17:01:00 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 26 Sep 2001 12:01:00 -0400 Subject: [Python-Dev] Crude Python->Parrot compiler In-Reply-To: <20010926152419.A929@netthink.co.uk>; from simon@netthink.co.uk on Wed, Sep 26, 2001 at 03:24:19PM +0100 References: <200109261425.f8QEPZd23217@odiug.digicool.com> <20010926152419.A929@netthink.co.uk> Message-ID: <20010926120100.B14177@ute.mems-exchange.org> On Wed, Sep 26, 2001 at 03:24:19PM +0100, Simon Cozens wrote: >Andrew, are you planning to do any more on this? If so, let me know if we can >do anything that will make it easier for you - more information, more things >implemented, or whatever. It's been entertaining so far, and I'll probably keep playing with it, but I have no intention of becoming the person responsible for Parrot/Python interfacing long-term. The only helpful thing that needs to be implemented would be PMCs, and that's scheduled for Parrot 0.0.3. Without PMCs, there seems little else productive that can be done. (Oh, I did send a trivial patch to Gregor to make assemble.pl return a non-zero status when there's an assembly error; otherwise parrot-gen.py has no way to detect a problem in its output.) Question for Guido and Simon: Should I check this into sandbox/ in the Python CVS tree, or do you want to pull it into the Parrot CVS? I'd like to have it in a public CVS tree someplace, preferably one where I have checkin privileges. sandbox/ is fine with me. --amk From walter@livinglogic.de Wed Sep 26 18:11:00 2001 From: walter@livinglogic.de (Walter =?ISO-8859-1?Q?D=F6rwald?=) Date: Wed, 26 Sep 2001 19:11:00 +0200 Subject: [Python-Dev] str() vs. unicode() References: <3BAB445B.A10584F8@lemburg.com> <200109211459.f8LExRc24735@odiug.digicool.com> <3BACB8F1.235DDE92@lemburg.com> <200109241330.JAA31034@cj20424-a.reston1.va.home.com> <3BAF5222.14A88E31@lemburg.com> <200109250517.BAA11617@cj20424-a.reston1.va.home.com> <3BB08844.80700@livinglogic.de> <200109251957.f8PJv9r21446@odiug.digicool.com> Message-ID: <3BB20C24.4090805@livinglogic.de> Guido van Rossum wrote: >[...] >=20 >>BTW, what about __repr__? Should this be allowed to return unicode=20 >>objects? (currently it is, and uses PyUnicode_AsUnicodeEscapeString) >> >=20 > But this is rarely what the caller expects, and it violates the > guideline that repr() should return something that can be fed back to > the parser. I'd say this is a bug in the parser! >;) > I'd rather change the rules to require that __repr__ and > tp_repr return an 8-bit string at all times. Sounds reasonable and again makes the implementation simpler (as long as we're not in an all unicode world). Bye, Walter D=F6rwald From guido@python.org Wed Sep 26 18:51:50 2001 From: guido@python.org (Guido van Rossum) Date: Wed, 26 Sep 2001 13:51:50 -0400 Subject: [Python-Dev] Crude Python->Parrot compiler In-Reply-To: Your message of "Wed, 26 Sep 2001 12:01:00 EDT." <20010926120100.B14177@ute.mems-exchange.org> References: <200109261425.f8QEPZd23217@odiug.digicool.com> <20010926152419.A929@netthink.co.uk> <20010926120100.B14177@ute.mems-exchange.org> Message-ID: <200109261751.f8QHppn23941@odiug.digicool.com> > Question for Guido and Simon: Should I check this into sandbox/ in the > Python CVS tree, or do you want to pull it into the Parrot CVS? I'd > like to have it in a public CVS tree someplace, preferably one where I > have checkin privileges. sandbox/ is fine with me. If Simon can give you write access in a corner of the Parrot tree, that might be the best solution; if not, sandbox is fine with me too. --Guido van Rossum (home page: http://www.python.org/~guido/) From gmcm@hypernet.com Wed Sep 26 21:40:03 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Wed, 26 Sep 2001 16:40:03 -0400 Subject: [Python-Dev] imputil and modulefinder replacements Message-ID: <3BB204E3.23339.3A9F068C@localhost> Dev-er's After numerous false-starts and dead-ends, I've come up with two modules that do what imputil and modulefinder do, but better. Code and detailed descriptions are available here: http://www.mcmillan-inc.com/importhooks.html I would like to propose these (or something quite like them) as replacements for the official versions. The code is quite similar (in fact, the modulefinder code could have been written by subclassing the imputil stuff, but I wrote them the other way 'round). If the charter of the Import-SIG is not as dead as the list is, I would promote the basic structure as a potential model for a reformed import. For now, though, it's enough to consider the code. The differences are too extreme to consider these patches, but the subject hardly seems PEPable so I bring it up here. - Gordon From guido@python.org Wed Sep 26 21:47:03 2001 From: guido@python.org (Guido van Rossum) Date: Wed, 26 Sep 2001 16:47:03 -0400 Subject: [Python-Dev] Crude Python->Parrot compiler In-Reply-To: Your message of "Wed, 26 Sep 2001 13:51:50 EDT." <200109261751.f8QHppn23941@odiug.digicool.com> References: <200109261425.f8QEPZd23217@odiug.digicool.com> <20010926152419.A929@netthink.co.uk> <20010926120100.B14177@ute.mems-exchange.org> <200109261751.f8QHppn23941@odiug.digicool.com> Message-ID: <200109262047.f8QKl3O25053@odiug.digicool.com> Andrew, which version of the compiler package do you use? When I try to use the version that Jeremy just checked into the standard library, I get a traceback: $ ~/python/src/linux/python parrot-gen.py -S euclid.py Traceback (most recent call last): File "parrot-gen.py", line 405, in ? main() File "parrot-gen.py", line 395, in main generate_assembly(filename, asm_filename) File "parrot-gen.py", line 355, in generate_assembly lnf = walk(ast, LocalNameFinder(), 0) File "/home/guido/python/src/Lib/compiler/visitor.py", line 114, in walk walker.preorder(tree, visitor) AttributeError: 'int' object has no attribute 'preorder' $ --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas@xs4all.net Thu Sep 27 00:50:55 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 27 Sep 2001 01:50:55 +0200 Subject: [Python-Dev] SOURCEFORGE.NET UPDATE - 2001-09-26 (fwd) Message-ID: <20010927015055.H844@xs4all.nl> For those of you who don't read this thingy they sent out, see the below abridged version. We might need to create a checkins list with a single member, python-checkins@python.org :) The same goes for other projects hosted at SF, such as Mailman or whatever your pet project is. ----- Forwarded message from Mailer ----- Date: Wed, 26 Sep 2001 16:26:25 -0700 From: Mailer To: "" Subject: SOURCEFORGE.NET UPDATE - 2001-09-26 [ snip ] 5) SF.NET E-MAIL POLICY CHANGES [ snip ] SENDING EMAIL FROM SOURCEFORGE.NET SERVERS SourceForge.net has seen an increase in abuse of site resources by spammers over the past few months. As result, steps are being taken to prevent most SourceForge.net servers from sending e-mail outside the SourceForge.net site; this will compartmentalize any issues which remain present. The following SourceForge.net domains will still be able to receive e-mail from SourceForge.net hosts: - sourceforge.net (also sf.net) - lists.sourceforge.net (lists.sf.net) - users.sourceforge.net (users.sf.net) This policy change has been active on the project shell servers for the past couple of months. In the next month, this change will be implemented on the project CVS servers. If your project currently makes use of syncmail (or similar mechanisms) to send e-mail when CVS commits occur, please take notice of this change. It will be necessary for all e-mail originating from the project CVS servers to be directed to a SourceForge.net user account (username@users.sourceforge.net) or to a SourceForge.net-hosted mailing list. [ snip ] ----- End forwarded message ----- -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido@python.org Thu Sep 27 16:54:20 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 27 Sep 2001 11:54:20 -0400 Subject: [Python-Dev] SOURCEFORGE.NET UPDATE - 2001-09-26 (fwd) In-Reply-To: Your message of "Thu, 27 Sep 2001 01:50:55 +0200." <20010927015055.H844@xs4all.nl> References: <20010927015055.H844@xs4all.nl> Message-ID: <200109271554.f8RFsK625849@odiug.digicool.com> > For those of you who don't read this thingy they sent out, see the below > abridged version. We might need to create a checkins list with a single > member, python-checkins@python.org :) The same goes for other projects > hosted at SF, such as Mailman or whatever your pet project is. Sigh. I trust that you & Barry to take care of this, OK? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Sep 27 17:38:33 2001 From: guido@python.org (Guido van Rossum) Date: Thu, 27 Sep 2001 12:38:33 -0400 Subject: [Python-Dev] Release 2.2a4 going out tomorrow Message-ID: <200109271638.f8RGcYr31119@odiug.digicool.com> Unless bad things happen, we'll release 2.2a4 tomorrow. I've just created the release branch (r22a4-branch) and will be making some small version-adjusting changes to it next. As before, please don't make checkins to this branch -- if you have something that you'd like to go in, please check it in to the trunk and send an email to python-dev. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Sep 27 22:09:33 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 27 Sep 2001 17:09:33 -0400 Subject: [Python-Dev] Python 2.2a4 docs frozen Message-ID: <15283.38285.231035.778383@grendel.zope.com> The documentation for Python 2.2a4 is now frozen; the HTML packages have been pushed to the server and the online version is available on python.org from the Python 2.2 pages. SourceForge will be updated momentarily. The trunk is *not* frozen; updates can continue there. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Thu Sep 27 22:18:17 2001 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 27 Sep 2001 17:18:17 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010927211817.5C90128694@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Documentation as released for Python 2.2 alpha 4. From loewis@informatik.hu-berlin.de Fri Sep 28 08:22:19 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Fri, 28 Sep 2001 09:22:19 +0200 (MEST) Subject: [Python-Dev] Canned Responses Message-ID: <200109280722.JAA28446@pandora.informatik.hu-berlin.de> I think the Canned Responses are a nice feature. However, being in a non-admin position, I cannot find out what the exact text of the response is. I've used "Missing Upload" myself, so I know what text to expect and when to use it. I can also guess what "RH 7 Compile (LONG_BIT)" is likely going to say (having typed in a similar response quite a few times). What about the others ("Float precision","Feature -> PEP42")? Can somebody please post the text associated with them? Thanks, Martin From loewis@informatik.hu-berlin.de Fri Sep 28 09:12:16 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Fri, 28 Sep 2001 10:12:16 +0200 (MEST) Subject: [Python-Dev] stat return types Message-ID: <200109280812.KAA03126@paros.informatik.hu-berlin.de> Nich Mathewson has developed a patch that allows to access stat fields by name, rather than by tuple index, while preserving backwards compatibility, see http://sourceforge.net/tracker/?func=detail&atid=305470&aid=462296&group_id=5470 One issue is non-posix stat implementations, which currently live in riscos and mac. Any comments on the patch are appreciated, in particular with regard to the following question: - should we attempt to modify those modules even though we cannot test the modifications, or should we accept that field access will not be possible on these platforms until somebody with platform access fixes them? - should the patch go into 2.2 (personally, I'd like to see it integrated)? Regards, Martin From guido@python.org Fri Sep 28 14:54:08 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 28 Sep 2001 09:54:08 -0400 Subject: [Python-Dev] Canned Responses In-Reply-To: Your message of "Fri, 28 Sep 2001 09:22:19 +0200." <200109280722.JAA28446@pandora.informatik.hu-berlin.de> References: <200109280722.JAA28446@pandora.informatik.hu-berlin.de> Message-ID: <200109281354.JAA25665@cj20424-a.reston1.va.home.com> > I think the Canned Responses are a nice feature. However, being in a > non-admin position, I cannot find out what the exact text of the > response is. I've used "Missing Upload" myself, so I know what text > to expect and when to use it. I can also guess what "RH 7 Compile > (LONG_BIT)" is likely going to say (having typed in a similar > response quite a few times). > > What about the others ("Float precision","Feature -> PEP42")? Can > somebody please post the text associated with them? Geez, SF's permissions are really totally incomprehensible. (The same problem I have with Zope permissions: there are so many of them that you never know which permission is needed for a particular operation.) I've given you some more permissions (all except project admin :-), see if you can see these now. Also, here are the standard messages we currently have: In the Bugs tracker: Feature->PEP42 I've added this feature request to PEP 42. Float precision This is not a bug. Binary floating point cannot represent decimal fractions exactly, so some rounding always occurs (even in Python 1.5.2). What changed is that Python 2.0 shows more precision than before in certain circumstances (repr() and the interactive prompt). You can use str() or print to get the old, rounded output: >>> print 0.1+0.1 0.2 >>> Follow the link for a detailed example: http://www.python.org/cgi-bin/moinmoin/RepresentationError Missing upload There's no uploaded file! You have to check the checkbox labeled "Check to Upload & Attach File" when you upload a file. Please try again. (This is a SourceForge annoyance that we can do nothing about. :-( ) RH 7 compile (LONG_BIT) The GCC version that comes with Red Hat 7.0 is not fit for distribution. In particular, it defines LONG_BIT as 64 on 32-bit machines under certain circumstances, and letting this go unchecked would cause it to generate bad code for various Python arithmetic operations. The solution is to download a valid version of GCC. See http://www.python.org/cgi- bin/moinmoin/FrequentlyAskedQuestions#line53 for more information. In the Patches tracker: Missing upload There's no uploaded file! You have to check the checkbox labeled "Check to Upload & Attach File" when you upload a file. Please try again. (This is a SourceForge annoyance that we can do nothing about. :-( ) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Sep 28 15:11:02 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 28 Sep 2001 10:11:02 -0400 Subject: [Python-Dev] stat return types In-Reply-To: Your message of "Fri, 28 Sep 2001 10:12:16 +0200." <200109280812.KAA03126@paros.informatik.hu-berlin.de> References: <200109280812.KAA03126@paros.informatik.hu-berlin.de> Message-ID: <200109281411.KAA28663@cj20424-a.reston1.va.home.com> > Nich Mathewson has developed a patch that allows to access stat fields > by name, rather than by tuple index, while preserving backwards > compatibility, see > > http://sourceforge.net/tracker/?func=detail&atid=305470&aid=462296&group_id=5470 (I still have a comment on the patch.) > One issue is non-posix stat implementations, which currently live in > riscos and mac. Any comments on the patch are appreciated, in > particular with regard to the following question: > > - should we attempt to modify those modules even though we cannot test > the modifications, or should we accept that field access will not be > possible on these platforms until somebody with platform access > fixes them? We should attempt to fix these implementations and notify the authors. > - should the patch go into 2.2 (personally, I'd like to see it > integrated)? Yes. --Guido van Rossum (home page: http://www.python.org/~guido/) From loewis@informatik.hu-berlin.de Fri Sep 28 15:39:34 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Fri, 28 Sep 2001 16:39:34 +0200 (MEST) Subject: [Python-Dev] Canned Responses In-Reply-To: <200109281354.JAA25665@cj20424-a.reston1.va.home.com> (message from Guido van Rossum on Fri, 28 Sep 2001 09:54:08 -0400) References: <200109280722.JAA28446@pandora.informatik.hu-berlin.de> <200109281354.JAA25665@cj20424-a.reston1.va.home.com> Message-ID: <200109281439.QAA17794@paros.informatik.hu-berlin.de> > I've given you some more permissions (all except project admin :-), > see if you can see these now. Thanks, with these permissions, I can now see all the canned responses, as well as performing the admin actions for the trackers (create categories, add and modify canned responses, ...) Regards, Martin From barry@zope.com Fri Sep 28 18:02:52 2001 From: barry@zope.com (Barry A. Warsaw) Date: Fri, 28 Sep 2001 13:02:52 -0400 Subject: [Python-Dev] RELEASED: Python 2.2a4 is out! Message-ID: <15284.44348.794514.487554@anthem.wooz.org> We've released Python 2.2a4, the fourth and likely last alpha of Python 2.2, for your erudition, escapism, or enthusiastic epithets. Download it from: http://www.python.org/2.2/ Give it a good try, and report what breaks to the bug tracker: http://sourceforge.net/bugs/?group_id=5470 New features in this release include: - pydoc and inspect are now aware of new-style classes - ExtensionClass modules can now safely be used with Python 2.2 and Zope 2.4.1 now works with Python 2.2. - For new-style classes, what was previously called __getattr__ is now called __getattribute__, since its semantics are different than classic __getattr__. - The builtin file type can now be subclassed. "file" is the name of a builtin type, file() is a builtin constructor, and has the same signature as builtin open(). - More support for iterators in some standard classes. - A new package called "email" has been added; a new module called SimpleXMLRPCServer has been added; the codecs module has grown four new helper APIs; smtplib now supports various authentication and security features; a new module called hmac has been added... - Large file support (LFS) is now automatic when the underlying platform supports it. - Compaq's iPAQ handheld, running the "familiar" Linux distribution (http://familiar.handhelds.org) is supported. As usual, Andrew Kuchling is writing a gentle introduction to the most important changes (currently excluding type/class unification), titled "What's New in Python 2.2": http://www.amk.ca/python/2.2/ There is an introduction to the type/class unification at: http://www.python.org/2.2/descrintro.html Thanks to everybody who contributed to this release, including all the 2.2 alpha 1, 2, and 3 testers! Enjoy! -Barry From tim.one@home.com Fri Sep 28 20:06:25 2001 From: tim.one@home.com (Tim Peters) Date: Fri, 28 Sep 2001 15:06:25 -0400 Subject: [Python-Dev] Canned Responses In-Reply-To: <200109281354.JAA25665@cj20424-a.reston1.va.home.com> Message-ID: > Float precision > This is not a bug. > > Binary floating point cannot represent decimal fractions exactly, > so some rounding always occurs (even in Python 1.5.2). > > What changed is that Python 2.0 shows more precision than before > in certain circumstances (repr() and the interactive prompt). > > You can use str() or print to get the old, rounded output: > > >>> print 0.1+0.1 > 0.2 > >>> > > Follow the link for a detailed example: > > http://www.python.org/cgi-bin/moinmoin/RepresentationError Note that I've since changed the last two lines: Follow the link for more information: http://python.sourceforge.net/devel-docs/tut/node14.html From mal@lemburg.com Fri Sep 28 21:43:07 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 28 Sep 2001 22:43:07 +0200 Subject: [Python-Dev] Re: __op__ and __rop__ ([Python-checkins] CVS: python/dist/src PLAN.txt,1.10,1.11) References: Message-ID: <3BB4E0DB.7DB3CC82@lemburg.com> Guido van Rossum wrote: > > + Treat all binary operators the same way as I just did for rich > + comparison: in a b, if isinstance(b, type(a)), try b.__rop__(a) > + before trying a.__op__(b). Some comments (could be that I'm misunderstanding your note...): - doesn't this cause an incompatibility to classic instances (__rop__ is fallback to __op__ not the other way around) ? - it seems a huge performance loss to first check for __rop__ (in my experience, __rop__ is only rarely implemented) The classic scheme for this is documented in abstract.c: Calling scheme used for binary operations: v w Action ------------------------------------------------------------------- new new v.op(v,w), w.op(v,w) new old v.op(v,w), coerce(v,w), v.op(v,w) old new w.op(v,w), coerce(v,w), v.op(v,w) old old coerce(v,w), v.op(v,w) Legend: ------- * new == new style number * old == old style number * Action indicates the order in which operations are tried until either a valid result is produced or an error occurs. Most (if not all) Python builtin numeric types are new-style numbers. How would your plan fit into this picture ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@python.org Fri Sep 28 21:50:07 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 28 Sep 2001 16:50:07 -0400 Subject: [Python-Dev] channews.rdf Message-ID: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> Looking at the new webstats (still at http://www.python.org/~thomas/wwwstats.new/ -- Thomas, when are you going to move these to a better place?) I noticed that /channews.rdf is the most requested file after / -- but the channews.rdf file hasn't been updated in ages! This page is used to be used by the "my netscape" python channel, but I can't find that any more ("my netscape" no longer seems to support personalized channels?). I wonder what other services reference it automatically? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Sep 28 22:10:03 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 28 Sep 2001 17:10:03 -0400 Subject: [Python-Dev] Re: __op__ and __rop__ ([Python-checkins] CVS: python/dist/src PLAN.txt,1.10,1.11) In-Reply-To: Your message of "Fri, 28 Sep 2001 22:43:07 +0200." <3BB4E0DB.7DB3CC82@lemburg.com> References: <3BB4E0DB.7DB3CC82@lemburg.com> Message-ID: <200109282110.RAA12336@cj20424-a.reston1.va.home.com> > Guido van Rossum wrote: > > > > + Treat all binary operators the same way as I just did for rich > > + comparison: in a b, if isinstance(b, type(a)), try b.__rop__(a) > > + before trying a.__op__(b). I didn't write the whole constraint: I should have added "if type(a) is not type(b)" before "isinstance(b, type(a))". > Some comments (could be that I'm misunderstanding your note...): I hope so. > - doesn't this cause an incompatibility to classic instances > (__rop__ is fallback to __op__ not the other way around) ? This change only apply to new-style classes. > - it seems a huge performance loss to first check for __rop__ > (in my experience, __rop__ is only rarely implemented) But that presumably means that you are only using your operators with both operators being of the same type, and then __op__ is tried first. > The classic scheme for this is documented in abstract.c: > > Calling scheme used for binary operations: > > v w Action > ------------------------------------------------------------------- > new new v.op(v,w), w.op(v,w) > new old v.op(v,w), coerce(v,w), v.op(v,w) > old new w.op(v,w), coerce(v,w), v.op(v,w) > old old coerce(v,w), v.op(v,w) > > Legend: > ------- > * new == new style number > * old == old style number > * Action indicates the order in which operations are tried until either > a valid result is produced or an error occurs. > > Most (if not all) Python builtin numeric types are new-style > numbers. How would your plan fit into this picture ? Because of the constraint isinstance(b, type(a)), this would only be done for the "new new" case. I have to admit I haven't done enough research to figure out if this is feasible; in 2.2a4 it's implemented for rich comparisons only, so you can have a look at the code there and play with it. For most purposes, rich comparisons behave similar to new-style numeric operations, except that the __op__ <--> __rop__ mapping is different (see swapped_op). --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Fri Sep 28 23:03:18 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 28 Sep 2001 18:03:18 -0400 (EDT) Subject: [Python-Dev] [development doc updates] Message-ID: <20010928220318.14E2C28697@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Various small adjustments and bug fixes. Added preliminary docs for the SimpleXMLRPCServer module. From thomas@xs4all.net Sat Sep 29 00:28:00 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sat, 29 Sep 2001 01:28:00 +0200 Subject: [Python-Dev] Re: channews.rdf In-Reply-To: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> References: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> Message-ID: <20010929012800.N844@xs4all.nl> On Fri, Sep 28, 2001 at 04:50:07PM -0400, Guido van Rossum wrote: > Looking at the new webstats (still at > http://www.python.org/~thomas/wwwstats.new/ -- Thomas, when are you going > to move these to a better place?) When I figure out what the PSF wants ;) I guess I'll go with Barry's idea, because that one at least got two votes (mine and his) and the others just one. So if anyone disagrees with that, speak up. I did update the stats, btw, both www and ftp, and removed the old pages; you can now access both using just ~thomas/{www,ftp}stats. Moving them to the proper spot and removing some of the info should be peanuts. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido@python.org Sat Sep 29 00:32:00 2001 From: guido@python.org (Guido van Rossum) Date: Fri, 28 Sep 2001 19:32:00 -0400 Subject: [Python-Dev] Re: channews.rdf In-Reply-To: Your message of "Sat, 29 Sep 2001 01:28:00 +0200." <20010929012800.N844@xs4all.nl> References: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> <20010929012800.N844@xs4all.nl> Message-ID: <200109282332.TAA21299@cj20424-a.reston1.va.home.com> > When I figure out what the PSF wants ;) I guess I'll go with Barry's idea, > because that one at least got two votes (mine and his) and the others just > one. So if anyone disagrees with that, speak up. Sounds good. I think you got all the help from the PSF membership that you can expect. :-( > I did update the stats, btw, both www and ftp, and removed the old pages; > you can now access both using just ~thomas/{www,ftp}stats. Moving them to > the proper spot and removing some of the info should be peanuts. Cool. Can you tell me which sites frequently request /channews.rdf? --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas@xs4all.net Sat Sep 29 11:31:07 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sat, 29 Sep 2001 12:31:07 +0200 Subject: [Python-Dev] Re: channews.rdf In-Reply-To: <200109282332.TAA21299@cj20424-a.reston1.va.home.com> References: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> <20010929012800.N844@xs4all.nl> <200109282332.TAA21299@cj20424-a.reston1.va.home.com> Message-ID: <20010929123107.O844@xs4all.nl> On Fri, Sep 28, 2001 at 07:32:00PM -0400, Guido van Rossum wrote: > Can you tell me which sites frequently request /channews.rdf? The requester list is quite long: 2829 distinct ip-addresses since the start of August, with an average of almost 100 requests each. Here is a top 5: 25180 24.24.52.209 11087 204.192.44.242 5439 212.125.76.33 5341 64.89.70.32 5152 194.236.124.44 ... and then it slowly dwindles down to 1. My guess is that whatever this channel thing is (I honestly don't know), you don't need to be listed on a site somewhere for people to still access the channel, probably every time they start their browser. I think the number of people is actually quite a bit less than that 2800, due to dynamic ipaddresses; just over 850 distinct ip-addresses requested it ten times or more in those two months. Referrer data confirms that we aren't actually listed in many places anymore. Here's the entire list: 222150 - 11676 http://radiodiscuss.userland.com/myUserLandOnTheDesktop 551 http://frontier.userland.com/xmlAggregator 3 http://www.xmltree.com/dir/viewResource.html?urlID=25170 1 http://www.python.org/~thomas/wwwstats.new/usage_200109.html#TOPURLS 1 http://www.python.org/~thomas/wwwstats.new/usage_200108.html 1 http://www.ourfavoritesongs.com/ 1 http://www.google.com/search?q=python%20and%20RSS&sourceid=opera&num=0 1 http://www.google.com/search?client=googlet&q=python%20rss 1 http://www.director-online.com/newshound.cfm 1 http://www.bilug.linux.it/htmlheadline/newsinf.htm 1 http://lists.eazel.com/pipermail/nautilus-list/2001-May/002845.html 1 http://lists.eazel.com/pipermail/nautilus-list/2001-May/002843.html 1 http://home.talkcity.com/PicassoPl/dinoch/news.htm The topmost one is 'no referrer' and are probably all bookmarks and 'channels' and stuff. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From skip@pobox.com (Skip Montanaro) Sat Sep 29 14:58:02 2001 From: skip@pobox.com (Skip Montanaro) (Skip Montanaro) Date: Sat, 29 Sep 2001 08:58:02 -0500 Subject: [Python-Dev] Re: channews.rdf In-Reply-To: <20010929123107.O844@xs4all.nl> References: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> <20010929012800.N844@xs4all.nl> <200109282332.TAA21299@cj20424-a.reston1.va.home.com> <20010929123107.O844@xs4all.nl> Message-ID: <15285.54122.317981.314455@beluga.mojam.com> Thomas> 222150 - Thomas> 11676 http://radiodiscuss.userland.com/myUserLandOnTheDesktop Thomas> 551 http://frontier.userland.com/xmlAggregator Thomas> 3 http://www.xmltree.com/dir/viewResource.html?urlID=25170 Thomas> 1 http://www.python.org/~thomas/wwwstats.new/usage_200109.html#TOPURLS Thomas> The topmost one is 'no referrer' and are probably all bookmarks Thomas> and 'channels' and stuff. My guess is that almost nobody has channews.rdf bookmarked since it's not intended for human consumption. The large no-referrer count indicates there are indeed lots of servers grabbing the rdf file and using it to update news items on their own websites. My guess is that many of these "channel servers" probably detect that channews.rdf has not been updated recently, so they will back off on their check frequency until they notice it changing again. Skip From fredrik@pythonware.com Sat Sep 29 17:23:14 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sat, 29 Sep 2001 18:23:14 +0200 Subject: [Python-Dev] Re: channews.rdf References: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> <20010929012800.N844@xs4all.nl> <200109282332.TAA21299@cj20424-a.reston1.va.home.com> <20010929123107.O844@xs4all.nl> <15285.54122.317981.314455@beluga.mojam.com> Message-ID: <00cd01c14903$0cc45ed0$b3fa42d5@hagrid> skip wrote: > My guess is that many of these "channel servers" probably detect > that channews.rdf has not been updated recently, so they will back > off on their check frequency until they notice it changing again. Optimist. In my experience, the only thing dumber than the typical RDF robot is the person who wrote it. I've seen robots trying to download a non-existent PDF file once an hour, years after it disappeared from our site. From barry@zope.com Sat Sep 29 17:44:58 2001 From: barry@zope.com (Barry A. Warsaw) Date: Sat, 29 Sep 2001 12:44:58 -0400 Subject: [Python-Dev] Re: channews.rdf References: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> <20010929012800.N844@xs4all.nl> <200109282332.TAA21299@cj20424-a.reston1.va.home.com> <20010929123107.O844@xs4all.nl> <15285.54122.317981.314455@beluga.mojam.com> <00cd01c14903$0cc45ed0$b3fa42d5@hagrid> Message-ID: <15285.64138.852232.267969@anthem.wooz.org> >>>>> "FL" == Fredrik Lundh writes: FL> In my experience, the only thing dumber than the typical RDF FL> robot is the person who wrote it. I've seen robots trying to FL> download a non-existent PDF file once an hour, years after it FL> disappeared from our site. And I was just going to suggest we just wax the file and see who complains... ;/ -Barry From mal@lemburg.com Sat Sep 29 12:42:16 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 29 Sep 2001 13:42:16 +0200 Subject: [Python-Dev] Re: channews.rdf References: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> <20010929012800.N844@xs4all.nl> <200109282332.TAA21299@cj20424-a.reston1.va.home.com> Message-ID: <3BB5B398.8B2B466D@lemburg.com> Guido van Rossum wrote: > > > When I figure out what the PSF wants ;) I guess I'll go with Barry's idea, > > because that one at least got two votes (mine and his) and the others just > > one. So if anyone disagrees with that, speak up. > > Sounds good. I think you got all the help from the PSF membership > that you can expect. :-( I suppose that you can count non-votes as +0 for whichever solution you choose :-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From guido@python.org Sat Sep 29 19:22:55 2001 From: guido@python.org (Guido van Rossum) Date: Sat, 29 Sep 2001 14:22:55 -0400 Subject: [Python-Dev] Re: channews.rdf In-Reply-To: Your message of "Sat, 29 Sep 2001 12:44:58 EDT." <15285.64138.852232.267969@anthem.wooz.org> References: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> <20010929012800.N844@xs4all.nl> <200109282332.TAA21299@cj20424-a.reston1.va.home.com> <20010929123107.O844@xs4all.nl> <15285.54122.317981.314455@beluga.mojam.com> <00cd01c14903$0cc45ed0$b3fa42d5@hagrid> <15285.64138.852232.267969@anthem.wooz.org> Message-ID: <200109291822.OAA25398@cj20424-a.reston1.va.home.com> > And I was just going to suggest we just wax the file and see who > complains... ;/ Actually, during the brief period that python.org was down and out, I got an email from Netscape once a day complaining about the missing channews.rdf... --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas@xs4all.net Sat Sep 29 21:34:46 2001 From: thomas@xs4all.net (Thomas Wouters) Date: Sat, 29 Sep 2001 22:34:46 +0200 Subject: [Python-Dev] Re: channews.rdf In-Reply-To: <15285.64138.852232.267969@anthem.wooz.org> References: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> <20010929012800.N844@xs4all.nl> <200109282332.TAA21299@cj20424-a.reston1.va.home.com> <20010929123107.O844@xs4all.nl> <15285.54122.317981.314455@beluga.mojam.com> <00cd01c14903$0cc45ed0$b3fa42d5@hagrid> <15285.64138.852232.267969@anthem.wooz.org> Message-ID: <20010929223446.A846@xs4all.nl> On Sat, Sep 29, 2001 at 12:44:58PM -0400, Barry A. Warsaw wrote: > >>>>> "FL" == Fredrik Lundh writes: > FL> In my experience, the only thing dumber than the typical RDF > FL> robot is the person who wrote it. I've seen robots trying to > FL> download a non-existent PDF file once an hour, years after it > FL> disappeared from our site. > And I was just going to suggest we just wax the file and see who > complains... ;/ Getting people to complain is easy. Just replace the .rdf file with hot topics from the monthly newsletter of the Bible Reader Group of Butler, KY. Of course, we could just add a flashy news item saying the channel is heavily out of date :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido@python.org Sun Sep 30 00:22:03 2001 From: guido@python.org (Guido van Rossum) Date: Sat, 29 Sep 2001 19:22:03 -0400 Subject: [Python-Dev] Re: channews.rdf In-Reply-To: Your message of "Sat, 29 Sep 2001 12:31:07 +0200." <20010929123107.O844@xs4all.nl> References: <200109282050.QAA12145@cj20424-a.reston1.va.home.com> <20010929012800.N844@xs4all.nl> <200109282332.TAA21299@cj20424-a.reston1.va.home.com> <20010929123107.O844@xs4all.nl> Message-ID: <200109292322.TAA25640@cj20424-a.reston1.va.home.com> I propose that we put something useful in it, like the links we have in the blue box at the top of python.org. --Guido van Rossum (home page: http://www.python.org/~guido/) From Samuele Pedroni Sat Sep 29 21:01:59 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Sat, 29 Sep 2001 22:01:59 +0200 (MET DST) Subject: [Python-Dev] Re: descr PLAN.txt (ii, oops) Message-ID: <200109292001.WAA07180@core.inf.ethz.ch> > - please, Guido advertise if you delete it, I know I can > resurrect it with the right cvs incantation but ... This is irrelevant, I can still see it in the Attic through viewcvs. Right? regards. From Samuele Pedroni Sat Sep 29 20:55:09 2001 From: Samuele Pedroni (Samuele Pedroni) Date: Sat, 29 Sep 2001 21:55:09 +0200 (MET DST) Subject: [Python-Dev] descr PLAN.txt Message-ID: <200109291955.VAA06923@core.inf.ethz.ch> Hi. Some questions, I have discovered the PLAN.txt in the CVS for the descr changes sometimes ago but more or less in trance :) Now I have developed some more awareness, some questions: - to which extent is what is listed there complete? - please, Guido advertise if you delete it, I know I can resurrect it with the right cvs incantation but ... I think and it seems it contains some useful information for when we try to port the descr changes to jython, or will all this stuff be completely documented somewhere else? Thanks. From guido@python.org Sun Sep 30 04:30:52 2001 From: guido@python.org (Guido van Rossum) Date: Sat, 29 Sep 2001 23:30:52 -0400 Subject: [Python-Dev] Re: descr PLAN.txt In-Reply-To: Your message of "Sat, 29 Sep 2001 21:55:09 +0200." <200109291955.VAA06923@core.inf.ethz.ch> References: <200109291955.VAA06923@core.inf.ethz.ch> Message-ID: <200109300330.XAA25843@cj20424-a.reston1.va.home.com> > Some questions, I have discovered the PLAN.txt in the CVS for the descr > changes sometimes ago but more or less in trance :) > > Now I have developed some more awareness, some questions: > - to which extent is what is listed there complete? > - please, Guido advertise if you delete it, I know I can > resurrect it with the right cvs incantation but ... > > I think and it seems it contains some useful information > for when we try to port the descr changes to jython, > or will all this stuff be completely documented somewhere else? Hopefully the PEPs will be much more complete records. PLAN.txt was just my own to-do list, indicating how far I was along the realization of the PEPs. I realize that the PEPs are currently behind, but it is in the PLAN.txt to remedy this. :-) I'm glad you are planning to copy this effort in Jython. (It should actually be simpler in Jython as it doesn't have so much of a dichotomy between types and classes as C-Python does.) If there's anything I can do to help (short of coding in Java :-) please let me know. In particular, if there's anything unclear in the PEPs or PLAN.txt or in python.org/2.2/descrintro.html or anywhere else (including the C code), I'd be happy to clarify it at your (or anybody else's) request. (Is Finn Bock still around in Jython?) --Guido van Rossum (home page: http://www.python.org/~guido/) From informacion.general@lideralia.com Sun Sep 30 12:48:52 2001 From: informacion.general@lideralia.com (Informacion General) Date: Sun, 30 Sep 2001 11:48:52 Subject: [Python-Dev] Informacion de Interes Message-ID:  Estimado amigo, nos complace comunicarte que Iberosoft, líder de soluciones software e implantaciones de portales interactivos en Español, ha cambiado su web corporativa. Visitanos en http://www.iberosoft.com/ y disfruta de nuestros exclusivos servicios Gratuitos. -Traceador Visual (VisualRoute Server en Español, Único en el mundo en nuestro idioma!!) -VisualPulse, monitor de tiempos de latencia de IPs, servidores, etc. -Webmail -Chat -Foros -Noticias. Y muchas mas.... Esperamos contar pronto con tu visita y te agradecemos todo tipo de comentarios Así mismo te expresamos nuestro interés en conocer tu opinión sobre nuestras actividades. ¿quieres trabajar con nosotros? Envíanos tu CV lo mas detallado posible a rrhh@iberosoft.com  Recibe un saludo del equipo de Iberosoft.com noticias@iberosoft.com From rnd@onego.ru Sun Sep 30 13:19:18 2001 From: rnd@onego.ru (Roman Suzi) Date: Sun, 30 Sep 2001 16:19:18 +0400 (MSD) Subject: [Python-Dev] Python 2.2a* getattr suggestion and question Message-ID: Well, now every attr access goes thru __getattr__-method, so this could cause situations which give not so clear diagnostics: ----------------------------------------------------------------- Python 2.2a3 (#1, Sep 26 2001, 22:42:46) [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2 Type "help", "copyright", "credits" or "license" for more information. HELP loaded. Readline loaded. History loaded. >>> class A: ... def aa(self): ... print self ... >>> class B(A): ... def __getattr__(self, x): ... print "getattr:", x ... >>> b = B() >>> b.aa getattr: __repr__ Traceback (most recent call last): File "", line 1, in ? TypeError: object is not callable: None ----------------------------------------------------------------- The problem above is that __repr__ can't be called because __getattr__ intercepts it's call, giving None. Could something be done with this to make it easy to trace such kinds of problems? Also, the question is what is the proper way (1 and only 1 way) to check if attribute exists inside __getattr__ method? How can it be done by one simple check like: def __getattr__(self, attr): if hasattr(self, attr): .... Or do I need some other tricks? Sincerely yours, Roman Suzi -- _/ Russia _/ Karelia _/ Petrozavodsk _/ rnd@onego.ru _/ _/ Sunday, September 30, 2001 _/ Powered by Linux RedHat 6.2 _/ _/ "Killer Rabbit's Motto: "Lettuce Prey."" _/ From gmcm@hypernet.com Sun Sep 30 14:08:58 2001 From: gmcm@hypernet.com (Gordon McMillan) Date: Sun, 30 Sep 2001 09:08:58 -0400 Subject: [Python-Dev] Python 2.2a* getattr suggestion and question In-Reply-To: Message-ID: <3BB6E12A.17205.4D9B7C8E@localhost> Roman Suzi wrote: > Well, now every attr access goes thru __getattr__-method, > so this could cause situations which give not so clear > diagnostics: > > ----------------------------------------------------------------- > > Python 2.2a3 (#1, Sep 26 2001, 22:42:46) > [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2 > Type "help", "copyright", "credits" or "license" for more > information. HELP loaded. Readline loaded. History loaded. >>> > class A: ... def aa(self): ... print self ... >>> class B(A): > ... def __getattr__(self, x): ... print "getattr:", x ... > >>> b = B() >>> b.aa getattr: __repr__ Traceback (most recent > call last): > File "", line 1, in ? > TypeError: object is not callable: None > ----------------------------------------------------------------- Hmm. If that last line you typed into the interp was, in fact, "b.aa()", then there's nothing new here. The "print" asked for __repr__ and got None. You'll get something very similar in any version of Python. If you really typed "b.aa", then something's really strange, because you didn't ask to call anything, yet B's __getattr__ was asked for "__repr__", not "aa". Since I doubt Guido has adopted VB's call-with-no-args-doesn't-need-parens, I bet you misquoted your session. - Gordon From guido@python.org Sun Sep 30 14:06:39 2001 From: guido@python.org (Guido van Rossum) Date: Sun, 30 Sep 2001 09:06:39 -0400 Subject: [Python-Dev] Python 2.2a* getattr suggestion and question In-Reply-To: Your message of "Sun, 30 Sep 2001 16:19:18 +0400." References: Message-ID: <200109301306.JAA32301@cj20424-a.reston1.va.home.com> > Well, now every attr access goes thru __getattr__-method, > so this could cause situations which give not so clear > diagnostics: Please update to 2.2a4. Every new-style instance attribute access goes throuh __getattribute__; __getattr__ does the same as for classic classes. It's true that if you screw with __getattribute__, you'll still break your objects; but that's hard to avoid given how things work. --Guido van Rossum (home page: http://www.python.org/~guido/) From rnd@onego.ru Sun Sep 30 14:48:36 2001 From: rnd@onego.ru (Roman Suzi) Date: Sun, 30 Sep 2001 17:48:36 +0400 (MSD) Subject: [Python-Dev] Python 2.2a* getattr suggestion and question In-Reply-To: <3BB6E12A.17205.4D9B7C8E@localhost> Message-ID: On Sun, 30 Sep 2001, Gordon McMillan wrote: >Hmm. If that last line you typed into the interp was, in fact, >"b.aa()", then there's nothing new here. The "print" asked for >__repr__ and got None. You'll get something very similar in >any version of Python. > >If you really typed "b.aa", then something's really strange, >because you didn't ask to call anything, yet B's __getattr__ >was asked for "__repr__", not "aa". Since I doubt Guido has >adopted VB's call-with-no-args-doesn't-need-parens, I bet you >misquoted your session. No, I have it right. It was my intention to try b.aa. Every Python object has ability to represent itself as string. That is what I wanted here. Sincerely yours, Roman Suzi -- _/ Russia _/ Karelia _/ Petrozavodsk _/ rnd@onego.ru _/ _/ Sunday, September 30, 2001 _/ Powered by Linux RedHat 6.2 _/ _/ "Killer Rabbit's Motto: "Lettuce Prey."" _/ From rnd@onego.ru Sun Sep 30 14:56:40 2001 From: rnd@onego.ru (Roman Suzi) Date: Sun, 30 Sep 2001 17:56:40 +0400 (MSD) Subject: [Python-Dev] Python 2.2a* getattr suggestion and question In-Reply-To: <200109301306.JAA32301@cj20424-a.reston1.va.home.com> Message-ID: On Sun, 30 Sep 2001, Guido van Rossum wrote: >> Well, now every attr access goes thru __getattr__-method, >> so this could cause situations which give not so clear >> diagnostics: > >Please update to 2.2a4. Every new-style instance attribute access >goes throuh __getattribute__; __getattr__ does the same as for classic >classes. That is good to hear! I forgot to upgrade to latest version before posting complains, sorry. >It's true that if you screw with __getattribute__, you'll still break >your objects; but that's hard to avoid given how things work. Sure. Still, I think interpreter diagnostics should be pointing to the exact place of trouble. At least, __getattribute__ must appear somewhere in the traceback to give a hint where from __repr__ was attempted to be called. Sincerely yours, Roman Suzi -- _/ Russia _/ Karelia _/ Petrozavodsk _/ rnd@onego.ru _/ _/ Sunday, September 30, 2001 _/ Powered by Linux RedHat 6.2 _/ _/ "Killer Rabbit's Motto: "Lettuce Prey."" _/ From loewis@informatik.hu-berlin.de Sun Sep 30 15:53:06 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Sun, 30 Sep 2001 16:53:06 +0200 (MEST) Subject: [Python-Dev] Integrating Expat Message-ID: <200109301453.QAA21436@paros.informatik.hu-berlin.de> [I know I've asked this before, but Fred wanted me to ask it again :-] What do you think about an integration of Expat into Python, to be always able to build pyexpat (and with the same version also)? Which version of Expat would you use? Would you put the expat files into a separate directory, or all into modules? Here is my proposal: Integrate Expat 2.95.2 for release together with Python 2.2; into an expat subdirectory of Modules (taking only the lib files of expat). This would affect build procedures on all targets; in particular, pyexpat would not link to a shared expat DLL, but incorporate the object files. Regards, Martin From mal@lemburg.com Sun Sep 30 17:43:05 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 30 Sep 2001 18:43:05 +0200 Subject: [Python-Dev] Integrating Expat References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> Message-ID: <3BB74B99.3B230398@lemburg.com> Martin von Loewis wrote: > > [I know I've asked this before, but Fred wanted me to ask it again :-] > > What do you think about an integration of Expat into Python, to be > always able to build pyexpat (and with the same version also)? > Which version of Expat would you use? Would you put the expat files > into a separate directory, or all into modules? > > Here is my proposal: Integrate Expat 2.95.2 for release together with > Python 2.2; into an expat subdirectory of Modules (taking only the lib > files of expat). > > This would affect build procedures on all targets; in particular, > pyexpat would not link to a shared expat DLL, but incorporate the > object files. Are you sure that we should choose expat as "native" XML parser ? There are other candidates which would fit this role just as well (in particular, Fredrik's sgmlop looks like a nice extension since it not only works with XML but also many other meta languages). If you want a very fast validating XML parser, RXP would also be a good choice -- AFAIK, the RXP folks would allow us to ship RXP under a different license than GPL which is then bound to Python. Given the many alternatives, I am not sure whether going with expat is the right path... may be wrong though. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Consulting & Company: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/ From loewis@informatik.hu-berlin.de Sun Sep 30 18:08:46 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Sun, 30 Sep 2001 19:08:46 +0200 (MEST) Subject: [Python-Dev] Integrating Expat In-Reply-To: <3BB74B99.3B230398@lemburg.com> (mal@lemburg.com) References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> <3BB74B99.3B230398@lemburg.com> Message-ID: <200109301708.TAA22863@paros.informatik.hu-berlin.de> > Are you sure that we should choose expat as "native" XML parser ? It wouldn't necessarily be the only parser. To process XML, different applications have different needs. However, since the expatreader is the only SAX reader included in the standard library at the moment, guaranteeing presence of pyexpat is oft-requested. Notice that pyexpat.c is also in the standard library already. > There are other candidates which would fit this role just > as well (in particular, Fredrik's sgmlop looks like a nice > extension since it not only works with XML but also many > other meta languages). Not that many candidates would work as well. For example, sgmlop has a number of known bugs, and a few unknown ones. Guido once complained that it is easy to crash sgmlop with ill-formed input, and rejected inclusion of sgmlop when xmlrpclib was integrated. A known problem is that entity references are not expanded in attributes. Beyond that, I'm not aware of many more pure-C parsers that could be reasonably be integrated into the core. There are many XML parsers, but many of the are written in C++ or Java. > If you want a very fast validating XML parser, RXP would also > be a good choice -- AFAIK, the RXP folks would allow us to > ship RXP under a different license than GPL which is then > bound to Python. RXP would indeed be a choice. Of course, integrating it is much harder; you'd have to write the C module first, plus documentation, plus a SAX driver, plus test cases. I'm not sure how much code you can inherit from PyLTXML. On performance: Please have a look at http://www.xml.com/lpt/a/Benchmark/exec.html which suggests that expat still has a speed advantage over rxp (assuming that the measurements where done carefully, i.e. disabling validation in RXP). > Given the many alternatives, I am not sure whether going with > expat is the right path... may be wrong though. It shouldn't be the only path. pyexpat is already integrated into the Python library, all I'm suggesting to give the promise that it will be available on every 2.2 Python installation. Any volunteers working on RXP integration are certainly welcome to do so; code contributions to PyXML will be welcome (provided the GPL issue gets resolved). Code contributions to the Python core would require some review, of course - it took quite some time to get pyexpat stable, and I guess any other C-integrated parser won't work from scratch, either. Regards, Martin From fredrik@pythonware.com Sun Sep 30 18:30:55 2001 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sun, 30 Sep 2001 19:30:55 +0200 Subject: [Python-Dev] Integrating Expat References: <200109301453.QAA21436@paros.informatik.hu-berlin.de> Message-ID: <01f001c149d5$ac48abf0$b3fa42d5@hagrid> martin wrote: > Here is my proposal: Integrate Expat 2.95.2 for release together with > Python 2.2; into an expat subdirectory of Modules (taking only the lib > files of expat). I have a terrible fever, so my thinking is probably muddled, but I'm pretty sure I said +1 the last time this issue was raised. And I'm pretty sure I still agree with myself. From guido@python.org Sun Sep 30 19:24:14 2001 From: guido@python.org (Guido van Rossum) Date: Sun, 30 Sep 2001 14:24:14 -0400 Subject: [Python-Dev] Python 2.2a* getattr suggestion and question In-Reply-To: Your message of "Sun, 30 Sep 2001 17:48:36 +0400." References: Message-ID: <200109301824.OAA32548@cj20424-a.reston1.va.home.com> > >If you really typed "b.aa", then something's really strange, > >because you didn't ask to call anything, yet B's __getattr__ > >was asked for "__repr__", not "aa". Since I doubt Guido has > >adopted VB's call-with-no-args-doesn't-need-parens, I bet you > >misquoted your session. For Gordon: the repr() call was implied when the value retrieved was about to be printed by the interactive interpreter. > No, I have it right. It was my intention to try b.aa. Every Python object > has ability to represent itself as string. That is what I wanted here. For Roman: *most* objects have this ability, but a bug in a program may cause this to fail. It's not a guarantee. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Sep 30 19:37:00 2001 From: guido@python.org (Guido van Rossum) Date: Sun, 30 Sep 2001 14:37:00 -0400 Subject: [Python-Dev] Python 2.2a* getattr suggestion and question In-Reply-To: Your message of "Sun, 30 Sep 2001 17:56:40 +0400." References: Message-ID: <200109301837.OAA32659@cj20424-a.reston1.va.home.com> > Sure. Still, I think interpreter diagnostics should be pointing to the > exact place of trouble. At least, __getattribute__ must appear somewhere > in the traceback to give a hint where from __repr__ was attempted to be > called. When you write a faulty __getattribute__ that returns None instead of raising AttributeError, it's not realistic to expect __getattribute__ to be in the stack trace. --Guido van Rossum (home page: http://www.python.org/~guido/)