From gnn at neville-neil.com Mon Aug 1 02:33:37 2005 From: gnn at neville-neil.com (George V. Neville-Neil) Date: Mon, 01 Aug 2005 09:33:37 +0900 Subject: [Python-Dev] Extension of struct to handle non byte aligned values? Message-ID: Hi, I'm attempting to write a Packet class, and a few other classes for use in writing protocol conformance tests. For the most part this is going well except that I'd like to be able to pack and unpack byte strings with values that are not 8 bit based quantities. As an example, I'd like to be able to grab just a single bit from a byte string, and I'd also like to modify, for example, 13 bits. These are all reasonable quantities in an IPv4 packet. I have looked at doing this all in Python within my own classes but I believe this is a general extension that would be good for the struct module. I could also write a new module, bitstruct, to do this but that seems silly. I did not find anything out there that handles this case, so if I missed that then please let me know. My proposal would be for a new format character, 'z', which is followed by a position in bits from 0 to 31 so that we get either a byte, halfword, or longword based byte string back and then an optional 'r' (for run length, and because 'l' and 's' are already used) followed by a number of bits. The default length is 1 bit. I believe this is sufficient for most packet protocols I know of because, for the most part, protocols try to be 32 or 64bit aligned. This would ALWAYS unpack into an int type. So, you would see this: bytestring = pack("z0r3z3r13", flags, fragment) this would pack the flags and fragment offset in a packet at bits 0-3 and 3-13 respectively and return a 2 byte byte-string. header_length = unpack("z4r4", packet.bytes) would retrieve the header length from the packet, which is from bits 4 through 8. Thoughts? Thanks, George From greg.ewing at canterbury.ac.nz Mon Aug 1 04:57:18 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 01 Aug 2005 14:57:18 +1200 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <42EB8402.10902@gmail.com> References: <42EB576F.3060309@egenix.com> <42EB8402.10902@gmail.com> Message-ID: <42ED8F8E.4070404@canterbury.ac.nz> Nick Coghlan wrote: > New Hierarchy > ============= > > Raisable (formerly Exception) > +-- CriticalException (new) > +-- KeyboardInterrupt > +-- MemoryError > +-- SystemError > +-- ControlFlowException (new) > +-- GeneratorExit > +-- StopIteration > +-- SystemExit > +-- Exception (formerly StandardError) If CriticalException and ControlFlowException are to be siblings of Exception rather than subclasses of it, they should be renamed so that they don't end with "Exception". Otherwise there will be a confusing mismatch between the actual inheritance hierarchy and the one suggested by the naming. Also, I'm not entirely happy about Exception no longer being at the top, because so far the word "exception" in relation to Python has invariably meant "anything that can be raised". This terminology is even embedded in the syntax with the try-except statement. Changing this could to lead to some awkward circumlocutions in the documentation and confusion in discussions. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From stephen at xemacs.org Mon Aug 1 08:54:17 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 01 Aug 2005 15:54:17 +0900 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1122607673.9665.38.camel@geddy.wooz.org> (Barry Warsaw's message of "Thu, 28 Jul 2005 23:27:53 -0400") References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> Message-ID: <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "BAW" == Barry Warsaw writes: BAW> So are you saying that moving to svn will let us do more long BAW> lived branches? Yay! Yes, but you still have to be disciplined about it. svn is not much better than cvs about detecting and ignoring spurious conflicts due to code that gets merged from branch A to branch B, then back to branch A. Unrestricted cherry-picking is still out. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From metawilm at gmail.com Mon Aug 1 10:49:46 2005 From: metawilm at gmail.com (Willem Broekema) Date: Mon, 1 Aug 2005 10:49:46 +0200 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: References: <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> Message-ID: On 7/31/05, Brett Cannon wrote: > On 7/31/05, Willem Broekema wrote: > > I does not seem right to me to think of KeyboardInterrupt as a means > > to cause program halting. An interpreter could in principle recover > > from it and resume execution of the program. > > > > Same goes for MemoryError as well, but you probably don't want to > catch that exception either. Well, an possible scenario is that if allocation of memory fails, then the interpreter (not the Python program in it) can detect that it is not caught explicitly and print possible ways of execution, like "try the allocation again" or "abort the program", letting the user determine how to proceed. Although in this case immediately retrying the allocation will fail again, so the user has to have a way to free some objects in the meantime. I realize it's major work to add recovery features to the CPython interpreter, so I don't think CPython will have anything like it soon and therefore also Python-the-language will not. Instead, my reason for mentioning this is to get the _concept_ of recoveries across. I think including (hypothetical, for now) recovery features in a discussion about exceptions is valuable, because that influences whether one thinks a label like "critical" for an exception is appropriate. I'm working on an implementation of Python in Common Lisp. The CL condition system offers recovery features, so this implementation could, too. Instead of the interpreter handling the interrupt in an application-specific way, as Fred said, the interpreter could handle the interrupt by leaving the choice to the user. Concretely, this is how KeyboardInterrupt is handled by a CL interpreter, and thus also how a Python interpreter could handle it: (defun foo () (loop for i from 0 do (format t "~A " i))) (foo) => 0 1 2 3 Error: Received signal number 2 (Keyboard interrupt) [condition type: INTERRUPT-SIGNAL] Restart actions (select using :continue): 0: continue computation 1: Return to Top Level (an "abort" restart). 2: Abort entirely from this process. :continue 0 => 4 5 6 ... > But it doesn't sound like you are arguing against putting > KeyboardInterrupt under CriticalException, but just the explanation I > gave, right? I hope the above makes the way I'm thinking more clear. Like Phillip J. Eby, I think that labeling KeyboardInterrupt a CriticalException seems wrong; it is not an error and not critical. - Willem From mwh at python.net Mon Aug 1 12:26:15 2005 From: mwh at python.net (Michael Hudson) Date: Mon, 01 Aug 2005 11:26:15 +0100 Subject: [Python-Dev] Extension of struct to handle non byte aligned values? In-Reply-To: (George V. Neville-Neil's message of "Mon, 01 Aug 2005 09:33:37 +0900") References: Message-ID: <2m4qaa0x88.fsf@starship.python.net> "George V. Neville-Neil" writes: > Hi, > > I'm attempting to write a Packet class, and a few other classes for > use in writing protocol conformance tests. For the most part this is > going well except that I'd like to be able to pack and unpack byte > strings with values that are not 8 bit based quantities. [...] > Thoughts? Well, the main thing that comes to mind is that I wouldn't regard the struct interface as being something totally wonderful and perfect. I am aware of a few attempts to make up a better interface, such as ctypes and Bob's rather similar looking ptypes from macholib: http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py and various silly unreleased things I've done. They all work on the basic idea of a class schema that describes the binary structure, eg: class Sound(Message): code = 0x06 layout = [('mask', BYTE()), ('vol', CDI(1, SDI(BYTE(), 1/255.0), 1.0)), ('attenuation', CDI(2, SDI(BYTE(), 1/64.0), 1.0)), ('entitychan', SHORT()), ('soundnum', BYTE()), ('origin', COORD()*3)] You may want to do something similar (presumably the struct module or some other c stuff would be under the hood somewhere). I don't really see a need to change CPython here, unless some general binary parsing scheme becomes best-of-breed and a candidate for stdlib inclusion. Cheers, mwh PS: This is probably more comp.lang.python material. -- The use of COBOL cripples the mind; its teaching should, therefore, be regarded as a criminal offence. -- Edsger W. Dijkstra, SIGPLAN Notices, Volume 17, Number 5 From mwh at python.net Mon Aug 1 12:33:26 2005 From: mwh at python.net (Michael Hudson) Date: Mon, 01 Aug 2005 11:33:26 +0100 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: (Willem Broekema's message of "Mon, 1 Aug 2005 10:49:46 +0200") References: <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> Message-ID: <2mzms2ymix.fsf@starship.python.net> Willem Broekema writes: > I realize it's major work to add recovery features to the CPython > interpreter, so I don't think CPython will have anything like it soon > and therefore also Python-the-language will not. Instead, my reason > for mentioning this is to get the _concept_ of recoveries across. I > think including (hypothetical, for now) recovery features in a > discussion about exceptions is valuable, because that influences > whether one thinks a label like "critical" for an exception is > appropriate. Heh, I talked about this at EuroPython... http://starship.python.net/crew/mwh/recexc.pdf The technical barriers are insignificant, really. Cheers, mwh -- Our Constitution never promised us a good or efficient government, just a representative one. And that's what we got. -- http://www.advogato.org/person/mrorganic/diary.html?start=109 From stephen at xemacs.org Mon Aug 1 15:52:06 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 01 Aug 2005 22:52:06 +0900 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: (Willem Broekema's message of "Mon, 1 Aug 2005 10:49:46 +0200") References: <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> Message-ID: <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Willem" == Willem Broekema writes: Willem> I hope the above makes the way I'm thinking more clear. Willem> Like Phillip J. Eby, I think that labeling Willem> KeyboardInterrupt a CriticalException seems wrong; it is Willem> not an error and not critical. Uh, according to your example in Common LISP it is indeed an error, and if an unhandled signal whose intended interpretation is "drop the gun and put your hands on your head!" isn't critical, what is? I didn't miss your point, but I don't see a good reason to oppose that label based on the usual definitions of the words or Common LISP usage, either. It seems to me the relevant question is "is it likely that catching KeyboardInterrupt with 'except Exception:' will get sane behavior from a generic user-defined handler?" I think not; usually you'd like generic error recovery to _not_ bother the user, but KeyboardInterrupt sort of demands interaction with the user, no? So you're going to need a separate routine for KeyboardInterrupt, anyway. I expect that's going to be the normal case. So I would say KeyboardInterrupt should derive from CriticalException, not from Exception. I definitely agree that implementing recovery features is a good idea, and in interactive operation (or with an option to the interpreter), to allow for such recovery in the interpreter itself. For example, the interpreter could keep a small nest egg of memory for the purpose of interacting with the user; this would be harder for a program to do. And in many quickie scripts it would be convenient if the interpreter would drop into interactive mode, not die, if the program encounters a critical exception. But it's still a critical exception to the program written in Python, even if it's easy for the user to handle and the interpreter provides the capability to pass the buck to the user. The program has completely lost control! -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN From gabriel.becedillas at corest.com Mon Aug 1 19:36:42 2005 From: gabriel.becedillas at corest.com (Gabriel Becedillas) Date: Mon, 01 Aug 2005 14:36:42 -0300 Subject: [Python-Dev] Syscall Proxying in Python Message-ID: <42EE5DAA.8040200@corest.com> Hi, We embbeded Python 2.0.1 in our product a few years ago and we'd like to upgrade to Python 2.4.1. This was not a simple task, because we needed to execute syscalls on a remote host. We modified Python's source code in severall places to call our own versions of some functions. For example, instead of calling fopen(...), the source code was modified to call remote_fopen(...), and the same was done with other libc functions. Socket functions where hooked too (we modified socket.c), Windows Registry functions, etc.. There are some syscalls that we don't want to execute remotely. For example when importing a module. That has to be local, and we didn't modified that. Python scripts are executed locally, but syscalls are executed on a remote host, thus giving the illusion that the script is executing on the remote host. As I said before, we're in the process of upgrading and we don't want to make such unmaintainable changes to Python's code. We'd like to make as few changes as possible. The aproach we're trying this time is far less intrusive: We'd like to link Python with special libraries that override those functions that we want to execute remotely. This way the only code that has to be changed is the one that has to be executed locally. I wrote this mail to ask you guys for any useful advice in making this changes to Python's core. The only places I figure out right now that have to execute locally all the time are import.c and pythonrun.c, but I'm not sure at all. Maybe you guys figure out another way to achieve what we need. Thanks in advance. -- Gabriel Becedillas Developer CORE SECURITY TECHNOLOGIES Florida 141 - 2? cuerpo - 7? piso C1005AAC Buenos Aires - Argentina Tel/Fax: (54 11) 5032-CORE (2673) http://www.corest.com From abo at minkirri.apana.org.au Mon Aug 1 19:52:03 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Mon, 01 Aug 2005 10:52:03 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <1122918723.9680.33.camel@warna.corp.google.com> On Sun, 2005-07-31 at 23:54, Stephen J. Turnbull wrote: > >>>>> "BAW" == Barry Warsaw writes: > > BAW> So are you saying that moving to svn will let us do more long > BAW> lived branches? Yay! > > Yes, but you still have to be disciplined about it. svn is not much > better than cvs about detecting and ignoring spurious conflicts due to > code that gets merged from branch A to branch B, then back to branch > A. Unrestricted cherry-picking is still out. Yeah. IMHO the sadest thing about SVN is it doesn't do branch/merge properly. All the other cool stuff like renames etc is kinda undone by that. For a definition of properly, see; http://prcs.sourceforge.net/merge.html This is why I don't bother migrating any existing CVS projects to SVN; the benefits don't yet outweigh the pain of migrating. For new projects sure, SVN is a better choice than CVS. -- Donovan Baarda From abo at minkirri.apana.org.au Mon Aug 1 20:08:51 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Mon, 01 Aug 2005 11:08:51 -0700 Subject: [Python-Dev] Syscall Proxying in Python In-Reply-To: <42EE5DAA.8040200@corest.com> References: <42EE5DAA.8040200@corest.com> Message-ID: <1122919731.9688.43.camel@warna.corp.google.com> On Mon, 2005-08-01 at 10:36, Gabriel Becedillas wrote: > Hi, > We embbeded Python 2.0.1 in our product a few years ago and we'd like to > upgrade to Python 2.4.1. This was not a simple task, because we needed > to execute syscalls on a remote host. We modified Python's source code > in severall places to call our own versions of some functions. For > example, instead of calling fopen(...), the source code was modified to > call remote_fopen(...), and the same was done with other libc functions. > Socket functions where hooked too (we modified socket.c), Windows > Registry functions, etc.. Wow... you guys sure did it the hard way. If you had done it at the Python level, you would have had a much easier time of both implementing and updating it. As an example, have a look at my osVFS stuff. This is a replacement for the os module and open() that tricks Python into using a virtual file system; http://minkirri.apana.org.au/~abo/projects/osVFS -- Donovan Baarda From metawilm at gmail.com Mon Aug 1 22:53:19 2005 From: metawilm at gmail.com (Willem Broekema) Date: Mon, 1 Aug 2005 22:53:19 +0200 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> References: <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: On 8/1/05, Stephen J. Turnbull wrote: > Uh, according to your example in Common LISP it is indeed an error, I think you are referring to the first word of this line: Error: Received signal number 2 (Keyboard interrupt) [condition type: INTERRUPT-SIGNAL] Well, that refers to the fact that it was raised with (error ...). It says nothing about the type of a Keyboad interrupt condition. (The function 'error' vs 'signal' mark the distinction between raising conditions that must be handled otherwise you'll end up in the debugger, and conditions that when not handled are silently ignored.) The CL ANSI standard does not define what kind of condition a Keyboard interrupt is, so the implementations have to make that decision. Although this implementation (Allegro CL) has currently defined it as a subclass of 'error', I'm told it should have been a 'serious-condition' instead ('error' is a subclass of 'serious-condition', which is a subclass of 'condition'), precisely because forms like ignore-errors, like a bare except in Python, will catch it right now when they shouldn't. I assume most of the other Lisp implementations have already defined it as serious-condition. So, in short, Keyboard interrupts in Lisp are a serious-condition, not an error. (And what is labeled CriticalException in this discussion, has in serious-condition Lisp's counterpart.) > and if an unhandled signal whose intended interpretation is "drop the > gun and put your hands on your head!" isn't critical, what is? Eh, are you serious? > I didn't miss your point, but I don't see a good reason to oppose that > label based on the usual definitions of the words or Common LISP > usage, either. Well, I'm not opposed to KeyboardInterrupt being in a class that's not a subclass of 'Exception', when the latter is the class used in a bare 'except'. But when CriticalException, despite its name, is not a subclass of Exception, that is a bit strange. I'd prefer the 'condition' and 'error' terminology, and to label a keyboard interrupt a condition, not any kind of exception or error. > It seems to me the relevant question is "is it likely that catching > KeyboardInterrupt with 'except Exception:' will get sane behavior from > a generic user-defined handler?" I agree with you that it should not be caught in a bare 'except' (or an 'except Exception', when that is equivalent). - Willem From tdelaney at avaya.com Tue Aug 2 01:12:07 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Tue, 2 Aug 2005 09:12:07 +1000 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 Message-ID: <2773CAC687FD5F4689F526998C7E4E5F0742A5@au3010avexu1.global.avaya.com> Nick Coghlan wrote: > +-- Exception (formerly StandardError) > +-- AttributeError > +-- NameError > +-- UnboundLocalError > +-- RuntimeError > +-- NotImplementedError Time to wade in ... I've actually been wondering if NotImplementedError should actually be a subclass of AttributeError. Everywhere I can think of where I would want to catch NotImplementedError, I would also want to catch AttributeError. My main question is whether I would want the reverse to also be true - anywhere I want to catch AttributeError, I would want to catch NotImplementedError. Perhaps instead it should be the other way around - AttributeError inherits from NotImplementedError. This does make some kind of sense - the attribute hasn't been implemented. Both seem to have some advantages, but neither really feels right to me. Thoughts? Anyway, I came to this via another thing - NotImplementedError doesn't play very well with super(). In many ways it's worse to call super().method() that raises NotImplementedError than super().method() where the attribute doesn't exist. In both cases, the class calling super() needs to know whether or not it's at the end of the MRO for that method - possible to find out in most cases that would raise AttributeError, but impossible for a method that raises NotImplementedError. The only way I can think of to deal with this is to do a try: except (AttributeError, NotImplementedError) around every super() attribute call. This seems bad. Tim Delaney From bcannon at gmail.com Tue Aug 2 02:03:54 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 1 Aug 2005 17:03:54 -0700 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742A5@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F0742A5@au3010avexu1.global.avaya.com> Message-ID: On 8/1/05, Delaney, Timothy (Tim) wrote: > Nick Coghlan wrote: > > > +-- Exception (formerly StandardError) > > +-- AttributeError > > +-- NameError > > +-- UnboundLocalError > > +-- RuntimeError > > +-- NotImplementedError > > Time to wade in ... > > I've actually been wondering if NotImplementedError should actually be a > subclass of AttributeError. > > Everywhere I can think of where I would want to catch > NotImplementedError, I would also want to catch AttributeError. My main > question is whether I would want the reverse to also be true - anywhere > I want to catch AttributeError, I would want to catch > NotImplementedError. > > Perhaps instead it should be the other way around - AttributeError > inherits from NotImplementedError. This does make some kind of sense - > the attribute hasn't been implemented. > > Both seem to have some advantages, but neither really feels right to me. > Thoughts? The problem with subclassing NotImplementedError is you need to remember it is used to signal that a magic method does not work for a specific type and thus should try the __r*__ version. That is not a case, I feel, that has anything to do with attributes but implementation support. I am not going to subclass NotImplementedError unless a huge push for it in a very specific direction. -Brett (who is waiting on a PEP number...) From anthony at interlink.com.au Tue Aug 2 02:21:16 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon, 1 Aug 2005 17:21:16 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42E93940.6080708@v.loewis.de> References: <42E93940.6080708@v.loewis.de> Message-ID: <200508011721.18567.anthony@interlink.com.au> On Thursday 28 July 2005 13:00, Martin v. L?wis wrote: > I'd like to see the Python source be stored in Subversion instead > of CVS, I'm +1 on this, assuming we use the fsfs backend, and not the berkeley DB one. I'm -1 if we're using the bdb backend (I've had nothing but pain from it). > CVS has a number of limitations that have been elimintation by > Subversion. For the development of Python, the most notable improvements > are: > - ability to rename files and directories, and to remove directories, > while keeping the history of these files. > - support for change sets (sets of correlated changes to multiple > files) through global revision numbers. > - support for offline diffs, which is useful when creating patches. - tagging for releases will no longer cause the release manager to experience fits of burning rage (personal record was something like 1h45m for 'cvs tag' to finish, from memory). My only concern is that we have sufficient volunteers to manage the system. I'm happy to be one of these, but that's assuming we have other people also volunteering. . . Anthony -- Anthony Baxter It's never too late to have a happy childhood. From stephen at xemacs.org Tue Aug 2 04:07:50 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 02 Aug 2005 11:07:50 +0900 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1122918723.9680.33.camel@warna.corp.google.com> (Donovan Baarda's message of "Mon, 01 Aug 2005 10:52:03 -0700") References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> Message-ID: <8764up147d.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Donovan" == Donovan Baarda writes: Donovan> Yeah. IMHO the sadest thing about SVN is it doesn't do Donovan> branch/merge properly. All the other cool stuff like Donovan> renames etc is kinda undone by that. [...] This is why Donovan> I don't bother migrating any existing CVS projects to Donovan> SVN; the benefits don't yet outweigh the pain of Donovan> migrating. FWIW, XEmacs just had this discussion, and we basically came to the conclusion that for a multi-developer project it's _definitely_ worth the effort if it can be done by cvs2svn (which for us it probably can't, due to some black magic we did on the CVS repository a few years ago :-( ). For the record, I was opposed for exactly the reason you give, but changed my mind. The point is that with several developers there's almost surely someone enthusiastic enough about svn to bear the burden of fooling with the script for a couple of hours to see if it works, a fascist policy about migrating account names makes that almost trivial, and after that it's all gravy: the administration does not look any worse, the security issues are similar, and the change is likely to incite only a few people to press for account name changes after the move. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From greg.ewing at canterbury.ac.nz Tue Aug 2 04:57:40 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 02 Aug 2005 14:57:40 +1200 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F0742A5@au3010avexu1.global.avaya.com> Message-ID: <42EEE124.7070406@canterbury.ac.nz> Brett Cannon wrote: > The problem with subclassing NotImplementedError is you need to > remember it is used to signal that a magic method does not work for a > specific type and thus should try the __r*__ version. No, that's done by *returning* NotImplemented, not by raising an exception at all. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From stephen at xemacs.org Tue Aug 2 05:25:48 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 02 Aug 2005 12:25:48 +0900 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: (Willem Broekema's message of "Mon, 1 Aug 2005 22:53:19 +0200") References: <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <871x5d10lf.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Willem" == Willem Broekema writes: Willem> So, in short, Keyboard interrupts in Lisp are a Willem> serious-condition, not an error. Willem> (And what is labeled CriticalException in this discussion, Willem> has in serious-condition Lisp's counterpart.) I don't see it that way. Rather, "Raisable" is the closest equivalent to "serious-condition", and "CriticalException" is an intermediate class that has no counterpart in Lisp usage. >> and if an unhandled signal whose intended interpretation is >> "drop the gun and put your hands on your head!" isn't critical, >> what is? Willem> Eh, are you serious? Yes. Unhandled, KeyboardInterrupt means that the user has forcibly taken control away from the program without giving it a chance to preserve state, finish responding to (realtime) external conditions, or even activate vacation(1), and the program is entirely at the mercy of the user. Usually, the program then proceeds to die without dignity. If it's a realtime application, killing it is probably the only merciful thing to do. If you were the program, wouldn't you consider that critical? Willem> But when CriticalException, despite its name, is not a Willem> subclass of Exception, that is a bit strange. Granted. It doesn't bother me, but since it bothers both you and Philip Eby, I concede the point; we should find a better name (or not bother with such a class, see below). Willem> I'd prefer the 'condition' and 'error' terminology, and to Willem> label a keyboard interrupt a condition, not any kind of Willem> exception or error. Now, that does bother me. Anything we will not permit a program to ignore with a bare "except: pass" if it so chooses had better be more serious than merely a "condition". Also, to me a "condition" is something that I poll for, it does not interrupt me. To me, a condition (even a serious one) is precisely the kind of thing that I should be able to ignore with a bare except! Your description of the CL hierarchy makes me wonder if there's any benefit to having a class between Raisable and KeyboardInterrupt. Unlike SystemShutdown or PowerFailure, KeyboardInterrupt does imply presence of a user demanding attention; I suppose that warrants special treatment. On the other hand, I don't see a need for a class whose members share only the property that they are not catchable with a bare except, leading to Raisable -+- Exception +- KeyboardInterrupt +- SystemShutdown +- PowerFailure +- (etc) or even Exception -+- CatchableException +- KeyboardInterrupt +- SystemShutdown +- PowerFailure +- (etc) The latter is my mental model, and would work well with bare excepts. It also would encourage the programmer to think about whether an Exception should be catchable or is a special case, but I don't think that's really helpful except for Python developers, who presumably would be aware of the issues. The former would be a compromise to allow "except Exception" to be a natural idiom, which I prefer to bare excepts on stylistic grounds. On balance, that's what I advocate. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From gnn at neville-neil.com Tue Aug 2 04:08:11 2005 From: gnn at neville-neil.com (George V. Neville-Neil) Date: Tue, 02 Aug 2005 11:08:11 +0900 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1122918723.9680.33.camel@warna.corp.google.com> References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> Message-ID: At Mon, 01 Aug 2005 10:52:03 -0700, Donovan Baarda wrote: > > On Sun, 2005-07-31 at 23:54, Stephen J. Turnbull wrote: > > >>>>> "BAW" == Barry Warsaw writes: > > > > BAW> So are you saying that moving to svn will let us do more long > > BAW> lived branches? Yay! > > > > Yes, but you still have to be disciplined about it. svn is not much > > better than cvs about detecting and ignoring spurious conflicts due to > > code that gets merged from branch A to branch B, then back to branch > > A. Unrestricted cherry-picking is still out. > > Yeah. IMHO the sadest thing about SVN is it doesn't do branch/merge > properly. All the other cool stuff like renames etc is kinda undone by > that. For a definition of properly, see; > > http://prcs.sourceforge.net/merge.html > > This is why I don't bother migrating any existing CVS projects to SVN; > the benefits don't yet outweigh the pain of migrating. For new projects > sure, SVN is a better choice than CVS. Since Python is Open Source are you looking at Per Force which you can use for free and seems to be a happy medium between something like CVS and something horrific like Clear Case? Later, George From pje at telecommunity.com Tue Aug 2 06:31:42 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 02 Aug 2005 00:31:42 -0400 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <871x5d10lf.fsf@tleepslib.sk.tsukuba.ac.jp> References: <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> At 12:25 PM 8/2/2005 +0900, Stephen J. Turnbull wrote: > >>>>> "Willem" == Willem Broekema writes: > > Willem> So, in short, Keyboard interrupts in Lisp are a > Willem> serious-condition, not an error. > > Willem> (And what is labeled CriticalException in this discussion, > Willem> has in serious-condition Lisp's counterpart.) > >I don't see it that way. Rather, "Raisable" is the closest equivalent >to "serious-condition", I don't think that Lisp's idea of an exception hierarchy has much bearing here. > >> and if an unhandled signal whose intended interpretation is > >> "drop the gun and put your hands on your head!" isn't critical, > >> what is? > > Willem> Eh, are you serious? > >Yes. Unhandled, KeyboardInterrupt means that the user has forcibly >taken control away from the program without giving it a chance to >preserve state, finish responding to (realtime) external conditions, >or even activate vacation(1), and the program is entirely at the mercy >of the user. Usually, the program then proceeds to die without >dignity. If it's a realtime application, killing it is probably the >only merciful thing to do. > >If you were the program, wouldn't you consider that critical? You just said, "Unhandled, KeyboardInterrupt means..." If the program doesn't *want* to handle KeyboardInterrupt, then it obviously *isn't* critical, because it doesn't care. Conversely, if it *does* handle KeyboardInterrupt, then once again, it's not critical by your definition. So, clearly, KeyboardInterrupt is thus *not* critical, and doesn't belong in the CriticalException hierarchy. Note, by the way, that Python programs can disable a KeyboardInterrupt from ever occurring in the first place, whereas none of the other CriticalException classes can be "disabled" because they're actually *error* conditions, while KeyboardInterrupt is just an asynchronous notification - for control flow purposes. Ergo, it's a control flow exception. (Similarly, a Python program can avoid raising any of the other control flow errors; they are by and large optional features.) > Willem> I'd prefer the 'condition' and 'error' terminology, and to > Willem> label a keyboard interrupt a condition, not any kind of > Willem> exception or error. > >Now, that does bother me. Anything we will not permit a program >to ignore with a bare "except: pass" if it so chooses had better be >more serious than merely a "condition". Also, to me a "condition" is >something that I poll for, it does not interrupt me. To me, a >condition (even a serious one) is precisely the kind of thing that I >should be able to ignore with a bare except! On the contrary, it is control-flow exceptions that bare except clauses are most harmful to: StopIteration, SystemExit, and... you guessed it... KeyboardInterrupt. An exception that's being used for control flow is precisely the kind of thing you don't want anything but an explicit except clause to catch. Whether critical errors should also pass bare except clauses is a distinct issue, one which KeyboardInterrupt really doesn't enter into. If you think that a KeyboardInterrupt is an error, then it's an indication that Python's documentation and the current exception class hierarchy has failed to educate you sufficiently, and that we *really* need to add a class like ControlFlowException into the hierarchy to help make sure that other people don't end up sharing your misunderstanding. ;-) From stephen at xemacs.org Tue Aug 2 09:13:10 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 02 Aug 2005 16:13:10 +0900 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> (Phillip J. Eby's message of "Tue, 02 Aug 2005 00:31:42 -0400") References: <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> Message-ID: <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Phillip" == Phillip J Eby writes: Phillip> You just said, "Unhandled, KeyboardInterrupt means..." Phillip> If the program doesn't *want* to handle Phillip> KeyboardInterrupt, then it obviously *isn't* critical, Phillip> because it doesn't care. Conversely, if it *does* handle Phillip> KeyboardInterrupt, then once again, it's not critical by Phillip> your definition. That's not my definition. By that argument, no condition that can be handled can be critical. By my definition, the condition only needs to prevent the program from continuing normally when it arises. KeyboardInterrupt is a convention that is used to tell a program that continuing normally is not acceptable behavior, and therefore "critical" by my definition. Under either definition, we'll still need to do something special with MemoryError, KeyboardInterrupt, et amicae, and they still shouldn't be caught by a generic "except Exception". We agree on that, don't we? Phillip> Note, by the way, that Python programs can disable a Phillip> KeyboardInterrupt [...]. Ergo, it's a control flow Phillip> exception. Sure, in some sense---but not in the Python language AFAIK. Which control constructs in the Python language define semantics for continuation after KeyboardInterrupt occurs? Anything that can stop a program but the language doesn't define semantics for continuation is critical and exceptional by my definition. Willem> I'd prefer the 'condition' and 'error' terminology, and to Willem> label a keyboard interrupt a condition, not any kind of Willem> exception or error. >> Now, that does bother me. [...] Phillip> On the contrary, it is control-flow exceptions that bare Phillip> except clauses are most harmful to: StopIteration, Phillip> SystemExit, and... you guessed it... KeyboardInterrupt. That is a Python semantics issue, but as far as I can see there's unanimity on it. I and (AFAICS) Willem were discussing the connotations of the _names_ at this point, and whether they were suggestive of the semantics we (all!) seem to agree on. I do not find the word "condition" suggestive of the "things 'bare except' should not catch" semantics. I believe enough others will agree with me that the word "condition", even "serious condition", should be avoided. Phillip> An exception that's being used for control flow is Phillip> precisely the kind of thing you don't want anything but Phillip> an explicit except clause to catch. Which is exactly the conclusion I reached: [It] makes me wonder if there's any benefit to having a class [ie, CriticalException] between Raisable and KeyboardInterrupt. ...I don't see a need for a class whose members share only the property that they are not catchable with a bare except.... Now, somebody proposed: Raisable -+- Exception +- ... +- ControlFlowException -+- StopIteration +- KeyboardInterrupt As I wrote above, I see no use for that; I think that's what you're saying too, right? AIUI, you want Raisable -+- Exception +- ... +- StopIteration +- KeyboardInterrupt so that only the appropriate control construct or an explicit except can catch a control flow exception. At least, you've convinced me that "critical exception" is not a concept that should be implemented in the Python language specification. Rather, (for those who think as I do, if there are others) "critical exception" would be an intuitive guide to a subclass of exceptions that shouldn't be caught by a bare except (or a handler for any superclass except Raisable, for that matter). By the same token, "control flow exception" is a pedagogical concept, not something that should be reified in a ControlFlowException class, right? Phillip> If you think that a KeyboardInterrupt is an error, I have used the word "error" only in quoting Willem, and that's quite deliberate. I don't think that a condition need be an error to be "critical". -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From martin at v.loewis.de Tue Aug 2 09:58:12 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 02 Aug 2005 09:58:12 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> Message-ID: <42EF2794.1000209@v.loewis.de> George V. Neville-Neil wrote: > Since Python is Open Source are you looking at Per Force which you can > use for free and seems to be a happy medium between something like CVS > and something horrific like Clear Case? No. The PEP is only about Subversion. Why should we be looking at Per Force? Only because Python is Open Source? I think anything but Subversion is ruled out because: - there is no offer to host that anywhere (for subversion, there is already svn.python.org) - there is no support for converting a CVS repository (for subversion, there is cvs2svn) Regards, Martin From mal at egenix.com Tue Aug 2 11:56:59 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 02 Aug 2005 11:56:59 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42EB5AD1.60703@v.loewis.de> References: <42E93940.6080708@v.loewis.de> <42EA061A.9040609@egenix.com> <42EA98CC.4060003@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> Message-ID: <42EF436B.3050308@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > > The PSF does have a reasonable budget, so why not use it to > > maintain the infrastructure needed for Python development and > > let a company do the administration of the needed servers and > > the importing of the CSV and tracker items into their > > systems ? > > In principle, this might be a good idea. In practice, it falls > short of details: which company, what precisely are their procedures, > etc. It's not always the case that giving money to somebody really > gives you back the value you expect. True, but if we never ask, we'll never know :-) My question was: Would asking a professional hosting company be a reasonable approach ? >From the answers, I take it that there's not much trust in these offers, so I guess there's not much desire to PSF money into this. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 02 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Tue Aug 2 12:00:42 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 02 Aug 2005 20:00:42 +1000 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp> References: <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <42EF444A.4040108@gmail.com> Stephen J. Turnbull wrote: > Now, somebody proposed: > > Raisable -+- Exception > +- ... > +- ControlFlowException -+- StopIteration > +- KeyboardInterrupt > > As I wrote above, I see no use for that The use for it is : try: # do stuff except ControlFlowException: raise except Raisable: # handle anything else Sure, you could write it as: try: # do stuff except (CriticalException, Exception, Warning): # handle anything else But the former structure better reflects the programmers intent (handle everything except control flow exceptions). It's a fact that Python uses exceptions for control flow - KeyboardInterrupt [1], StopIteration, SystemExit (and soon to be GeneratorExit as well). Grouping them under a common parent allows them to be dealt with as a group, rather than their names being spelt out explicitly. Actually having this in the exception hierarchy is beneficial from a pedagogical point of view as well - the hierarchy is practically the first thing you encounter when you run "help ('exceptions')" at the interactive prompt. I have a Python 2.5 candidate hierarchy below, which uses dual inheritance to avoid breaking backward compatibility - any existing except clauses will catch all of the exceptions they used to catch. The only new inheritance introduced is to new exceptions, also avoiding backward compatibility problems, as any existing except clauses will let by all of the exceptions they used to let by. There are no removals, but the deprecation process is started in order to change the names of ReferenceError and RuntimeWarning to WeakReferenceError and SemanticsWarning. With this hierarchy, the recommended parent class for application errors becomes Error, and "except Error:" is preferred to any of "except:", "except Exception:" and "except StandardError:" (although these three continue to catch everything they used to catch). The recommended workaround for libraries raising errors which still inherit directly from Exception is: try: # Use library except (ControlFlowException, CriticalError): raise except Exception: # Do stuff (Remove the 'Exception' part if the library is so outdated that it still raises string exceptions) Applications which use exceptions to control the flow of execution rather than to indicate an error (e.g. breaking out of multiple nested loops) are free to use ControlFlowException directly, or else define their own subclasses of ControlFlowException. This hierarchy achieves my main goal for the exception reorganisation, which is to make it easy for scripts and applications to avoid inadvertently swallowing the control flow exceptions and critical errors, while still being able to provide generic error handlers for application faults. (Hmm, the pre-PEP doesn't include that as a goal in the 'Philosophy' section. . .) Python 2.4 Compatible Improved Exception Hierarchy v 0.1 ======================================================== Exception +-- ControlFlowException (new) +-- GeneratorExit (new) +-- StopIteration +-- SystemExit +-- KeyboardInterrupt (dual-inheritance new) +-- StandardError +-- KeyboardInterrupt (dual-inheritance new) +-- CriticalError (new) +-- MemoryError +-- SystemError +-- Error (new) +-- AssertionError +-- AttributeError +-- EOFError +-- ImportError +-- TypeError +-- ReferenceError (deprecated), WeakReferenceError (new alias) +-- ArithmeticError +-- FloatingPointError +-- DivideByZeroError +-- OverflowError +-- EnvironmentError +-- OSError +-- WindowsError +-- IOError +-- LookupError +-- IndexError +-- KeyError +-- NameError +-- UnboundLocalError +-- RuntimeError +-- NotImplementedError +-- SyntaxError +-- IndentationError +-- TabError +-- ValueError +-- UnicodeError +-- UnicodeDecodeError +-- UnicodeEncodeError +-- UnicodeTranslateError +-- Warning +-- DeprecationWarning +-- FutureWarning +-- PendingDeprecationWarning +-- RuntimeWarning (deprecated), SemanticsWarning (new alias) +-- SyntaxWarning +-- UserWarning Cheers, Nick. [1] PJE has convinced me that I was right in thinking that KeyboardInterrupt was a better fit under ControlFlowExceptions than it was under CriticalError. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From mal at egenix.com Tue Aug 2 12:07:57 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 02 Aug 2005 12:07:57 +0200 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <42EF444A.4040108@gmail.com> References: <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp> <42EF444A.4040108@gmail.com> Message-ID: <42EF45FD.5090800@egenix.com> Nick Coghlan wrote: > I have a Python 2.5 candidate hierarchy below, which uses dual inheritance to > avoid breaking backward compatibility - any existing except clauses will catch > all of the exceptions they used to catch. The only new inheritance introduced > is to new exceptions, also avoiding backward compatibility problems, as any > existing except clauses will let by all of the exceptions they used to let by. > There are no removals, but the deprecation process is started in order to > change the names of ReferenceError and RuntimeWarning to WeakReferenceError > and SemanticsWarning. +1. I like this approach of using multiple inheritence to solve the b/w compatibility problem. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 02 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mwh at python.net Tue Aug 2 12:07:59 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 02 Aug 2005 11:07:59 +0100 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1122918723.9680.33.camel@warna.corp.google.com> (Donovan Baarda's message of "Mon, 01 Aug 2005 10:52:03 -0700") References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> Message-ID: <2mslxszm68.fsf@starship.python.net> Donovan Baarda writes: > This is why I don't bother migrating any existing CVS projects to SVN; > the benefits don't yet outweigh the pain of migrating. I think they do. I was on dialup for a while, and would have _loved_ Python to be using SVN then -- and given how long diffs can take even over my broadband connection... Cheers, mwh PS: Wot, noone's suggested git yet? :) -- C++ is a siren song. It *looks* like a HLL in which you ought to be able to write an application, but it really isn't. -- Alain Picard, comp.lang.lisp From mark.russell at redmoon.me.uk Tue Aug 2 14:24:07 2005 From: mark.russell at redmoon.me.uk (Mark Russell) Date: Tue, 02 Aug 2005 13:24:07 +0100 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <42EF444A.4040108@gmail.com> References: <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp> <42EF444A.4040108@gmail.com> Message-ID: <1122985447.6108.9.camel@localhost> On Tue, 2005-08-02 at 11:00, Nick Coghlan wrote: > With this hierarchy, the recommended parent class for application errors > becomes Error, ... And presumably Error could also be the recommended exception for quick'n'dirty scripts. Mark Russell From pinard at iro.umontreal.ca Tue Aug 2 16:49:08 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Tue, 2 Aug 2005 10:49:08 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42EF2794.1000209@v.loewis.de> References: <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> Message-ID: <20050802144908.GA7898@alcyon.progiciels-bpi.ca> [Martin von L?wis] > The PEP is only about Subversion. I think anything but Subversion is > ruled out because: > - there is no offer to host that anywhere (for subversion, there is > already svn.python.org) > - there is no support for converting a CVS repository (for subversion, > there is cvs2svn) I quickly discussed Subversion with a few friends. While some say Subversion is the most reasonable avenue nowadays, others them told me they found something more appealing than Subversion: http://www.venge.net/monotone/ The hosting paradigm is fairly different, and for a few weeks now, they have a CVS repository converter. In my very naive eyes, the centralised aspects of Python development are be better represented with Subversion. It is notable also that Subversion if more Python-friendly than Monotone, with its Lua-based scripting. I did not deepen why, but at first glance, Monotone does not seduce me. On the other hand, the two guys saying good about Monotone are well informed (and also well known), so I would not dismiss their opinion so lightly. So, it might be worth at least a quick look? :-) -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From foom at fuhm.net Tue Aug 2 16:53:33 2005 From: foom at fuhm.net (James Y Knight) Date: Tue, 2 Aug 2005 10:53:33 -0400 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> References: <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> Message-ID: <9EDA49FB-1E9B-4558-9441-90A65ECC5A52@fuhm.net> On Aug 2, 2005, at 12:31 AM, Phillip J. Eby wrote: > If you think that a KeyboardInterrupt is an error, then it's an > indication > that Python's documentation and the current exception class > hierarchy has > failed to educate you sufficiently, and that we *really* need to add a > class like ControlFlowException into the hierarchy to help make > sure that > other people don't end up sharing your misunderstanding. ;-) No... KeyboardInterrupt (just like other asynchronous exceptions) really should be treated as a critical error. Doing anything other than killing your process off after receiving it is just inviting disaster. Because the exception can have occurred absolutely anywhere, it is unsuitable for normal use. Aborting a function between two arbitrary bytecodes and trying to continue operation is simply a recipe for disaster. For example, in threadable.py between line 200 "saved_state = self._release_save()" and 201 "try: # restore state no matter what (e.g., KeyboardInterrupt)" would be a bad place to hit control-c if you ever need to use that Condition again. This kind of problem is pervasive and unavoidable. If you want to do a clean shutdown on control-c, the only sane way is to install a custom signal handler that doesn't throw an asynchronous exception at you. There's a reason asynchronously killing off threads was deprecated in java. James From raymond.hettinger at verizon.net Tue Aug 2 16:55:43 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 02 Aug 2005 10:55:43 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050802144908.GA7898@alcyon.progiciels-bpi.ca> Message-ID: <000001c59772$4224c8a0$92b2958d@oemcomputer> [Fran?ois Pinard] > While some say Subversion is the most reasonable avenue nowadays, others > them told me they found something more appealing than Subversion: > > http://www.venge.net/monotone/ The current release is 0.21 which suggests that it is not ready for primetime. Raymond From foom at fuhm.net Tue Aug 2 17:08:11 2005 From: foom at fuhm.net (James Y Knight) Date: Tue, 2 Aug 2005 11:08:11 -0400 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <5.1.1.6.0.20050731124043.027e3768@mail.telecommunity.com> References: <42EC21F8.3040704@gmail.com> <42EB576F.3060309@egenix.com> <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <5.1.1.6.0.20050731124043.027e3768@mail.telecommunity.com> Message-ID: <42688208-3A8E-492F-86D8-E4FE76FB294D@fuhm.net> On Jul 31, 2005, at 12:49 PM, Phillip J. Eby wrote: > I think you're ignoring the part where most exception handlers are > already broken. At least adding CriticalException and > ControlFlowException makes it possible to add this: > > try: > ... > except (CriticalException,ControlFlowException): > raise > except: > ... > > This isn't great, I admit, but at least it would actually *work*. > > I also don't see how changing the recommended base class from > Exception to Error causes *problems* for every library. Sure, it > forces them to move (eventually!), but it's a trivial change, and > makes it *possible* to do the right thing with exceptions (e.g. > except Error:) as soon as all the libraries you depend on have > moved to using Error. Exactly. That is the problem. Adding a new class above Exception in the hierarchy allows everything to work nicely *now*. Recommended practice has been to have exceptions derive from Exception for a looong time. Changing everybody now will take approximately forever, which means the Error class is pretty much useless. By keeping the definition of Exception as "the standard thing you should derive from and catch", and adding a superclass with things you shouldn't catch, you make conversion a lot simpler. If you're not worried about compatibility with ye olde string exceptions, you can start using "except Exception" immediately. If you are, you can do as your example above. And when Python v.Future comes around, "except Exception" will be the only reasonable thing to do. If, on the other hand, we use Exception as the base class and Error as the thing you should use, I predict that even by the time Python v.Future comes out, many libraries/prgrams will still have exceptions deriving from Exception, thus making the Exception/Error distinction somewhat broken. James From tjreedy at udel.edu Tue Aug 2 17:09:30 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 2 Aug 2005 11:09:30 -0400 Subject: [Python-Dev] __autoinit__ (Was: Proposal: reducing self.x=x; self.y=y; self.z=z boilerplate code) References: <139701967.20050731214526@intercable.ru> Message-ID: "falcon" wrote in message news:139701967.20050731214526 at intercable.ru... > Hello python-list, > > As I Understood, semantic may be next: [snip] This was properly posted to the general Python discussion group/list. Reposted here, to the Python development list/group, it is offtopic. If you did not get a satisfactory answer to your first post to the general group/list, it may be because your question is confusing. So you might want to try again there with different words. Terry J. Reedy From pje at telecommunity.com Tue Aug 2 17:39:19 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 02 Aug 2005 11:39:19 -0400 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp> References: <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050802113712.025aa098@mail.telecommunity.com> At 04:13 PM 8/2/2005 +0900, Stephen J. Turnbull wrote: >Now, somebody proposed: > >Raisable -+- Exception > +- ... > +- ControlFlowException -+- StopIteration > +- KeyboardInterrupt > >As I wrote above, I see no use for that; I think that's what you're >saying too, right? AIUI, you want > >Raisable -+- Exception > +- ... > +- StopIteration > +- KeyboardInterrupt > >so that only the appropriate control construct or an explicit except >can catch a control flow exception. No, I want ControlFlowException to exist as a parent so that code today can work around the fact that bare "except:" and "except Exception:" catch everything. In Python 3.0, we should have "except Error:" and be able to have it catch everything but control flow exceptions and possibly critical errors. From pje at telecommunity.com Tue Aug 2 17:48:03 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 02 Aug 2005 11:48:03 -0400 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <42EF444A.4040108@gmail.com> References: <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp> <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <5.1.1.6.0.20050802113926.02895d08@mail.telecommunity.com> At 08:00 PM 8/2/2005 +1000, Nick Coghlan wrote: >Python 2.4 Compatible Improved Exception Hierarchy v 0.1 >======================================================== > >Exception >+-- ControlFlowException (new) > +-- GeneratorExit (new) > +-- StopIteration > +-- SystemExit > +-- KeyboardInterrupt (dual-inheritance new) >+-- StandardError > +-- KeyboardInterrupt (dual-inheritance new) > +-- CriticalError (new) > +-- MemoryError > +-- SystemError > +-- Error (new) Couldn't we make Error a parent of StandardError, here, and then make the CriticalError subclasses dual-inherit StandardError, i.e.: Error CriticalError MemoryError (also subclass StandardError) SystemError (also subclass StandardError) StandardError ... In this way, we can encourage people to inherit from Error. Or maybe we should just make the primary hierarchy the way we want it to be, and only cross-link exceptions to StandardError that were previously under StandardError, i.e.: Raisable ControlFlowException ... (cross-inherit to StandardError as needed) CriticalError ... (cross-inherit to StandardError as needed) Exception ... This wouldn't avoid "except Exception" and bare except being problems, but at least you can catch the uncatchables and reraise them. Hm. Maybe we should include a Reraisable base for ControlFlowException and CriticalError? Then you could do "except Reraisable: raise" as a nice way to do the right thing until Python 3.0. It seems to me that multiple inheritance is definitely the right idea, though. That way, we can get the hierarchy we really want with only a minimum of boilerplate in pre-3.0 to make it actually work. From pje at telecommunity.com Tue Aug 2 17:57:15 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 02 Aug 2005 11:57:15 -0400 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <9EDA49FB-1E9B-4558-9441-90A65ECC5A52@fuhm.net> References: <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> <42EB576F.3060309@egenix.com> <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050802115018.027f4360@mail.telecommunity.com> At 10:53 AM 8/2/2005 -0400, James Y Knight wrote: >No... KeyboardInterrupt (just like other asynchronous exceptions) >really should be treated as a critical error. Doing anything other >than killing your process off after receiving it is just inviting >disaster. Because the exception can have occurred absolutely >anywhere, it is unsuitable for normal use. Aborting a function >between two arbitrary bytecodes and trying to continue operation is >simply a recipe for disaster. For example, in threadable.py between >line 200 "saved_state = self._release_save()" and 201 "try: # >restore state no matter what (e.g., KeyboardInterrupt)" would be a >bad place to hit control-c if you ever need to use that Condition >again. This kind of problem is pervasive and unavoidable. In my personal experience with using KeyboardInterrupt I've only ever needed to do some minor cleanup of external state, such as removing lockfiles, abandoning connections, etc., so I haven't encountered this issue before. I can see, however, why it would be a problem if you were trying to keep the program *running* - but I've been assuming that KeyboardInterrupt is something that always means "attempt to shutdown gracefully". I suppose considering it a critical error might put it more clearly in that category. I'm not 100% convinced, but you've definitely given me something to think about. On the other hand, any exception can happen "between two arbitrary bytecodes", so there are always circumstances that need special attention, or require a "with block_signals" statement or something. I suppose this issue may have to come down to BDFL pronouncement. From pinard at iro.umontreal.ca Tue Aug 2 18:06:20 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Tue, 2 Aug 2005 12:06:20 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <000001c59772$4224c8a0$92b2958d@oemcomputer> References: <20050802144908.GA7898@alcyon.progiciels-bpi.ca> <000001c59772$4224c8a0$92b2958d@oemcomputer> Message-ID: <20050802160620.GA9652@alcyon.progiciels-bpi.ca> [Raymond Hettinger] > > http://www.venge.net/monotone/ > The current release is 0.21 which suggests that it is not ready for > primetime. It suggests it, yes, and to me as well. On the other hand, there is a common prejudice that something requires many releases, or frequent releases, to be qualified as good. While it might be true on average, this is not necessarily true: some packages need not so many steps for becoming very usable, mature or stable. (Note that I'm not asserting anything about Monotone, here.) We should merely keep an open mind. -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From bcannon at gmail.com Tue Aug 2 18:56:05 2005 From: bcannon at gmail.com (Brett Cannon) Date: Tue, 2 Aug 2005 09:56:05 -0700 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <5.1.1.6.0.20050802113926.02895d08@mail.telecommunity.com> References: <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp> <42EF444A.4040108@gmail.com> <5.1.1.6.0.20050802113926.02895d08@mail.telecommunity.com> Message-ID: On 8/2/05, Phillip J. Eby wrote: > At 08:00 PM 8/2/2005 +1000, Nick Coghlan wrote: [SNIP] > Or maybe we > should just make the primary hierarchy the way we want it to be, and only > cross-link exceptions to StandardError that were previously under > StandardError, i.e.: > > Raisable > ControlFlowException > ... (cross-inherit to StandardError as needed) > CriticalError > ... (cross-inherit to StandardError as needed) > Exception > ... > > This wouldn't avoid "except Exception" and bare except being problems, but > at least you can catch the uncatchables and reraise them. > I think that is acceptable. Using multiple inheritance to make sure that the exceptions that have been moved out of the main exception branch seems like it will be the best solution for giving some form of backwards-compatibility for now while allowing things to still move forward and not cripple the changes we want to make. > Hm. Maybe we should include a Reraisable base for ControlFlowException and > CriticalError? Then you could do "except Reraisable: raise" as a nice way > to do the right thing until Python 3.0. > As in exceptions that don't inherit from Error/StandError/whatever_the_main_exception_is can easily be caught separately? > It seems to me that multiple inheritance is definitely the right idea, > though. That way, we can get the hierarchy we really want with only a > minimum of boilerplate in pre-3.0 to make it actually work. > Yeah. I think name aliasing and multiple inheritance will take us a long way. Warnings should be able to take us the rest of the way. -Brett (who is still waiting for a number; Barry, David, you out there?) From gabriel.becedillas at corest.com Tue Aug 2 20:59:58 2005 From: gabriel.becedillas at corest.com (Gabriel Becedillas) Date: Tue, 02 Aug 2005 15:59:58 -0300 Subject: [Python-Dev] Syscall Proxying in Python In-Reply-To: <1122919731.9688.43.camel@warna.corp.google.com> References: <42EE5DAA.8040200@corest.com> <1122919731.9688.43.camel@warna.corp.google.com> Message-ID: <42EFC2AE.6020205@corest.com> Donovan Baarda wrote: > On Mon, 2005-08-01 at 10:36, Gabriel Becedillas wrote: > >>Hi, >>We embbeded Python 2.0.1 in our product a few years ago and we'd like to >>upgrade to Python 2.4.1. This was not a simple task, because we needed >>to execute syscalls on a remote host. We modified Python's source code >>in severall places to call our own versions of some functions. For >>example, instead of calling fopen(...), the source code was modified to >>call remote_fopen(...), and the same was done with other libc functions. >>Socket functions where hooked too (we modified socket.c), Windows >>Registry functions, etc.. > > > Wow... you guys sure did it the hard way. If you had done it at the > Python level, you would have had a much easier time of both implementing > and updating it. > > As an example, have a look at my osVFS stuff. This is a replacement for > the os module and open() that tricks Python into using a virtual file > system; > > http://minkirri.apana.org.au/~abo/projects/osVFS > > Hi, thanks for your reply. The problem I see with the aproach you're sugesting is that I have to rewrite a lot of code to make it work the way I want. We allready have the syscall proxying stuff with an stdio layer on top of it. I should have to rewrite some parts of some modules and use my own versions of stdio functions, and that is pretty much the same as we have done before. There are also native objects that use stdio functions, and I should replace those ones too, or modules that have some native code that uses stdio, or sockets. I should duplicate those files, and make the same kind of search/replace work that we have done previously and that we'd like to avoid. Please let me know if I misunderstood you. Thanks again. -- Gabriel Becedillas Developer CORE SECURITY TECHNOLOGIES Florida 141 - 2? cuerpo - 7? piso C1005AAC Buenos Aires - Argentina Tel/Fax: (54 11) 5032-CORE (2673) http://www.corest.com From metawilm at gmail.com Tue Aug 2 21:39:49 2005 From: metawilm at gmail.com (Willem Broekema) Date: Tue, 2 Aug 2005 21:39:49 +0200 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: <871x5d10lf.fsf@tleepslib.sk.tsukuba.ac.jp> References: <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <871x5d10lf.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: On 8/2/05, Stephen J. Turnbull wrote: > I don't see it that way. Rather, "Raisable" is the closest equivalent > to "serious-condition", and "CriticalException" is an intermediate > class that has no counterpart in Lisp usage. That would imply that all raisables are 'serious' in the Lisp sense, which is defined as "all conditions serious enough to require interactive intervention if not handled". Yet Python warnings are raisable (as raisable is the root), but are certainly not serious in the Lisp sense. (This is complicated by that warnings are raised using 'signal'. More below.) Willem: > I'd prefer the 'condition' and 'error' terminology, and to > label a keyboard interrupt a condition, not any kind of > exception or error. To clarify myself: a 'serious-condition' in CL stands for "all conditions serious enough to require interactive intervention if not handled"; I meant to label KI a 'serious-condition'. Stephen: > Now, that does bother me. Anything we will not permit a program > to ignore with a bare "except: pass" if it so chooses had better be > more serious than merely a "condition". Also, to me a "condition" is > something that I poll for, it does not interrupt me. To me, a > condition (even a serious one) is precisely the kind of thing that I > should be able to ignore with a bare except! If I understand your position correctly, it is probably not changed yet by the above clarification. Maybe it will surprise you, that in Lisp a bare except (ignore-errors) does not catch non-serious things like warnings. And if left uncatched, a warning leaks out to the top level, gets printed and subsequently ignored. That's because non-serious conditions are (usually) raised using 'signal', not 'error'. The default top-level warnings handler just prints it, but does not influence the program control flow, so the execution resumes just after the (warn ..) form. This probably marks a very important difference between Python and CL. I think one could say that where in Python one would use a bare except to catch both non-serious and serious exceptions, in CL one normally doesn't bother with catching the non-serious ones because they will not create havoc at an outer level anyway. So in Python a warning must be catched by a bare except, while in Lisp it would not. And from this follow different contraints on the hierarchy. By the way, this is the condition hierarchy in Allegro CL (most of which is prescribed by the ANSI standard): - Willem From martin at v.loewis.de Tue Aug 2 23:16:05 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 02 Aug 2005 23:16:05 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42EF436B.3050308@egenix.com> References: <42E93940.6080708@v.loewis.de> <42EA061A.9040609@egenix.com> <42EA98CC.4060003@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> Message-ID: <42EFE295.6040906@v.loewis.de> M.-A. Lemburg wrote: > True, but if we never ask, we'll never know :-) > > My question was: Would asking a professional hosting company > be a reasonable approach ? It would be an option, yes, of course. It's not an approach that *I* would be willing to implement, though. >>From the answers, I take it that there's not much trust in these > offers, so I guess there's not much desire to PSF money into this. I haven't received any offers to make a qualified statement. I only know that I would oppose an approach to ask somebody but our volunteers to do it for free, and I also know that I don't want to spend my time researching commercial alternatives (although I wouldn't mind if you spent your time). Regards, Martin From martin at v.loewis.de Tue Aug 2 23:25:56 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 02 Aug 2005 23:25:56 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050802144908.GA7898@alcyon.progiciels-bpi.ca> References: <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050802144908.GA7898@alcyon.progiciels-bpi.ca> Message-ID: <42EFE4E4.4020507@v.loewis.de> Fran?ois Pinard wrote: > So, it might be worth at least a quick look? :-) Certainly not my look - although I'm willing to integrate anything that people contribute into the PEP. Regards, Martin From bcannon at gmail.com Wed Aug 3 02:34:01 2005 From: bcannon at gmail.com (Brett Cannon) Date: Tue, 2 Aug 2005 17:34:01 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 Message-ID: OK, having taken in all of the suggestions, here is another revision round. I think I still have a place or two I partially ignored people just because there was not a severe uproar and I still think the original idea is good (renaming RuntimeError, for instance). I also added notes on handling the transition and rejected idea. There is now only one open issue, which is whether ControlFlowException should be removed. And I am still waiting on a PEP number to be able to check this into CVS and push me to flesh out the references. =) -------------------------------------------------------------- PEP: XXX Title: Exception Reorganization for Python 3.0 Version: $Revision: 1.5 $ Last-Modified: $Date: 2005/06/07 13:17:37 $ Author: Brett Cannon Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-Jul-2005 Post-History: XX-XXX-XXX .. contents:: Abstract ======== Python, as of version 2.4, has 38 exceptions (including warnings) in the built-in namespace in a rather shallow hierarchy. This list of classes has grown over the years without a chance to learn from mistakes and cleaning up the hierarchy. This PEP proposes doing a reorganization for Python 3.0 when backwards-compatibility is not an issue. Along with this reorganization, adding a requirement that all objects passed to a ``raise`` statement must inherit from a specific superclass is proposed. Lastly, the removal of bare ``except`` class is suggested. Rationale ========= Exceptions are a critical part of Python. While exceptions are traditionally used to signal errors in a program, they have also grown to be used for flow control for things such as iterators. There importance is great. But the organization of the exception hierarchy is suboptimal. Mostly for backwards-compatibility reasons, the hierarchy has stayed very flat and old exceptions who usefulness have not been proven have been left in. Making exceptions more hierarchical would help facilitate exception handling by making catching exceptions using inheritance much more logical. This should also help lead to less errors from being too broad in what exceptions are caught in an ``except`` clause. A required superclass for all exceptions is also being proposed [Summary2004-08-01]_. By requiring any object that is used in a ``raise`` statement to inherit from a specific superclass, certain attributes (such as those laid out in PEP 344 [PEP344]_) can be guaranteed to exist. This also will lead to the planned removal of string exceptions. Lastly, bare ``except`` clauses are to be removed [XXX Guido's reply to my initial draft]_. Often people use a bare ``except`` when what they really wanted were non-critical exceptions to be caught while more system-specific ones, such as MemoryError, to pass through and to halt the interpreter. This leads to errors that can be hard to debug thanks to exceptions' sometimes unpredictable execution flow. It also causes ``except`` statements to follow the "explicit is better than implicit" tenant of Python [XXX]_. Philosophy of Reorganization ============================ There are several goals in this reorganization that defined the philosophy used to guide the work. One goal was to prune out unneeded exceptions. Extraneous exceptions should not be left in since it just serves to clutter the built-in namespace. Unneeded exceptions also dilute the importance of other exceptions by splitting uses between several exceptions when all uses should have been under a single exception. Another goal was to introduce any exceptions that were deemed needed to fill any holes in the hierarchy. Most new exceptions were done to flesh out the inheritance hierarchy to make it easier to catch a category of exceptions with a simpler ``except`` clause. Changing inheritance to make it more reasonable was a goal. As stated above, having proper inheritance allows for more accurate ``except`` statements when catching exceptions based on the inheritance tree. Lastly, any renaming to make an exception's use more obvious from its name was done. Having to look up what an exception is meant to be used for because the name does not proper reflect its usage is annoying and slows down debugging. Having a proper name also makes debugging easier on new programmers. But for simplicity of existing user's and for transitioning to Python 3.0, only exceptions whose names were fairly out of alignment with their stated purpose have been renamed. New Hierarchy ============= Exception +-- CriticalException (new) +-- KeyboardInterrupt +-- MemoryError +-- SystemError +-- ControlFlowException (new) +-- StopIteration +-- GeneratorExit +-- SystemExit +-- StandardError +-- AssertionError +-- SyntaxError +-- IndentationError +-- TabError +-- UserException (rename of RuntimeError) +-- ArithmeticError +-- FloatingPointError +-- DivideByZeroError +-- OverflowError +-- UnicodeError +-- UnicodeDecodeError +-- UnicodeEncodeError +-- UnicodeTranslateError +-- LookupError +-- IndexError +-- KeyError +-- TypeError +-- AttributeError +-- EnvironmentError +-- OSError +-- IOError +-- EOFError (new inheritance) +-- ImportError +-- NotImplementedError (new inheritance) +-- NamespaceError (rename of NameError) +-- UnboundGlobalError (new) +-- UnboundLocalError +-- UnboundFreeError (new) +-- WeakReferenceError (rename of ReferenceError) +-- ValueError +-- Warning +-- UserWarning +-- AnyDeprecationWarning (new) +-- PendingDeprecationWarning +-- DeprecationWarning +-- SyntaxWarning +-- SemanticsWarning (rename of RuntimeWarning) +-- FutureWarning Differences Compared to Python 2.4 ================================== Changes to exceptions from Python 2.4 can take shape in three forms: removal, renaming, or change in their superclass. There are also new exceptions introduced in the proposed hierarchy. New Exceptions -------------- CriticalException ''''''''''''''''' The superclass for exceptions for which a severe error has occurred that one would not want to recover from. The name is meant to reflect the point that these exceptions are usually raised only when the interpreter should most likely be terminated. All classes that inherit from this class are raised when the virtual machine has a asynchronous exception to raise about its state. ControlFlowException '''''''''''''''''''' This exception exists as a superclass for all exceptions that directly deal with control flow. Inheriting from Exception instead of StandardError prevents them from being caught accidently when one wants to catch errors. The name, by not mentioning "Error", does not lead to one to confuse the subclasses as errors. UnboundGlobalError '''''''''''''''''' Raised when a global variable was not found. UnboundFreeError '''''''''''''''' Raised when a free variable is not found. AnyDeprecationWarning ''''''''''''''''''''' A common superclass for all deprecation-related exceptions. While having DeprecationWarning inherit from PendingDeprecationWarning was suggested because a DeprecationWarning can be viewed as a PendingDeprecationWarning that is happening now, the logic was not agreed upon by a majority. But since the exceptions are related, creating a common superclass is warranted. Removed Exceptions ------------------ WindowsError '''''''''''' Too OS-specific to be kept in the built-in exception hierarchy. Renamed Exceptions ------------------ RuntimeError '''''''''''' Renamed UserException. Meant for use as a generic exception to be used when one does not want to create a new exception class but do not want to raise an exception that might be caught based on inheritance, RuntimeError is poorly named. It's name in Python 2.4 seems to suggest an error that occurred at runtime, possibly an error in the VM. Renaming the exception to UserException more clearly states the purpose for the exception as quick-and-dirty exception for the user to use. The name also keeps it in line with UserWarning. ReferenceError '''''''''''''' Renamed WeakReferenceError. ReferenceError was added to the built-in exception hierarchy in Python 2.2 [exceptionsmodule]_. Taken directly from the ``weakref`` module, its name comes directly from its original name when it resided in the module. Unfortunately its name does not suggest its connection to weak references and thus deserves a renaming. NameError ''''''''' Renamed NamespaceError. While NameError suggests its common use, it is not entirely apparent. Making it more of a superclass for namespace-related exceptions warrants a renaming to make it abundantly clear its use. Plus the documentation of the exception module[XXX]_ states that it is actually meant for global names and not for just any exception. RuntimeWarning '''''''''''''' Renamed SemanticsWarning. RuntimeWarning is to represent semantic changes coming in the future. But while saying that affects "runtime" is true, flat-out stating it is a semantic change is much clearer, eliminating any possible association of "runtime" with the virtual machine specifically. Changed Inheritance ------------------- AttributeError '''''''''''''' Inherits from StandardError. Originally inheriting from NotImplementedError, AttributeError is typically raised because of the lack of an attribute which does not necessarily mean it was not implemented but just not set yet. Thus it has been decoupled from NotImplementedError. EOFError '''''''' Subclasses IOError. Since an EOF comes from I/O it only makes sense that it be considered an I/O error. Required Superclass for ``raise`` ================================= By requiring all objects passed to a ``raise`` statement inherit from a specific superclass, one is guaranteed that all exceptions will have certain attributes. If PEP 342 [PEP344]_ is accepted, the attributes outlined there will be guaranteed to be on all exceptions raised. This should help facilitate debugging by making the querying of information from exceptions much easier. The proposed hierarchy has Exception as the required class that one must inherit from. Implementation -------------- Enforcement is straight-forward. Modifying ``RAISE_VARARGS`` to do an inheritance check first before raising an exception should be enough. For the C API, all functions that set an exception will have the same inheritance check. Removal of Bare ``except`` Clauses ================================== One of Python's basic tenants is "explicit is better than implicit". Unfortunately a bare ``except`` clause implicitly states it should catch all exceptions. While useful as a way to catch all exceptions when any object can be raised, requiring a specific superclass be inherited in order to raise an object gives a single class to catch to cover all exceptions. With this in mind, the removal of bare ``except`` statements is justified. Implementation -------------- A simple change to the grammar is all that is needed for implementation. Transition Plan =============== Exception Hierarchy Changes --------------------------- New Exceptions '''''''''''''' New exceptions can simply be added to the built-in namespace. Any pre-existing objects with the same name will mask the new exceptions, preserving backwards-compatibility. Renamed Exceptions '''''''''''''''''' Renamed exceptions will directly subclass the new names. When the old exceptions are instantiated (which occurs when an exception is caught, either by a ``try`` statement or by propagating to the top of the execution stack), a PendingDeprecationWarning will be raised. This should properly preserve backwards-compatibility as old usage won't change and the new names can be used to also catch exceptions using the old name. The warning of the deprecation is also kept simple. New Inheritance for Old Exceptions '''''''''''''''''''''''''''''''''' Using multiple inheritance to our advantage, exceptions whose inheritance has changed in such a way as for them to not necessarily be caught by pre-existing ``except`` clauses can be made backwards-compatible. By inheriting from both the new superclasses as well as the original superclasses existing ``except`` clauses will continue to work as before while allowing the new inheritance to be used for new clauses. A PendingDeprecationWarning will be raised based on whether the bytecode ``COMPARE_OP(10)`` results in an exception being caught that would not have under the new hierarchy. This will require hard-coding in the implementation of the bytecode. Removed Exceptions '''''''''''''''''' Exceptions scheduled for removal will be transitioned much like the old names of renamed exceptions. Upon instantiation a PendingDeprecationWarning will be raised stating the the exception is due to be removed by Python 3.0 . Required Superclass for ``raise`` --------------------------------- A SemanticsWarning will be raised when an object is passed to ``raise`` that does not have the proper inheritance. Removal of Bare ``except`` Clauses ---------------------------------- A PendingDeprecationWarning will be raised when a bare ``except`` clause is used. Rejected Ideas ============== Threads on python-dev discussing this PEP can be found at [XXX]_. KeyboardInterrupt inheriting from ControlFlowException ------------------------------------------------------ KeyboardInterrupt has been a contentious point within this hierarchy. Some view the exception as more control flow being caused by the user. But with its asynchronous cause thanks to the user being able to trigger the exception at any point in code it has a more proper place inheriting from CriticalException. It also keeps the name of the exception from being "CriticalError". Renaming Exception to Raisable, StandardError to Exception ---------------------------------------------------------- While the naming makes sense and emphasizes the required superclass as what must be inherited from for raising an object, the naming is not required. Keeping the existing names minimizes code change to use the new names. DeprecationWarning Inheriting From PendingDeprecationWarning ------------------------------------------------------------ Originally proposed because a DeprecationWarning can be viewed as a PendingDeprecationWarning that is being removed in the next version. But enough people thought the inheritance could logically work the other way the idea was dropped. AttributeError Inheriting From TypeError or NameError ----------------------------------------------------- Viewing attributes as part of the interface of a type caused the idea of inheriting from TypeError. But that partially defeats the thinking of duck typing and thus was dropped. Inheriting from NameError was suggested because objects can be viewed as having their own namespace that the attributes lived in and when they are not found it is a namespace failure. This was also dropped as a possibility since not everyone shared this view. Removal of EnvironmentError --------------------------- Originally proposed based on the idea that EnvironmentError was an unneeded distinction, the BDFL overruled this idea [XXX]_. Introduction of MacError and UnixError -------------------------------------- Proposed to add symmetry to WindowsError, the BDFL said they won't be used enough [XXX]_. The idea of then removing WindowsError was proposed and accepted as reasonable, thus completely negating the idea of adding these exceptions. SystemError Subclassing SystemExit ---------------------------------- Proposed because a SystemError is meant to lead to a system exit, the idea was removed since CriticalException signifies this better. ControlFlowException Under StandardError ---------------------------------------- It has been suggested that ControlFlowException inherit from StandardError. This idea has been rejected based on the thinking that control flow exceptions are typically not desired to be caught in a generic fashion as StandardError will usually be used. Open Issues =========== Remove ControlFlowException? ---------------------------- It has been suggested that ControlFlowException is not needed. Since the desire to catch any control flow exception will be atypical, the suggestion is to just remove the exception and let the exceptions that inherited from it inherit directly from Exception. This still preserves the seperation from StandardError which is one of the driving factors behind the introduction of the exception. Acknowledgements ================ Thanks to Robert Brewer, Josiah Carlson, Nick Coghlan, Timothy Delaney, Jack Diedrich, Fred L. Drake, Jr., Philip J. Eby, Greg Ewing, James Y. Knight, MA Lemburg, Guido van Rossum, Stephen J. Turnbull and everyone else I missed for participating in the discussion. References ========== .. [PEP342] PEP 342 (Coroutines via Enhanced Generators) (http://www.python.org/peps/pep-0342.html) .. [PEP344] PEP 344 (Exception Chaining and Embedded Tracebacks) (http://www.python.org/peps/pep-0344.html) .. [exceptionsmodules] 'exceptions' module (http://docs.python.org/lib/module-exceptions.html) .. [Summary2004-08-01] python-dev Summary (An exception is an exception, unless it doesn't inherit from Exception) (http://www.python.org/dev/summary/2004-08-01_2004-08-15.html#an-exception-is-an-exception-unless-it-doesn-t-inherit-from-exception) .. [Summary2004-09-01] python-dev Summary (Cleaning the Exception House) (http://www.python.org/dev/summary/2004-09-01_2004-09-15.html#cleaning-the-exception-house) .. [python-dev1] python-dev email (Exception hierarchy) (http://mail.python.org/pipermail/python-dev/2004-August/047908.html) .. [python-dev2] python-dev email (Dangerous exceptions) (http://mail.python.org/pipermail/python-dev/2004-September/048681.html) Copyright ========= This document has been placed in the public domain. From stephen at xemacs.org Wed Aug 3 02:49:15 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 03 Aug 2005 09:49:15 +0900 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: (Willem Broekema's message of "Tue, 2 Aug 2005 21:39:49 +0200") References: <5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com> <42EC21F8.3040704@gmail.com> <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <871x5d10lf.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <87wtn3hmk4.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Willem" == Willem Broekema writes: Willem> On 8/2/05, Stephen J. Turnbull wrote: >> I don't see it that way. Rather, "Raisable" is the closest >> equivalent to "serious-condition", and "CriticalException" is >> an intermediate class that has no counterpart in Lisp usage. Willem> That would imply that all raisables are 'serious' in the Willem> Lisp sense, No, it implies that Phillip was right when he wrote that the Lisp hierarchy of signals is not relevant (as a whole) to the discussion of Python Raisables. Of course partial analogies are useful. In any case, Nick's idiom of "except ControlFlowException: raise" clarified everything for me. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From abo at minkirri.apana.org.au Wed Aug 3 03:03:04 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Tue, 02 Aug 2005 18:03:04 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050802160620.GA9652@alcyon.progiciels-bpi.ca> References: <20050802144908.GA7898@alcyon.progiciels-bpi.ca> <000001c59772$4224c8a0$92b2958d@oemcomputer> <20050802160620.GA9652@alcyon.progiciels-bpi.ca> Message-ID: <1123030984.1821.124.camel@warna.corp.google.com> On Tue, 2005-08-02 at 09:06, Fran?ois Pinard wrote: > [Raymond Hettinger] > > > > http://www.venge.net/monotone/ > > > The current release is 0.21 which suggests that it is not ready for > > primetime. > > It suggests it, yes, and to me as well. On the other hand, there is > a common prejudice that something requires many releases, or frequent > releases, to be qualified as good. While it might be true on average, > this is not necessarily true: some packages need not so many steps for > becoming very usable, mature or stable. (Note that I'm not asserting > anything about Monotone, here.) We should merely keep an open mind. It is true that some well designed/developed software becomes reliable very quicky. However, it still takes heavy use over time to prove that. You don't want to be the guy who finds out that this is not one of those bits of software. IMHO you need maturity for revision control software... you are relying on it for history. The only open source options worth considering for Python are CVS and SVN, and even SVN is questionable (see bdb backend issues). -- Donovan Baarda From python at rcn.com Wed Aug 3 03:02:55 2005 From: python at rcn.com (Raymond Hettinger) Date: Tue, 2 Aug 2005 21:02:55 -0400 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: Message-ID: <000301c597c7$1551dde0$92b2958d@oemcomputer> The Py3.0 PEPs are a bit disconcerting. Without 3.0 actively in development, it is difficult to get the participation, interest, and seriousness of thought that we apply to the current release. The PEPs may have the effect of prematurely finalizing discussions on something that still has an ethereal if not pie-in-the-sky quality to it. I would hate for 3.0 development to start with constraints that got set in stone before the project became a reality. With respect to exception re-organization, the conversation has been thought provoking but a little too much of a design-from-scratch quality. Each proposed change needs to be rooted in a specific problem with the current hierarchy (i.e. what use cases cannot currently be dealt with under the existing tree). Setting a high threshold for change will increase the likelihood that old code can be easily ported and decrease the likelihood of either throwing away previous good decisions or adopting new ideas that later prove unworkable. IOW, unless the current tree is thought to be really bad, then the new tree ought to be very close to what we have now. Raymond > -----Original Message----- > From: python-dev-bounces+python=rcn.com at python.org [mailto:python-dev- > bounces+python=rcn.com at python.org] On Behalf Of Brett Cannon > Sent: Tuesday, August 02, 2005 8:34 PM > To: Python Dev > Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 > > OK, having taken in all of the suggestions, here is another revision > round. I think I still have a place or two I partially ignored people > just because there was not a severe uproar and I still think the > original idea is good (renaming RuntimeError, for instance). I also > added notes on handling the transition and rejected idea. > > There is now only one open issue, which is whether > ControlFlowException should be removed. > > And I am still waiting on a PEP number to be able to check this into > CVS and push me to flesh out the references. =) From pje at telecommunity.com Wed Aug 3 03:17:56 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 02 Aug 2005 21:17:56 -0400 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: <000301c597c7$1551dde0$92b2958d@oemcomputer> References: Message-ID: <5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com> At 09:02 PM 8/2/2005 -0400, Raymond Hettinger wrote: >The Py3.0 PEPs are a bit disconcerting. Without 3.0 actively in >development, it is difficult to get the participation, interest, and >seriousness of thought that we apply to the current release. The PEPs >may have the effect of prematurely finalizing discussions on something >that still has an ethereal if not pie-in-the-sky quality to it. I would >hate for 3.0 development to start with constraints that got set in stone >before the project became a reality. > >With respect to exception re-organization, the conversation has been >thought provoking but a little too much of a design-from-scratch >quality. Each proposed change needs to be rooted in a specific problem >with the current hierarchy (i.e. what use cases cannot currently be >dealt with under the existing tree). Setting a high threshold for >change will increase the likelihood that old code can be easily ported >and decrease the likelihood of either throwing away previous good >decisions or adopting new ideas that later prove unworkable. IOW, >unless the current tree is thought to be really bad, then the new tree >ought to be very close to what we have now. +1. The main things that need fixing, IMO, are the need for critical and control flow exceptions to be distinguished from "normal" errors. The rest is mostly too abstract for me to care about in 2.x. From abo at minkirri.apana.org.au Wed Aug 3 03:22:28 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Tue, 02 Aug 2005 18:22:28 -0700 Subject: [Python-Dev] Syscall Proxying in Python In-Reply-To: <42EFC2AE.6020205@corest.com> References: <42EE5DAA.8040200@corest.com> <1122919731.9688.43.camel@warna.corp.google.com> <42EFC2AE.6020205@corest.com> Message-ID: <1123032148.1859.131.camel@warna.corp.google.com> On Tue, 2005-08-02 at 11:59, Gabriel Becedillas wrote: > Donovan Baarda wrote: [...] > > Wow... you guys sure did it the hard way. If you had done it at the > > Python level, you would have had a much easier time of both implementing > > and updating it. [...] > Hi, thanks for your reply. > The problem I see with the aproach you're sugesting is that I have to > rewrite a lot of code to make it work the way I want. We allready have > the syscall proxying stuff with an stdio layer on top of it. I should > have to rewrite some parts of some modules and use my own versions of > stdio functions, and that is pretty much the same as we have done before. > There are also native objects that use stdio functions, and I should > replace those ones too, or modules that have some native code that uses > stdio, or sockets. I should duplicate those files, and make the same > kind of search/replace work that we have done previously and that we'd > like to avoid. > Please let me know if I misunderstood you. Nope... you got it all figured out. I guess it depends on what degree of "proxying" you want... I thought there was some stuff you wanted re-directed, and some you didn't. The point is, you _can_ do this at the Python level, and you only have to modify Python code, not C Python source. However, if you want to proxy everything, then the glib wrapper is probably the best approach, provided you really want to code in C and have your own Python binary. -- Donovan Baarda From pinard at iro.umontreal.ca Wed Aug 3 03:27:54 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Tue, 2 Aug 2005 21:27:54 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1123030984.1821.124.camel@warna.corp.google.com> References: <20050802144908.GA7898@alcyon.progiciels-bpi.ca> <000001c59772$4224c8a0$92b2958d@oemcomputer> <20050802160620.GA9652@alcyon.progiciels-bpi.ca> <1123030984.1821.124.camel@warna.corp.google.com> Message-ID: <20050803012754.GA21052@alcyon.progiciels-bpi.ca> [Donovan Baarda] > It is true that some well designed/developed software becomes reliable > very quicky. However, it still takes heavy use over time to prove that. There is wisdom in your say! :-) -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From kbk at shore.net Wed Aug 3 04:28:58 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Tue, 2 Aug 2005 22:28:58 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200508030228.j732SwHG022094@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 354 open ( -3) / 2888 closed ( +3) / 3242 total ( +0) Bugs : 909 open (+11) / 5152 closed ( +8) / 6061 total (+19) RFE : 191 open ( +0) / 178 closed ( +0) / 369 total ( +0) Patches Closed ______________ PEP 342 Generator enhancements (2005-06-18) http://python.org/sf/1223381 closed by pje Provide tuple of "special" exceptions (2004-10-01) http://python.org/sf/1038256 closed by ncoghlan Patch for (Doc) #1243553 (2005-07-24) http://python.org/sf/1243910 closed by montanaro New / Reopened Bugs ___________________ "new" not marked as deprecated in the docs (2005-07-30) http://python.org/sf/1247765 opened by J?rgen Hermann error in popen2() reference (2005-07-30) CLOSED http://python.org/sf/1248036 opened by Lorenzo Luengo pdb 'next' does not skip list comprehension (2005-07-31) http://python.org/sf/1248119 opened by Joseph Heled set of pdb breakpoint fails (2005-07-31) http://python.org/sf/1248127 opened by Joseph Heled shelve .sync operation not documented (2005-07-31) http://python.org/sf/1248199 opened by paul rubin dir should accept dirproxies for __dict__ (2005-07-31) http://python.org/sf/1248658 opened by Ronald Oussoren 2.3.5 SRPM fails to build without tkinter installed (2005-07-31) http://python.org/sf/1248997 opened by Laurie Harper rfc822 module, bug in parsedate_tz (2005-08-01) http://python.org/sf/1249573 opened by nemesis isinstance() fails depending on how modules imported (2005-08-01) http://python.org/sf/1249615 opened by Hugh Gibson Encodings and aliases do not match runtime (2005-08-01) http://python.org/sf/1249749 opened by liturgist container methods raise KeyError not IndexError (2005-08-01) http://python.org/sf/1249837 opened by Wilfredo Sanchez numarray in debian python 2.4.1 (2005-08-01) CLOSED http://python.org/sf/1249867 opened by LovePanda numarray in debian python 2.4.1 (2005-08-01) CLOSED http://python.org/sf/1249873 opened by LovePanda numarray in debian python 2.4.1 (2005-08-01) http://python.org/sf/1249903 opened by LovePanda IDLE does not start. 2.4.1 (2005-08-01) CLOSED http://python.org/sf/1249965 opened by codepyro gethostbyname(gethostname()) fails on misconfigured system (2005-08-02) http://python.org/sf/1250170 opened by Tadeusz Andrzej Kadlubowski incorrect description of range function (2005-08-02) http://python.org/sf/1250306 opened by John Gleeson The -m option to python does not search zip files (2005-08-02) http://python.org/sf/1250389 opened by Paul Moore Tix: PanedWindow.panes nonfunctional (2005-08-02) http://python.org/sf/1250469 opened by Majromax Bugs Closed ___________ Segfault in Python interpreter 2.3.5 (2005-07-26) http://python.org/sf/1244864 closed by birkenfeld logging module doc needs to note that it changed in 2.4 (2005-07-25) http://python.org/sf/1244683 closed by vsajip manual.cls contains an invalid pdf-inquiry (2005-07-14) http://python.org/sf/1238210 closed by fdrake error in popen2() reference (2005-07-30) http://python.org/sf/1248036 closed by birkenfeld numarray in debian python 2.4.1 (2005-08-01) http://python.org/sf/1249867 closed by birkenfeld numarray in debian python 2.4.1 (2005-08-01) http://python.org/sf/1249873 closed by birkenfeld IDLE does not start. 2.4.1 (2005-08-01) http://python.org/sf/1249965 closed by codepyro Incorrect documentation of re.UNICODE (2005-07-22) http://python.org/sf/1243192 closed by birkenfeld From ncoghlan at gmail.com Wed Aug 3 11:05:54 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 03 Aug 2005 19:05:54 +1000 Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0 In-Reply-To: References: <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp> <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com> <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp> <42EF444A.4040108@gmail.com> <5.1.1.6.0.20050802113926.02895d08@mail.telecommunity.com> Message-ID: <42F088F2.3020908@gmail.com> Brett Cannon wrote: > On 8/2/05, Phillip J. Eby wrote: >>It seems to me that multiple inheritance is definitely the right idea, >>though. That way, we can get the hierarchy we really want with only a >>minimum of boilerplate in pre-3.0 to make it actually work. > > Yeah. I think name aliasing and multiple inheritance will take us a > long way. Warnings should be able to take us the rest of the way. > > -Brett (who is still waiting for a number; Barry, David, you out there?) And it will let us get rid of some of the ugliness in my v 0.1 proposal, too (like Error being a child of StandardError, instead of the other way around). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Wed Aug 3 15:10:33 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 03 Aug 2005 23:10:33 +1000 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: <5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com> References: <5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com> Message-ID: <42F0C249.20007@gmail.com> Phillip J. Eby wrote: > +1. The main things that need fixing, IMO, are the need for critical and > control flow exceptions to be distinguished from "normal" errors. The rest > is mostly too abstract for me to care about in 2.x. I guess, before we figure out "where would we like to go?", we really need to know "what's wrong with where we are right now?" Like you, the only real problem I have with the current hierarchy is that "except Exception:" is just as bad as a bare except in terms of catching exceptions it shouldn't (like SystemExit). I find everything else about the hierarchy is pretty workable (mainly because it *is* fairly flat - if I want to catch a couple of different exception types, which is fairly rare, I can just list them). I like James's suggestion that instead of trying to switch people to using something other than "except Exception:", we just aim to adjust the hierarchy so that "except Exception" becomes the right thing to do. Changing the inheritance structure a bit is far easier than trying to shift several years of accumulated user experience. . . Anyway, with the hierarchy below, "except Exception:" still overreaches, but can be corrected by preceding it with "except (ControlFlow, CriticalError): raise". "except Exception:" stops overreaching once the links from Exception to StopIteration and SystemExit, and the links from StandardError to KeyboardInterrupt, SystemError and MemoryError are removed (probably difficult to do before Py3k but not impossible). This hierarchy also means that inheriting application and library errors from Exception can continue to be recommended practice. Adapting the language to fit the users rather than the other way around seems to be a pretty good call on this point. . . The only changes from the Python 2.4 hierarchy are: New exceptions: - Raisable (new base) - ControlFlow (inherits from Raisable) - CriticalError (inherits from Raisable) - GeneratorExit (inherits from ControlFlow) Added inheritance: - Exception from Raisable - StopIteration, SystemExit, KeyboardInterrupt from ControlFlow - SystemError, MemoryError from CriticalError Python 2.4 Compatible Improved Exception Hierarchy v 0.2 [1] ============================================================ Raisable (new) +-- ControlFlow (new) +-- GeneratorExit (new) +-- StopIteration (inheritance new) +-- SystemExit (inheritance new) +-- KeyboardInterrupt (inheritance new) +-- CriticalError (new) +-- MemoryError (inheritance new) +-- SystemError (inheritance new) +-- Exception (inheritance new) +-- StopIteration +-- SystemExit +-- StandardError +-- KeyboardInterrupt +-- MemoryError +-- SystemError +-- AssertionError +-- AttributeError +-- EOFError +-- ImportError +-- TypeError +-- ReferenceError +-- ArithmeticError +-- FloatingPointError +-- DivideByZeroError +-- OverflowError +-- EnvironmentError +-- OSError +-- WindowsError +-- IOError +-- LookupError +-- IndexError +-- KeyError +-- NameError +-- UnboundLocalError +-- RuntimeError +-- NotImplementedError +-- SyntaxError +-- IndentationError +-- TabError +-- ValueError +-- UnicodeError +-- UnicodeDecodeError +-- UnicodeEncodeError +-- UnicodeTranslateError +-- Warning +-- DeprecationWarning +-- FutureWarning +-- PendingDeprecationWarning +-- RuntimeWarning +-- SyntaxWarning +-- UserWarning Cheers, Nick. [1] I've started putting version numbers on these suggestions, since someone referred to "Nick's exception hierarchy" in one of the threads, and I had no idea which of my suggestions they meant. I think I'm up to three or four different variants by now. . . -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From pje at telecommunity.com Wed Aug 3 15:24:27 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 03 Aug 2005 09:24:27 -0400 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: <42F0C249.20007@gmail.com> References: <5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com> <5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050803092046.025af648@mail.telecommunity.com> At 11:10 PM 8/3/2005 +1000, Nick Coghlan wrote: > New exceptions: > - Raisable (new base) > - ControlFlow (inherits from Raisable) > - CriticalError (inherits from Raisable) > - GeneratorExit (inherits from ControlFlow) > Added inheritance: > - Exception from Raisable > - StopIteration, SystemExit, KeyboardInterrupt from ControlFlow > - SystemError, MemoryError from CriticalError +1 I'd also like to see a "Reraisable" or something like that to cover both CriticalError and ControlFlow, but it could be a tuple of those two bases rather than a class. But that's just a "would be nice" feature. From rrr at ronadam.com Wed Aug 3 18:33:31 2005 From: rrr at ronadam.com (Ron Adam) Date: Wed, 03 Aug 2005 12:33:31 -0400 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: <42F0C249.20007@gmail.com> References: <5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com> <42F0C249.20007@gmail.com> Message-ID: <42F0F1DB.7050704@ronadam.com> Nick Coghlan wrote: > Phillip J. Eby wrote: > >>+1. The main things that need fixing, IMO, are the need for critical and >>control flow exceptions to be distinguished from "normal" errors. The rest >>is mostly too abstract for me to care about in 2.x. > > > I guess, before we figure out "where would we like to go?", we really need to > know "what's wrong with where we are right now?" > > Like you, the only real problem I have with the current hierarchy is that > "except Exception:" is just as bad as a bare except in terms of catching > exceptions it shouldn't (like SystemExit). I find everything else about the > hierarchy is pretty workable (mainly because it *is* fairly flat - if I want > to catch a couple of different exception types, which is fairly rare, I can > just list them). More often than not, 9 out 10 times, when ever I use "except Exception:" or a bare except:, what I am doing is the equivalent to: try: # will either fail or not except: pass else: Usually I end up using "if hasattr():" or some other way to pre test the statement if possible as I find "except: pass" to be ugly. And putting both the statement that may fail together with the depending statements in the try:, catches too much. Finding subtle errors hidden by a try block can be rather difficult at times. Could inverse exceptions be an option? Exceptions don't work this way so it would probably need to be sugar for "except :pass; else:". Possibly? try: except not : "except None:" as an option? Ok, this isn't exactly clear, and probably a -2 for several reasons. The exception tree organization should also take into account inverse relationships as well. Exceptions used for control flow are often of the type "if not exception: do something". Cheers, Ron From bcannon at gmail.com Wed Aug 3 18:55:56 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 3 Aug 2005 09:55:56 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: <000301c597c7$1551dde0$92b2958d@oemcomputer> References: <000301c597c7$1551dde0$92b2958d@oemcomputer> Message-ID: On 8/2/05, Raymond Hettinger wrote: > The Py3.0 PEPs are a bit disconcerting. Without 3.0 actively in > development, it is difficult to get the participation, interest, and > seriousness of thought that we apply to the current release. The PEPs > may have the effect of prematurely finalizing discussions on something > that still has an ethereal if not pie-in-the-sky quality to it. I would > hate for 3.0 development to start with constraints that got set in stone > before the project became a reality. > I don't view this PEP (nor any other PYthon 3000 PEP) as set in stone until we are one or two versions away from Python 3.0 . I view these PEPs as just provoking discussion and getting the ball rolling now instead of rushing to get it done when it does come time to start thinking about these things. Even if we get everyone to agree on this PEP I still won't consider it finalized until there is one more round of discussion just before we start implementing for Python 3.0 . > With respect to exception re-organization, the conversation has been > thought provoking but a little too much of a design-from-scratch > quality. Each proposed change needs to be rooted in a specific problem > with the current hierarchy (i.e. what use cases cannot currently be > dealt with under the existing tree). Setting a high threshold for > change will increase the likelihood that old code can be easily ported > and decrease the likelihood of either throwing away previous good > decisions or adopting new ideas that later prove unworkable. IOW, > unless the current tree is thought to be really bad, then the new tree > ought to be very close to what we have now. > So are you saying that the renaming is bad, or the whole reorg? It seems everyone agrees with the moving of the control flow exceptions and CriticalException, although the renamings might just me wishing for it. -Brett From gvanrossum at gmail.com Wed Aug 3 19:18:42 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 3 Aug 2005 10:18:42 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: References: <000301c597c7$1551dde0$92b2958d@oemcomputer> Message-ID: So here's a radical proposal (hear the scratching of the finglernail on the blackboard? :-). Start with Brett's latest proposal. Goal: keep bare "except:" but change it to catch only the part of the hierarchy rooted at StandardError. - Call the root of the hierarchy Raisable. - Rename CriticalException to CriticalError (this should happen anyway). - Rename ControlFlowException to ControlFlowRaisable (anything except Error or Exception). - Rename StandardError to Exception. - Make Warning a subclass of Exception. I'd want the latter point even if the rest of this idea is rejected; when a Warning is raised (as opposed to just printing a message or being suppressed altogether) it should be treated just like any other normal exception, i.e. StandardError. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Wed Aug 3 19:28:32 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 3 Aug 2005 10:28:32 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: <42F0C249.20007@gmail.com> References: <5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com> <42F0C249.20007@gmail.com> Message-ID: On 8/3/05, Nick Coghlan wrote: > Phillip J. Eby wrote: > > +1. The main things that need fixing, IMO, are the need for critical and > > control flow exceptions to be distinguished from "normal" errors. The rest > > is mostly too abstract for me to care about in 2.x. > > I guess, before we figure out "where would we like to go?", we really need to > know "what's wrong with where we are right now?" > > Like you, the only real problem I have with the current hierarchy is that > "except Exception:" is just as bad as a bare except in terms of catching > exceptions it shouldn't (like SystemExit). I find everything else about the > hierarchy is pretty workable (mainly because it *is* fairly flat - if I want > to catch a couple of different exception types, which is fairly rare, I can > just list them). > Does no one else feel some of the names could be improved upon? While we might all have gotten used to them I still don't believe newbies necessarily grasp what they are all for based on their names. Then again, if this PEP is viewed more as handling macro issues with the currnt hierarchy and name changes can be done when we get closer to Python 3.0 I am happy to drop renaming until we are closer to actual implementation with a section just listing suggested name changes and stating that they are just being considered possibl renaming which will not be finalized until we are closer to Python 3.0 . > I like James's suggestion that instead of trying to switch people to using > something other than "except Exception:", we just aim to adjust the hierarchy > so that "except Exception" becomes the right thing to do. Changing the > inheritance structure a bit is far easier than trying to shift several years > of accumulated user experience. . . > > Anyway, with the hierarchy below, "except Exception:" still overreaches, but > can be corrected by preceding it with "except (ControlFlow, CriticalError): > raise". > > "except Exception:" stops overreaching once the links from Exception to > StopIteration and SystemExit, and the links from StandardError to > KeyboardInterrupt, SystemError and MemoryError are removed (probably difficult > to do before Py3k but not impossible). > > This hierarchy also means that inheriting application and library errors from > Exception can continue to be recommended practice. Adapting the language to > fit the users rather than the other way around seems to be a pretty good call > on this point. . . > Well, then StandardError becomes kind of stupid. The only use it would serve is a superclass for all non-critical, non-control-flow built-in exceptions. I really don't know how often that is going to be needed. I do realize it keeps inheritance working for existing code, though. I guess that would just have to be a trade-off for backwards-compatibility. OK, I am convinced; I will revert back to Raisable/Exception instead of Exception/StandardError. -Brett From bcannon at gmail.com Wed Aug 3 19:29:45 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 3 Aug 2005 10:29:45 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: <5.1.1.6.0.20050803092046.025af648@mail.telecommunity.com> References: <5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com> <42F0C249.20007@gmail.com> <5.1.1.6.0.20050803092046.025af648@mail.telecommunity.com> Message-ID: On 8/3/05, Phillip J. Eby wrote: > At 11:10 PM 8/3/2005 +1000, Nick Coghlan wrote: > > New exceptions: > > - Raisable (new base) > > - ControlFlow (inherits from Raisable) > > - CriticalError (inherits from Raisable) > > - GeneratorExit (inherits from ControlFlow) > > Added inheritance: > > - Exception from Raisable > > - StopIteration, SystemExit, KeyboardInterrupt from ControlFlow > > - SystemError, MemoryError from CriticalError > > +1 > > I'd also like to see a "Reraisable" or something like that to cover both > CriticalError and ControlFlow, but it could be a tuple of those two bases > rather than a class. But that's just a "would be nice" feature. Eh, I am not so hot on this idea. I see your argument, Phillip, but I just don't think it will be useful enough to warrant its introduction. Could add to the exceptions module, though. -Brett From bcannon at gmail.com Wed Aug 3 19:44:30 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 3 Aug 2005 10:44:30 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: References: <000301c597c7$1551dde0$92b2958d@oemcomputer> Message-ID: On 8/3/05, Guido van Rossum wrote: > So here's a radical proposal (hear the scratching of the finglernail > on the blackboard? :-). > > Start with Brett's latest proposal. Including renaming (I want to know if you support the renamings at all, if I should make them more of an idea to be considered when we get closer to Python 3.0, or just drop them) and the new exceptions? > Goal: keep bare "except:" but > change it to catch only the part of the hierarchy rooted at > StandardError. > Why the change of heart? Backwards-compatibility? Way to keep newbies from choosing Raisable or such as what to catch? > - Call the root of the hierarchy Raisable. Fine by me. Will change it before I check in the PEP tonight. > - Rename CriticalException to CriticalError > (this should happen anyway). I thought I changed that in the latest version. I will change it. > - Rename ControlFlowException to ControlFlowRaisable > (anything except Error or Exception). No objection from me. > - Rename StandardError to Exception. So just ditch StandardError, which is fine by me, or go with Nick's v2 proposal and have all pre-existing exceptions inherit from it? I assume the latter since you said you wanted bare 'except' clauses to catch StandardError. > - Make Warning a subclass of Exception. > > I'd want the latter point even if the rest of this idea is rejected; > when a Warning is raised (as opposed to just printing a message or > being suppressed altogether) it should be treated just like any other > normal exception, i.e. StandardError. > Since warnings only become raised if the warnings filter lists it as an error I can see how this is a reasonable suggestion. And if bare 'except' clauses catch StandardError and not Exception they will still propagate to the top unless people explicitly catch Exception or lower which seems fair. -Brett From gvanrossum at gmail.com Wed Aug 3 21:00:58 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 3 Aug 2005 12:00:58 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: References: <000301c597c7$1551dde0$92b2958d@oemcomputer> Message-ID: On 8/3/05, Brett Cannon wrote: > On 8/3/05, Guido van Rossum wrote: > > So here's a radical proposal (hear the scratching of the finglernail > > on the blackboard? :-). > > > > Start with Brett's latest proposal. > > Including renaming (I want to know if you support the renamings at > all, if I should make them more of an idea to be considered when we > get closer to Python 3.0, or just drop them) and the new exceptions? Most of the renamings sound fine to me. > > Goal: keep bare "except:" but > > change it to catch only the part of the hierarchy rooted at > > StandardError. > > Why the change of heart? Backwards-compatibility? Way to keep > newbies from choosing Raisable or such as what to catch? The proposal accepts that there's a need to catch "all errors that are reasonable to catch": that's why it separates StandardError from the root exception class. So now we're going to recommend that everyone who was using bare 'except:' write 'except StandardError:' instead. So why not have a default? Because of EIBTI? Seems a weak argument; we have defaults for lots of things. > > - Call the root of the hierarchy Raisable. > > Fine by me. Will change it before I check in the PEP tonight. > > > - Rename CriticalException to CriticalError > > (this should happen anyway). > > I thought I changed that in the latest version. I will change it. I may have missed the change. > > - Rename ControlFlowException to ControlFlowRaisable > > (anything except Error or Exception). > > No objection from me. I actually find it ugly; but it's not an error and it would be weird if there was an xxxException that didn't derive from Exception. > > - Rename StandardError to Exception. > > So just ditch StandardError, which is fine by me, or go with Nick's v2 > proposal and have all pre-existing exceptions inherit from it? I > assume the latter since you said you wanted bare 'except' clauses to > catch StandardError. What do you think? Of course the critical and control flow ones should *not* inherit from it. [...brain hums...] OK, I'm changing my mind again about the names again. Exception as the root and StandardError can stay; the only new proposal would then be to make bare 'except:' call StandardError. > > - Make Warning a subclass of Exception. > > > > I'd want the latter point even if the rest of this idea is rejected; > > when a Warning is raised (as opposed to just printing a message or > > being suppressed altogether) it should be treated just like any other > > normal exception, i.e. StandardError. > > Since warnings only become raised if the warnings filter lists it as > an error I can see how this is a reasonable suggestion. And if bare > 'except' clauses catch StandardError and not Exception they will still > propagate to the top unless people explicitly catch Exception or lower > which seems fair. Unclear what you mean; I want bare except; to catch Warnings! IOW I want Warning to inherit from whatever the thing is that bare except: catches (if we keep it) and that is the start of all the "normal" exceptions excluding critical and control flow exceptions. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at egenix.com Wed Aug 3 21:01:10 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 03 Aug 2005 21:01:10 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42EFE295.6040906@v.loewis.de> References: <42E93940.6080708@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de> Message-ID: <42F11476.9000507@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>True, but if we never ask, we'll never know :-) >> >>My question was: Would asking a professional hosting company >>be a reasonable approach ? > > It would be an option, yes, of course. It's not an approach that > *I* would be willing to implement, though. Fair enough. >>>From the answers, I take it that there's not much trust in these >>offers, so I guess there's not much desire to PSF money into this. > > I haven't received any offers to make a qualified statement. I only > know that I would oppose an approach to ask somebody but our > volunteers to do it for free, and I also know that I don't want to > spend my time researching commercial alternatives (although I > wouldn't mind if you spent your time). I don't quite understand what you meant here: are you opposing spending PSF money on a hosting company if and only if volunteers who take on the job don't get paid ? I've done a bit of research on the subject and so far only found CollabNet and VA offering commercial services in this area. VA hosts SourceForge so that's a non-option, I guess :-) I know that Greg Stein worked for CollabNet, so thought it might be a good idea to ask him about the idea to move things to CollabNet. Of course, before taking this route, I wanted to get a feeling for the general attitude towards a commercial approach, which is why I tossed in the idea. Other non-commercial alternatives are Berlios and Savannah, but I'm not sure whether they'd offer Subversion support. BTW, have you considered using Trac as issue tracker on svn.python.org ? They have a very good subversion integration, it's easy to use, comes with a wiki and looks great. Oh, and it's written in Python :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 03 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From nick.bastin at gmail.com Wed Aug 3 21:18:35 2005 From: nick.bastin at gmail.com (Nicholas Bastin) Date: Wed, 3 Aug 2005 15:18:35 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42EF2794.1000209@v.loewis.de> References: <42E93940.6080708@v.loewis.de> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> Message-ID: <66d0a6e105080312181e25fa08@mail.gmail.com> On 8/2/05, "Martin v. L?wis" wrote: > George V. Neville-Neil wrote: > > Since Python is Open Source are you looking at Per Force which you can > > use for free and seems to be a happy medium between something like CVS > > and something horrific like Clear Case? > > No. The PEP is only about Subversion. Why should we be looking at Per > Force? Only because Python is Open Source? Perforce is a commercial product, but it can be had for free for verified Open Source projects, which Python shouldn't have any problem with. There are other problems, like you have to renew the agreement every year, but it might be worth considering, given the fact that it's an excellent system. > I think anything but Subversion is ruled out because: > - there is no offer to host that anywhere (for subversion, there is > already svn.python.org) We could host a Perforce repository just as easily, I would think. > - there is no support for converting a CVS repository (for subversion, > there is cvs2svn) I'd put $20 on the fact that cvs2svn will *not* work out of the box for converting the python repository. Just call it a hunch. In any case, the Perforce-supplied cvs2p4 should work at least as well. -- Nick From martin at v.loewis.de Wed Aug 3 21:22:10 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 03 Aug 2005 21:22:10 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F11476.9000507@egenix.com> References: <42E93940.6080708@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de> <42F11476.9000507@egenix.com> Message-ID: <42F11962.2070107@v.loewis.de> M.-A. Lemburg wrote: >>I haven't received any offers to make a qualified statement. I only >>know that I would oppose an approach to ask somebody but our >>volunteers to do it for free, and I also know that I don't want to >>spend my time researching commercial alternatives (although I >>wouldn't mind if you spent your time). > > > I don't quite understand what you meant here: are you opposing > spending PSF money on a hosting company if and only if volunteers > who take on the job don't get paid ? No. I'm opposed to approaching somebody to do it for free, except the somebody are the pydotorg volunteers (IOW, I won't take gifts from anybody else in this matter). > I've done a bit of research on the subject and so far only found > CollabNet and VA offering commercial services in this area. VA hosts > SourceForge so that's a non-option, I guess :-) It's not that I dislike VA - I personally think they are doing a great job with SourceForge, and I like SourceForge a lot. There are just some issues with it (like that they offer no Subversion). The question would be: what precisely is the commercial offering from VA: does it provide subversion? how is the user management done? etc. > I know that Greg Stein worked for CollabNet, so thought it might be a > good idea to ask him about the idea to move things to CollabNet. > Of course, before taking this route, I wanted to get a feeling > for the general attitude towards a commercial approach, which > is why I tossed in the idea. Ok - I expect that the project might be *done* before we even have a single commercial offer, with a precise service description, and a precise price tag. That makes commercial offers so difficult: that it is so time expensive to use them, that you might spend less time doing it yourself. > Other non-commercial alternatives are Berlios and Savannah, but > I'm not sure whether they'd offer Subversion support. For me, they fall into the "I won't take gifts" category. > BTW, have you considered using Trac as issue tracker on > svn.python.org ? You mean, me personally? I quite like the Subversion tracker, and don't want to trade it for anything else. I know Guido wants to use Roundup (which is also written in Python), and obviously so does Richard Jones. The main questions are the same as with this PEP: how to do the migration from SF (without losing data), and how to do the ongoing maintenance. It's just that finding answers to these questions is so much harder, therefore, this PEP is *only* about CVS. Regards, Martin From fdrake at acm.org Wed Aug 3 21:28:25 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 3 Aug 2005 15:28:25 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F11476.9000507@egenix.com> References: <42E93940.6080708@v.loewis.de> <42EFE295.6040906@v.loewis.de> <42F11476.9000507@egenix.com> Message-ID: <200508031528.25776.fdrake@acm.org> On Wednesday 03 August 2005 15:01, M.-A. Lemburg wrote: > Other non-commercial alternatives are Berlios and Savannah, but > I'm not sure whether they'd offer Subversion support. Berlios does offer Subversion; the docutils project is using the Berlios Subversion and SourceForge for everything else. I don't know whether Savannah is offering Subversion right now, but the last time I looked at it, it appeared nearly un-maintained. But that may just be the understated nature of that community. :-) -Fred -- Fred L. Drake, Jr. From bcannon at gmail.com Wed Aug 3 22:10:58 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 3 Aug 2005 13:10:58 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: References: <000301c597c7$1551dde0$92b2958d@oemcomputer> Message-ID: On 8/3/05, Guido van Rossum wrote: > On 8/3/05, Brett Cannon wrote: > > On 8/3/05, Guido van Rossum wrote: > > > So here's a radical proposal (hear the scratching of the finglernail > > > on the blackboard? :-). > > > > > > Start with Brett's latest proposal. > > > > Including renaming (I want to know if you support the renamings at > > all, if I should make them more of an idea to be considered when we > > get closer to Python 3.0, or just drop them) and the new exceptions? > > Most of the renamings sound fine to me. > OK, great. I will leave in the new names and the new exceptions. > > > Goal: keep bare "except:" but > > > change it to catch only the part of the hierarchy rooted at > > > StandardError. > > > > Why the change of heart? Backwards-compatibility? Way to keep > > newbies from choosing Raisable or such as what to catch? > > The proposal accepts that there's a need to catch "all errors that are > reasonable to catch": that's why it separates StandardError from the > root exception class. > > So now we're going to recommend that everyone who was using bare > 'except:' write 'except StandardError:' instead. > > So why not have a default? > Because you can easily write it without a default. > Because of EIBTI? > Don't know the acronym (and neither does acronymfinder.com). > Seems a weak argument; we have defaults for lots of things. > OK. I was fine with bare 'except' clauses to begin with so this is not a huge point of contention for me personally. [SNIP] > > So just ditch StandardError, which is fine by me, or go with Nick's v2 > > proposal and have all pre-existing exceptions inherit from it? I > > assume the latter since you said you wanted bare 'except' clauses to > > catch StandardError. > > What do you think? Of course the critical and control flow ones should > *not* inherit from it. > Well, Nick and Jame's point of tweaking the names so that they more reflect what people expect instead of what they are meant to actually be is interesting. But, in terms of backwards-compatibility, Exception/StandardError is most exacting in terms of matching what already exists. But with renamings I don't know how critical this kind of low-level backwards-compatibility is critical. Personally I just prefer the names Exception/StandardError for unexplained aesthetic reasons. > [...brain hums...] > > OK, I'm changing my mind again about the names again. > > Exception as the root and StandardError can stay; the only new > proposal would then be to make bare 'except:' call StandardError. > OK. I will then also leave ControlFlowException as-is. > > > - Make Warning a subclass of Exception. > > > > > > I'd want the latter point even if the rest of this idea is rejected; > > > when a Warning is raised (as opposed to just printing a message or > > > being suppressed altogether) it should be treated just like any other > > > normal exception, i.e. StandardError. > > > > Since warnings only become raised if the warnings filter lists it as > > an error I can see how this is a reasonable suggestion. And if bare > > 'except' clauses catch StandardError and not Exception they will still > > propagate to the top unless people explicitly catch Exception or lower > > which seems fair. > > Unclear what you mean; I want bare except; to catch Warnings! IOW I > want Warning to inherit from whatever the thing is that bare except: > catches (if we keep it) and that is the start of all the "normal" > exceptions excluding critical and control flow exceptions. > OK, that squares that one away. And it makes sense since you can view Warnings as even less critical exceptions than the non-control and non-critical exceptions and thus should be caught by a default `except' clause. -Brett From mwh at python.net Wed Aug 3 22:13:43 2005 From: mwh at python.net (Michael Hudson) Date: Wed, 03 Aug 2005 21:13:43 +0100 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: (Guido van Rossum's message of "Wed, 3 Aug 2005 10:18:42 -0700") References: <000301c597c7$1551dde0$92b2958d@oemcomputer> Message-ID: <2mfytqzslk.fsf@starship.python.net> Guido van Rossum writes: > So here's a radical proposal (hear the scratching of the finglernail > on the blackboard? :-). > > Start with Brett's latest proposal. Goal: keep bare "except:" but > change it to catch only the part of the hierarchy rooted at > StandardError. > > - Call the root of the hierarchy Raisable. > - Rename CriticalException to CriticalError > (this should happen anyway). > - Rename ControlFlowException to ControlFlowRaisable > (anything except Error or Exception). > - Rename StandardError to Exception. > - Make Warning a subclass of Exception. > > I'd want the latter point even if the rest of this idea is rejected; > when a Warning is raised (as opposed to just printing a message or > being suppressed altogether) it should be treated just like any other > normal exception, i.e. StandardError. In the above you need to ensure that all raised exceptions inherit from Raisable, because sometimes you really do want to catch almost anything (e.g. code.py). Has anyone thought about the C side of this? There are a few slightly-careless calls to PyErr_Clear() in the codebase, and they can cause just as much (more!) heartache as bare except: clauses. I'll note in passing that I'm not sure that any reorganization of the exception hierachy will make this kind of catching-too-much bug go away. The issue is just thorny, and each case is different. I'm also still not convinced that the backwards compatibility breaking Python 3.0 will ever actually happen, but I guess that's a different consideration... Cheers, mwh -- Haha! You had a *really* weak argument! -- Moshe Zadka, comp.lang.python From bcannon at gmail.com Wed Aug 3 22:23:21 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 3 Aug 2005 13:23:21 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: <2mfytqzslk.fsf@starship.python.net> References: <000301c597c7$1551dde0$92b2958d@oemcomputer> <2mfytqzslk.fsf@starship.python.net> Message-ID: On 8/3/05, Michael Hudson wrote: > Guido van Rossum writes: > > > So here's a radical proposal (hear the scratching of the finglernail > > on the blackboard? :-). > > > > Start with Brett's latest proposal. Goal: keep bare "except:" but > > change it to catch only the part of the hierarchy rooted at > > StandardError. > > > > - Call the root of the hierarchy Raisable. > > - Rename CriticalException to CriticalError > > (this should happen anyway). > > - Rename ControlFlowException to ControlFlowRaisable > > (anything except Error or Exception). > > - Rename StandardError to Exception. > > - Make Warning a subclass of Exception. > > > > I'd want the latter point even if the rest of this idea is rejected; > > when a Warning is raised (as opposed to just printing a message or > > being suppressed altogether) it should be treated just like any other > > normal exception, i.e. StandardError. > > In the above you need to ensure that all raised exceptions inherit > from Raisable, because sometimes you really do want to catch almost > anything (e.g. code.py). > That's part of the PEP. > Has anyone thought about the C side of this? I have thought about it somewhat, but I have not dived in to try to write a patch. > There are a few > slightly-careless calls to PyErr_Clear() in the codebase, and they can > cause just as much (more!) heartache as bare except: clauses. > I fail to see how clearing the exception state has any effect on the implementation for the PEP. > I'll note in passing that I'm not sure that any reorganization of the > exception hierachy will make this kind of catching-too-much bug go > away. The issue is just thorny, and each case is different. > It will never go away as long as catching exceptions based on inheritance exists. Don't know if any language has ever solved it. Best we can do is try to minimize it. > I'm also still not convinced that the backwards compatibility breaking > Python 3.0 will ever actually happen, but I guess that's a different > consideration... Perhaps not. Might end up doing so much of a slow transition that it will just be a bigger codebase change from 2.9 (or whatever the end of the 2.x branch is) to 3.0 . > -- > Haha! You had a *really* weak argument! > -- Moshe Zadka, comp.lang.python Hopefully I don't. =) -Brett From rowen at cesmail.net Wed Aug 3 22:35:24 2005 From: rowen at cesmail.net (Russell E. Owen) Date: Wed, 03 Aug 2005 13:35:24 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 References: Message-ID: In article , Brett Cannon wrote: > New Hierarchy > ============= > > Exception > +-- CriticalException (new) > +-- KeyboardInterrupt > +-- MemoryError > +-- SystemError > +-- ControlFlowException (new) > +-- StopIteration > +-- GeneratorExit > +-- SystemExit > +-- StandardError > +-- AssertionError > +-- SyntaxError > +-- IndentationError > +-- TabError > +-- UserException (rename of RuntimeError) > +-- ArithmeticError > +-- FloatingPointError > +-- DivideByZeroError > +-- OverflowError > +-- UnicodeError > +-- UnicodeDecodeError > +-- UnicodeEncodeError > +-- UnicodeTranslateError > +-- LookupError > +-- IndexError > +-- KeyError > +-- TypeError > +-- AttributeError > +-- EnvironmentError > +-- OSError > +-- IOError > +-- EOFError (new inheritance) > +-- ImportError > +-- NotImplementedError (new inheritance) > +-- NamespaceError (rename of NameError) > +-- UnboundGlobalError (new) > +-- UnboundLocalError > +-- UnboundFreeError (new) > +-- WeakReferenceError (rename of ReferenceError) > +-- ValueError > +-- Warning > +-- UserWarning > +-- AnyDeprecationWarning (new) > +-- PendingDeprecationWarning > +-- DeprecationWarning > +-- SyntaxWarning > +-- SemanticsWarning (rename of RuntimeWarning) > +-- FutureWarning I am wondering why OSError and IOError are not under StandardError? This seems a serious misfeature to me (perhaps the posting was just misformatted?). Having one class for "normal" errors (not exceptions whose sole purpose is to halt the program and not so critical that any continuation is hopeless) sure would make it easier to write code that output a traceback and tried to continue. I'd love it. -- Russell From gvanrossum at gmail.com Wed Aug 3 22:47:27 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 3 Aug 2005 13:47:27 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: References: <000301c597c7$1551dde0$92b2958d@oemcomputer> Message-ID: > > Because of EIBTI? > > Don't know the acronym (and neither does acronymfinder.com). Sorry. Explicit is Better than Implicit. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Wed Aug 3 23:12:09 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 3 Aug 2005 14:12:09 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: References: Message-ID: On 8/3/05, Russell E. Owen wrote: > In article , > Brett Cannon wrote: > > > New Hierarchy > > ============= > > > > Exception [SNIP] > > +-- StandardError [SNIP] > > +-- EnvironmentError > > +-- OSError > > +-- IOError > > +-- EOFError (new inheritance) [SNIP] > > I am wondering why OSError and IOError are not under StandardError? This > seems a serious misfeature to me (perhaps the posting was just > misformatted?). > Look again; they are with an inheritance for both of (OS|IO)Error <- EnvironmentError <- StandardError <- Exception. > Having one class for "normal" errors (not exceptions whose sole purpose > is to halt the program and not so critical that any continuation is > hopeless) sure would make it easier to write code that output a > traceback and tried to continue. I'd love it. > That is what StandardError is for. -Brett From foom at fuhm.net Thu Aug 4 00:26:12 2005 From: foom at fuhm.net (James Y Knight) Date: Wed, 3 Aug 2005 18:26:12 -0400 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: References: <000301c597c7$1551dde0$92b2958d@oemcomputer> Message-ID: On Aug 3, 2005, at 3:00 PM, Guido van Rossum wrote: > [...brain hums...] > > OK, I'm changing my mind again about the names again. > > Exception as the root and StandardError can stay; the only new > proposal would then be to make bare 'except:' call StandardError. I don't see how that can work. Any solution that is expected to result in a usable hierarchy this century must preserve "Exception" as the object that user exceptions should derive from (and therefore that users should generally catch, as well). There is way too much momentum behind that to change it. James From bcannon at gmail.com Thu Aug 4 00:47:54 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 3 Aug 2005 15:47:54 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: References: <000301c597c7$1551dde0$92b2958d@oemcomputer> Message-ID: On 8/3/05, James Y Knight wrote: > On Aug 3, 2005, at 3:00 PM, Guido van Rossum wrote: > > [...brain hums...] > > > > OK, I'm changing my mind again about the names again. > > > > Exception as the root and StandardError can stay; the only new > > proposal would then be to make bare 'except:' call StandardError. > > I don't see how that can work. Any solution that is expected to > result in a usable hierarchy this century must preserve "Exception" > as the object that user exceptions should derive from (and therefore > that users should generally catch, as well). There is way too much > momentum behind that to change it. > Oh, I bet Guido can make them change. =) Look at it this way; going with the Raisable/Exception change and having bare 'except's catch Exception will still lead to a semantic change since CriticalError and ControlFlowException will not be caught. Breakage is going to happen, so why not just do a more thorough change that leads to more breakage? Obviously you are saying to minimize it while Guido is saying to go for a more thorough change. So how much more code is going to crap out with this change? Everything under our control will be fine since we can change it. User-defined exceptions might need to be changed if they inherit directly from Exception instead of StandardError, which is probably the common case, but changing a superclass is not hard. That kind of breakage is not bad since you can easily systematically change superclasses of exceptions from Exception to StandardError without much effort thanks to regexes. I honestly think the requirement of inheriting from a specific superclass will lead to more breakage since you can't grep for exceptions that don't at least inherit from *some* exception universally. -Brett From gvanrossum at gmail.com Thu Aug 4 01:19:18 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 3 Aug 2005 16:19:18 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: References: <000301c597c7$1551dde0$92b2958d@oemcomputer> Message-ID: [Guido van Rossum] > > OK, I'm changing my mind again about the names again. > > > > Exception as the root and StandardError can stay; the only new > > proposal would then be to make bare 'except:' call StandardError. [James Y Knight] > I don't see how that can work. Any solution that is expected to > result in a usable hierarchy this century must preserve "Exception" > as the object that user exceptions should derive from (and therefore > that users should generally catch, as well). There is way too much > momentum behind that to change it. This is actually a good point, and what I was thinking when I first responded to Brett. Sorry for the waivering -- being at OSCON always is a serious attack on my system. I'm still searching for a solution that lets us call everything in the hierarchy "exception" and *yet* has Exception at the mid-point in that hierarchy where Brett has StandardException. The problem with Raisable is that it doesn't contain the word exception; perhaps we can call it BaseException? We've got a few more things called base-such-and-such, e.g. basestring (not that I like that one all that much). BTW I just noticed UserException -- shouldn't this be UserError? Also... We should have a guideline for when to use "...Exception" and when to use "...Error". Maybe we can use ...Exception for the first two levels of the hierarchy, ...Error for errors, and other endings for things that aren't errors (like SystemExit)? Then the top of the tree would look like this: BaseException (or RootException?) +-- CriticalException +-- ControlFlowException +-- Exception +-- (all regular exceptions start here) +-- Warning All common errors and warnings derive from Exception; bare 'except:' would be the same as 'except Exception:'. (I like that particularly because I've been writing that in lots of code already. :-) A refinement might be to introduce something called Error, which would change the last part of the avove hierarchy as follows: (first three lines same as above) +-- Exception +-- Error +-- (all regular ...Error exceptions start here) +-- Warning +-- (all warnings start here) This has a nice symmetry between Error and Warning. Downside is that this "breaks" all user code that currently tries to be correct by declaring exceptions as deriving from Exception, which is pretty common; they would have to derive from Error to be politically correct. I don't immediately see what's best -- maybe Exception and Error should be two names for the same object??? But that's ugly too as a long-term solution. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Thu Aug 4 01:34:52 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 03 Aug 2005 19:34:52 -0400 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: Message-ID: <000601c59883$f35fb280$8421cb97@oemcomputer> > The problem with Raisable > is that it doesn't contain the word exception; perhaps we can call it > BaseException? +1 > A refinement might be to introduce something called Error, which would > change the last part of the avove hierarchy as follows: . . . > This has a nice symmetry between Error and Warning. > > Downside is that this "breaks" all user code that currently tries to > be correct by declaring exceptions as deriving from Exception, which > is pretty common; they would have to derive from Error to be > politically correct. > > I don't immediately see what's best -- maybe Exception and Error > should be two names for the same object??? But that's ugly too as a > long-term solution. -1 Who really cares about the distinction? Besides, the correct choice may depend on your point of view or specific application (i.e. a case could be made that NotImplementedError is sometimes just a regular exception that can be expected to arise and be handled in the normal course of business). Unless we can point to real problems that people are having today, then these kind of changes are likely unwarranted. Raymond From bcannon at gmail.com Thu Aug 4 01:40:36 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 3 Aug 2005 16:40:36 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: References: <000301c597c7$1551dde0$92b2958d@oemcomputer> Message-ID: On 8/3/05, Guido van Rossum wrote: > [Guido van Rossum] > > > OK, I'm changing my mind again about the names again. > > > > > > Exception as the root and StandardError can stay; the only new > > > proposal would then be to make bare 'except:' call StandardError. > > [James Y Knight] > > I don't see how that can work. Any solution that is expected to > > result in a usable hierarchy this century must preserve "Exception" > > as the object that user exceptions should derive from (and therefore > > that users should generally catch, as well). There is way too much > > momentum behind that to change it. > > This is actually a good point, and what I was thinking when I first > responded to Brett. > > Sorry for the waivering -- being at OSCON always is a serious attack > on my system. > As long as you don't change your mind again on bare 'except's I won't feel like strangling you. =) > I'm still searching for a solution that lets us call everything in the > hierarchy "exception" and *yet* has Exception at the mid-point in that > hierarchy where Brett has StandardException. The problem with Raisable > is that it doesn't contain the word exception; perhaps we can call it > BaseException? We've got a few more things called base-such-and-such, > e.g. basestring (not that I like that one all that much). > BaseException is what comes to mind initially. You also mention RootException below. PureException seems too cutesy. SuperclassException might work. SuperException doesn't sound right. Co-worker suggested UhOh, but I don't think that will work either. =) > BTW I just noticed UserException -- shouldn't this be UserError? > Yep, and I already changed it in my personal copy. > Also... We should have a guideline for when to use "...Exception" and > when to use "...Error". Maybe we can use ...Exception for the first > two levels of the hierarchy, ...Error for errors, and other endings > for things that aren't errors (like SystemExit)? Then the top of the > tree would look like this: > That makes the most sense. Error for actual errors, exception when another suffix (e.g., Exit, Iteration) does not fit. > BaseException (or RootException?) > +-- CriticalException > +-- ControlFlowException > +-- Exception > +-- (all regular exceptions start here) > +-- Warning > > All common errors and warnings derive from Exception; bare 'except:' > would be the same as 'except Exception:'. (I like that particularly > because I've been writing that in lots of code already. :-) > > A refinement might be to introduce something called Error, which would > change the last part of the avove hierarchy as follows: > > (first three lines same as above) > +-- Exception > +-- Error > +-- (all regular ...Error exceptions start here) > +-- Warning > +-- (all warnings start here) > > This has a nice symmetry between Error and Warning. > > Downside is that this "breaks" all user code that currently tries to > be correct by declaring exceptions as deriving from Exception, which > is pretty common; they would have to derive from Error to be > politically correct. > > I don't immediately see what's best -- maybe Exception and Error > should be two names for the same object??? But that's ugly too as a > long-term solution. Yuck. I say introduce Error (or StandardError or BaseError) and just live with the fact that older code will not necessarily follow the proper naming scheme. We can provide a script that will change source directly for any class that inherits from Exception to some other class, namely Error. -Brett From aahz at pythoncraft.com Thu Aug 4 02:26:18 2005 From: aahz at pythoncraft.com (Aahz) Date: Wed, 3 Aug 2005 17:26:18 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <66d0a6e105080312181e25fa08@mail.gmail.com> References: <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> Message-ID: <20050804002618.GA2779@panix.com> On Wed, Aug 03, 2005, Nicholas Bastin wrote: > > I'd put $20 on the fact that cvs2svn will *not* work out of the box > for converting the python repository. Just call it a hunch. In any > case, the Perforce-supplied cvs2p4 should work at least as well. Maybe. OTOH, I went to a CVS->SVN talk today at OSCON, and I'd be suspicious of claims that Python's repository is more difficult to convert than others that have successfully made the switch (such as KDE). I'd rather not rely on licensing of a closed-source system; one of the points made during the talk was that the Linux project had to scramble when they lost their Bitkeeper license (but they didn't switch to SVN because they wanted a distributed model -- one of things I appreciated about this talk was the lack of One True Way-ism). -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From stephen at xemacs.org Thu Aug 4 05:36:55 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 04 Aug 2005 12:36:55 +0900 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050804002618.GA2779@panix.com> (aahz@pythoncraft.com's message of "Wed, 3 Aug 2005 17:26:18 -0700") References: <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <20050804002618.GA2779@panix.com> Message-ID: <87ek9afk4o.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "aahz" == aahz writes: aahz> I'd rather not rely on licensing of a closed-source system; aahz> one of the points made during the talk was that the Linux aahz> project had to scramble when they lost their Bitkeeper aahz> license Python is unlikely to throw away its license in the same way, I should think. For additional security, you could try to negotiate a perpetual license on a particular version, or a license that required substantial notice (say, six months) for termination. I would imagine you could get them; the only reason for the vendor not to give them would be spite. The problem with both of those options is the one that Martin already pointed out: negotiation takes effort. There are several good open source alternatives, one of which (svn) is well-established and gets excellent reviews for those goals it sets itself, which happen to be solving the problems (as opposed to missing features) of CVS. Why spend effort on negotiating licenses and preparing for potential vendor relationship problems, unless there's acknowledged need for features svn doesn't provide? -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From bcannon at gmail.com Thu Aug 4 05:43:41 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 3 Aug 2005 20:43:41 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in Message-ID: OK, once the cron job comes around and is run, http://www.python.org/peps/pep-0348.html will not be a 404 but be the latest version of the PEP. Differences since my last public version is that it has BaseException/Exception as the naming hierarchy, Warning inherits from Exception, UserException is UserError, and StandardError inherits from Exception. I also added better annotations on the tree for noticing where inheritance changed and whether it become broader (and thus had a new exception in its MRO) or more restrictive (and thus lost an exception). Basically everything that Guido has brought up today (08-03). I may have made some mistakes changing over to BaseException/Exception thanks to their names being so similar and tossing back in StandardError so if people catch what seems like odd sentences that is why (obviously let me know of the mistake). -Brett From stephen at xemacs.org Thu Aug 4 06:17:50 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 04 Aug 2005 13:17:50 +0900 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F11476.9000507@egenix.com> (M.'s message of "Wed, 03 Aug 2005 21:01:10 +0200") References: <42E93940.6080708@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de> <42F11476.9000507@egenix.com> Message-ID: <87acjyfi8h.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "M" == "M.-A. Lemburg" writes: M> Other non-commercial alternatives are Berlios and Savannah, but M> I'm not sure whether they'd offer Subversion support. Savannah doesn't offer great reliability or support, at least to judge by the frequency with which the GNU Emacs and GNU Arch projects have been unable to access various services on Savannah, including mailing lists and CVS. I also wonder if Savannah poses security risks. They've been successfully cracked (ISTR more than once) in the last couple of years, and took 6-10 weeks to get back to normal. This makes them reluctant to make minor variations in their established procedures for the convenience of projects. For example, it took a couple of months for GNU Arch to arrange sftp access so that they could host the Arch project in an Arch repository (Arch can use sftp but not plain ssh as a transport). SunSITE.dk does provide reliable service and timely support. XEmacs has been very happy with it. But Martin v. Loewis apparently hasn't had the same good experience with negotiating with them, and at least some negotiation and relationship maintenance is necessary---it's a closer, more personal relationship than with SF or Savannah. In particular for Subversion support (I was told they allow it on a case by case basis, and once success is demonstrated they plan to offer it in general). As I say, we've been happy with SunSITE, but the amount of effort is basically the same as if we ran our own repository, just directed more toward "vendor relations" and away from "sys admin" (which suits us). FWIW, XEmacs has moved or reorganized CVS repositories five times since 1999. Although it's not all in the PEP, if you add the discussion on this list Martin has covered the important issues we encountered or worried about. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From martin at v.loewis.de Thu Aug 4 07:42:54 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 04 Aug 2005 07:42:54 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <66d0a6e105080312181e25fa08@mail.gmail.com> References: <42E93940.6080708@v.loewis.de> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> Message-ID: <42F1AADE.50908@v.loewis.de> Nicholas Bastin wrote: >>No. The PEP is only about Subversion. Why should we be looking at Per >>Force? Only because Python is Open Source? > > > Perforce is a commercial product, but it can be had for free for > verified Open Source projects, which Python shouldn't have any problem > with. There are other problems, like you have to renew the agreement > every year, but it might be worth considering, given the fact that > it's an excellent system. So we should consider it because it is an excellent system... I don't know what that means, in precise, day-to-day usage terms (i.e. what precisely would it do for us that, say, Subversion can't do). >>I think anything but Subversion is ruled out because: >>- there is no offer to host that anywhere (for subversion, there is >> already svn.python.org) > > > We could host a Perforce repository just as easily, I would think. Interesting offer. I'll add this to the PEP - who is "we" in this context? >>- there is no support for converting a CVS repository (for subversion, >> there is cvs2svn) > > > I'd put $20 on the fact that cvs2svn will *not* work out of the box > for converting the python repository. Just call it a hunch. You could have read the PEP before losing that money :-) It did work out of the box. Regards, Martin From mal at egenix.com Thu Aug 4 10:51:56 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 04 Aug 2005 10:51:56 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F11962.2070107@v.loewis.de> References: <42E93940.6080708@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de> <42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de> Message-ID: <42F1D72C.8070202@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>>I haven't received any offers to make a qualified statement. I only >>>know that I would oppose an approach to ask somebody but our >>>volunteers to do it for free, and I also know that I don't want to >>>spend my time researching commercial alternatives (although I >>>wouldn't mind if you spent your time). >> >> >>I don't quite understand what you meant here: are you opposing >>spending PSF money on a hosting company if and only if volunteers >>who take on the job don't get paid ? > > No. I'm opposed to approaching somebody to do it for free, except > the somebody are the pydotorg volunteers (IOW, I won't take gifts > from anybody else in this matter). Ok. >>I've done a bit of research on the subject and so far only found >>CollabNet and VA offering commercial services in this area. VA hosts >>SourceForge so that's a non-option, I guess :-) > > > It's not that I dislike VA - I personally think they are doing a > great job with SourceForge, and I like SourceForge a lot. There > are just some issues with it (like that they offer no Subversion). > > The question would be: what precisely is the commercial offering from > VA: does it provide subversion? how is the user management done? > etc. I guess this was a misunderstanding on my part: VA doesn't offer their commercial solution in an ASP-like way. Their product, called SourceForge Enterprise, is a J2EE application which we'd have to install and run. They do mention Subversion as being supported by the Enterprise edition. >>I know that Greg Stein worked for CollabNet, so thought it might be a >>good idea to ask him about the idea to move things to CollabNet. >>Of course, before taking this route, I wanted to get a feeling >>for the general attitude towards a commercial approach, which >>is why I tossed in the idea. > > Ok - I expect that the project might be *done* before we even have > a single commercial offer, with a precise service description, > and a precise price tag. That makes commercial offers so difficult: > that it is so time expensive to use them, that you might spend > less time doing it yourself. For (more or less) simple things like setting up SVN, I'd agree, but for hosting a complete development system, I have my doubts - things start to get rather complicated and integration of various different tools tends to be very time consuming. Sysadmin tasks like doing backups, emergency recovery, etc. also get more complicated once you have to deal with many different ways of data storage deployed by such tools, e.g. many of them require use of special tools to do hot backups. >>Other non-commercial alternatives are Berlios and Savannah, but >>I'm not sure whether they'd offer Subversion support. > > > For me, they fall into the "I won't take gifts" category. Ok, I'll drop the idea. >>BTW, have you considered using Trac as issue tracker on >>svn.python.org ? > > > You mean, me personally? I quite like the Subversion tracker, > and don't want to trade it for anything else. I know Guido > wants to use Roundup (which is also written in Python), > and obviously so does Richard Jones. > > The main questions are the same as with this PEP: how to do > the migration from SF (without losing data), and how to > do the ongoing maintenance. It's just that finding answers > to these questions is so much harder, therefore, this PEP > is *only* about CVS. Ok. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 04 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Thu Aug 4 12:07:40 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 04 Aug 2005 20:07:40 +1000 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: <42F1E8EC.9080506@gmail.com> Brett Cannon wrote: > OK, once the cron job comes around and is run, > http://www.python.org/peps/pep-0348.html will not be a 404 but be the > latest version of the PEP. > > Differences since my last public version is that it has > BaseException/Exception as the naming hierarchy, Warning inherits from > Exception, UserException is UserError, and StandardError inherits from > Exception. I also added better annotations on the tree for noticing > where inheritance changed and whether it become broader (and thus had > a new exception in its MRO) or more restrictive (and thus lost an > exception). Basically everything that Guido has brought up today > (08-03). > If/when you add a "Getting there from here" section, it would be worth noting that there are a few basic strategies to be applied: - for new exceptions: - just add them in release 2.x - for name changes: - add the new name as an alias in release 2.x - deprecate the old name in release 2.x - delete the old name in release 2.(x+1) - to switch inheritance to a new exception type: - add the inheritance to the new parent in release 2.x - delete the inheritance from the old parent in release 3.0 - to switch inheritance to an existing exception type: - add the inheritance to the new parent in release 3.0 - delete the inheritance from the old parent in release 3.0 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Thu Aug 4 12:47:12 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 04 Aug 2005 20:47:12 +1000 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: <42F1E8EC.9080506@gmail.com> References: <42F1E8EC.9080506@gmail.com> Message-ID: <42F1F230.5000505@gmail.com> Nick Coghlan wrote: > If/when you add a "Getting there from here" section, it would be worth noting > that there are a few basic strategies to be applied: Eh, never mind. It's already there ;) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Thu Aug 4 13:03:05 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 04 Aug 2005 21:03:05 +1000 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: <42F1F5E9.8050904@gmail.com> Brett Cannon wrote (in the PEP): > KeyboardInterrupt inheriting from ControlFlowException > > KeyboardInterrupt has been a contentious point within this hierarchy. Some > view the exception as more control flow being caused by the user. But with > its asynchronous cause thanks to the user being able to trigger the > exception at any point in code it has a more proper place inheriting from > CriticalException. It also keeps the name of the exception from being > "CriticalError". I think this argues against your own hierarchy, since you _did_ call the parent exception CriticalError. By your argument above, that suggests KeyboardInterrupt doesn't belong there ;) In practice, whether KeyboardInterrupt inherits from ControlFlowException or CriticalError shouldn't be a big deal - the important thing is to get it out from under Exception and StandardError. At which point, the naming issue is enough to incline me towards christening it a ControlFlowException. It gets all the 'oddly named' exceptions into one place. Additionally, consider that a hypothetical ThreadExit exception (used to terminate a thread semi-gracefully) would also clearly belong under ControlFlowException. That is, just because something is asynchronous with respect to the currently executing code doesn't necessarily make it an error (yes, I know I argued the opposite point the other day. . .). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Thu Aug 4 13:27:40 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 04 Aug 2005 21:27:40 +1000 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: <42F1FBAC.6020404@gmail.com> Since I forgot to mention it in the last couple of messages - this version looks very good. The transition strategy section makes it a lot more meaningful. Brett Cannon wrote (in the PEP): > Renamed Exceptions > > Renamed exceptions will directly subclass the new names. When the old > exceptions are instantiated (which occurs when an exception is caught, > either by a try statement or by propagating to the top of the execution > stack), a PendingDeprecationWarning will be raised. Nice trick with figuring out how to raise the deprecation warning :) (That line was going to read 'Why not just create an alias?', but then I worked out what you were doing, and why you were doing it) One case that this doesn't completely address is NameError, as it is the only renamed exception which currently has a subclass. In this case, I think that during the transmition phase, all three of the 'Unbound*Error' exceptions should inherit from NameError, with NameError inheriting from NamespaceError. I believe it should still be possible to get the deprecation warning to work correctly in this case (by not raising the warning when a subclass is instantiated). In the 'just a type' category, WeakReferenceError should still be under StandardError in the hierarchy. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From foom at fuhm.net Thu Aug 4 15:01:47 2005 From: foom at fuhm.net (James Y Knight) Date: Thu, 4 Aug 2005 09:01:47 -0400 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: <42F1F5E9.8050904@gmail.com> References: <42F1F5E9.8050904@gmail.com> Message-ID: <5148626E-877E-4B6E-881C-0A56824FDE66@fuhm.net> On Aug 4, 2005, at 7:03 AM, Nick Coghlan wrote: > Additionally, consider that a hypothetical ThreadExit exception > (used to > terminate a thread semi-gracefully) would also clearly belong under > ControlFlowException. That is, just because something is > asynchronous with > respect to the currently executing code doesn't necessarily make it > an error > (yes, I know I argued the opposite point the other day. . .). No. Just because something gets asynchronously raised out from under you *does* make it critical (or maybe "critically fatal"). See my reply to Philip Eby on Aug 2, msgid <9EDA49FB-1E9B-4558-9441-90A65ECC5A52 at fuhm.net>. James From metawilm at gmail.com Thu Aug 4 15:37:28 2005 From: metawilm at gmail.com (Willem Broekema) Date: Thu, 4 Aug 2005 15:37:28 +0200 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: On 8/4/05, Brett Cannon wrote: > OK, once the cron job comes around and is run, > http://www.python.org/peps/pep-0348.html will not be a 404 but be the > latest version of the PEP. Currently, when the "recursion limit" is reached, a RuntimeError is raised. RuntimeError is in the PEP renamed to UserError. UserError is in the new hierarchy located below StandardError, below Exception. I think that in the new hierarchy this error should be in the same "critical" category as MemoryError. (MemoryError includes general stack overflow.) - Willem From foom at fuhm.net Thu Aug 4 17:06:00 2005 From: foom at fuhm.net (James Y Knight) Date: Thu, 4 Aug 2005 11:06:00 -0400 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: > +-- NamespaceError (rename of NameError) > +-- UnboundFreeError (new) > +-- UnboundGlobalError (new) > +-- UnboundLocalError > What are these new exceptions for? Under what circumstances are they raised? Why is this necessary or an improvement? > Renamed Exceptions > > Renamed exceptions will directly subclass the new names. When the > old exceptions are instantiated (which occurs when an exception is > caught, either by a try statement or by propagating to the top of > the execution stack), a PendingDeprecationWarning will be raised. > > This should properly preserve backwards-compatibility as old usage > won't change and the new names can be used to also catch exceptions > using the old name. The warning of the deprecation is also kept > simple. This will cause problems when a library raises the exception under the new name and an app tries to catch the old name. So the standard lib (or any other lib) cannot raise the new names. Because the stdlib must raise the old names, people will see the old names, continue catching the old names, and the new names will never catch on. Perhaps it'd work out better to have the new names subclass the old names. Then you have to continue catching the old name as long as anyone is raising it, but at least you can raise the new name with impunity. I expect not much code actually raises ReferenceError or NameError besides that internal to python. Thus it would be relatively safe to change all code to catch the new names for those immediately. Lots of code raises RuntimeError, but I bet not very much code explicitly catches it. Oh, but if the stdlib starts raising under the new names, that'll break any code that checks the exact type of the exception against the old name. Boo. It'd be better to somehow raise a DeprecationWarning upon access, yet still result in the same object. Unfortunately I don't think there's any way to do that in python. This lack of ability to deprecate module attributes has bit me several times in other projects as well. Matt Goodall wrote the hack attached at the end in order to move some whole modules around in Nevow. Amazingly it actually seemed to work. :) Something like that won't work for __builtins__, of course, since that's accessed directly with PyDict_Get. All in all I don't really see a real need for these renamings and I don't see a way to do them compatibly so I'm -1 to the whole idea of renaming exceptions. > Removal of Bare except Clauses > > A SemanticsWarning will be raised for all bare except clauses. Does this mean that bare except clauses change meaning to "except Exception" immediately? Or (I hope) did you mean that in Py2.5 they continue doing as they do now, but print a warning to tell you they will be changing in the future? James > import sys > import types > import warnings > > from twisted.python import reflect > > class ModuleWithDeprecations(types.ModuleType): > > def __init__(self, original, deprecatedNames): > self.original = original > self.deprecatedNames = deprecatedNames > > def __getattr__(self, name): > newName = self.deprecatedNames.get(name, None) > if newName is not None: > warnings.warn("nevow.%s is deprecated, please import %s > instead!"% (name,newName), DeprecationWarning, 2) > return reflect.namedAny(newName) > return getattr(self.original, name) > > # Evil hack? What evil hack! > sys.modules['nevow'] = ModuleWithDeprecations( > sys.modules['nevow'], > {'formless': 'formless', > 'freeform': 'formless.webform' > } > ) From gvanrossum at gmail.com Thu Aug 4 17:23:32 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 4 Aug 2005 08:23:32 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: In general the PEP looks really good now! On 8/4/05, Willem Broekema wrote: > On 8/4/05, Brett Cannon wrote: > > OK, once the cron job comes around and is run, > > http://www.python.org/peps/pep-0348.html will not be a 404 but be the > > latest version of the PEP. > > Currently, when the "recursion limit" is reached, a RuntimeError is > raised. RuntimeError is in the PEP renamed to UserError. UserError is > in the new hierarchy located below StandardError, below Exception. > > I think that in the new hierarchy this error should be in the same > "critical" category as MemoryError. (MemoryError includes general > stack overflow.) No. Usually, a recursion error is a simple bug in the code, no different from a TypeError or NameError etc. This does contradict my earlier claim that Python itself doesn't use RuntimeError; I think I'd be happier if it remained RuntimeError. (I think there are a few more uses of it inside Python itself; I don't think it's worth inventing new exceptions for all these.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Thu Aug 4 19:57:16 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 04 Aug 2005 19:57:16 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F1D72C.8070202@egenix.com> References: <42E93940.6080708@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de> <42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de> <42F1D72C.8070202@egenix.com> Message-ID: <42F256FC.7050606@v.loewis.de> M.-A. Lemburg wrote: > I guess this was a misunderstanding on my part: VA doesn't offer > their commercial solution in an ASP-like way. Their product, > called SourceForge Enterprise, is a J2EE application which we'd > have to install and run. They do mention Subversion as being > supported by the Enterprise edition. Ah, ok. I don't think I want to operate such a software (and, strictly speaking, this is out of the scope of the PEP). I had the "pleasure" once of having to maintain a SourceForge installation (before SourceForge became closed source), and it was a nightmare to operate. > For (more or less) simple things like setting up SVN, I'd agree, > but for hosting a complete development system, I have my doubts - > things start to get rather complicated and integration of various > different tools tends to be very time consuming. I guess Python's development process is very simple then. We use mailing lists, CVS, newsgroups, web servers, and bug trackers, but these don't have to integrate. Many of these services are already on pydotorg, and I propose to add an additional one (revision control). > Sysadmin tasks like doing backups, emergency recovery, etc. also > get more complicated once you have to deal with many different ways > of data storage deployed by such tools, e.g. many of them > require use of special tools to do hot backups. We are doing quite well here. XS4ALL kindly does disk backup for us, and, in the specific case of Subversion's fsfs, this is all that is needed. For Postgres, we backup to disk, which then gets picked up by the disk backup. Regards, Martin From raymond.hettinger at verizon.net Thu Aug 4 19:56:50 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 04 Aug 2005 13:56:50 -0400 Subject: [Python-Dev] PEP 342 Implementation Message-ID: <000001c5991d$e40bb140$12b62c81@oemcomputer> Could someone please make an independent check to verify an issue with the 342 checkin. The test suite passes but when I run IDLE and open a new window (using Control-N), it crashes and burns. The problem does not occur just before the checkin: cvs up -D "2005-08-01 18:00" But emerges immediately after: cvs up -D "2005-08-01 21:00" Raymond From mal at egenix.com Thu Aug 4 20:28:06 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 04 Aug 2005 20:28:06 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F256FC.7050606@v.loewis.de> References: <42E93940.6080708@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de> <42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de> <42F1D72C.8070202@egenix.com> <42F256FC.7050606@v.loewis.de> Message-ID: <42F25E36.5060103@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>I guess this was a misunderstanding on my part: VA doesn't offer >>their commercial solution in an ASP-like way. Their product, >>called SourceForge Enterprise, is a J2EE application which we'd >>have to install and run. They do mention Subversion as being >>supported by the Enterprise edition. > > > Ah, ok. I don't think I want to operate such a software (and, > strictly speaking, this is out of the scope of the PEP). I had the > "pleasure" once of having to maintain a SourceForge installation > (before SourceForge became closed source), and it was a nightmare > to operate. With J2EE I doubt that things got any easier to maintain... (assuming that you had to run the version of the software which is used on SF.net). >>For (more or less) simple things like setting up SVN, I'd agree, >>but for hosting a complete development system, I have my doubts - >>things start to get rather complicated and integration of various >>different tools tends to be very time consuming. > > > I guess Python's development process is very simple then. We use > mailing lists, CVS, newsgroups, web servers, and bug trackers, > but these don't have to integrate. Many of these services are > already on pydotorg, and I propose to add an additional one > (revision control). > > >>Sysadmin tasks like doing backups, emergency recovery, etc. also >>get more complicated once you have to deal with many different ways >>of data storage deployed by such tools, e.g. many of them >>require use of special tools to do hot backups. > > > We are doing quite well here. XS4ALL kindly does disk backup for > us, and, in the specific case of Subversion's fsfs, this is all > that is needed. For Postgres, we backup to disk, which then > gets picked up by the disk backup. Sounds like you have everything under control, which is good :-) BTW, in one of your replies I read that you had a problem with how cvs2svn handles trunk, branches and tags. In reality, this is no problem at all, since Subversion is very good at handling moves within the repository: you can easily change the repository layout after the import to whatevery layout you see fit - without losing any of the version history. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 04 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2005-07-18: Released mxODBC.Zope.DA for Zope 2.8 ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From pje at telecommunity.com Thu Aug 4 20:37:04 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Aug 2005 14:37:04 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F25E36.5060103@egenix.com> References: <42F256FC.7050606@v.loewis.de> <42E93940.6080708@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de> <42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de> <42F1D72C.8070202@egenix.com> <42F256FC.7050606@v.loewis.de> Message-ID: <5.1.1.6.0.20050804143230.025c4138@mail.telecommunity.com> At 08:28 PM 8/4/2005 +0200, M.-A. Lemburg wrote: >BTW, in one of your replies I read that you had a problem with >how cvs2svn handles trunk, branches and tags. In reality, this >is no problem at all, since Subversion is very good at handling >moves within the repository: you can easily change the repository >layout after the import to whatevery layout you see fit - without >losing any of the version history. Yeah, in my use of SVN I find that this is more theoretical than actual for certain use cases. You can see the history of a file including the history of any file it was copied from. However, if you want to try to look at the whole layout, you can't easily get to the old locations. This can be a royal pain, whereas at least in CVS you can use viewcvs to show you the "attic". Subversion doesn't have an attic, which makes looking at structural history very difficult. That having been said, I generally like Subversion, I just know that when I moved my projects to it I felt it was worth taking extra care to convert them in a way that didn't require me to reorganize the repository immediately thereafter, because I didn't want a sudden discontinuity, beyond which history would be difficult to follow. Therefore, I'm saying that taking some care with the conversion process to get things the way we like them would be a good idea. From mal at egenix.com Thu Aug 4 21:29:41 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 04 Aug 2005 21:29:41 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <5.1.1.6.0.20050804143230.025c4138@mail.telecommunity.com> References: <42F256FC.7050606@v.loewis.de> <42E93940.6080708@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de> <42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de> <42F1D72C.8070202@egenix.com> <42F256FC.7050606@v.loewis.de> <5.1.1.6.0.20050804143230.025c4138@mail.telecommunity.com> Message-ID: <42F26CA5.6010009@egenix.com> Phillip J. Eby wrote: > At 08:28 PM 8/4/2005 +0200, M.-A. Lemburg wrote: > >> BTW, in one of your replies I read that you had a problem with >> how cvs2svn handles trunk, branches and tags. In reality, this >> is no problem at all, since Subversion is very good at handling >> moves within the repository: you can easily change the repository >> layout after the import to whatevery layout you see fit - without >> losing any of the version history. > > > Yeah, in my use of SVN I find that this is more theoretical than actual > for certain use cases. You can see the history of a file including the > history of any file it was copied from. However, if you want to try to > look at the whole layout, you can't easily get to the old locations. > This can be a royal pain, whereas at least in CVS you can use viewcvs to > show you the "attic". Subversion doesn't have an attic, which makes > looking at structural history very difficult. Hmm, I usually create a tag before doing such changes in our Subversion repo. This makes it very easy to look at layouts before a restructuring. And because Subversion doesn't really care whether you do a tag, branch, or some other form of diverting versions into different namespaces (it's all just copying data), you can easily create a directory called "attic" for just this purpose and copy your structural change tags in there :-) > That having been said, I generally like Subversion, I just know that > when I moved my projects to it I felt it was worth taking extra care to > convert them in a way that didn't require me to reorganize the > repository immediately thereafter, because I didn't want a sudden > discontinuity, beyond which history would be difficult to follow. > > Therefore, I'm saying that taking some care with the conversion process > to get things the way we like them would be a good idea. Still very true indeed. The fact that cvs2svn is written in Python should make this even easier. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 04 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2005-07-18: Released mxODBC.Zope.DA for Zope 2.8 ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From edcjones at comcast.net Thu Aug 4 21:32:25 2005 From: edcjones at comcast.net (Edward C. Jones) Date: Thu, 04 Aug 2005 15:32:25 -0400 Subject: [Python-Dev] String exceptions in Python source Message-ID: <42F26D49.6010408@comcast.net> /usr/local/src/Python-2.4.1/Lib/SimpleXMLRPCServer.py: raise 'bad method' /usr/local/src/Python-2.4.1/Demo/classes/bitvec.py: raise 'FATAL', '(param, l) = %r' % ((param, l),) /usr/local/src/Python-2.4.1/Lib/plat-mac/FrameWork.py: raise 'Unsupported in MachoPython' /usr/local/src/Python-2.4.1/Lib/plat-mac/FrameWork.py: raise 'Can only delete last item of a menu' /usr/local/src/Python-2.4.1/Lib/plat-mac/MiniAEFrame.py: raise 'Cannot happen: AE callback without handler', (_class, _type) /usr/local/src/Python-2.4.1/Lib/plat-mac/PixMapWrapper.py: raise 'UseErr', "don't assign to .baseAddr -- assign to .data instead" /usr/local/src/Python-2.4.1/Lib/plat-mac/argvemulator.py: raise 'Cannot happen: AE callback without handler', (_class, _type) /usr/local/src/Python-2.4.1/Mac/Modules/waste/wastescan.py: raise 'Error: not found: %s', WASTEDIR /usr/local/src/Python-2.4.1/Mac/Tools/IDE/PyDebugger.py: raise 'spam' (3 times) /usr/local/src/Python-2.4.1/Mac/Tools/macfreeze/macfreeze.py: raise 'unknown gentype', gentype /usr/local/src/Python-2.4.1/Mac/Tools/macfreeze/macfreezegui.py: raise 'Error in gentype', gentype From tanzer at swing.co.at Fri Aug 5 10:12:25 2005 From: tanzer at swing.co.at (tanzer@swing.co.at) Date: Fri, 05 Aug 2005 10:12:25 +0200 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: Your message of "Wed, 03 Aug 2005 18:26:12 EDT." Message-ID: James Y Knight wrote: > > OK, I'm changing my mind again about the names again. > > > > Exception as the root and StandardError can stay; the only new > > proposal would then be to make bare 'except:' call StandardError. > > I don't see how that can work. Any solution that is expected to > result in a usable hierarchy this century must preserve "Exception" > as the object that user exceptions should derive from (and therefore > that users should generally catch, as well). There is way too much > momentum behind that to change it. Well, in the last few years I always derived my own exceptions from StandardError and used `except StandardError` instead of `except Exception`. And I'd love to get rid of the except KeyboardInterrupt: raise clause I currently have to write before any `except StandardError`. -- Christian Tanzer http://www.c-tanzer.at/ From dooms at info.ucl.ac.be Fri Aug 5 11:18:47 2005 From: dooms at info.ucl.ac.be (=?ISO-8859-1?Q?Gr=E9goire_Dooms?=) Date: Fri, 05 Aug 2005 11:18:47 +0200 Subject: [Python-Dev] [ python-Patches-790710 ] breakpoint command lists in pdb Message-ID: <42F32EF7.6050208@info.ucl.ac.be> Hello, This patch is about to celebrate its second birthday :-) https://sourceforge.net/tracker/?func=detail&atid=305470&aid=790710&group_id=5470 It seems from the comments that the feature is nice but the implementation was not OK. I redid the implem according to the comments. What should I do to get it reviewed further ? (perhaps just this : posting to python-dev :-) Best, -- Gr?goire From tjreedy at udel.edu Fri Aug 5 15:50:52 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 5 Aug 2005 09:50:52 -0400 Subject: [Python-Dev] [ python-Patches-790710 ] breakpoint command lists inpdb References: <42F32EF7.6050208@info.ucl.ac.be> Message-ID: "Grégoire Dooms" wrote in message news:42F32EF7.6050208 at info.ucl.ac.be... >This patch is about to celebrate its second birthday :-) >What should I do to get it reviewed further ? The guaranteed-by-a-couple-of-developers way is to review 5 other patches, post a summary here, and name this as the one you want reviewed in exchange. TJR From gvanrossum at gmail.com Fri Aug 5 17:53:00 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri, 5 Aug 2005 08:53:00 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: References: Message-ID: One more thing. Is renaming NameError to NamespaceError really worth it? I'd say that NameError is just as clear. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Fri Aug 5 18:34:43 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 05 Aug 2005 12:34:43 -0400 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: Message-ID: <001301c599db$9681be60$cbae2c81@oemcomputer> [ Guido] > One more thing. Is renaming NameError to NamespaceError really worth > it? I'd say that NameError is just as clear. +1 on NameError -- it's clear, easy to type, isn't a gratuitous change, and doesn't make you think twice about NamespaceError vs NameSpaceError. Raymond From raymond.hettinger at verizon.net Fri Aug 5 20:01:26 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 05 Aug 2005 14:01:26 -0400 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: Message-ID: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer> Also strong -1 on renaming RuntimeWarning to SemanticsWarning. Besides being another unnecessary change (trying to solve a non-existent problem), this isn't an improvement. The phrase RuntimeWarning is sufficiently generic to allow it to be used for a number of purposes. In costrast, SemanticsWarning is less flexible. Worse, it is not at all clear what a Semantics Warning would mean -- it suggests something much more ominous and complicated that it should. Another risk from gratuitous changes is the risk of unexpectedly introducing new problems. In this case, I find myself remembering the name as SemanticWarning instead of SemanticsWarning. These kind of changes suck -- they fail to take advantage of 15 years of field testing and risk introducing hard-to-change usability problems. Likewise, am a strong -1 on renaming RuntimeError to UserError. The latter name has some virtues but it is also misread as the User doing something wrong -- that is definitely not the intended meaning. While RuntimeError is a less than perfect name, it should not be changed unless we have both 1) demonstrated that real world problems have occurred with the current name and 2) that we have a clearly superior alternative name (a test which UserError fails). The only virtue to the name, UserError, is its symmetry with UserWarning. -0 on renaming ReferenceError to WeakReferenceError. The new name does better suggest the cause. OTOH, the context of the traceback would also make that perfectly clear. I'm not aware of a single user having had a problem with the current name. In general, we've avoided long names in favor of the short and pithy -- the theory was that the only a mnemonic is needed. Before adopting this one, there should be some discussion of 1) whether the current name is really that unclear, 2) whether shorter alternatives would serve (i.e. WeakrefError), and 3) whether the name suffers from capitalization ambiguity (WeakreferenceError vs WeakReferenceError). Summary: Most of the proposed name changes are unnecessary, the new names are not necessarily better, and there is a high risk of introducing new usability problems. Raymond From raymond.hettinger at verizon.net Fri Aug 5 20:46:41 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 05 Aug 2005 14:46:41 -0400 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer> Message-ID: <002301c599ee$09dd6e60$cbae2c81@oemcomputer> The PEP moves StopIteration out from under Exception so that it cannot be caught by a bare except or an explicit "except Exception". IMO, this is a mistake. In either form, a programmer is stating that they want to catch and handle just about anything. There is a reasonable argument that SystemExit special and should float to the top, but that is not the case with StopIteration. When a user creates their own exception for exiting multiple levels of loops or frames, should they inherit from ControlFlowException on the theory that it no different in intent from StopIteration or should they inherit from UserError on the theory that it is a custom exception? Do you really want routine control-flow exceptions to bypass "except Exception". I suspect that will lead to coding errors that are very difficult to spot (it sure looks like it should catch a StopIteration). Be careful with these proposals. While well intentioned, they have ramifications that aren't instantly apparent. Each one needs some deep thought, user discussion, usability testing, and a darned good reason for changing what we already have in the field. Raymond From bcannon at gmail.com Fri Aug 5 21:02:46 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 12:02:46 -0700 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: <002301c599ee$09dd6e60$cbae2c81@oemcomputer> References: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer> <002301c599ee$09dd6e60$cbae2c81@oemcomputer> Message-ID: On 8/5/05, Raymond Hettinger wrote: > The PEP moves StopIteration out from under Exception so that it cannot > be caught by a bare except or an explicit "except Exception". > > IMO, this is a mistake. In either form, a programmer is stating that > they want to catch and handle just about anything. There is a > reasonable argument that SystemExit special and should float to the top, > but that is not the case with StopIteration. > > When a user creates their own exception for exiting multiple levels of > loops or frames, should they inherit from ControlFlowException on the > theory that it no different in intent from StopIteration or should they > inherit from UserError on the theory that it is a custom exception? I say ControlFlowException. UserError is meant for quick-and-dirty exception usage and not as a base for user error exceptions. If the name is confusing it can be changed to SimpleError. > Do > you really want routine control-flow exceptions to bypass "except > Exception". Yes. > I suspect that will lead to coding errors that are very > difficult to spot (it sure looks like it should catch a StopIteration). > I honestly don't think it will. People who are going to care about catching StopIteration are writing custom iterators, not something a newbie will porobably be doing and thus should know to be specific about what exceptions they are catching when they have a specific thing in mind. > Be careful with these proposals. While well intentioned, they have > ramifications that aren't instantly apparent. Each one needs some deep > thought, user discussion, usability testing, and a darned good reason > for changing what we already have in the field. > Right, which is why this is all in a PEP, so the discussion can happen and the kinks can be worked out. As for the testing, that can happen with __future__ statements, people trying out a patch, or maybe even some testing branch of Python for possible 3000 features. -Brett From bcannon at gmail.com Fri Aug 5 21:07:08 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 12:07:08 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: <42F1F5E9.8050904@gmail.com> References: <42F1F5E9.8050904@gmail.com> Message-ID: On 8/4/05, Nick Coghlan wrote: > Brett Cannon wrote (in the PEP): > > KeyboardInterrupt inheriting from ControlFlowException > > > > KeyboardInterrupt has been a contentious point within this hierarchy. Some > > view the exception as more control flow being caused by the user. But with > > its asynchronous cause thanks to the user being able to trigger the > > exception at any point in code it has a more proper place inheriting from > > CriticalException. It also keeps the name of the exception from being > > "CriticalError". > > I think this argues against your own hierarchy, since you _did_ call the > parent exception CriticalError. By your argument above, that suggests > KeyboardInterrupt doesn't belong there ;) > =) Drawback of having names swapped in and out so many times. > In practice, whether KeyboardInterrupt inherits from ControlFlowException or > CriticalError shouldn't be a big deal - the important thing is to get it out > from under Exception and StandardError. > In general, probably. > At which point, the naming issue is enough to incline me towards christening > it a ControlFlowException. It gets all the 'oddly named' exceptions into one > place. > Good point. I think I would like to see Guido's preference for this since it feels like it should be under CriticalError. > Additionally, consider that a hypothetical ThreadExit exception (used to > terminate a thread semi-gracefully) would also clearly belong under > ControlFlowException. That is, just because something is asynchronous with > respect to the currently executing code doesn't necessarily make it an error > (yes, I know I argued the opposite point the other day. . .). > Another good point. I am leaning towards moving it now, but I still would like to hear Guido's preference, if he has one. -Brett From pje at telecommunity.com Fri Aug 5 21:14:48 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 05 Aug 2005 15:14:48 -0400 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: <002301c599ee$09dd6e60$cbae2c81@oemcomputer> References: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer> Message-ID: <5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com> At 02:46 PM 8/5/2005 -0400, Raymond Hettinger wrote: >The PEP moves StopIteration out from under Exception so that it cannot >be caught by a bare except or an explicit "except Exception". > >IMO, this is a mistake. In either form, a programmer is stating that >they want to catch and handle just about anything. There is a >reasonable argument that SystemExit special and should float to the top, >but that is not the case with StopIteration. While I agree with most of your -1's on gratuitous changes, this particular problem isn't gratuitous. A StopIteration that reaches a regular exception handler is a programming error; allowing StopIteration and other control-flow exceptions to be caught other than explicitly *masks* programming errors. Under normal circumstances, StopIteration is caught by for loops or by explicit catches of StopIteration. If it doesn't get caught, *that's* an error, and it would be hidden if caught by a generic "except" clause. So, any code that is "broken" by the move was in fact *already* broken, it's just that one bug (a too-general except: clause) is masking the other bug (the escaping control-flow exception). >When a user creates their own exception for exiting multiple levels of >loops or frames, should they inherit from ControlFlowException on the >theory that it no different in intent from StopIteration or should they >inherit from UserError on the theory that it is a custom exception? Do >you really want routine control-flow exceptions to bypass "except >Exception". Yes, definitely. A control flow exception that isn't explicitly caught somewhere is itself an error, but it's not detectable if it's swallowed by an over-eager except: clause. > I suspect that will lead to coding errors that are very >difficult to spot (it sure looks like it should catch a StopIteration). Actually, no, it makes them *easy* to spot because nothing will catch them, and therefore you will be able to see that there's no handler in place. If they *are* caught, that is what leads to difficult-to-spot errors -- i.e. the situation we have now. >Be careful with these proposals. While well intentioned, they have >ramifications that aren't instantly apparent. Each one needs some deep >thought, user discussion, usability testing, and a darned good reason >for changing what we already have in the field. There is a darned good reason for this one; critical exceptions and control flow exceptions are pretty much the motivating reason for doing any changes to the exception hierarchy at all. From raymond.hettinger at verizon.net Fri Aug 5 21:23:13 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 05 Aug 2005 15:23:13 -0400 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: Message-ID: <002901c599f3$24bd1000$cbae2c81@oemcomputer> > > When a user creates their own exception for exiting multiple levels of > > loops or frames, should they inherit from ControlFlowException on the > > theory that it no different in intent from StopIteration or should they > > inherit from UserError on the theory that it is a custom exception? > > I say ControlFlowException. UserError is meant for quick-and-dirty > exception usage and not as a base for user error exceptions. If the > name is confusing it can be changed to SimpleError. Gads. It sounds like you're just making this up on the fly. The process should be disciplined, grounded in use cases, and aimed at known, real problems with the current hierarchy. The above question was rhetorical. It didn't have a right answer. "Quick-and-dirty" is not a useful category and cannot be reliably placed in one part of the tree versus another. A common and basic use case for quick and dirty exceptions is to break out of nested loops and functions. That is control flow as well as quick-and-dirty. Raymond From foom at fuhm.net Fri Aug 5 21:42:16 2005 From: foom at fuhm.net (James Y Knight) Date: Fri, 5 Aug 2005 15:42:16 -0400 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: <002301c599ee$09dd6e60$cbae2c81@oemcomputer> References: <002301c599ee$09dd6e60$cbae2c81@oemcomputer> Message-ID: <214E1AB0-A01E-4C56-A9AB-356A367AAE25@fuhm.net> On Aug 5, 2005, at 2:46 PM, Raymond Hettinger wrote: > The PEP moves StopIteration out from under Exception so that it cannot > be caught by a bare except or an explicit "except Exception". > > IMO, this is a mistake. In either form, a programmer is stating that > they want to catch and handle just about anything. There is a > reasonable argument that SystemExit special and should float to the > top, > but that is not the case with StopIteration. I'm glad you brought that up. I had wondered from the beginning why ControlFlowException was moved out, but thought there must be a good reason I was just not seeing, and promptly forgot about it. So now that I've been reminded, can someone explain to me why StopIteration and GeneratorExit should not be caught by an "except:" or "except Exception:" clause? James From raymond.hettinger at verizon.net Fri Aug 5 21:57:44 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 05 Aug 2005 15:57:44 -0400 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: <5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com> Message-ID: <002a01c599f7$f29d63e0$cbae2c81@oemcomputer> [Raymond Hettinger wrote] > >The PEP moves StopIteration out from under Exception so that it cannot > >be caught by a bare except or an explicit "except Exception". > > > >IMO, this is a mistake. In either form, a programmer is stating that > >they want to catch and handle just about anything. There is a > >reasonable argument that SystemExit special and should float to the top, > >but that is not the case with StopIteration. [Phillip J. Eby] > While I agree with most of your -1's on gratuitous changes, this > particular > problem isn't gratuitous. A StopIteration that reaches a regular > exception > handler is a programming error; allowing StopIteration and other > control-flow exceptions to be caught other than explicitly *masks* > programming errors. Thanks for clearly articulating the rationale behind moving control-flow exceptions out from under Exception. The idea is not entirely without merit but I believe it is both misguided and has serious negative consequences. Two things are both true. Writers of bare excepts sometimes catch more than they intended and mask errors in their programs. It is also true that there are valid use cases for wanting to trap and log all recoverable errors in long running programs (i.e. not crashing your whole air traffic control system if submodule fails to trap a control flow exception). I favor the current setup for several reasons: 1. Writing a bare except is its own warning to a programmer. It is a Python basic to be careful with it and to focus attention on whether it is really intended. PyChecker flags it and it stands out during code review. IOW, it is a documented, well-understood hazard that should surprise no one. 2. There is a lesson to be taken from a story in the ACM risks forum where a massive phone outage was traced to a single line of C code that ran a "break" to get out of a nested if-statement. The interesting part is that this was known to be mission critical code yet the error survived multiple, independent code reviews. The problem was that the code created an optical illusion. We risk the same thing when an "except Exception" doesn't catch ControlFlowExceptions. The recovery/logging handler will look like it ought to catch everything, but it won't. That is a disaster for fault-tolerant coding and for keeping your sales demo from exploding in front of customers. 3. As noted above, there ARE valid use cases for bare excepts. Also consider that Python rarely documents or even can document all the recoverable exceptions that can be raised by a method call. A broad based handler is sometimes the programmer's only defense. 4. As noted in another email, user defined control flow exceptions are a key use case. I believe that was the typical use for string exceptions. The idea is that that user defined control flow exceptions are one of the key means for exiting multiple layers of loops or function calls. If you have a bunch of these, it is reasonable to expect that "except Exception" will catch them. Summary: It is a noble thought to save someone from shooting themselves in the foot with a bare except. However, bare excepts are clearly a we-are-all-adults construct. It has valid current use cases and its current meaning is likely the intended meaning. Making the change will break some existing code and produce dubious benefits. The change is at odds with fundamental use cases for user defined control flow exceptions. The change introduces the serious risk of a hard-to-spot optical illusion error where an "except Exception" doesn't catch exceptions that were intended to be caught. Nice try, but don't do anything this radical without validating that it solves significant problems without introducing worse, unintended effects. Don't break existing code unless there is a darned good reason. Check with the Zope and Twisted people to see if this would improve their lives or make things worse. There are user constituencies that are not being well represented in these discussions. Raymond From bcannon at gmail.com Fri Aug 5 22:00:18 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 13:00:18 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: <42F1FBAC.6020404@gmail.com> References: <42F1FBAC.6020404@gmail.com> Message-ID: On 8/4/05, Nick Coghlan wrote: > Since I forgot to mention it in the last couple of messages - this version > looks very good. The transition strategy section makes it a lot more meaningful. > Great to hear! > Brett Cannon wrote (in the PEP): > > Renamed Exceptions > > > > Renamed exceptions will directly subclass the new names. When the old > > exceptions are instantiated (which occurs when an exception is caught, > > either by a try statement or by propagating to the top of the execution > > stack), a PendingDeprecationWarning will be raised. > > Nice trick with figuring out how to raise the deprecation warning :) > (That line was going to read 'Why not just create an alias?', but then I > worked out what you were doing, and why you were doing it) > Thanks. > One case that this doesn't completely address is NameError, as it is the only > renamed exception which currently has a subclass. In this case, I think that > during the transmition phase, all three of the 'Unbound*Error' exceptions > should inherit from NameError, with NameError inheriting from NamespaceError. > > I believe it should still be possible to get the deprecation warning to work > correctly in this case (by not raising the warning when a subclass is > instantiated). > Ah, didn't think about that issue. Yeah, as long as you don't call a superclass' __init__ it should still work. > In the 'just a type' category, WeakReferenceError should still be under > StandardError in the hierarchy. > Yeah, that is an error from trying adding StandardError back in. -Brett From bcannon at gmail.com Fri Aug 5 22:13:21 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 13:13:21 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: On 8/4/05, James Y Knight wrote: > > +-- NamespaceError (rename of NameError) > > +-- UnboundFreeError (new) > > +-- UnboundGlobalError (new) > > +-- UnboundLocalError > > > > What are these new exceptions for? Under what circumstances are they > raised? Why is this necessary or an improvement? > Exceptions relating to when a name is not found in a specific namespace (directly related to bytecode). So UnboundFreeError is raised when the interpreter cannot find a variable that is a free variable. UnboundLocalError already exists. UnboundGlobalError is to prevent NameError from being overloaded. UnboundFreeError is to prevent UnboundLocalError from being overloaded > > Renamed Exceptions > > > > Renamed exceptions will directly subclass the new names. When the > > old exceptions are instantiated (which occurs when an exception is > > caught, either by a try statement or by propagating to the top of > > the execution stack), a PendingDeprecationWarning will be raised. > > > > This should properly preserve backwards-compatibility as old usage > > won't change and the new names can be used to also catch exceptions > > using the old name. The warning of the deprecation is also kept > > simple. > > This will cause problems when a library raises the exception under > the new name and an app tries to catch the old name. So the standard > lib (or any other lib) cannot raise the new names. Because the stdlib > must raise the old names, people will see the old names, continue > catching the old names, and the new names will never catch on. > Crap, you're right. Going to have to think about this more. > Perhaps it'd work out better to have the new names subclass the old > names. Then you have to continue catching the old name as long as > anyone is raising it, but at least you can raise the new name with > impunity. I expect not much code actually raises ReferenceError or > NameError besides that internal to python. Thus it would be > relatively safe to change all code to catch the new names for those > immediately. Lots of code raises RuntimeError, but I bet not very > much code explicitly catches it. > > Oh, but if the stdlib starts raising under the new names, that'll > break any code that checks the exact type of the exception against > the old name. Boo. > > It'd be better to somehow raise a DeprecationWarning upon access, yet > still result in the same object. Unfortunately I don't think there's > any way to do that in python. This lack of ability to deprecate > module attributes has bit me several times in other projects as well. > Matt Goodall wrote the hack attached at the end in order to move some > whole modules around in Nevow. Amazingly it actually seemed to > work. :) Something like that won't work for __builtins__, of course, > since that's accessed directly with PyDict_Get. > > All in all I don't really see a real need for these renamings and I > don't see a way to do them compatibly so I'm -1 to the whole idea of > renaming exceptions. > Well, the new names can go into 2.x but not removed until 3.0 . And there is always a solution. We do control the implementation so something has evil as hacking the exception system to do class-specific checks could work. > > Removal of Bare except Clauses > > > > A SemanticsWarning will be raised for all bare except clauses. > > Does this mean that bare except clauses change meaning to "except > Exception" immediately? Or (I hope) did you mean that in Py2.5 they > continue doing as they do now, but print a warning to tell you they > will be changing in the future? They would have a warning for a version, and then change. And this will nost necessarily go into 2.5 . -Brett From bcannon at gmail.com Fri Aug 5 22:15:14 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 13:15:14 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: On 8/4/05, Guido van Rossum wrote: > In general the PEP looks really good now! > Glad you like it. > On 8/4/05, Willem Broekema wrote: > > On 8/4/05, Brett Cannon wrote: > > > OK, once the cron job comes around and is run, > > > http://www.python.org/peps/pep-0348.html will not be a 404 but be the > > > latest version of the PEP. > > > > Currently, when the "recursion limit" is reached, a RuntimeError is > > raised. RuntimeError is in the PEP renamed to UserError. UserError is > > in the new hierarchy located below StandardError, below Exception. > > > > I think that in the new hierarchy this error should be in the same > > "critical" category as MemoryError. (MemoryError includes general > > stack overflow.) > > No. Usually, a recursion error is a simple bug in the code, no > different from a TypeError or NameError etc. > > This does contradict my earlier claim that Python itself doesn't use > RuntimeError; I think I'd be happier if it remained RuntimeError. (I > think there are a few more uses of it inside Python itself; I don't > think it's worth inventing new exceptions for all these.) > OK, I will not propose renaming RuntimeError. -Brett From reinhold-birkenfeld-nospam at wolke7.net Fri Aug 5 22:12:31 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Fri, 05 Aug 2005 22:12:31 +0200 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: <002a01c599f7$f29d63e0$cbae2c81@oemcomputer> References: <5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com> <002a01c599f7$f29d63e0$cbae2c81@oemcomputer> Message-ID: Raymond Hettinger wrote: > 2. There is a lesson to be taken from a story in the ACM risks forum > where a massive phone outage was traced to a single line of C code that > ran a "break" to get out of a nested if-statement. The interesting part > is that this was known to be mission critical code yet the error > survived multiple, independent code reviews. The problem was that the > code created an optical illusion. We risk the same thing when an > "except Exception" doesn't catch ControlFlowExceptions. The > recovery/logging handler will look like it ought to catch everything, > but it won't. That is a disaster for fault-tolerant coding and for > keeping your sales demo from exploding in front of customers. I think that ControlFlowException should inherit from Exception, because it is an exception. As Raymond says, it's hard to spot this when in a hurry. But looking at the current PEP 348, why not rename BaseException to Exception and Exception to Error? That way, you could say "except Error:" instead of most of today's bare "except:" and it's clear that StopIteration or GeneratorExit won't be caught because they are not errors. Reinhold -- Mail address is perfectly valid! From gvanrossum at gmail.com Fri Aug 5 22:16:31 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri, 5 Aug 2005 13:16:31 -0700 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: <5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com> References: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer> <002301c599ee$09dd6e60$cbae2c81@oemcomputer> <5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com> Message-ID: On 8/5/05, Phillip J. Eby wrote: > While I agree with most of your -1's on gratuitous changes, this particular > problem isn't gratuitous. A StopIteration that reaches a regular exception > handler is a programming error; allowing StopIteration and other > control-flow exceptions to be caught other than explicitly *masks* > programming errors. And your point is? If that was the reasoning behind this PEP, it should move TypeError, NameError, AttributeError and a whole bunch of others (even LookupError) out of the StandardError hierarchy too! Those are all clear symptoms of programming errors and are frequently masked by bare 'except:'. The point is not to avoid bare 'except:' from hiding programming errors. There's no hope to obtain that goal. The point is to make *legitimate* uses of bare 'except:' easier -- the typical use case is an application that has some kind of main loop which uses bare 'except:' to catch gross programming errors in other parts of the app, or in code received from an imperfect source (like an end-user script) and recovers by logging the error and continuing. (I was going to say "or clean up and exit", but that use case is handled by 'finally:'.) Those legitimate uses often need to make a special case of Keyboardinterrupt and SystemExit -- KeyboardInterrupt because it's not a bug in the code but a request from the user who is *running* the app (and the appropriate default response is to exit with a stack trace); SystemExit because it's not a bug but a deliberate attempt to exit the program -- logging an error would be a mistake. I think the use cases for moving other exceptions out of the way are weak; MemoryError and SystemError are exceedingly rare and I've never felt the need to exclude them; when GeneratorExit or StopIteration reach the outer level of an app, it's a bug like all the others that bare 'except:' WANTS to catch. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Fri Aug 5 22:25:45 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 13:25:45 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: On 8/4/05, Guido van Rossum wrote: > This does contradict my earlier claim that Python itself doesn't use > RuntimeError; I think I'd be happier if it remained RuntimeError. (I > think there are a few more uses of it inside Python itself; I don't > think it's worth inventing new exceptions for all these.) > I just realized that keeping RuntimeError still does not resolve the issue that the name kind of sucks for realizing intrinsically that it is for quick-and-dirty exceptions (or am I the only one who thinks this?). Should we toss in a subclass called SimpleError? -Brett From gvanrossum at gmail.com Fri Aug 5 22:28:36 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri, 5 Aug 2005 13:28:36 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: On 8/5/05, Brett Cannon wrote: > On 8/4/05, Guido van Rossum wrote: > > > This does contradict my earlier claim that Python itself doesn't use > > RuntimeError; I think I'd be happier if it remained RuntimeError. (I > > think there are a few more uses of it inside Python itself; I don't > > think it's worth inventing new exceptions for all these.) > > > > I just realized that keeping RuntimeError still does not resolve the > issue that the name kind of sucks for realizing intrinsically that it > is for quick-and-dirty exceptions (or am I the only one who thinks > this?). Should we toss in a subclass called SimpleError? I don't think so. People should feel free to use whatever pre-existing exception they like, even Exception. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Fri Aug 5 22:31:52 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 13:31:52 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: <001301c599db$9681be60$cbae2c81@oemcomputer> References: <001301c599db$9681be60$cbae2c81@oemcomputer> Message-ID: On 8/5/05, Raymond Hettinger wrote: > [ Guido] > > One more thing. Is renaming NameError to NamespaceError really worth > > it? I'd say that NameError is just as clear. > > +1 on NameError -- it's clear, easy to type, isn't a gratuitous change, > and doesn't make you think twice about NamespaceError vs NameSpaceError. > OK, I will remove the name change proposal. -Brett From python at discworld.dyndns.org Fri Aug 5 22:39:04 2005 From: python at discworld.dyndns.org (Charles Cazabon) Date: Fri, 5 Aug 2005 14:39:04 -0600 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: <20050805203904.GA30701@discworld.dyndns.org> Brett Cannon wrote: > On 8/4/05, Guido van Rossum wrote: > > I just realized that keeping RuntimeError still does not resolve the > issue that the name kind of sucks for realizing intrinsically that it > is for quick-and-dirty exceptions (or am I the only one who thinks > this?). Should we toss in a subclass called SimpleError? Much Python code I've looked at uses ValueError for this purpose. Would adding a special exception add much utility? Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://pyropus.ca/software/ ----------------------------------------------------------------------- From bcannon at gmail.com Fri Aug 5 23:05:00 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 14:05:00 -0700 Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0 In-Reply-To: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer> References: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer> Message-ID: On 8/5/05, Raymond Hettinger wrote: > Also strong -1 on renaming RuntimeWarning to SemanticsWarning. > > Besides being another unnecessary change (trying to solve a non-existent > problem), this isn't an improvement. The phrase RuntimeWarning is > sufficiently generic to allow it to be used for a number of purposes. > In costrast, SemanticsWarning is less flexible. Worse, it is not at all > clear what a Semantics Warning would mean -- it suggests something much > more ominous and complicated that it should. > But the docs don't say that RuntimeWarning is meant as a generic warning but for dubious runtime behavior being changed. If it is truly meant to be generic (I think of UserWarning for that), then fine, I can let go of the name change. But it just took a friend of mine with no exposure to the warning system to understand what it meant. > Another risk from gratuitous changes is the risk of unexpectedly > introducing new problems. In this case, I find myself remembering the > name as SemanticWarning instead of SemanticsWarning. These kind of > changes suck -- they fail to take advantage of 15 years of field testing > and risk introducing hard-to-change usability problems. > OK, I can see the typos from that, but I still think RuntimeWarning and Error, for use as a generic exception, suck as names. > Likewise, am a strong -1 on renaming RuntimeError to UserError. The > latter name has some virtues but it is also misread as the User doing > something wrong -- that is definitely not the intended meaning. While > RuntimeError is a less than perfect name, it should not be changed > unless we have both 1) demonstrated that real world problems have > occurred with the current name and 2) that we have a clearly superior > alternative name (a test which UserError fails). The only virtue to the > name, UserError, is its symmetry with UserWarning. > SimpleError? > -0 on renaming ReferenceError to WeakReferenceError. The new name does > better suggest the cause. OTOH, the context of the traceback would also > make that perfectly clear. I'm not aware of a single user having had a > problem with the current name. In general, we've avoided long names in > favor of the short and pithy -- the theory was that the only a mnemonic > is needed. Before adopting this one, there should be some discussion of > 1) whether the current name is really that unclear, 2) whether shorter > alternatives would serve (i.e. WeakrefError), and 3) whether the name > suffers from capitalization ambiguity (WeakreferenceError vs > WeakReferenceError). > Will I didn't know what the exception was for until I read the docs. Granted this was just from looking at ``import exceptions; dir(exceptions)``, but why shouldn't the names be that obvious? And I don't see a capitalization ambiguity; if it was WeakrefError, sure. But not when the entire phrase is used. > Summary: Most of the proposed name changes are unnecessary, the new > names are not necessarily better, and there is a high risk of > introducing new usability problems. > I still think RuntimeError (and RuntimeWarning if that is what it is meant for) sucks as a name for a generic exception. I didn't know that was its use until I read the docs and Guido pointed out during the discussion of this thread. I am willing to compromise with a new exception that inherits RuntimeError named SimpleError (or the inheritance can be flipped). -Brett From bcannon at gmail.com Fri Aug 5 23:09:30 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 14:09:30 -0700 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: <002901c599f3$24bd1000$cbae2c81@oemcomputer> References: <002901c599f3$24bd1000$cbae2c81@oemcomputer> Message-ID: On 8/5/05, Raymond Hettinger wrote: > > > When a user creates their own exception for exiting multiple levels > of > > > loops or frames, should they inherit from ControlFlowException on > the > > > theory that it no different in intent from StopIteration or should > they > > > inherit from UserError on the theory that it is a custom exception? > > > > I say ControlFlowException. UserError is meant for quick-and-dirty > > exception usage and not as a base for user error exceptions. If the > > name is confusing it can be changed to SimpleError. > > Gads. It sounds like you're just making this up on the fly. The > process should be disciplined, grounded in use cases, and aimed at > known, real problems with the current hierarchy. > It is based on a real use case; my own. As I said in another email I just sent, I had no clue that RuntimeError was meant to be used as a generic exception until Guido pointed it out. -Brett From mdehoon at c2b2.columbia.edu Fri Aug 5 23:18:46 2005 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Fri, 5 Aug 2005 17:18:46 -0400 Subject: [Python-Dev] [ python-Patches-790710 ] breakpoint command listsinpdb Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE7AC28B@cgcmail.cgc.cpmc.columbia.edu> > "Gr?goire Dooms" wrote in message > news:42F32EF7.6050208 at info.ucl.ac.be... > >This patch is about to celebrate its second birthday :-) > >What should I do to get it reviewed further ? > The guaranteed-by-a-couple-of-developers way is to review 5 other patches, > post a summary here, and name this as the one you want reviewed in > exchange. > TJR Speaking of the five-patch-review-rule, about two months ago I reviewed five patches and posted a summary here in order to push patch #1049855. This patch is still waiting for a verdict (this is also my own fault, since I needed several iterations to get this patch straightened out; my apologies for that). Is there anything else I can do for this patch? --Michiel. From bcannon at gmail.com Fri Aug 5 23:20:34 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 14:20:34 -0700 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: References: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer> <002301c599ee$09dd6e60$cbae2c81@oemcomputer> <5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com> Message-ID: On 8/5/05, Guido van Rossum wrote: [SNIP] > Those legitimate uses often need to make a special case of > Keyboardinterrupt and SystemExit -- KeyboardInterrupt because it's not > a bug in the code but a request from the user who is *running* the app > (and the appropriate default response is to exit with a stack trace); > SystemExit because it's not a bug but a deliberate attempt to exit the > program -- logging an error would be a mistake. > > I think the use cases for moving other exceptions out of the way are > weak; MemoryError and SystemError are exceedingly rare and I've never > felt the need to exclude them; when GeneratorExit or StopIteration > reach the outer level of an app, it's a bug like all the others that > bare 'except:' WANTS to catch. > So are you saying you would rather ditch all reorganization suggestions and just have SystemExit and KeyboardInterrupt inherit directly from BaseException, and keep the bare 'except' change and required superclass inheritance suggestions? Would this appease everyone else? If this is what people want, fine. But I am still going to suggest CriticalError stay since they are not caused by programmer error directly (I am ignoring C extension module screw-ups that devour memory). -Brett From bcannon at gmail.com Fri Aug 5 23:21:51 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 14:21:51 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: On 8/5/05, Guido van Rossum wrote: > On 8/5/05, Brett Cannon wrote: > > On 8/4/05, Guido van Rossum wrote: > > > > > This does contradict my earlier claim that Python itself doesn't use > > > RuntimeError; I think I'd be happier if it remained RuntimeError. (I > > > think there are a few more uses of it inside Python itself; I don't > > > think it's worth inventing new exceptions for all these.) > > > > > > > I just realized that keeping RuntimeError still does not resolve the > > issue that the name kind of sucks for realizing intrinsically that it > > is for quick-and-dirty exceptions (or am I the only one who thinks > > this?). Should we toss in a subclass called SimpleError? > > I don't think so. People should feel free to use whatever pre-existing > exception they like, even Exception. > Fine, the idea is pulled. -Brett From bcannon at gmail.com Fri Aug 5 23:23:56 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 5 Aug 2005 14:23:56 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: <003201c599fb$1e8c9d60$cbae2c81@oemcomputer> References: <003201c599fb$1e8c9d60$cbae2c81@oemcomputer> Message-ID: On 8/5/05, Raymond Hettinger wrote: > > On 8/4/05, James Y Knight wrote: > > > > +-- NamespaceError (rename of NameError) > > > > +-- UnboundFreeError (new) > > > > +-- UnboundGlobalError (new) > > > > +-- UnboundLocalError > > > > > > > > > > What are these new exceptions for? Under what circumstances are they > > > raised? Why is this necessary or an improvement? > > > > > > > Exceptions relating to when a name is not found in a specific > > namespace (directly related to bytecode). So UnboundFreeError is > > raised when the interpreter cannot find a variable that is a free > > variable. UnboundLocalError already exists. UnboundGlobalError is to > > prevent NameError from being overloaded. UnboundFreeError is to > > prevent UnboundLocalError from being overloaded > > Do we have any use cases for making the distinctions. I have NEVER had > a reason to write a different handler for the various types of > NameError. > > Also, everyone knows what a Global is. Can the same be said for Free? > I had thought that to be a implementation detail rather than part of the > language spec. > Perhaps then we should just ditch UnboundLocalError? If we just make sure we have good messages to go with the exceptions the reasons for the exception should be obvious. -Brett From raymond.hettinger at verizon.net Sat Aug 6 01:18:09 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 05 Aug 2005 19:18:09 -0400 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: Message-ID: <005f01c59a13$f1d8caa0$cbae2c81@oemcomputer> > > > > > +-- NamespaceError (rename of NameError) > > > > > +-- UnboundFreeError (new) > > > > > +-- UnboundGlobalError (new) > > > > > +-- UnboundLocalError > > > > > > > > > > > > > What are these new exceptions for? Under what circumstances are they > > > > raised? Why is this necessary or an improvement? [James Y Knight] > > > Exceptions relating to when a name is not found in a specific > > > namespace (directly related to bytecode). So UnboundFreeError is > > > raised when the interpreter cannot find a variable that is a free > > > variable. UnboundLocalError already exists. UnboundGlobalError is to > > > prevent NameError from being overloaded. UnboundFreeError is to > > > prevent UnboundLocalError from being overloaded [Raymond] > > Do we have any use cases for making the distinctions. I have NEVER had > > a reason to write a different handler for the various types of > > NameError. > > > > Also, everyone knows what a Global is. Can the same be said for Free? > > I had thought that to be a implementation detail rather than part of the > > language spec. [Brett] > Perhaps then we should just ditch UnboundLocalError? Perhaps the hierarchy should be left unchanged unless there is shown to be something wrong with it. "just ditching" something is not a rationale that warrants a language change. What problem is being solved by making additions or deletions to subclasses of NameError? > If we just make > sure we have good messages to go with the exceptions the reasons for > the exception should be obvious. +1 Raymodn From ncoghlan at gmail.com Sat Aug 6 11:33:45 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 06 Aug 2005 19:33:45 +1000 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: References: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer> <002301c599ee$09dd6e60$cbae2c81@oemcomputer> <5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com> Message-ID: <42F483F9.2080503@gmail.com> Guido van Rossum wrote: > The point is not to avoid bare 'except:' from hiding programming > errors. There's no hope to obtain that goal. > > The point is to make *legitimate* uses of bare 'except:' easier -- the > typical use case is an application that has some kind of main loop > which uses bare 'except:' to catch gross programming errors in other > parts of the app, or in code received from an imperfect source (like > an end-user script) and recovers by logging the error and continuing. > (I was going to say "or clean up and exit", but that use case is > handled by 'finally:'.) > > Those legitimate uses often need to make a special case of > Keyboardinterrupt and SystemExit -- KeyboardInterrupt because it's not > a bug in the code but a request from the user who is *running* the app > (and the appropriate default response is to exit with a stack trace); > SystemExit because it's not a bug but a deliberate attempt to exit the > program -- logging an error would be a mistake. > > I think the use cases for moving other exceptions out of the way are > weak; MemoryError and SystemError are exceedingly rare and I've never > felt the need to exclude them; when GeneratorExit or StopIteration > reach the outer level of an app, it's a bug like all the others that > bare 'except:' WANTS to catch. To try to turn this idea into a concrete example, the idea would be to make the following code work correctly: for job in joblist: try: job.exec() except: # or "except Exception:" failed_jobs.append((job, sys.exc_info())) Currently, this code will make a user swear, as Ctrl-C will cause the program to move onto the next job, instead of exiting as you would except (I have found Python scripts not exiting when I press Ctrl-C to be an all-too-common problem). Additionally calling sys.exit() inside a job will fail. This may be deliberate (to prevent a job from exiting the whole application), but given only the code above, it looks like a bug. The program will attempt to continue in the face of a MemoryError. This is actually reasonable, as memory may have been freed as the stack unwound to the level of the job execution loop, or the request that failed may have been for a ridicuolously large amount of memory. The program will also attempt to continue in the face of a SystemError. This is reasonable too, as SystemError is only used when the VM thinks the current operation needs to be aborted due to an internal problem in the VM, but the VM itself is still safe to use. If the VM thinks something is seriously wrong with the internal data structures, it will kill the process with Py_FatalError (to ensure that no further Python code is executed), rather than raise SystemError. As others have pointed out, GeneratorExit and StopIteration should never reach the job execution loop - if they do, there's a bug in the job, and they should be caught and logged. That covers the six exceptions that have been proposed to be moved out from under "Exception", and, as I see it, only two of them end up making the grade - SystemExit and KeyboardInterrupt, for exactly the reasons Guido gives in his message above. This suggests a Py3k exception hierarchy that looks like: BaseException +-- CriticalException +-- SystemExit +-- KeyboardInterrupt +-- Exception +-- GeneratorExit +-- (Remainder as for Python 2.4, other than KeyboardInterrupt) With a transitional 2.x hierarchy that looks like: BaseException +-- CriticalException +-- SystemExit +-- KeyboardInterrupt +-- Exception +-- GeneratorExit +-- (Remainder exactly as for Python 2.4) The reason for the CriticalException parent is that Python 2.x code can be made 'correct' by doing: try: # whatever except CriticalException: raise except: # or 'except Exception' # Handle everything non-critical And, the hypothetical job execution loop above can be updated to: for job in joblist: try: job.exec() except CriticalException: failed_jobs.append((job, sys.exc_info())) job_idx = joblist.find(job) skipped_jobs.extend(joblist[job_idx+1:] raise except: # or "except Exception:" failed_jobs.append((job, sys.exc_info())) To tell the truth, if base except is kept around for Py3k, I would prefer to see it catch BaseException rather than Exception. Failing that, I would prefer to see it removed. Having it catch something other than the root of the exception hierarchy would be just plain confusing. Moving SystemExit and KeyboardInterrupt is the only change we've considered which seems to have a genuine motivating use case. The rest of the changes suggested don't seem to be solving an actual problem (or are solving a problem that is minor enough to be not worth any backward compatibility pain). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From nas at arctrix.com Sat Aug 6 12:23:42 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Sat, 6 Aug 2005 04:23:42 -0600 Subject: [Python-Dev] PEP: Generalised String Coercion Message-ID: <20050806102342.GA11309@mems-exchange.org> The title is perhaps a little too grandiose but it's the best I could think of. The change is really not large. Personally, I would be happy enough if only %s was changed and the built-in was not added. Please comment. Neil PEP: 349 Title: Generalised String Coercion Version: $Revision: 1.2 $ Last-Modified: $Date: 2005/08/06 04:05:48 $ Author: Neil Schemenauer Status: Draft Type: Standards Track Content-Type: text/plain Created: 02-Aug-2005 Post-History: 06-Aug-2005 Python-Version: 2.5 Abstract This PEP proposes the introduction of a new built-in function, text(), that provides a way of generating a string representation of an object without forcing the result to be a particular string type. In addition, the behavior %s format specifier would be changed to call text() on the argument. These two changes would make it easier to write library code that can be used by applications that use only the str type and by others that also use the unicode type. Rationale Python has had a Unicode string type for some time now but use of it is not yet widespread. There is a large amount of Python code that assumes that string data is represented as str instances. The long term plan for Python is to phase out the str type and use unicode for all string data. Clearly, a smooth migration path must be provided. We need to upgrade existing libraries, written for str instances, to be made capable of operating in an all-unicode string world. We can't change to an all-unicode world until all essential libraries are made capable for it. Upgrading the libraries in one shot does not seem feasible. A more realistic strategy is to individually make the libraries capable of operating on unicode strings while preserving their current all-str environment behaviour. First, we need to be able to write code that can accept unicode instances without attempting to coerce them to str instances. Let us label such code as Unicode-safe. Unicode-safe libraries can be used in an all-unicode world. Second, we need to be able to write code that, when provided only str instances, will not create unicode results. Let us label such code as str-stable. Libraries that are str-stable can be used by libraries and applications that are not yet Unicode-safe. Sometimes it is simple to write code that is both str-stable and Unicode-safe. For example, the following function just works: def appendx(s): return s + 'x' That's not too surprising since the unicode type is designed to make the task easier. The principle is that when str and unicode instances meet, the result is a unicode instance. One notable difficulty arises when code requires a string representation of an object; an operation traditionally accomplished by using the str() built-in function. Using str() makes the code not Unicode-safe. Replacing a str() call with a unicode() call makes the code not str-stable. Using a string format almost accomplishes the goal but not quite. Consider the following code: def text(obj): return '%s' % obj It behaves as desired except if 'obj' is not a basestring instance and needs to return a Unicode representation of itself. In that case, the string format will attempt to coerce the result of __str__ to a str instance. Defining a __unicode__ method does not help since it will only be called if the right-hand operand is a unicode instance. Using a unicode instance for the right-hand operand does not work because the function is no longer str-stable (i.e. it will coerce everything to unicode). Specification A Python implementation of the text() built-in follows: def text(s): """Return a nice string representation of the object. The return value is a basestring instance. """ if isinstance(s, basestring): return s r = s.__str__() if not isinstance(r, basestring): raise TypeError('__str__ returned non-string') return r Note that it is currently possible, although not very useful, to write __str__ methods that return unicode instances. The %s format specifier for str objects would be changed to call text() on the argument. Currently it calls str() unless the argument is a unicode instance (in which case the object is substituted as is and the % operation returns a unicode instance). The following function would be added to the C API and would be the equivalent of the text() function: PyObject *PyObject_Text(PyObject *o); A reference implementation is available on Sourceforge [1] as a patch. Backwards Compatibility The change to the %s format specifier would result in some % operations returning a unicode instance rather than raising a UnicodeDecodeError exception. It seems unlikely that the change would break currently working code. Alternative Solutions Rather than adding the text() built-in, if PEP 246 were implemented then adapt(s, basestring) could be equivalent to text(s). The advantage would be one less built-in function. The problem is that PEP 246 is not implemented. Fredrik Lundh has suggested [2] that perhaps a new slot should be added (e.g. __text__), that could return any kind of string that's compatible with Python's text model. That seems like an attractive idea but many details would still need to be worked out. Instead of providing the text() built-in, the %s format specifier could be changed and a string format could be used instead of calling text(). However, it seems like the operation is important enough to justify a built-in. Instead of providing the text() built-in, the basestring type could be changed to provide the same functionality. That would possibly be confusing behaviour for an abstract base type. Some people have suggested [3] that an easier migration path would be to change the default encoding to be UTF-8. Code that is not Unicode safe would then encode Unicode strings as UTF-8 and operate on them as str instances, rather than raising a UnicodeDecodeError exception. Other code would assume that str instances were encoded using UTF-8 and decode them if necessary. While that solution may work for some applications, it seems unsuitable as a general solution. For example, some applications get string data from many different sources and assuming that all str instances were encoded using UTF-8 could easily introduce subtle bugs. References [1] http://www.python.org/sf/1159501 [2] http://mail.python.org/pipermail/python-dev/2004-September/048755.html [3] http://blog.ianbicking.org/illusive-setdefaultencoding.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From amk at amk.ca Sat Aug 6 14:10:01 2005 From: amk at amk.ca (A.M. Kuchling) Date: Sat, 6 Aug 2005 08:10:01 -0400 Subject: [Python-Dev] PEP 8: exception style Message-ID: <20050806121001.GC16042@rogue.amk.ca> PEP 8 doesn't express any preference between the two forms of raise statements: raise ValueError, 'blah' raise ValueError("blah") I like the second form better, because if the exception arguments are long or include string formatting, you don't need to use line continuation characters because of the containing parens. Grepping through the library code, the first form is in the majority, used roughly 60% of the time. Should PEP 8 take a position on this? If yes, which one? --amk From raymond.hettinger at verizon.net Sat Aug 6 18:12:35 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 06 Aug 2005 12:12:35 -0400 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: <42F483F9.2080503@gmail.com> Message-ID: <001201c59aa1$cb1fd3c0$441dc797@oemcomputer> [Nick Coghlan] > As others have pointed out, GeneratorExit and StopIteration should never > reach > the job execution loop - if they do, there's a bug in the job, and they > should > be caught and logged. Please read my other, detailed post on this (8/5/2005 4:05pm). It is a mistake to bypass control flow exceptions like GeneratorExit and StopIteration. Those need to remain under Exception. Focus on your core use case of eliminating the common idiom: try: block() except KeyboardInterrupt: raise except: pass # or some handler/logger In real code, I've never seen the above idiom used with StopIteration. Read Guido's note and my note. There are plenty of valid use cases for a bare except intending to catch almost everything including programming errors from NameError to StopIteration. It is a consenting-adults construct. Your proposal breaks a major use case for it (preventing your sales demos from crashing in front of your customers, writing fault-tolerant programs, etc.) Raymond From fumanchu at amor.org Sat Aug 6 18:33:05 2005 From: fumanchu at amor.org (Robert Brewer) Date: Sat, 6 Aug 2005 09:33:05 -0700 Subject: [Python-Dev] PEP 8: exception style Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3772711@exchange.hqamor.amorhq.net> A.M. Kuchling wrote: > PEP 8 doesn't express any preference between the > two forms of raise statements: > raise ValueError, 'blah' > raise ValueError("blah") > > I like the second form better, because if the exception arguments are > long or include string formatting, you don't need to use line > continuation characters because of the containing parens. Grepping > through the library code, the first form is in the majority, used > roughly 60% of the time. > > Should PEP 8 take a position on this? If yes, which one? I like the second form better, because even intermediate Pythonistas sometimes make a mistake between: raise ValueError, A and raise (ValueError, A) I'd like to see the first form removed in Python 3k, to help reduce the ambiguity. But PEP 8 taking a stand on it would be a good start for now. Robert Brewer System Architect Amor Ministries fumanchu at amor.org From gvanrossum at gmail.com Sat Aug 6 19:10:54 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 6 Aug 2005 10:10:54 -0700 Subject: [Python-Dev] PEP 8: exception style In-Reply-To: <20050806121001.GC16042@rogue.amk.ca> References: <20050806121001.GC16042@rogue.amk.ca> Message-ID: On 8/6/05, A.M. Kuchling wrote: > PEP 8 doesn't express any preference between the > two forms of raise statements: > raise ValueError, 'blah' > raise ValueError("blah") > > I like the second form better, because if the exception arguments are > long or include string formatting, you don't need to use line > continuation characters because of the containing parens. Grepping > through the library code, the first form is in the majority, used > roughly 60% of the time. > > Should PEP 8 take a position on this? If yes, which one? Definitely ValueError('blah'). The other form will go away in Python 3000. Please update the PEP. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Sat Aug 6 19:10:09 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 06 Aug 2005 13:10:09 -0400 Subject: [Python-Dev] FW: PEP 8: exception style Message-ID: <001601c59aa9$b38deaa0$441dc797@oemcomputer> > PEP 8 doesn't express any preference between the > two forms of raise statements: > raise ValueError, 'blah' > raise ValueError("blah") > > I like the second form better, because if the exception arguments are > long or include string formatting, you don't need to use line > continuation characters because of the containing parens. Grepping > through the library code, the first form is in the majority, used > roughly 60% of the time. > > Should PEP 8 take a position on this? If yes, which one? I we had to pick one, I would also choose the second form. But why bother inflicting our preference on others, both forms are readable so we won't gain anything by dictating a style. Raymond From tim.peters at gmail.com Sat Aug 6 19:37:04 2005 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 6 Aug 2005 13:37:04 -0400 Subject: [Python-Dev] FW: PEP 8: exception style In-Reply-To: <001601c59aa9$b38deaa0$441dc797@oemcomputer> References: <001601c59aa9$b38deaa0$441dc797@oemcomputer> Message-ID: <1f7befae0508061037179151c0@mail.gmail.com> [AMK] >> PEP 8 doesn't express any preference between the >> two forms of raise statements: >> raise ValueError, 'blah' >> raise ValueError("blah") >> >> I like the second form better, because if the exception arguments are >> long or include string formatting, you don't need to use line >> continuation characters because of the containing parens. Grepping >> through the library code, the first form is in the majority, used >> roughly 60% of the time. >> >> Should PEP 8 take a position on this? If yes, which one? [Raymond Hettinger] > I we had to pick one, I would also choose the second form. But why > bother inflicting our preference on others, both forms are readable so > we won't gain anything by dictating a style. Ongoing cruft reduction -- TOOWTDI. The first form was necessary at Python's start because exceptions were strings, and strings aren't callable, and there needed to be _some_ way to spell "and here's the detail associated with the exception". "raise" grew special syntax to support that need. In a Python without string exceptions, that syntax isn't needed, and becomes (over time) an increasingly obscure way to invoke an ordinary constructor -- ValueError("blah") does exactly the same thing in a raise statement as it does in any other context, and transforming `ValueError, 'blah'` into the former becomes a wart unique to raise statements. From tjreedy at udel.edu Sat Aug 6 22:28:56 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 6 Aug 2005 16:28:56 -0400 Subject: [Python-Dev] PEP 8: exception style References: <20050806121001.GC16042@rogue.amk.ca> Message-ID: "Guido van Rossum" wrote in message news:ca471dc205080610104fb870ac at mail.gmail.com... > On 8/6/05, A.M. Kuchling wrote: >> PEP 8 doesn't express any preference between the >> two forms of raise statements: >> raise ValueError, 'blah' >> raise ValueError("blah") >> >> I like the second form better, because if the exception arguments are >> long or include string formatting, you don't need to use line >> continuation characters because of the containing parens. Grepping >> through the library code, the first form is in the majority, used >> roughly 60% of the time. >> >> Should PEP 8 take a position on this? If yes, which one? > > Definitely ValueError('blah'). The other form will go away in Python > 3000. Please update the PEP. Great. PEP 3000 could also be updated to add the line The raise Error,'blah' syntax: use raise Error('blah') instead [14] in the To be removed section after the line on string exceptions and [14] under references. Terry J. Reedy From tjreedy at udel.edu Sat Aug 6 22:31:21 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 6 Aug 2005 16:31:21 -0400 Subject: [Python-Dev] Generalised String Coercion References: <20050806102342.GA11309@mems-exchange.org> Message-ID: > PEP: 349 > Title: Generalised String Coercion ... > Rationale > Python has had a Unicode string type for some time now but use of > it is not yet widespread. There is a large amount of Python code > that assumes that string data is represented as str instances. > The long term plan for Python is to phase out the str type and use > unicode for all string data. This PEP strikes me as premature, as putting the toy wagon before the horse, since it is premised on a major change to Python, possibly the most disruptive and controversial ever, being a done deal. However there is, as far as I could find no PEP on Making Strings be Unicode, let alone a discussed, debated, and finalized PEP on the subject. > Clearly, a smooth migration path must be provided. Of course. But the path depends on the detailed final target, which has not, as far as I know, has been finalized, and certainly not in the needed PEP. Your proposal might be part of the transition section of such a PEP or of a separate migration path PEP. Terry J. Reedy From bcannon at gmail.com Sun Aug 7 01:14:32 2005 From: bcannon at gmail.com (Brett Cannon) Date: Sat, 6 Aug 2005 16:14:32 -0700 Subject: [Python-Dev] PEP 8: exception style In-Reply-To: References: <20050806121001.GC16042@rogue.amk.ca> Message-ID: On 8/6/05, Guido van Rossum wrote: > On 8/6/05, A.M. Kuchling wrote: > > PEP 8 doesn't express any preference between the > > two forms of raise statements: > > raise ValueError, 'blah' > > raise ValueError("blah") > > > > I like the second form better, because if the exception arguments are > > long or include string formatting, you don't need to use line > > continuation characters because of the containing parens. Grepping > > through the library code, the first form is in the majority, used > > roughly 60% of the time. > > > > Should PEP 8 take a position on this? If yes, which one? > > Definitely ValueError('blah'). The other form will go away in Python > 3000. Please update the PEP. > Done. rev. 1.18 . -Brett From ncoghlan at gmail.com Sun Aug 7 03:52:01 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 07 Aug 2005 11:52:01 +1000 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: <001201c59aa1$cb1fd3c0$441dc797@oemcomputer> References: <001201c59aa1$cb1fd3c0$441dc797@oemcomputer> Message-ID: <42F56941.401@gmail.com> Raymond Hettinger wrote: > Please read my other, detailed post on this (8/5/2005 4:05pm). It is a > mistake to bypass control flow exceptions like GeneratorExit and > StopIteration. Those need to remain under Exception. This is the paragraph after the one you replied to above: [Nick Coghlan] >> That covers the six exceptions that have been proposed to be moved out >> from under "Exception", and, as I see it, only two of them end up making >> the grade - SystemExit and KeyboardInterrupt, for exactly the reasons >> Guido gives in his message above. The remainder of my message then goes on to describe a hierarchy just as you suggest - SystemError, MemoryError, StopIteration and GeneratorExit are all still caught by "except Exception:". The only two exceptions which are no longer caught by "except Exception:" are KeyboardInterrupt and SystemExit. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From gvanrossum at gmail.com Sun Aug 7 03:56:39 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 6 Aug 2005 18:56:39 -0700 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> Message-ID: [Removed python-list CC] On 8/6/05, Terry Reedy wrote: > > PEP: 349 > > Title: Generalised String Coercion > ... > > Rationale > > Python has had a Unicode string type for some time now but use of > > it is not yet widespread. There is a large amount of Python code > > that assumes that string data is represented as str instances. > > The long term plan for Python is to phase out the str type and use > > unicode for all string data. > > This PEP strikes me as premature, as putting the toy wagon before the > horse, since it is premised on a major change to Python, possibly the most > disruptive and controversial ever, being a done deal. However there is, as > far as I could find no PEP on Making Strings be Unicode, let alone a > discussed, debated, and finalized PEP on the subject. True. OTOH, Jython and IreonPython already have this, and it is my definite plan to make all strings Unicode in Python 3000. The rest (such as a bytes datatype) is details, as they say. :-) My first response to the PEP, however, is that instead of a new built-in function, I'd rather relax the requirement that str() return an 8-bit string -- after all, int() is allowed to return a long, so why couldn't str() be allowed to return a Unicode string? The main problem for a smooth Unicode transition remains I/O, in my opinion; I'd like to see a PEP describing a way to attach an encoding to text files, and a way to decide on a default encoding for stdin, stdout, stderr. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Sun Aug 7 05:06:45 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 06 Aug 2005 23:06:45 -0400 Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0 In-Reply-To: <42F56941.401@gmail.com> Message-ID: <000601c59afd$0b98e4e0$4a05a044@oemcomputer> > The remainder of my message then goes on to describe a hierarchy just as > you > suggest - SystemError, MemoryError, StopIteration and GeneratorExit are > all > still caught by "except Exception:". The only two exceptions which are no > longer caught by "except Exception:" are KeyboardInterrupt and SystemExit. Ah, I was too quick on the draw. It now appears that you were already converted :-) Now, if only the PEP would get updated ... BTW, why did you exclude MemoryError? Raymond From bcannon at gmail.com Sun Aug 7 06:26:28 2005 From: bcannon at gmail.com (Brett Cannon) Date: Sat, 6 Aug 2005 21:26:28 -0700 Subject: [Python-Dev] Major revision of PEP 348 committed Message-ID: Version 1.5 of PEP 348 (http://www.python.org/peps/pep-0348.html) just got checked in. This one is a *big* change compared to the previous version: * Renamings removed * SystemExit are the KeyboardInterrupt are the only exceptions *not* inheriting from Exception + CriticalException has been renamed TerminalException so it is more inline with the idea that the exceptions are meant to terminate the interpreter, not that they are more critical than other exceptions * Removed ControlFlowException + StopIteration and GeneratorExit inherit from Exception directly * Added VMError which inherits Exception + SystemError and MemoryError subclass VMError * Removed UnboundG(Global|Free)Error * other stuff I don't remember This version addresses everyone's worries about backwards-compatibility or changes that were not substantive enough to break code. The things I did on my own without thorough discussion is remove ControlFlowException and introduce VMError. The former seemed reasonable since catching control flow exceptions as a group is probably rare and with StopIteration and GeneratorExit not falling outside of Exception, ControlFlowException lost its usefulness. VMError was introduced to allow the grouping of MemoryError and SystemError since they are both exceptions relating to the VM. The name can be changed to InterpreterError, but VMError is shorter while still getting the idea across. Plus I just like VMError more. =) OK, guys, have at it. -Brett From reinhold-birkenfeld-nospam at wolke7.net Sun Aug 7 09:46:31 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Sun, 07 Aug 2005 09:46:31 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> Message-ID: Guido van Rossum wrote: > The main problem for a smooth Unicode transition remains I/O, in my > opinion; I'd like to see a PEP describing a way to attach an encoding > to text files, and a way to decide on a default encoding for stdin, > stdout, stderr. FWIW, I've already drafted a patch for the former. It lets you write to file.encoding and honors this when writing Unicode strings to it. http://www.python.org/sf/1214889 Reinhold -- Mail address is perfectly valid! From raymond.hettinger at verizon.net Sun Aug 7 11:54:28 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 07 Aug 2005 05:54:28 -0400 Subject: [Python-Dev] Major revision of PEP 348 committed In-Reply-To: Message-ID: <000401c59b36$01226de0$e410c797@oemcomputer> VMError -- This is a new intermediate grouping so it won't break anything and it does bring together two exceptions relating them by source. However, I recommend against introducing this new group. Besides added yet another thing to remember, it violates Flat-Is-Better-Than-Nested (see FIBTN below). Also, the new group is short on use cases with MemoryErrors sometimes being recoverable and SystemErrors generally not. In the library, only cookielib catches these and it does so along with KeyboardInterrupt in order to re-raise. In general, you don't want to introduce a new grouping unless there is some recurring need to catch that group. EOFError -- I recommend leaving this one alone. IOError is generally for real errors while EOF occurs in the normal course of reading a file or filelike source. The former is hard to recover and the latter is normal. The PEP's justification of "Since an EOF comes from I/O it only makes sense that it be considered an I/O error" is somewhat shallow and doesn't reflect thought about how those exceptions are actually used. That information is readily attainable by scanning the standard library with 57 instances of EOFError and 150 instances of IOError. There are a few cases of overlap where an except clause catches both; however, the two are mostly used independent from one another. The review of the library gives a good indication of how much code would be broken by this change. Also, see the FIBTN comment below. AnyDeprecationWarning -- This grouping makes some sense intuitively but do we have much real code that has had occasion to catch both at the same time? If not, then we don't need this. FIBTN (flat-is-better-than-nested) -- This bit of Zen carries extra significance for the exception hierarchy. The core issue is that exceptions are NOT inherently tree-structured. Each may ultimately carry its own set of meaningful attributes and those tend to not neatly roll-up into a parent/subclass relationships without Liskov violations. Likewise, it is a mistake to introduce nesting as a means of categorization. The problem is that many conflicting, though meaningful groupings are possible. (i.e. grouped by source (vm, user, data, system), grouped by recoverability or transience, grouped by module/container type (dictionary errors, weakref errors, net errors, warnings module, xml module, email errors), etc.) The ONLY useful nestings are those for a cluster of exceptions that are typically all handled together. IOW, any new nesting needs to be justified by a long list of real code examples that currently catch all those exceptions at the same time. Ideally, searching for that list would also turn-up no competing instances where other, orthogonal groupings are being used. Vocabulary size -- At one time, python-dev exhibited a strong reluctance to introduce any new builtins. No matter how sensible the idea, there was typically an immediate effort to jam the proposed function into some other namespace. It should be remembered that each of PEP 348's proposed new exception groupings ARE new builtins. Therefore, the bar for admission should be relatively high (i.e. I would prefer Fredrik's join() proposal to any of the above new proposals). Every new word in the vocabulary makes the language a little more complex, a little less likely to fit in your brain, and a little harder to learn. Nestings make this more acute since learning the new word also entails remembering how it fits in the structure (yet another good reason for FIBTN). Once again, my advice is not introduce change unless it is solving a specific, real problem in existing code. The groupings listed above feel like random ideas searching for a justification rather than the product of an effort to solve known issues. If the PEP can't resist the urge to create new intermediate groupings, then start by grepping through tons of Python code to find-out which exceptions are typically caught on the same line. That would be a worthwhile empirical study and may lead to useful insights. Try to avoid reversing the process, staring at the existing tree, and letting your mind arbitrarily impose patterns on it. Raymond From ncoghlan at gmail.com Sun Aug 7 14:17:16 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 07 Aug 2005 22:17:16 +1000 Subject: [Python-Dev] Major revision of PEP 348 committed In-Reply-To: <000401c59b36$01226de0$e410c797@oemcomputer> References: <000401c59b36$01226de0$e410c797@oemcomputer> Message-ID: <42F5FBCC.80701@gmail.com> Raymond Hettinger wrote: > FIBTN (flat-is-better-than-nested) -- This bit of Zen carries extra > significance for the exception hierarchy. The core issue is that > exceptions are NOT inherently tree-structured. Each may ultimately > carry its own set of meaningful attributes and those tend to not neatly > roll-up into a parent/subclass relationships without Liskov violations. I think this is a key point, because a Python except clause makes it easy to create an on-the-fly exception grouping, but it is more awkward to get rid of inheritance that is incorrect (you have to catch and reraise the ones you don't want handled before the real handler). I think Raymond gives a good suggestion - new groupings should only be introduced for exceptions where we have reasonable evidence that they are already frequently caught together. TerminalException is a good example of this. "except (KeyboardInterrupt, SystemExit): raise" is something that should be written often - there is a definite use case for catching them together. Those two are also examples of inappropriate inheritance causing obvious usability problems. Cheers, Nick. P.S. Are there any other hardware control people around to understand what I mean when I say that python-dev discussions sometimes remind me of a poorly tuned PID loop? Particularly the with statement discussion and this one. . . -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Sun Aug 7 14:24:21 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 07 Aug 2005 22:24:21 +1000 Subject: [Python-Dev] Major revision of PEP 348 committed In-Reply-To: References: Message-ID: <42F5FD75.50903@gmail.com> Brett Cannon wrote: > * SystemExit are the KeyboardInterrupt are the only exceptions *not* > inheriting from Exception > + CriticalException has been renamed TerminalException so it is > more inline with the idea that the exceptions are meant to terminate > the interpreter, not that they are more critical than other exceptions I like TerminalException, although TerminatingException may be less ambiguous. ("There's nothing wrong with my terminal, you moronic machine!") > This version addresses everyone's worries about > backwards-compatibility or changes that were not substantive enough to > break code. Well, I think you said from the start that the forces of backwards-compatibility would get you eventually ;) > The things I did on my own without thorough discussion is remove > ControlFlowException and introduce VMError. +1 on the former. -1 on the latter. Same reasons as Raymond, basically. These exceptions are builtins, so let's not add new ones without a strong use case. Anyway, this is starting to look pretty good (but then, I thought that a few days ago, too). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From martin at v.loewis.de Sun Aug 7 14:53:21 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Aug 2005 14:53:21 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> Message-ID: <42F60441.8000007@v.loewis.de> Guido van Rossum wrote: > The main problem for a smooth Unicode transition remains I/O, in my > opinion; I'd like to see a PEP describing a way to attach an encoding > to text files, and a way to decide on a default encoding for stdin, > stdout, stderr. If stdin, stdout and stderr go to a terminal, there already is a default encoding (actually, there always is a default encoding on these, as it falls back to the system encoding if its not a terminal, or if the terminal's encoding is not supported or cannot be determined). Regards, Martin From martin at v.loewis.de Sun Aug 7 15:06:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Aug 2005 15:06:27 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> Message-ID: <42F60753.8030309@v.loewis.de> Reinhold Birkenfeld wrote: > FWIW, I've already drafted a patch for the former. It lets you write to > file.encoding and honors this when writing Unicode strings to it. I don't like that approach. You shouldn't be allowed to change the encoding mid-stream (except perhaps under very specific circumstances). As I see it, the buffer of an encoded file becomes split, atleast for input: there are bytes which have been read and not yet decoded, and there are characters which have been decoded but not yet consumed. If you change the encoding mid-stream, you would have to undo decoding that was already done, resetting the stream to the real "current" position. For output, the situation is similar: before changing to a new encoding, or before changing from unicode output to byte output, you have to flush then codec first: it may be that the codec has buffered some state which needs to be completely processed first before a new codec can be applied to the stream. Another issue is seeking: given the many different kinds of buffers, seeking becomes fairly complex. Ideally, seeking should apply to application-level positions, ie. if when you tell the current position, it should be in terms of data already consumed by the application. Perhaps seeking in an encoded stream should not be supported at all. Finally, you also have to consider Universal Newlines: you can apply them either on the byte stream, or on the character stream. I think conceptually right would be to do universal newlines on the character stream. Regards, Martin From mal at egenix.com Sun Aug 7 15:35:49 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 07 Aug 2005 15:35:49 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> Message-ID: <42F60E35.9080809@egenix.com> Guido van Rossum wrote: > My first response to the PEP, however, is that instead of a new > built-in function, I'd rather relax the requirement that str() return > an 8-bit string -- after all, int() is allowed to return a long, so > why couldn't str() be allowed to return a Unicode string? The problem here is that strings and Unicode are used in different ways, whereas integers and longs are very similar. Strings are used for both arbitrary data and text data, Unicode can only be used for text data. The new text() built-in would help make a clear distinction between "convert this object to a string of bytes" and "please convert this to a text representation". We need to start making the separation somewhere and I think this is a good non-invasive start. Furthermore, the text() built-in could be used to only allow 8-bit strings with ASCII content to pass through and require that all non-ASCII content be returned as Unicode. We wouldn't be able to enforce this in str(). I'm +1 on adding text(). I would also like to suggest a new formatting marker '%t' to have the same semantics as text() - instead of changing the semantics of %s as the Neil suggests in the PEP. Again, the reason is to make the difference between text and arbitrary data explicit and visible in the code. > The main problem for a smooth Unicode transition remains I/O, in my > opinion; I'd like to see a PEP describing a way to attach an encoding > to text files, and a way to decide on a default encoding for stdin, > stdout, stderr. Hmm, not sure why you need PEPs for this: Open an encoded file: --------------------- Use codecs.open() instead of open() or file(). Set the external encoding for stdin, stdout, stderr: ---------------------------------------------------- (also an example for adding encoding support to an existing file object): def set_sys_std_encoding(encoding): # Load encoding support (encode, decode, streamreader, streamwriter) = codecs.lookup(encoding) # Wrap using stream writers and readers sys.stdin = streamreader(sys.stdin) sys.stdout = streamwriter(sys.stdout) sys.stderr = streamwriter(sys.stderr) # Add .encoding attribute for introspection sys.stdin.encoding = encoding sys.stdout.encoding = encoding sys.stderr.encoding = encoding set_sys_std_encoding('rot-13') Example session: >>> print 'hello' uryyb >>> raw_input() hello h'hello' >>> 1/0 Genpronpx (zbfg erprag pnyy ynfg): Svyr "", yvar 1, va ? MrebQvivfvbaReebe: vagrtre qvivfvba be zbqhyb ol mreb Note that the interactive session bypasses the sys.stdin redirection, which is why you can still enter Python commands in ASCII - not sure whether there's a reason for this, or whether it's just a missing feature. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 07 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Sun Aug 7 15:47:49 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Aug 2005 15:47:49 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F25E36.5060103@egenix.com> References: <42E93940.6080708@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de> <42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de> <42F1D72C.8070202@egenix.com> <42F256FC.7050606@v.loewis.de> <42F25E36.5060103@egenix.com> Message-ID: <42F61105.1070806@v.loewis.de> M.-A. Lemburg wrote: > BTW, in one of your replies I read that you had a problem with > how cvs2svn handles trunk, branches and tags. In reality, this > is no problem at all, since Subversion is very good at handling > moves within the repository: you can easily change the repository > layout after the import to whatevery layout you see fit - without > losing any of the version history. Yes, however, I recall that some clients have problems with displaying history across renames (in particular, I believe viewcvs has this problem); also, it becomes difficult to refer to an old version by path name, since the old versions had all different path names. Jim Fulton has suggested a different approach: cvs2svn can create a dump file, and svnadmin load accepts a parent directory. Then, no renames are necessary. Regards, Martin From martin at v.loewis.de Sun Aug 7 15:55:05 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Aug 2005 15:55:05 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <5.1.1.6.0.20050804143230.025c4138@mail.telecommunity.com> References: <42F256FC.7050606@v.loewis.de> <42E93940.6080708@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de> <42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de> <42F1D72C.8070202@egenix.com> <42F256FC.7050606@v.loewis.de> <5.1.1.6.0.20050804143230.025c4138@mail.telecommunity.com> Message-ID: <42F612B9.105@v.loewis.de> Phillip J. Eby wrote: > Yeah, in my use of SVN I find that this is more theoretical than actual > for certain use cases. You can see the history of a file including the > history of any file it was copied from. However, if you want to try to > look at the whole layout, you can't easily get to the old locations. > This can be a royal pain, whereas at least in CVS you can use viewcvs to > show you the "attic". Subversion doesn't have an attic, which makes > looking at structural history very difficult. I guess this is a client issue also; in websvn, you can browse an older revision to see what the structure looked at that point. If you made tags, you can also browse the tags through the standard HTTP interface. I don't know a client, off-hand, which would answer the question "which files have been moved since tag xyz?". Regards, Martin From martin at v.loewis.de Sun Aug 7 16:07:41 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Aug 2005 16:07:41 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <200507281956.03788.jeff@taupro.com> References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> Message-ID: <42F615AD.7010008@v.loewis.de> Jeff Rush wrote: > BTW, re SSH access on python.org, using Apache's SSL support re https would > provide as good of security without the risk of giving out shell accounts. > SSL would encrypt the link and require a password or permit cert auth > instead, same as SSH. Cert admin needn't be hard if only a single server > cert is used, with client passwords, instead of client certs. That is the currently-proposed setup. However, with the current subversion clients, you will have to save your password to disk, or type it in every time. This is the real security disk: if somebody attacks the client machine, they get access to the python source repository. Regards, Martin From martin at v.loewis.de Sun Aug 7 16:10:29 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 07 Aug 2005 16:10:29 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> Message-ID: <42F61655.3070101@v.loewis.de> Fernando Perez wrote: > I know Joe was in contact with the SVN devs to work on this, so perhaps he's > using a patched version of cvs2svn, I simply don't know. But I mention it in > case it proves useful to the python.org conversion. Thanks for the pointer. It turns out that I could resolve all my conversion problems myself (following Jim Fulton's suggestion of creating dump files). I found that somebody created a patch to support different structures in cvs2svn directly, but these patches have not been integrated yet. Regards, Martin From martin at v.loewis.de Sun Aug 7 16:34:43 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Aug 2005 16:34:43 +0200 Subject: [Python-Dev] PEP 347: Migration to Subversion Message-ID: <42F61C03.6050703@v.loewis.de> I have placed a new version of the PEP on http://www.python.org/peps/pep-0347.html Changes to the previous version include: - add more rationale for using svn (atomic changesets, fast tags and branches) - changed conversion procedure to a single repository, with some reorganization. See http://www.dcl.hpi.uni-potsdam.de/pysvn/ My proposal is that the repository is called http://svn.python.org/projects - add discussion section (Nick Bastin's proposal of hosting a Perforce repository, single vs. multiple repositories, user authentication, admin overhead and alternative hosters) - require python-cvsroot to be preserved forever. Please let me know what else I should change in the PEP. Regards, Martin From martin at v.loewis.de Sun Aug 7 16:39:21 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Aug 2005 16:39:21 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1122918723.9680.33.camel@warna.corp.google.com> References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> Message-ID: <42F61D19.6090806@v.loewis.de> Donovan Baarda wrote: > Yeah. IMHO the sadest thing about SVN is it doesn't do branch/merge > properly. All the other cool stuff like renames etc is kinda undone by > that. For a definition of properly, see; > > http://prcs.sourceforge.net/merge.html Can you please elaborate? I read the page, and it seems to me that subversion's merge command works exactly the way described on the page. Regards, Martin From mal at egenix.com Sun Aug 7 16:48:13 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 07 Aug 2005 16:48:13 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F61105.1070806@v.loewis.de> References: <42E93940.6080708@v.loewis.de> <1122676547.10752.61.camel@geddy.wooz.org> <42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de> <42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de> <42F1D72C.8070202@egenix.com> <42F256FC.7050606@v.loewis.de> <42F25E36.5060103@egenix.com> <42F61105.1070806@v.loewis.de> Message-ID: <42F61F2D.7080604@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>BTW, in one of your replies I read that you had a problem with >>how cvs2svn handles trunk, branches and tags. In reality, this >>is no problem at all, since Subversion is very good at handling >>moves within the repository: you can easily change the repository >>layout after the import to whatevery layout you see fit - without >>losing any of the version history. > > > Yes, however, I recall that some clients have problems with displaying > history across renames (in particular, I believe viewcvs has this > problem); also, it becomes difficult to refer to an old version by > path name, since the old versions had all different path names. Since I only use trac to view the source code (which doesn't have this problem), I can't comment on this. > Jim Fulton has suggested a different approach: cvs2svn can create > a dump file, and svnadmin load accepts a parent directory. Then, > no renames are necessary. Good idea. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 07 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mwh at python.net Sun Aug 7 20:05:03 2005 From: mwh at python.net (Michael Hudson) Date: Sun, 07 Aug 2005 19:05:03 +0100 Subject: [Python-Dev] PEP 8: exception style In-Reply-To: (Guido van Rossum's message of "Sat, 6 Aug 2005 10:10:54 -0700") References: <20050806121001.GC16042@rogue.amk.ca> Message-ID: <2mwtmxy65s.fsf@starship.python.net> Guido van Rossum writes: > On 8/6/05, A.M. Kuchling wrote: >> PEP 8 doesn't express any preference between the >> two forms of raise statements: >> raise ValueError, 'blah' >> raise ValueError("blah") >> >> I like the second form better, because if the exception arguments are >> long or include string formatting, you don't need to use line >> continuation characters because of the containing parens. Grepping >> through the library code, the first form is in the majority, used >> roughly 60% of the time. >> >> Should PEP 8 take a position on this? If yes, which one? > > Definitely ValueError('blah'). The other form will go away in Python > 3000. Please update the PEP. How do you then supply a traceback to the raise statement? Cheers, mwh -- please realize that the Common Lisp community is more than 40 years old. collectively, the community has already been where every clueless newbie will be going for the next three years. so relax, please. -- Erik Naggum, comp.lang.lisp From gvanrossum at gmail.com Sun Aug 7 20:15:05 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 7 Aug 2005 11:15:05 -0700 Subject: [Python-Dev] PEP 8: exception style In-Reply-To: <2mwtmxy65s.fsf@starship.python.net> References: <20050806121001.GC16042@rogue.amk.ca> <2mwtmxy65s.fsf@starship.python.net> Message-ID: > How do you then supply a traceback to the raise statement? raise ValueError, ValueError("blah"), tb Maybe in Py3K this could become raise ValueError("bloop"), tb -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Sun Aug 7 20:27:20 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Aug 2005 20:27:20 +0200 Subject: [Python-Dev] [ python-Patches-790710 ] breakpoint command listsinpdb In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE7AC28B@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE7AC28B@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <42F65288.8040901@v.loewis.de> Michiel De Hoon wrote: > Speaking of the five-patch-review-rule, about two months ago I reviewed five > patches and posted a summary here in order to push patch #1049855. This patch > is still waiting for a verdict (this is also my own fault, since I needed > several iterations to get this patch straightened out; my apologies for > that). Is there anything else I can do for this patch? Sorry, I missed that message. I now reviewed the patch, but at the moment, I see little chance that the suggested feature is implementable, except by using code that is specific to each stdio implementation. Regards, Martin From raymond.hettinger at verizon.net Sun Aug 7 20:25:17 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 07 Aug 2005 14:25:17 -0400 Subject: [Python-Dev] PEP 8: exception style In-Reply-To: Message-ID: <000401c59b7d$6d730940$05fecc97@oemcomputer> > > How do you then supply a traceback to the raise statement? > > raise ValueError, ValueError("blah"), tb > > Maybe in Py3K this could become > > raise ValueError("bloop"), tb The instantiation and bindings need to be done in one step without mixing two syntaxes. Treat this case the same as everything else: raise ValueError("blip", traceback=tb) Raymond From ilya at bluefir.net Sun Aug 7 22:38:00 2005 From: ilya at bluefir.net (Ilya Sandler) Date: Sun, 7 Aug 2005 13:38:00 -0700 (PDT) Subject: [Python-Dev] pdb: should next command be extended? Message-ID: Problem: When the code contains list comprehensions (or for that matter any other looping construct), the only way to get quickly through this code in pdb is to set a temporary breakpoint on the line after the loop, which is inconvenient.. There is a SF bug report #1248119 about this behavior. Solution: Should pdb's next command accept an optional numeric argument? It would specify how many actual lines of code (not "line events") should be skipped in the current frame before stopping, i.e "next 5" would mean stop when line>=line_where_next_N_happened+5 is reached. This would allow to easily get over/out of loops in the debugger What do you think? Ilya From martin at v.loewis.de Sun Aug 7 23:11:56 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Aug 2005 23:11:56 +0200 Subject: [Python-Dev] [C++-sig] GCC version compatibility In-Reply-To: <200507172321.31665.anthony@interlink.com.au> References: <200507171601.23780.anthony@interlink.com.au> <20050717100609.GB3581@lap200.cdc.informatik.tu-darmstadt.de> <200507172321.31665.anthony@interlink.com.au> Message-ID: <42F6791C.3030602@v.loewis.de> Anthony Baxter wrote: > I should probably add that I'm not flagging that I think there's a problem > here. I'm mostly urging caution - I hate having to cut brown-paper-bag > releases . If possible, can the folks on c++-sig try this patch > out and put their results in the patch discussion? If you're keen, you > could try jumping onto HP's testdrive systems (http://www.testdrive.hp.com/). >>From what I recall, they have a bunch of systems with non-gcc C++ compilers, > including the DEC^WDigital^Compaq^WHP one on the alphas, and the HP C++ > compiler on the HP/UX boxes[1]. I've looked at the patch, and it looks fairly safe, so I committed it. Regards, Martin From martin at v.loewis.de Sun Aug 7 23:15:08 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Aug 2005 23:15:08 +0200 Subject: [Python-Dev] pdb: should next command be extended? In-Reply-To: References: Message-ID: <42F679DC.6030705@v.loewis.de> Ilya Sandler wrote: > Should pdb's next command accept an optional numeric argument? It would > specify how many actual lines of code (not "line events") > should be skipped in the current frame before stopping, [...] > What do you think? That would differ from gdb's "next ", which does "next" n times. It would be confusing if pdb accepted the same command, but it meant something different. Plus, there is always a chance that +n is never reached, which would also be confusing. So I'm -1 here. Regards, Martin From abo at minkirri.apana.org.au Mon Aug 8 00:12:36 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Sun, 07 Aug 2005 15:12:36 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F61D19.6090806@v.loewis.de> References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42F61D19.6090806@v.loewis.de> Message-ID: <42F68754.6090400@minkirri.apana.org.au> Martin v. L?wis wrote: > Donovan Baarda wrote: > >>Yeah. IMHO the sadest thing about SVN is it doesn't do branch/merge >>properly. All the other cool stuff like renames etc is kinda undone by >>that. For a definition of properly, see; >> >>http://prcs.sourceforge.net/merge.html > > > Can you please elaborate? I read the page, and it seems to me that > subversion's merge command works exactly the way described on the > page. maybe it's changed since I last looked at it, but last time I looked SVN didn't track merge histories. From the svnbook; "Unfortunately, Subversion is not such a system. Like CVS, Subversion 1.0 does not yet record any information about merge operations. When you commit local modifications, the repository has no idea whether those changes came from running svn merge, or from just hand-editing the files." What this means is SVN has no way of automatically identifying the common version. An svn merge requires you to manually identify and specify the "last common point" where the branch was created or last merged. PRCS automatically finds the common version from the branch/merge history, and even remembers the merge/replace/nothing/delete decision you make for each file as the default to use for future merges. You can see this in the command line differences. For subversion; # create and checkout branch my-calc-branch $ svn copy http://svn.example.com/repos/calc/trunk \ http://svn.example.com/repos/calc/branches/my-calc-branch \ -m "Creating a private branch of /calc/trunk." $ svn checkout http://svn.example.com/repos/calc/branches/my-calc-branch # merge and commit changes from trunk $ svn merge -r 341:HEAD http://svn.example.com/repos/calc/trunk $ svn commit -m "Merged trunc changes to my-calc-branch." # merge and commit more changes from trunk $ svn merge -r 345:HEAD http://svn.example.com/repos/calc/trunk $ svn commit -m "Merged trunc changes to my-calc-branch." Note that 341 and 345 are "magic" version numbers which correspond to the trunc version at the time of branch and first merge respectively. It is up to the user to figure out these versions using either meticulous use of tags or svn logs. In PRCS; # create and checkout branch my-calc-branch $ prcs checkout calc -r 0 $ prcs checkin -r my-calc-branch -m "Creating my-calc-branch" # merge and commit changes from trunk $ prcs merge -r 0 $ prcs checkin -m " merged changes from trunk" # merge and commit more changes from trunk $ prcs merge -r 0 $ prcs checkin -m " merged changes from trunk" Note that "-R 0" means "HEAD of trunk branch", and "-r my-calc-branch" means "HEAD of my-calc-branch". There is no need to figure out what versions of those branches to use as the "changes from" point, because PRCS figures it out for you. Not only that, but if you chose to ignore changes in certain files during the first merge, the second merge will remember that as the default action for the second merge. -- Donovan Baarda From ilya at bluefir.net Mon Aug 8 00:12:20 2005 From: ilya at bluefir.net (Ilya Sandler) Date: Sun, 7 Aug 2005 15:12:20 -0700 (PDT) Subject: [Python-Dev] pdb: should next command be extended? In-Reply-To: <42F679DC.6030705@v.loewis.de> References: <42F679DC.6030705@v.loewis.de> Message-ID: On Sun, 7 Aug 2005, [ISO-8859-1] "Martin v. L?wis" wrote: > Ilya Sandler wrote: > > Should pdb's next command accept an optional numeric argument? It would > > specify how many actual lines of code (not "line events") > > should be skipped in the current frame before stopping, > [...] > > What do you think? > > That would differ from gdb's "next ", which does "next" n times. > It would be confusing if pdb accepted the same command, but it > meant something different. But as far as I can tell, pdb's next is already different from gdb's next! gdb's next seem to always go to the different source line, while pdb's next may stay on the current line. The problem with "next " meaning "repeat next n times" is that it seems to be less useful that the original suggestion. Any alternative suggestions to allow to step over list comprehensions and such? (SF 1248119) > Plus, there is always a chance that > +n is never reached, which would also be confusing. That should not be a problem, returning from the current frame should be treated as a stopping condition (similarly to the current "next" behaviour)... Ilya > So I'm -1 here. > > Regards, > Martin > From martin at v.loewis.de Mon Aug 8 00:33:26 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Aug 2005 00:33:26 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F68754.6090400@minkirri.apana.org.au> References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42F61D19.6090806@v.loewis.de> <42F68754.6090400@minkirri.apana.org.au> Message-ID: <42F68C36.4090208@v.loewis.de> Donovan Baarda wrote: > What this means is SVN has no way of automatically identifying the > common version. Ah, ok. That's true. It doesn't mean you can't do proper merging with subversion - it only means that it is harder, as you need to figure out the revision range that you want to merge. If this is too painful, you can probably use subversion to store the relevant information. For example, you could define a custom property on the directory, last_merge_from_trunk, which you would always update after you have done a merge operation. Then you don't have to look through history to find out when you last merged. Regards, Martin From gvanrossum at gmail.com Mon Aug 8 01:58:42 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 7 Aug 2005 16:58:42 -0700 Subject: [Python-Dev] PEP 8: exception style In-Reply-To: <000401c59b7d$6d730940$05fecc97@oemcomputer> References: <000401c59b7d$6d730940$05fecc97@oemcomputer> Message-ID: > > Maybe in Py3K this could become > > > > raise ValueError("bloop"), tb > > The instantiation and bindings need to be done in one step without > mixing two syntaxes. Treat this case the same as everything else: > > raise ValueError("blip", traceback=tb) That requires PEP 344. I have some vague feeling that the way we build up the traceback by linking backwards, this may not necessarily work right. I guess somebody has to try to implement PEP 344 in order to find out. (In fact, I think trying to implement PEP 344 would be an *excellent* way to validate it.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Mon Aug 8 02:03:02 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 7 Aug 2005 17:03:02 -0700 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <42F60441.8000007@v.loewis.de> References: <20050806102342.GA11309@mems-exchange.org> <42F60441.8000007@v.loewis.de> Message-ID: [me] > > a way to decide on a default encoding for stdin, > > stdout, stderr. [Martin] > If stdin, stdout and stderr go to a terminal, there already is a > default encoding (actually, there always is a default encoding on > these, as it falls back to the system encoding if its not a terminal, > or if the terminal's encoding is not supported or cannot be determined). So there is. Wow! I never kew this. How does it work? Can we use this for writing to files to? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Mon Aug 8 02:07:49 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 7 Aug 2005 17:07:49 -0700 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <42F60753.8030309@v.loewis.de> References: <20050806102342.GA11309@mems-exchange.org> <42F60753.8030309@v.loewis.de> Message-ID: [Reinhold Birkenfeld] > > FWIW, I've already drafted a patch for the former. It lets you write to > > file.encoding and honors this when writing Unicode strings to it. [Martin v L] > I don't like that approach. You shouldn't be allowed to change the > encoding mid-stream (except perhaps under very specific circumstances). Right. IMO the encoding is something you specify when opening the file, just like buffer size and text mode. > Another issue is seeking: given the many different kinds of buffers, > seeking becomes fairly complex. Ideally, seeking should apply to > application-level positions, ie. if when you tell the current position, > it should be in terms of data already consumed by the application. > Perhaps seeking in an encoded stream should not be supported at all. I'm not sure if it works for all encodings, but if possible I'd like to extend the seeking semantics on text files: seek positions are byte counts, and the application should consider them as "magic cookies". > Finally, you also have to consider Universal Newlines: you can apply > them either on the byte stream, or on the character stream. I think > conceptually right would be to do universal newlines on the character > stream. Is there any reason not to do Universal Newline processing on *all* text files? I can't think of a use case where you'd like text file processing but you want to see the bare \r characters. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Mon Aug 8 02:24:34 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 7 Aug 2005 17:24:34 -0700 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <42F60E35.9080809@egenix.com> References: <20050806102342.GA11309@mems-exchange.org> <42F60E35.9080809@egenix.com> Message-ID: [Guido] > > My first response to the PEP, however, is that instead of a new > > built-in function, I'd rather relax the requirement that str() return > > an 8-bit string -- after all, int() is allowed to return a long, so > > why couldn't str() be allowed to return a Unicode string? [MAL] > The problem here is that strings and Unicode are used in different > ways, whereas integers and longs are very similar. Strings are used > for both arbitrary data and text data, Unicode can only be used > for text data. Yes, that is the case in Python 2.x. In Python 3.x, I'd like to use a separate "bytes" array type for non-text and for encoded text data, just like Java; strings should always be considered text data. We might be able to get there halfway in Python 2.x: we could introduce the bytes type now, and provide separate APIs to read and write them. (In fact, the array module and the f.readinto() method make this possible today, but it's too klunky so nobody uses it. Perhaps a better API would be a new file-open mode ("B"?) to indicate that a file's read* operations should return bytes instead of strings. The bytes type could just be a very thin wrapper around array('b'). > The new text() built-in would help make a clear distinction > between "convert this object to a string of bytes" and > "please convert this to a text representation". We need to > start making the separation somewhere and I think this is > a good non-invasive start. I agree with the latter, but I would prefer that any new APIs we use use a 'bytes' data type to represent non-text data, rather than having two different sets of APIs to differentiate between the use of 8-bit strings as text vs. data -- while we *currently* use 8-bit strings for both text and data, in Python 3.0 we won't, so then the interim APIs would have to change again. I'd rather intrduce a new data type and new APIs that work with it. > Furthermore, the text() built-in could be used to only > allow 8-bit strings with ASCII content to pass through > and require that all non-ASCII content be returned as > Unicode. > > We wouldn't be able to enforce this in str(). > > I'm +1 on adding text(). I'm still -1. > I would also like to suggest a new formatting marker '%t' > to have the same semantics as text() - instead of changing > the semantics of %s as the Neil suggests in the PEP. Again, > the reason is to make the difference between text and > arbitrary data explicit and visible in the code. Hm. What would be the use case for using %s with binary, non-text data? > > The main problem for a smooth Unicode transition remains I/O, in my > > opinion; I'd like to see a PEP describing a way to attach an encoding > > to text files, and a way to decide on a default encoding for stdin, > > stdout, stderr. > > Hmm, not sure why you need PEPs for this: I'd forgotten how far we've come. I'm still unsure how the default encoding on stdin/stdout works. But it still needs to be simpler; IMO the built-in open() function should have an encoding keyword. (But it could return something whose type is not 'file' -- once again making a distinction between open and file.) Do these files support universal newlines? IMO they should. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From benji at benjiyork.com Mon Aug 8 02:51:53 2005 From: benji at benjiyork.com (Benji York) Date: Sun, 07 Aug 2005 20:51:53 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F68C36.4090208@v.loewis.de> References: <42E93940.6080708@v.loewis.de> <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42F61D19.6090806@v.loewis.de> <42F68754.6090400@minkirri.apana.org.au> <42F68C36.4090208@v.loewis.de> Message-ID: <42F6ACA9.6030306@benjiyork.com> Martin v. L?wis wrote: > Donovan Baarda wrote: >>What this means is SVN has no way of automatically identifying the >>common version. > If this is too painful, you can probably use subversion to store > the relevant information. For example, you could define a custom > property on the directory A script named "svnmerge" that does just that is included in the contrib directory of the Subversion tar. We (ZC) have just started using it to track two-way merge operations, but I don't have much experience with it personally yet. -- Benji York From nick.bastin at gmail.com Mon Aug 8 03:52:47 2005 From: nick.bastin at gmail.com (Nicholas Bastin) Date: Sun, 7 Aug 2005 21:52:47 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F1AADE.50908@v.loewis.de> References: <42E93940.6080708@v.loewis.de> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> Message-ID: <66d0a6e105080718527939aa81@mail.gmail.com> On 8/4/05, "Martin v. L?wis" wrote: > Nicholas Bastin wrote: > > Perforce is a commercial product, but it can be had for free for > > verified Open Source projects, which Python shouldn't have any problem > > with. There are other problems, like you have to renew the agreement > > every year, but it might be worth considering, given the fact that > > it's an excellent system. > > So we should consider it because it is an excellent system... I don't > know what that means, in precise, day-to-day usage terms (i.e. what > precisely would it do for us that, say, Subversion can't do). It's a mature product. I would hope that that would count for something. I've had enough corrupted subversion repositories that I'm not crazy about the thought of using it in a production system. I know I'm not the only person with this experience. Sure, you can keep backups, and not really lose any work, but we're moving over because we have uptime and availability problems, so lets not just create them again. > >>I think anything but Subversion is ruled out because: > >>- there is no offer to host that anywhere (for subversion, there is > >> already svn.python.org) > > > > > > We could host a Perforce repository just as easily, I would think. > > Interesting offer. I'll add this to the PEP - who is "we" in this > context? Uh, the Python community. Which is currently hosting a subversion repository, so it doesn't seem like a stretch to imagine that p4.python.org could exist just as easily. > >>- there is no support for converting a CVS repository (for subversion, > >> there is cvs2svn) > > > > > > I'd put $20 on the fact that cvs2svn will *not* work out of the box > > for converting the python repository. Just call it a hunch. > > You could have read the PEP before losing that money :-) It did work > out of the box. Pardon me if I don't feel that I'd like to see a system in production for a few weeks before we declare victory. The problems with this kind of conversion can be very subtle, and very painful. I'm not saying we shouldn't do this, I'm just saying that we should take an appropriate measure of how much greener the grass really is on the other side, and how much work we're willing to put in to make it that way. -- Nick From bcannon at gmail.com Mon Aug 8 03:57:15 2005 From: bcannon at gmail.com (Brett Cannon) Date: Sun, 7 Aug 2005 18:57:15 -0700 Subject: [Python-Dev] Major revision of PEP 348 committed In-Reply-To: <000401c59b36$01226de0$e410c797@oemcomputer> References: <000401c59b36$01226de0$e410c797@oemcomputer> Message-ID: On 8/7/05, Raymond Hettinger wrote: > VMError -- This is a new intermediate grouping so it won't break > anything and it does bring together two exceptions relating them by > source. However, I recommend against introducing this new group. > Besides added yet another thing to remember, it violates > Flat-Is-Better-Than-Nested (see FIBTN below). Also, the new group is > short on use cases with MemoryErrors sometimes being recoverable and > SystemErrors generally not. In the library, only cookielib catches > these and it does so along with KeyboardInterrupt in order to re-raise. > In general, you don't want to introduce a new grouping unless there is > some recurring need to catch that group. > And Nick didn't like it either. Unless someone speaks up Monday, you can consider it removed. > EOFError -- I recommend leaving this one alone. IOError is generally > for real errors while EOF occurs in the normal course of reading a file > or filelike source. The former is hard to recover and the latter is > normal. The PEP's justification of "Since an EOF comes from I/O it only > makes sense that it be considered an I/O error" is somewhat shallow and > doesn't reflect thought about how those exceptions are actually used. > That information is readily attainable by scanning the standard library > with 57 instances of EOFError and 150 instances of IOError. There are a > few cases of overlap where an except clause catches both; however, the > two are mostly used independent from one another. The review of the > library gives a good indication of how much code would be broken by this > change. Also, see the FIBTN comment below. > Basically you are arguing that EOFError is practically not an error and more of an exception signaling an event, like StopIteration for file reading. That makes sense, although it does suggest the name breaks the naming scheme Guido suggested. But I am not crazy enough to try to suggest a name change at this point. =) > AnyDeprecationWarning -- This grouping makes some sense intuitively but > do we have much real code that has had occasion to catch both at the > same time? If not, then we don't need this. > Well, PendingDeprecationWarning is barely used in Lib/ it seems. That would suggest the grouping isn't worth it just because the need to catch it will be miniscule. That also kills the argument that it would simplify warnings filters by cutting down on needing another registration since the chance of that happening seems to be microscopic. > FIBTN (flat-is-better-than-nested) -- This bit of Zen carries extra > significance for the exception hierarchy. The core issue is that > exceptions are NOT inherently tree-structured. Each may ultimately > carry its own set of meaningful attributes and those tend to not neatly > roll-up into a parent/subclass relationships without Liskov violations. > [SNIP] > > Vocabulary size -- At one time, python-dev exhibited a strong reluctance > to introduce any new builtins. No matter how sensible the idea, there > was typically an immediate effort to jam the proposed function into some > other namespace. It should be remembered that each of PEP 348's > proposed new exception groupings ARE new builtins. Therefore, the bar > for admission should be relatively high (i.e. I would prefer Fredrik's > join() proposal to any of the above new proposals). Every new word in > the vocabulary makes the language a little more complex, a little less > likely to fit in your brain, and a little harder to learn. Nestings > make this more acute since learning the new word also entails > remembering how it fits in the structure (yet another good reason for > FIBTN). > Now those are two arguments I can go with. OK, so your points make sense. I will wait until Monday evening after work to make any changes to give people a chance to argue against them, but VMError and AnyDeprecationWarning can be considered removed and EOFError will be moved to inherit from EnvironmentError again. Luckily you didn't say you hated TerminalException. =) -Brett From bcannon at gmail.com Mon Aug 8 04:02:30 2005 From: bcannon at gmail.com (Brett Cannon) Date: Sun, 7 Aug 2005 19:02:30 -0700 Subject: [Python-Dev] Major revision of PEP 348 committed In-Reply-To: <42F5FD75.50903@gmail.com> References: <42F5FD75.50903@gmail.com> Message-ID: On 8/7/05, Nick Coghlan wrote: > Brett Cannon wrote: > > * SystemExit are the KeyboardInterrupt are the only exceptions *not* > > inheriting from Exception > > + CriticalException has been renamed TerminalException so it is > > more inline with the idea that the exceptions are meant to terminate > > the interpreter, not that they are more critical than other exceptions > > I like TerminalException, although TerminatingException may be less ambiguous. > ("There's nothing wrong with my terminal, you moronic machine!") > Maybe. But the interpreter is not terminating quite yet; state is still fine since the exceptions have not reached the top of the stack if you caught it. But then "terminal" sounds destined to die, which is not true either since that only occurs if you catch the exceptions; "terminating" portrays that the plan is the termination but that it is not definite. OK, TerminatingException it is. > > This version addresses everyone's worries about > > backwards-compatibility or changes that were not substantive enough to > > break code. > > Well, I think you said from the start that the forces of > backwards-compatibility would get you eventually ;) > =) I should become a pundit for being able to tell what is going to happen. > > The things I did on my own without thorough discussion is remove > > ControlFlowException and introduce VMError. > > +1 on the former. > -1 on the latter. > > Same reasons as Raymond, basically. These exceptions are builtins, so let's > not add new ones without a strong use case. > > Anyway, this is starting to look pretty good (but then, I thought that a few > days ago, too). > Yeah, and so did everyone else basically. While Guido has his "let's get all excited about a crazy idea, but then scale it back" mentality, I guess I have the "let's change everything for the better, but then realize other people actually use this language too". =) -Brett From edcjones at comcast.net Mon Aug 8 04:06:44 2005 From: edcjones at comcast.net (Edward C. Jones) Date: Sun, 07 Aug 2005 22:06:44 -0400 Subject: [Python-Dev] PyTuple_Pack added references undocumented Message-ID: <42F6BE34.3040104@comcast.net> According to the source code, PyTuple_Pack returns a new reference (it calls PyTuple_New). It also Py_INCREF's all the objects in the new tuple. Is this unusual behavior? None of these added references are documented in the API Reference Manual. From bcannon at gmail.com Mon Aug 8 04:10:47 2005 From: bcannon at gmail.com (Brett Cannon) Date: Sun, 7 Aug 2005 19:10:47 -0700 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <42F61C03.6050703@v.loewis.de> References: <42F61C03.6050703@v.loewis.de> Message-ID: On 8/7/05, "Martin v. L?wis" wrote: > I have placed a new version of the PEP on > > http://www.python.org/peps/pep-0347.html > > Changes to the previous version include: > > - add more rationale for using svn (atomic changesets, > fast tags and branches) > > - changed conversion procedure to a single repository, with > some reorganization. See > > http://www.dcl.hpi.uni-potsdam.de/pysvn/ > What is going in under python/ ? If it is what is currently /dist/src/, then great and the renaming of the repository works. But if that is what src/ is going to be used for, then what is python/ for and it would be nice to have a repository name that more directly reflects that it is the Python source tree. And I assume you are going to list the directory structure in the PEP at some point. -Brett From gvanrossum at gmail.com Mon Aug 8 04:14:43 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 7 Aug 2005 19:14:43 -0700 Subject: [Python-Dev] PyTuple_Pack added references undocumented In-Reply-To: <42F6BE34.3040104@comcast.net> References: <42F6BE34.3040104@comcast.net> Message-ID: On 8/7/05, Edward C. Jones wrote: > According to the source code, PyTuple_Pack returns a new reference (it > calls PyTuple_New). It also Py_INCREF's all the objects in the new > tuple. Is this unusual behavior? None of these added references are > documented in the API Reference Manual. This seems the only sensible behavior given what it does. I think the INCREFs don't need to be documented because you don't have to worry about them -- they follow the normal pattern of reference counts: if you owned an object before passing it to PyTuple_Pack(), you still own it afterwards. The docs say that it returns a new object, so that's in order too. It's not listed in refcounts.dat; that seems an omission (or perhaps the function's varargs signature doesn't fit in the pattern?). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Mon Aug 8 04:49:41 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 07 Aug 2005 22:49:41 -0400 Subject: [Python-Dev] PyTuple_Pack added references undocumented In-Reply-To: <42F6BE34.3040104@comcast.net> Message-ID: <000301c59bc3$d3ff9e80$05fecc97@oemcomputer> > According to the source code, PyTuple_Pack returns a new reference (it > calls PyTuple_New). It also Py_INCREF's all the objects in the new > tuple. Is this unusual behavior? No. That is how containers work. Look at PyBuild_Value() for comparison. > None of these added references are documented in the API Reference Manual. The docs seem clear to me. If the docs don't meet your needs, submit a patch. From nas at arctrix.com Mon Aug 8 05:47:56 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Sun, 7 Aug 2005 21:47:56 -0600 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> Message-ID: <20050808034756.GA16756@mems-exchange.org> On Sat, Aug 06, 2005 at 06:56:39PM -0700, Guido van Rossum wrote: > My first response to the PEP, however, is that instead of a new > built-in function, I'd rather relax the requirement that str() return > an 8-bit string Do you have any thoughts on what the C API would be? It seems to me that PyObject_Str cannot start returning a unicode object without a lot of code breakage. I suppose we could introduce a function called something like PyObject_String. Neil From pje at telecommunity.com Mon Aug 8 05:49:25 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun, 07 Aug 2005 23:49:25 -0400 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <42F60E35.9080809@egenix.com> <20050806102342.GA11309@mems-exchange.org> <42F60E35.9080809@egenix.com> Message-ID: <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com> At 05:24 PM 8/7/2005 -0700, Guido van Rossum wrote: >Hm. What would be the use case for using %s with binary, non-text data? Well, I could see using it to write things like netstrings, i.e. sock.send("%d:%s," % (len(data),data)) seems like the One Obvious Way to write a netstring in today's Python at least. But perhaps there's a subtlety I've missed here. From barry at python.org Mon Aug 8 05:51:49 2005 From: barry at python.org (Barry Warsaw) Date: Sun, 07 Aug 2005 23:51:49 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <66d0a6e105080718527939aa81@mail.gmail.com> References: <42E93940.6080708@v.loewis.de> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> Message-ID: <1123473109.20293.35.camel@geddy.wooz.org> On Sun, 2005-08-07 at 21:52, Nicholas Bastin wrote: > I've had enough corrupted subversion repositories that I'm > not crazy about the thought of using it in a production system. I > know I'm not the only person with this experience. Sure, you can keep > backups, and not really lose any work, but we're moving over because > we have uptime and availability problems, so lets not just create them > again. Has anyone experienced svn corruptions with the fsfs backend? I haven't, across quite a few repositories. > Uh, the Python community. Which is currently hosting a subversion > repository, so it doesn't seem like a stretch to imagine that > p4.python.org could exist just as easily. Unfortunately, I don't think "we" (meaning specifically the collective python.org admins) have much if any operational experience with Perforce. We do with Subversion though and that's a big plus. If "we" were to host a Perforce repository, we'd need significant commitments from several somebodies to get things set up, keep it running, and help socialize long-term institutional knowledge amongst the other admins. We'd also have to teach the current crop of developers how to use the client tools effectively. I think it's fairly simple to teach a CVS user how to use Subversion, but have no idea if translating CVS experience to Perforce is as straightforward. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050807/59637840/attachment.pgp From fdrake at acm.org Mon Aug 8 07:07:51 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 8 Aug 2005 01:07:51 -0400 Subject: [Python-Dev] PyTuple_Pack added references undocumented In-Reply-To: References: <42F6BE34.3040104@comcast.net> Message-ID: <200508080107.51852.fdrake@acm.org> On Sunday 07 August 2005 22:14, Guido van Rossum wrote: > I think the INCREFs don't need to be documented because you don't have > to worry about them -- they follow the normal pattern of reference > counts: if you owned an object before passing it to PyTuple_Pack(), > you still own it afterwards. That's right; the function doesn't affect the references you hold in any way, so there's no need to deal with them. > It's not listed in refcounts.dat; that seems an omission (or perhaps > the function's varargs signature doesn't fit in the pattern?). It should and can be listed. refcounts.dat won't deal with the varargs portion of the signature, but it can deal with the return value and normal arguments without worrying about varargs portions of the signature for any function. -Fred -- Fred L. Drake, Jr. From martin at v.loewis.de Mon Aug 8 07:37:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Aug 2005 07:37:27 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> <42F60441.8000007@v.loewis.de> Message-ID: <42F6EF97.50609@v.loewis.de> Guido van Rossum wrote: >>If stdin, stdout and stderr go to a terminal, there already is a >>default encoding (actually, there always is a default encoding on >>these, as it falls back to the system encoding if its not a terminal, >>or if the terminal's encoding is not supported or cannot be determined). > > > So there is. Wow! I never kew this. How does it work? Can we use this > for writing to files to? On Unix, it uses nl_langinfo(CHARSET), which in turn looks at the environment variables. On Windows, it uses GetConsoleCP()/GetConsoleOutputCP(). On Mac, I'm still searching for a way to determine the encoding of Terminal.app. In IDLE, it uses locale.getpreferredencoding(). So no, this cannot easily be used for file output. Most likely, people would use locale.getpreferredencoding() for file output. For socket output, there should not be a standard way to encode Unicode. Regards, Martin From bob at redivi.com Mon Aug 8 07:58:51 2005 From: bob at redivi.com (Bob Ippolito) Date: Sun, 7 Aug 2005 19:58:51 -1000 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <42F6EF97.50609@v.loewis.de> References: <20050806102342.GA11309@mems-exchange.org> <42F60441.8000007@v.loewis.de> <42F6EF97.50609@v.loewis.de> Message-ID: <79BBBAB6-6630-4AF0-A74B-0D712186A054@redivi.com> On Aug 7, 2005, at 7:37 PM, Martin v. L?wis wrote: > Guido van Rossum wrote: > >>> If stdin, stdout and stderr go to a terminal, there already is a >>> default encoding (actually, there always is a default encoding on >>> these, as it falls back to the system encoding if its not a >>> terminal, >>> or if the terminal's encoding is not supported or cannot be >>> determined). >>> >> >> >> So there is. Wow! I never kew this. How does it work? Can we use this >> for writing to files to? >> > > On Unix, it uses nl_langinfo(CHARSET), which in turn looks at the > environment variables. > > On Windows, it uses GetConsoleCP()/GetConsoleOutputCP(). > > On Mac, I'm still searching for a way to determine the encoding of > Terminal.app. It's UTF-8 by default, I highly doubt many people bother to change it. -bob From martin at v.loewis.de Mon Aug 8 07:59:02 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Aug 2005 07:59:02 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> <42F60753.8030309@v.loewis.de> Message-ID: <42F6F4A6.8060605@v.loewis.de> Guido van Rossum wrote: > I'm not sure if it works for all encodings, but if possible I'd like > to extend the seeking semantics on text files: seek positions are byte > counts, and the application should consider them as "magic cookies". If the seek position is merely a number, it won't work for all encodings. For the ISO 2022 ones (iso-2022-jp etc), you need to know the shift state: you can switch to a different encoding in the stream using standard escape codes, and then the same bytes are interpreted differently. For example, iso-2022-jp supports these escape codes: ESC ( B ASCII ESC $ @ JIS X 0208-1978 ESC $ B JIS X 0208-1983 ESC ( J JIS X 0201-Roman ESC $ A GB2312-1980 ESC $ ( C KSC5601-1987 ESC $ ( D JIS X 0212-1990 ESC . A ISO8859-1 ESC . F ISO8859-7 So at a certain position in the stream, the same bytes could mean different characters, depending on which "shift state" you are in. That's why ISO C introduced fgetpos/fsetpos in addition to ftell/fseek: an fpos_t is a truly opaque structure that can also incorporate codec state. If you follow this approach, you can get back most of seek; you will lose the "whence" parameter, i.e. you cannot seek forth and back, and you cannot position at the end of the file (actually, iso-2022-jp still supports appending to a file, since it requires that all data "shift out" back to ASCII at the end of each line, and at the end of the file. So "correct" ISO 2022 files can still be concatenated) > Is there any reason not to do Universal Newline processing on *all* > text files? Correct. However, this still might result in a full rewrite of the universal newlines code: the code currently operates on byte streams, when it "should" operate on character streams. In some encodings, CRLF simply isn't represented by \x0d\x0a (e.g. UTF-16-LE: \x0d\0\0x0a\0) Regards, Martin From martin at v.loewis.de Mon Aug 8 08:05:15 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Aug 2005 08:05:15 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <66d0a6e105080718527939aa81@mail.gmail.com> References: <42E93940.6080708@v.loewis.de> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> Message-ID: <42F6F61B.1080505@v.loewis.de> Nicholas Bastin wrote: > It's a mature product. I would hope that that would count for > something. Sure. But so is subversion. > I've had enough corrupted subversion repositories that I'm > not crazy about the thought of using it in a production system. I had the last corrupted repository with subversion 0.23. It has matured since then. Even then, invoking db_recover would restore the operation, without losing data (i.e. I did not need to go to backup). >>Interesting offer. I'll add this to the PEP - who is "we" in this >>context? > > > Uh, the Python community. Which is currently hosting a subversion > repository, so it doesn't seem like a stretch to imagine that > p4.python.org could exist just as easily. Ah. But these people have no expertise with Perforce, and there is no Debian Perforce package, so it *is* a stretch assuming that they could also host a perforce directory. So I should then remove your offer to host a perforce installation, as you never made such an offer, right? > Pardon me if I don't feel that I'd like to see a system in production > for a few weeks before we declare victory. The problems with this > kind of conversion can be very subtle, and very painful. I'm not > saying we shouldn't do this, I'm just saying that we should take an > appropriate measure of how much greener the grass really is on the > other side, and how much work we're willing to put in to make it that > way. Yes. That's what this PEP is for. So I guess you are -1 on the PEP. Regards, Martin From martin at v.loewis.de Mon Aug 8 08:08:34 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Aug 2005 08:08:34 +0200 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: References: <42F61C03.6050703@v.loewis.de> Message-ID: <42F6F6E2.7020007@v.loewis.de> Brett Cannon wrote: > What is going in under python/ ? If it is what is currently > /dist/src/, then great and the renaming of the repository works. Just have a look yourself :-) Yes, this is dist/src. > But if that is what src/ is going to be used for This is nondist/src. Perhaps I should just move nondist/src/Compiler, and drop nondist/src. > And I assume you are going to list the directory structure in the PEP > at some point. Please take a look at the PEP. Regards, Martin From cludwig at cdc.informatik.tu-darmstadt.de Mon Aug 8 09:07:13 2005 From: cludwig at cdc.informatik.tu-darmstadt.de (Christoph Ludwig) Date: Mon, 8 Aug 2005 09:07:13 +0200 Subject: [Python-Dev] [C++-sig] GCC version compatibility In-Reply-To: <42F6791C.3030602@v.loewis.de> References: <200507171601.23780.anthony@interlink.com.au> <20050717100609.GB3581@lap200.cdc.informatik.tu-darmstadt.de> <200507172321.31665.anthony@interlink.com.au> <42F6791C.3030602@v.loewis.de> Message-ID: <20050808070712.GB3570@lap200.cdc.informatik.tu-darmstadt.de> On Sun, Aug 07, 2005 at 11:11:56PM +0200, "Martin v. L?wis" wrote: > I've looked at the patch, and it looks fairly safe, so I committed it. Thanks. I did not forget my promise to look into a more comprehensive approach to the C++ build issues. But I first need to better understand the potential impact on distutils. And, foremost, I need to finish my thesis whence my spare time projects go very slowly. Regards Christoph -- http://www.informatik.tu-darmstadt.de/TI/Mitarbeiter/cludwig.html LiDIA: http://www.informatik.tu-darmstadt.de/TI/LiDIA/Welcome.html From martin at v.loewis.de Mon Aug 8 09:21:41 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Aug 2005 09:21:41 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <79BBBAB6-6630-4AF0-A74B-0D712186A054@redivi.com> References: <20050806102342.GA11309@mems-exchange.org> <42F60441.8000007@v.loewis.de> <42F6EF97.50609@v.loewis.de> <79BBBAB6-6630-4AF0-A74B-0D712186A054@redivi.com> Message-ID: <42F70805.2070107@v.loewis.de> Bob Ippolito wrote: > It's UTF-8 by default, I highly doubt many people bother to change it. I think your doubts are unfounded. Many Japanese people change it to EUC-JP (I believe), as UTF-8 support doesn't work well for them (or atleast didn't use to). Regards, Martin From martin at v.loewis.de Mon Aug 8 10:00:00 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Aug 2005 10:00:00 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> <42F60E35.9080809@egenix.com> Message-ID: <42F71100.7000401@v.loewis.de> Guido van Rossum wrote: > We might be able to get there halfway in Python 2.x: we could > introduce the bytes type now, and provide separate APIs to read and > write them. (In fact, the array module and the f.readinto() method > make this possible today, but it's too klunky so nobody uses it. > Perhaps a better API would be a new file-open mode ("B"?) to indicate > that a file's read* operations should return bytes instead of strings. > The bytes type could just be a very thin wrapper around array('b'). That answers an important question: so you want the bytes type to be mutable (and, consequently, unsuitable as a dictionary key). Regards, Martin From martin at v.loewis.de Mon Aug 8 10:07:37 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Aug 2005 10:07:37 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com> References: <42F60E35.9080809@egenix.com> <20050806102342.GA11309@mems-exchange.org> <42F60E35.9080809@egenix.com> <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com> Message-ID: <42F712C9.9040000@v.loewis.de> Phillip J. Eby wrote: >>Hm. What would be the use case for using %s with binary, non-text data? > > > Well, I could see using it to write things like netstrings, > i.e. sock.send("%d:%s," % (len(data),data)) seems like the One Obvious Way > to write a netstring in today's Python at least. But perhaps there's a > subtlety I've missed here. As written, this would stop working when strings become Unicode. It's pretty clear what '%d' means (format the number in decimal numbers, using "\N{DIGIT ZERO}" .. "\N{DIGIT NINE}" as the digits). It's not all that clear what %s means: how do you get a sequence of characters out of data, when data is a byte string? Perhaps there could be byte string literals, so that you would write sock.send(b"%d:%s," % (len(data),data)) but this would raise different questions: - what does %d mean for a byte string formatting? str(len(data)) returns a character string, how do you get a byte string? In the specific case of %d, encoding as ASCII would work, though. - if byte strings are mutable, what about byte string literals? I.e. if I do x = b"%d:%s," x[1] = b'f' and run through the code the second time, will the literal have changed? Perhaps these would be displays, not literals (although I never understood why Guido calls these displays) Regards, Martin From stephen at xemacs.org Mon Aug 8 10:10:07 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 08 Aug 2005 17:10:07 +0900 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <42F70805.2070107@v.loewis.de> (Martin v. =?iso-8859-1?q?L=F6wis's?= message of "Mon, 08 Aug 2005 09:21:41 +0200") References: <20050806102342.GA11309@mems-exchange.org> <42F60441.8000007@v.loewis.de> <42F6EF97.50609@v.loewis.de> <79BBBAB6-6630-4AF0-A74B-0D712186A054@redivi.com> <42F70805.2070107@v.loewis.de> Message-ID: <87oe8897ds.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Martin" == Martin v L?wis writes: Martin> I think your doubts are unfounded. Many Japanese people Martin> change it to EUC-JP (I believe), as UTF-8 support doesn't Martin> work well for them (or atleast didn't use to). If you mean the UTF-8 support in Terminal, it's no better or worse than the EUC-JP support. The problem is that most Japanese Unix systems continue to default to EUC-JP, and many Windows hosts (including Samba file systems) default to Shift JIS. So people using Terminal tend to set it to match the default remote environment (few of them use shells on the Mac). All that is certainly true of my organization, for one example. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From martin at v.loewis.de Mon Aug 8 10:26:37 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Aug 2005 10:26:37 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <87oe8897ds.fsf@tleepslib.sk.tsukuba.ac.jp> References: <20050806102342.GA11309@mems-exchange.org> <42F60441.8000007@v.loewis.de> <42F6EF97.50609@v.loewis.de> <79BBBAB6-6630-4AF0-A74B-0D712186A054@redivi.com> <42F70805.2070107@v.loewis.de> <87oe8897ds.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <42F7173D.4000204@v.loewis.de> Stephen J. Turnbull wrote: > If you mean the UTF-8 support in Terminal, it's no better or worse > than the EUC-JP support. The problem is that most Japanese Unix > systems continue to default to EUC-JP, and many Windows hosts > (including Samba file systems) default to Shift JIS. So people using > Terminal tend to set it to match the default remote environment (few > of them use shells on the Mac). Right: that might be the biggest problem. ls(1) would not display the file names of the remote servers in any readable way. Thanks for the confirmation. Regards, Martin From arigo at tunes.org Mon Aug 8 10:31:06 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon, 8 Aug 2005 10:31:06 +0200 Subject: [Python-Dev] __traceback__ and reference cycles Message-ID: <20050808083106.GA15924@code1.codespeak.net> Hi all, There are various proposals to add an attribute on exception instances to store the traceback (see PEP 344). A detail not discussed, which I thought of historical interest only, is that today's exceptions try very hard to avoid reference cycles, in particular the cycle 'frame -> local variable -> traceback object -> frame' which was important for pre-GC versions of Python. A clause 'except Exception, e' would not create a local reference to the traceback, only to the exception instance. If the latter grows a __traceback__ attribute, it is no longer true, and every such except clause typically creates a cycle. Of course, we don't care, we have a GC -- do we? Well, there are cases where we do: see the attached program... In my opinion it should be considered a bug of today's Python that this program leaks memory very fast and takes longer and longer to run each loop (each loop takes half a second longer than the previous one!). (I don't know how this bug could be fixed, though.) Spoiling the fun of figuring out what is going on, the reason is that 'e_tb' creates a reference cycle involving the frame of __del__, which keeps a reference to 'self' alive. Python thinks 'self' was resurrected. The next time the GC runs, the cycle disappears, and the refcount of 'self' drops to zero again, calling __del__ again -- which gets resurrected again by a new cycle. Etc... Note that no cycle actually contains 'self'; they just point to 'self'. In summary, no X instance gets ever freed, but they all have their destructors called over and over again. Attaching a __traceback__ will only make this "bug" show up more often, as the 'except Exception, e' line in a __del__() method would be enough to trigger it. Not sure what to do about it. I just thought I should share these thoughts (I stumbled over almost this problem in PyPy). A bientot, Armin From arigo at tunes.org Mon Aug 8 10:38:12 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon, 8 Aug 2005 10:38:12 +0200 Subject: [Python-Dev] __traceback__ and reference cycles In-Reply-To: <20050808083106.GA15924@code1.codespeak.net> References: <20050808083106.GA15924@code1.codespeak.net> Message-ID: <20050808083812.GA16341@code1.codespeak.net> Hi, On Mon, Aug 08, 2005 at 10:31:06AM +0200, Armin Rigo wrote: > see the attached program... Oups. Here it is... Armin -------------- next part -------------- import sys, time def log(typ, val, tb): pass class X: def __del__(self): try: typo except Exception, e: e_type, e_value, e_tb = sys.exc_info() log(e_type, e_value, e_tb) t = time.time() while True: lst = [X() for i in range(1000)] t1 = time.time() print t1 - t t = t1 From phd at mail2.phd.pp.ru Mon Aug 8 10:47:56 2005 From: phd at mail2.phd.pp.ru (Oleg Broytmann) Date: Mon, 8 Aug 2005 12:47:56 +0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1123473109.20293.35.camel@geddy.wooz.org> References: <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> <1123473109.20293.35.camel@geddy.wooz.org> Message-ID: <20050808084756.GB28977@phd.pp.ru> On Sun, Aug 07, 2005 at 11:51:49PM -0400, Barry Warsaw wrote: > Has anyone experienced svn corruptions with the fsfs backend? I > haven't, across quite a few repositories. I haven't. But I must admit that the repositories I'm working with aren't big. The bigest is at svn.colorstudy.com (I am working on SQLObject) and since the time Ian has switched from dbfs to fsfs I don't remember any problems with the repo. Speaking of merge. SVN relived much pain that CVS had gave me. With CVS I had a lot of conflicts - if the code to be merged is already there (had been merged from another branch) one got conflict. If the code contains CVS keywords (__version__ = "$Id$") cvs merge always produced conflicts. SVN fixed both problems so now I see only real conflicts. SVN just ignores the code to be merged if it has ben already merged; and SVN convert keywords internally to its default form ($Id$ instead of $Id: python.c 42 phd $) before merging. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From mwh at python.net Mon Aug 8 11:29:50 2005 From: mwh at python.net (Michael Hudson) Date: Mon, 08 Aug 2005 10:29:50 +0100 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <42F60E35.9080809@egenix.com> (M.'s message of "Sun, 07 Aug 2005 15:35:49 +0200") References: <20050806102342.GA11309@mems-exchange.org> <42F60E35.9080809@egenix.com> Message-ID: <2mll3cydwx.fsf@starship.python.net> "M.-A. Lemburg" writes: > Set the external encoding for stdin, stdout, stderr: > ---------------------------------------------------- > (also an example for adding encoding support to an > existing file object): > > def set_sys_std_encoding(encoding): > # Load encoding support > (encode, decode, streamreader, streamwriter) = codecs.lookup(encoding) > # Wrap using stream writers and readers > sys.stdin = streamreader(sys.stdin) > sys.stdout = streamwriter(sys.stdout) > sys.stderr = streamwriter(sys.stderr) > # Add .encoding attribute for introspection > sys.stdin.encoding = encoding > sys.stdout.encoding = encoding > sys.stderr.encoding = encoding > > set_sys_std_encoding('rot-13') > > Example session: >>>> print 'hello' > uryyb >>>> raw_input() > hello > h'hello' >>>> 1/0 > Genpronpx (zbfg erprag pnyy ynfg): > Svyr "", yvar 1, va ? > MrebQvivfvbaReebe: vagrtre qvivfvba be zbqhyb ol mreb > > Note that the interactive session bypasses the sys.stdin > redirection, which is why you can still enter Python > commands in ASCII - not sure whether there's a reason > for this, or whether it's just a missing feature. Um, I'm not quite sure how this would be implemented. Interactive input comes via PyOS_Readline which deals in FILE*s... this area of the code always confuses me :( Cheers, mwh -- As it seems to me, in Perl you have to be an expert to correctly make a nested data structure like, say, a list of hashes of instances. In Python, you have to be an idiot not to be able to do it, because you just write it down. -- Peter Norvig, comp.lang.functional From mwh at python.net Mon Aug 8 11:49:15 2005 From: mwh at python.net (Michael Hudson) Date: Mon, 08 Aug 2005 10:49:15 +0100 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1123473109.20293.35.camel@geddy.wooz.org> (Barry Warsaw's message of "Sun, 07 Aug 2005 23:51:49 -0400") References: <42E93940.6080708@v.loewis.de> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> <1123473109.20293.35.camel@geddy.wooz.org> Message-ID: <2mhde0yd0k.fsf@starship.python.net> Barry Warsaw writes: > Unfortunately, I don't think "we" (meaning specifically the collective > python.org admins) have much if any operational experience with > Perforce. Also (from someone who is on the fringes of the pydotorg admin set): I don't know that much about subversion administration. But, if it proves necessary, as it's an open source project and all, I'm willing to put some time into learning about it. I'm *much* less likely to do this for a closed source package unless someone is paying me to do it. Maybe I'm the only person who thinks this way, but if not, it's something to think about. Cheers, mwh -- 42. You can measure a programmer's perspective by noting his attitude on the continuing vitality of FORTRAN. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From ncoghlan at gmail.com Mon Aug 8 12:24:24 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 08 Aug 2005 20:24:24 +1000 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <42F71100.7000401@v.loewis.de> References: <20050806102342.GA11309@mems-exchange.org> <42F60E35.9080809@egenix.com> <42F71100.7000401@v.loewis.de> Message-ID: <42F732D8.9080505@gmail.com> Martin v. L?wis wrote: > Guido van Rossum wrote: >>The bytes type could just be a very thin wrapper around array('b'). > > That answers an important question: so you want the bytes type to be > mutable (and, consequently, unsuitable as a dictionary key). I would suggest a bytes/frozenbytes pair, similar to set/frozenset and list/tuple. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From mal at egenix.com Mon Aug 8 12:42:26 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 08 Aug 2005 12:42:26 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> <42F60E35.9080809@egenix.com> Message-ID: <42F73712.9040904@egenix.com> Guido van Rossum wrote: > [Guido] > >>>My first response to the PEP, however, is that instead of a new >>>built-in function, I'd rather relax the requirement that str() return >>>an 8-bit string -- after all, int() is allowed to return a long, so >>>why couldn't str() be allowed to return a Unicode string? > > > [MAL] > >>The problem here is that strings and Unicode are used in different >>ways, whereas integers and longs are very similar. Strings are used >>for both arbitrary data and text data, Unicode can only be used >>for text data. > > Yes, that is the case in Python 2.x. In Python 3.x, I'd like to use a > separate "bytes" array type for non-text and for encoded text data, > just like Java; strings should always be considered text data. > > We might be able to get there halfway in Python 2.x: we could > introduce the bytes type now, and provide separate APIs to read and > write them. > > (In fact, the array module and the f.readinto() method > make this possible today, but it's too klunky so nobody uses it. > Perhaps a better API would be a new file-open mode ("B"?) to indicate > that a file's read* operations should return bytes instead of strings. > The bytes type could just be a very thin wrapper around array('b'). I'd prefer to keep such bytes type immutable (arrays are mutable), otherwise, as Martin already mentioned, they wouldn't be usable as dictionary keys and the transition from the current string implementation would be made more difficult than necessary. Since we won't have any use for the string type in Py3k, why not simply strip it down to a plain bytes type ? (I wouldn't want to lose or have to reinvent all the optimizations that went into its implementation and which are missing in the array implementation.) About the file-type idea: We already have text mode and binary mode - with their implementation being platform dependent. I don't think that this is particularly good area to add new functionality. If you use codecs.open() to open a file, you could easily write a codec which implements what you have in mind. >>The new text() built-in would help make a clear distinction >>between "convert this object to a string of bytes" and >>"please convert this to a text representation". We need to >>start making the separation somewhere and I think this is >>a good non-invasive start. > > > I agree with the latter, but I would prefer that any new APIs we use > use a 'bytes' data type to represent non-text data, rather than having > two different sets of APIs to differentiate between the use of 8-bit > strings as text vs. data -- while we *currently* use 8-bit strings for > both text and data, in Python 3.0 we won't, so then the interim APIs > would have to change again. I'd rather intrduce a new data type and > new APIs that work with it. Well, let's put it this way: it all really depends on what str() should mean in Py3k. Given that str() is used for mixed content data strings, simply aliasing str() to unicode() in Py3k would cause a lot of breakage, due to changed semantics. Aliasing str() to bytes() would also cause breakage, due to the fact that bytes types wouldn't have string method like e.g. .lower(), .upper(), etc. Perhaps str() in Py3k should become a helper that converts bytes() to Unicode, provided the content is ASCII-only. In any case, Py3k would only have unicode() for text and bytes() for data, so there's no real need to continue using str(). If we add the text() API in Py2k and with the above meaning, then we could rename unicode() to text() in Py3k - only a cosmetical change, but one that I would find useful: text() and bytes() are more intuitive to understand than unicode() and bytes(). >>Furthermore, the text() built-in could be used to only >>allow 8-bit strings with ASCII content to pass through >>and require that all non-ASCII content be returned as >>Unicode. >> >>We wouldn't be able to enforce this in str(). >> >>I'm +1 on adding text(). > > > I'm still -1. > > >>I would also like to suggest a new formatting marker '%t' >>to have the same semantics as text() - instead of changing >>the semantics of %s as the Neil suggests in the PEP. Again, >>the reason is to make the difference between text and >>arbitrary data explicit and visible in the code. > > > Hm. What would be the use case for using %s with binary, non-text data? I guess we'd only keep it for backwards compatibility and map it to the str() helper. >>>The main problem for a smooth Unicode transition remains I/O, in my >>>opinion; I'd like to see a PEP describing a way to attach an encoding >>>to text files, and a way to decide on a default encoding for stdin, >>>stdout, stderr. >> >>Hmm, not sure why you need PEPs for this: > > > I'd forgotten how far we've come. I'm still unsure how the default > encoding on stdin/stdout works. Codecs in general work like this: they take an existing file-like object and wrap it with new versions of .read(), .write(), .readline(), etc. which filter the data through encoding and/or decoding functions. Once a file is wrapped with a codec StreamWriter/Reader, you can continue using it as if it were a standard file-like object. > But it still needs to be simpler; IMO the built-in open() function > should have an encoding keyword. (But it could return something whose > type is not 'file' -- once again making a distinction between open and > file.) Right, because it would then return a wrapped file object. > Do these files support universal newlines? IMO they should. Since the codecs wrap the underlying file object which does support universal newlines, this should be the case. However, you should be aware of the fact that Unicode defines a lot more line break characters than just \r, \r\n, \n. The codecs use the .splitlines() methods of strings and Unicode - which support all of them transparently, so you don't need to enable universal newlines support at all - it's sort-of enabled per default. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 08 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mal at egenix.com Mon Aug 8 13:06:31 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 08 Aug 2005 13:06:31 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <2mll3cydwx.fsf@starship.python.net> References: <20050806102342.GA11309@mems-exchange.org> <42F60E35.9080809@egenix.com> <2mll3cydwx.fsf@starship.python.net> Message-ID: <42F73CB7.2090007@egenix.com> Michael Hudson wrote: > "M.-A. Lemburg" writes: > > >>Set the external encoding for stdin, stdout, stderr: >>---------------------------------------------------- >>(also an example for adding encoding support to an >>existing file object): >> >>def set_sys_std_encoding(encoding): >> # Load encoding support >> (encode, decode, streamreader, streamwriter) = codecs.lookup(encoding) >> # Wrap using stream writers and readers >> sys.stdin = streamreader(sys.stdin) >> sys.stdout = streamwriter(sys.stdout) >> sys.stderr = streamwriter(sys.stderr) >> # Add .encoding attribute for introspection >> sys.stdin.encoding = encoding >> sys.stdout.encoding = encoding >> sys.stderr.encoding = encoding >> >>set_sys_std_encoding('rot-13') >> >>Example session: >> >>>>>print 'hello' >> >>uryyb >> >>>>>raw_input() >> >>hello >>h'hello' >> >>>>>1/0 >> >>Genpronpx (zbfg erprag pnyy ynfg): >> Svyr "", yvar 1, va ? >>MrebQvivfvbaReebe: vagrtre qvivfvba be zbqhyb ol mreb >> >>Note that the interactive session bypasses the sys.stdin >>redirection, which is why you can still enter Python >>commands in ASCII - not sure whether there's a reason >>for this, or whether it's just a missing feature. > > > Um, I'm not quite sure how this would be implemented. Interactive > input comes via PyOS_Readline which deals in FILE*s... this area of > the code always confuses me :( Me too. It appears that this part of the Python code has undergone so many iterations and patches, that the structure has suffered a lot, e.g. the main() functions calls PyRun_AnyFileFlags(stdin, "", &cf), but the fp argument stdin is then subsequently ignored if the tok_nextc() function finds that a prompt is set. Anyway, hacking along the same lines, I think the above can be had by changing tok_stdin_decode() to use a possibly available sys.stdin.decode() method for the decoding of the data read by PyOS_Readline(). This would then return Unicode which tok_stdin_decode() could then encode to UTF-8 which is the encoding that the tokenizer can work on. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 08 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From pje at telecommunity.com Mon Aug 8 15:54:20 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 08 Aug 2005 09:54:20 -0400 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <42F712C9.9040000@v.loewis.de> References: <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com> <42F60E35.9080809@egenix.com> <20050806102342.GA11309@mems-exchange.org> <42F60E35.9080809@egenix.com> <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050808094640.025b8d98@mail.telecommunity.com> At 10:07 AM 8/8/2005 +0200, Martin v. L?wis wrote: >Phillip J. Eby wrote: > >>Hm. What would be the use case for using %s with binary, non-text data? > > > > > > Well, I could see using it to write things like netstrings, > > i.e. sock.send("%d:%s," % (len(data),data)) seems like the One Obvious > Way > > to write a netstring in today's Python at least. But perhaps there's a > > subtlety I've missed here. > >As written, this would stop working when strings become Unicode. It's >pretty clear what '%d' means (format the number in decimal numbers, >using "\N{DIGIT ZERO}" .. "\N{DIGIT NINE}" as the digits). It's not >all that clear what %s means: how do you get a sequence of characters >out of data, when data is a byte string? > >Perhaps there could be byte string literals, so that you would write > > sock.send(b"%d:%s," % (len(data),data)) Actually, thinking about it some more, it seems to me it's actually more like this: sock.send( ("%d:%s," % (len(data),data.decode('latin1'))).encode('latin1') ) That is, if all we have is unicode and bytes, and 'data' is bytes, then encoding and decoding from latin1 is the right way to do a netstring. It's a bit more painful, but still doable. >but this would raise different questions: >- what does %d mean for a byte string formatting? str(len(data)) > returns a character string, how do you get a byte string? > In the specific case of %d, encoding as ASCII would work, though. >- if byte strings are mutable, what about byte string literals? > I.e. if I do > > x = b"%d:%s," > x[1] = b'f' > > and run through the code the second time, will the literal have > changed? Perhaps these would be displays, not literals (although > I never understood why Guido calls these displays) I'm thinking that bytes.decode and unicode.encode are the correct way to convert between the two, and there's no such thing as a bytes literal. We can always optimize "constant.encode(constant)" to a bytes display internally if necessary, although it will be a pain for programs that have lots of bytestring constants. OTOH, we've previously discussed having a 'bytes()' constructor, and perhaps it should use latin1 as its default encoding. From aahz at pythoncraft.com Mon Aug 8 17:41:57 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon, 8 Aug 2005 08:41:57 -0700 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <20050808034756.GA16756@mems-exchange.org> References: <20050806102342.GA11309@mems-exchange.org> <20050808034756.GA16756@mems-exchange.org> Message-ID: <20050808154157.GA28005@panix.com> On Sun, Aug 07, 2005, Neil Schemenauer wrote: > On Sat, Aug 06, 2005 at 06:56:39PM -0700, Guido van Rossum wrote: >> >> My first response to the PEP, however, is that instead of a new >> built-in function, I'd rather relax the requirement that str() return >> an 8-bit string > > Do you have any thoughts on what the C API would be? It seems to me > that PyObject_Str cannot start returning a unicode object without a > lot of code breakage. I suppose we could introduce a function > called something like PyObject_String. OTOH, should Guido change his -1 on text(), that leads to the obvious PyObject_Text. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From aahz at pythoncraft.com Mon Aug 8 17:45:03 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon, 8 Aug 2005 08:45:03 -0700 Subject: [Python-Dev] pdb: should next command be extended? In-Reply-To: References: Message-ID: <20050808154503.GB28005@panix.com> On Sun, Aug 07, 2005, Ilya Sandler wrote: > > Solution: > > Should pdb's next command accept an optional numeric argument? It would > specify how many actual lines of code (not "line events") > should be skipped in the current frame before stopping, At OSCON, Anthony Baxter made the point that pdb is currently one of the more unPythonic modules. If you're feeling a lot of energy about this, rewriting pdb might be more productive. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From gvanrossum at gmail.com Mon Aug 8 18:14:18 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon, 8 Aug 2005 09:14:18 -0700 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <20050808154157.GA28005@panix.com> References: <20050806102342.GA11309@mems-exchange.org> <20050808034756.GA16756@mems-exchange.org> <20050808154157.GA28005@panix.com> Message-ID: Ouch. Too much discussion to respond to it all. Please remember that in Jythin and IronPython, str and unicode are already synonyms. That's how Python 3.0 will do it, except unicode will disappear as being redundant. I like the bytes/frozenbytes pair idea. Streams could grow a getpos()/setpos() API pair that can be used for stateful encodings (although it sounds like seek()/tell() would be okay to use in most cases as long as you read in units of whole lines). For sockets, send()/recv() would deal in bytes, and makefile() would get an encoding parameter. I'm not going to change my mind on text() unless someone explains what's so attractive about it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Mon Aug 8 18:16:22 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon, 8 Aug 2005 09:16:22 -0700 Subject: [Python-Dev] __traceback__ and reference cycles In-Reply-To: <20050808083106.GA15924@code1.codespeak.net> References: <20050808083106.GA15924@code1.codespeak.net> Message-ID: On 8/8/05, Armin Rigo wrote: > Attaching a __traceback__ will only make this "bug" show up more often, > as the 'except Exception, e' line in a __del__() method would be enough > to trigger it. Hmm... We can blame this on __del__ as much as on __traceback__, of course... But it is definitely of concern. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Mon Aug 8 18:33:39 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 08 Aug 2005 12:33:39 -0400 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050808154157.GA28005@panix.com> <20050806102342.GA11309@mems-exchange.org> <20050808034756.GA16756@mems-exchange.org> <20050808154157.GA28005@panix.com> Message-ID: <5.1.1.6.0.20050808122851.025d3e90@mail.telecommunity.com> At 09:14 AM 8/8/2005 -0700, Guido van Rossum wrote: >I'm not going to change my mind on text() unless >someone explains what's so attractive about it. 1. It's obvious to non-programmers what it's for (str and unicode aren't) 2. It's more obvious to programmers that it's a *text* string rather than a string of bytes 3. It's easier to type than "unicode", but less opaque than "str" 4. Switching to 'text' and 'bytes' allows for a clean break from any mental baggage now associated with 'unicode' and 'str'. Of course, the flip side to these arguments is that in today's Python, one rarely has use for the string type names, except for coercion and some occasional type checking. On the other hand, if we end up with type declarations, then these issues become a bit more important. From mal at egenix.com Mon Aug 8 18:51:47 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 08 Aug 2005 18:51:47 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> <20050808034756.GA16756@mems-exchange.org> <20050808154157.GA28005@panix.com> Message-ID: <42F78DA3.80605@egenix.com> Guido van Rossum wrote: > Ouch. Too much discussion to respond to it all. Please remember that > in Jythin and IronPython, str and unicode are already synonyms. I know, but don't understand that argument: aren't we talking about Python in general, not some particular implementation ? Why should CPython applications break just to permit Jython and IronPython applications not to break ? > That's how Python 3.0 will do it, except unicode will disappear as being > redundant. I like the bytes/frozenbytes pair idea. Streams could grow > a getpos()/setpos() API pair that can be used for stateful encodings > (although it sounds like seek()/tell() would be okay to use in most > cases as long as you read in units of whole lines). Please don't confuse the raw bytes position in a file or stream with e.g. an index into the possibly decoded data. Those are two different pairs of shoes. Since the position into decoded data depends on what type of encoding your using and how you decode, the "position" would not be defined across streams, but depend on the features of a particular stream. > For sockets, send()/recv() would deal in bytes, and makefile() would get an > encoding parameter. I'm not going to change my mind on text() unless > someone explains what's so attractive about it. Please read my reply for some reasoning and also Phillips answer to your posting. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 08 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From aahz at pythoncraft.com Mon Aug 8 18:56:20 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon, 8 Aug 2005 09:56:20 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1123473109.20293.35.camel@geddy.wooz.org> References: <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> <1123473109.20293.35.camel@geddy.wooz.org> Message-ID: <20050808165620.GA8064@panix.com> On Sun, Aug 07, 2005, Barry Warsaw wrote: > > We'd also have to teach the current crop of developers how to use the > client tools effectively. I think it's fairly simple to teach a CVS > user how to use Subversion, but have no idea if translating CVS > experience to Perforce is as straightforward. The impression I got from Alex Martelli is that it's not particularly straightforward. (Google apparently uses Perforce.) -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From bcannon at gmail.com Mon Aug 8 19:14:30 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 8 Aug 2005 10:14:30 -0700 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <42F6F6E2.7020007@v.loewis.de> References: <42F61C03.6050703@v.loewis.de> <42F6F6E2.7020007@v.loewis.de> Message-ID: On 8/7/05, "Martin v. L?wis" wrote: > Brett Cannon wrote: > > What is going in under python/ ? If it is what is currently > > /dist/src/, then great and the renaming of the repository works. > > Just have a look yourself :-) Yes, this is dist/src. > Ah, OK. I didn't drill far enough down. Not enough experience with svn to realize that the directory was not just filled with default directories. > > But if that is what src/ is going to be used for > > This is nondist/src. Perhaps I should just move nondist/src/Compiler, > and drop nondist/src. > Wouldn't hurt. Since svn allows directory deletion there doesn't seem to be an huge need to worry about the projects/ directory getting to large. > > And I assume you are going to list the directory structure in the PEP > > at some point. > > Please take a look at the PEP. > OK, now I see it. I scanned the PEP initially but I didn't see it; guess I was expecting a more literal directory list than a paragraph on it. -Brett From trentm at ActiveState.com Mon Aug 8 20:51:00 2005 From: trentm at ActiveState.com (Trent Mick) Date: Mon, 8 Aug 2005 11:51:00 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42EF2794.1000209@v.loewis.de> References: <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> Message-ID: <20050808185100.GJ16963@ActiveState.com> > > Since Python is Open Source are you looking at Per Force which you can > > use for free and seems to be a happy medium between something like CVS > > and something horrific like Clear Case? > > No. The PEP is only about Subversion. Why should we be looking at Per > Force? Only because Python is Open Source? Perforce offers free licensing to open source projects. > I think anything but Subversion is ruled out because: > - there is no offer to host that anywhere (for subversion, there is > already svn.python.org) > - there is no support for converting a CVS repository (for subversion, > there is cvs2svn) There *is* support for converting a CVS repository to Perforce [1]. Perforce is very good, stable, solid, reliable, good tools, etc. etc. but I'd tend to support SVN over Perforce for Python development. Perforce usage is quite different than CVS (would be a painful re-learning for old CVS-hands) and SVN tends to better support highly distributed development: most operations don't need to talk to the server, with Perforce (aka p4), almost *all* operations talk to the server. This can be somewhat mitigated with "p4proxy" (a tool that Perforce also provides) but people would be happier with SVN, I'd bet. [1] It is a project called VCP. Some details here (I didn't look too hard): http://www.cpan.org/modules/by-module/LWP/AUTRIJUS/VCP-autrijus-snapshot-0.9-20041020.readme Trent -- Trent Mick TrentM at ActiveState.com From trentm at ActiveState.com Mon Aug 8 20:58:06 2005 From: trentm at ActiveState.com (Trent Mick) Date: Mon, 8 Aug 2005 11:58:06 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050808165620.GA8064@panix.com> References: <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> <1123473109.20293.35.camel@geddy.wooz.org> <20050808165620.GA8064@panix.com> Message-ID: <20050808185806.GK16963@ActiveState.com> [Aahz wrote] > On Sun, Aug 07, 2005, Barry Warsaw wrote: > > > > We'd also have to teach the current crop of developers how to use the > > client tools effectively. I think it's fairly simple to teach a CVS > > user how to use Subversion, but have no idea if translating CVS > > experience to Perforce is as straightforward. > > The impression I got from Alex Martelli is that it's not particularly > straightforward. Agreed. > (Google apparently uses Perforce.) We do at ActiveState as well. *The* Perl source code repository is a Perforce one (hosted separately here at ActiveState as well). Microsoft licenses the Perforce code and uses it (with some slight modifications I hear) internally. Trent -- Trent Mick TrentM at ActiveState.com From trentm at ActiveState.com Mon Aug 8 20:58:06 2005 From: trentm at ActiveState.com (Trent Mick) Date: Mon, 8 Aug 2005 11:58:06 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050808165620.GA8064@panix.com> References: <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> <1123473109.20293.35.camel@geddy.wooz.org> <20050808165620.GA8064@panix.com> Message-ID: <20050808185806.GK16963@ActiveState.com> [Aahz wrote] > On Sun, Aug 07, 2005, Barry Warsaw wrote: > > > > We'd also have to teach the current crop of developers how to use the > > client tools effectively. I think it's fairly simple to teach a CVS > > user how to use Subversion, but have no idea if translating CVS > > experience to Perforce is as straightforward. > > The impression I got from Alex Martelli is that it's not particularly > straightforward. Agreed. > (Google apparently uses Perforce.) We do at ActiveState as well. *The* Perl source code repository is a Perforce one (hosted separately here at ActiveState as well). Microsoft licenses the Perforce code and uses it (with some slight modifications I hear) internally. Trent -- Trent Mick TrentM at ActiveState.com From falcon at intercable.ru Thu Aug 4 09:09:33 2005 From: falcon at intercable.ru (falcon) Date: Thu, 4 Aug 2005 11:09:33 +0400 Subject: [Python-Dev] PEP-343 - Context Managment variant Message-ID: <313379598.20050804110933@intercable.ru> I know I came after the battle. And I have just another sight on context managment. Simple Context Managment may look in Python 2.4.1 like this: Synhronized example: def Synhronised(lock,func): lock.acquire() try: func() finally: lock.release() .... lock=Lock() def Some(): local_var1=x local_var2=y local_var3=Z def Work(): global local_var3 local_var3=Here_I_work(local_var1,local_var2,local_var3) Synhronised(lock,Work) return asd(local_var3) FileOpenClose Example: def FileOpenClose(name,mode,func): f=file(name,mode) try: func(f) finally: f.close() .... def Another(name): local_var1=x local_var2=y local_var3=None def Work(f): global local_var3 local_var3=[[x,y(i)] for i in f] FileOpenClose(name,'r',Work) return local_var3 It was complicated because : 1. We must declare closure (local function) before using it 2. We must declare local vars, to which we wish assign in "global" statement 3. We cannot create variable local to outter function in closure, so we must create it before and declare in global So one can say: "that is because there're no block lambda". (I can be wrong) I think it is possible to introduce block-object in analogy to lambda-object (and function-object) It's difference from function: it has no true self local variables, all global(free) and local variables it steels from outter scope. And all local vars, introduced in block are really introduced in outter scope (i think, during parse state). So its global dicts and local dicts are realy corresponding dicts of outter scope. (Excuse my english) So, may be it would be faster than function call. I don't know. Syntax can be following: lock=Lock() def Some(): local_var1=x local_var2=y local_var3=Z Synhronised(lock,block) local_var3=Here_I_work(local_var1,local_var2,local_var3) return asd(local_var3) def Another(name): local_var1=x local_var2=y FileOpenClose(name,'r',block{f}) local_var3=[[x,y(i)] for i in f] return local_var3 And here is sample of returning value: def Foo(context,proc): context.enter() try: res=proc(context.x,context.y)*2 except Exception,Err: context.throw(Err) finally: context.exit() return res ... context=MyContext() ... def Bar(): result=Foo(context,block{x,y}) continue x+y return result It's idea was token from Ruby. But I think, good idea is good whatever it came from. It can be not Pythonic. From aahz at pythoncraft.com Mon Aug 8 22:09:28 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon, 8 Aug 2005 13:09:28 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050808185100.GJ16963@ActiveState.com> References: <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050808185100.GJ16963@ActiveState.com> Message-ID: <20050808200928.GA24381@panix.com> On Mon, Aug 08, 2005, Trent Mick wrote: >Martin: >> >> No. The PEP is only about Subversion. Why should we be looking at Per >> Force? Only because Python is Open Source? > > Perforce offers free licensing to open source projects. So did BitKeeper. Linux got bitten by that. We'd need a strong incentive to consider Perforce over Subversion just because of that issue. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From martin at v.loewis.de Mon Aug 8 23:37:48 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Aug 2005 23:37:48 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050808185100.GJ16963@ActiveState.com> References: <1f7befae050728172161d4a9e8@mail.gmail.com> <200507281956.03788.jeff@taupro.com> <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050808185100.GJ16963@ActiveState.com> Message-ID: <42F7D0AC.5020003@v.loewis.de> Trent Mick wrote: >>No. The PEP is only about Subversion. Why should we be looking at Per >>Force? Only because Python is Open Source? > > > Perforce offers free licensing to open source projects. Ok, so I now got "it's mature", and "it would be without charges". Given that it is now running against Subversion, I would be still interested in advantages that it offers compared to svn. The biggest disadvantage, to me, is that few people know how to use it (myself included). I don't trust tools I've never used. So for me, as the author of this PEP, usage of the revsion control software is non-negotiable (selection of hoster, to a limited degree, is). If you want to see Perforce used for the Python development, you should write a counter-PEP, so we could let the BDFL decide. [This is a theoretical "you" here, since you then explain that you would still prefer to use subversion] Regards, Martin From martin at v.loewis.de Mon Aug 8 23:42:01 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Aug 2005 23:42:01 +0200 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <5.1.1.6.0.20050808094640.025b8d98@mail.telecommunity.com> References: <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com> <42F60E35.9080809@egenix.com> <20050806102342.GA11309@mems-exchange.org> <42F60E35.9080809@egenix.com> <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com> <5.1.1.6.0.20050808094640.025b8d98@mail.telecommunity.com> Message-ID: <42F7D1A9.4000909@v.loewis.de> Phillip J. Eby wrote: > Actually, thinking about it some more, it seems to me it's actually more > like this: > > sock.send( ("%d:%s," % > (len(data),data.decode('latin1'))).encode('latin1') ) While this would work, it would still feel wrong: the binary data are *not* latin1 (most likely), so declaring them to be latin1 would be confusing. Perhaps a synonym '8bit' for latin1 could be introduced. Regards, Martin From trentm at ActiveState.com Tue Aug 9 00:49:12 2005 From: trentm at ActiveState.com (Trent Mick) Date: Mon, 8 Aug 2005 15:49:12 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F7D0AC.5020003@v.loewis.de> References: <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de> Message-ID: <20050808224912.GA11584@ActiveState.com> One feature I like in Perforce (which Subversion doesn't have) is the ability to have pending changesets. A changeset is, as with subversion, something you check-in atomically. Pending changesets in Perforce allow you to (1) group related files in a source tree where you might be working on multiple things at once to ensure and (2) to build a change description as you go. In a large source tree this can be useful for separating chunks of work. There are other little things, like not being able to trim the check-in filelist when editing the check-in message. For example, say you have 10 files checked out scattered around the Python source tree and you want to check 9 of those in. Currently with svn you have to manually specify those 9 to be sure to not include the remaining one. With p4 you just say to check-in the whole tree and then remove that one from the list give you in your editor with entering the check-in message. Not that big of a deal. [Martin v. L?wis on Perforce] > The biggest disadvantage, to me, is that few people know how > to use it (myself included). Granted. For that reason and for a couple of others I mentioned (SVN will probably work better for offline and distributed developers) I think Subversion wins over Perforce. That is presuming, of course, that we find Subversion to be acceptibly stable/robust/manageble. Trent -- Trent Mick TrentM at ActiveState.com From nas at arctrix.com Tue Aug 9 00:51:52 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Mon, 8 Aug 2005 16:51:52 -0600 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> Message-ID: <20050808225151.GA19090@mems-exchange.org> On Sat, Aug 06, 2005 at 06:56:39PM -0700, Guido van Rossum wrote: > My first response to the PEP, however, is that instead of a new > built-in function, I'd rather relax the requirement that str() return > an 8-bit string -- after all, int() is allowed to return a long, so > why couldn't str() be allowed to return a Unicode string? I've played with this idea a bit and it seems viable. I modified my original patch to have string_new call PyObject_Text instead of PyObject_Str. That change breaks only two tests, both in test_email. The tracebacks are attached. Both problems seem relatively shallow. Do you thing such a change could go into 2.5? Neil Traceback (most recent call last): File "/home/nas/Python/py_cvs/Lib/email/test/test_email.py", line 2844, in test_encoded_adjacent_nonencoded h = make_header(decode_header(s)) File "/home/nas/Python/py_cvs/Lib/email/Header.py", line 123, in make_header charset = Charset(charset) File "/home/nas/Python/py_cvs/Lib/email/Charset.py", line 190, in __init__ input_charset = unicode(input_charset, 'ascii').lower() TypeError: decoding Unicode is not supported Traceback (most recent call last): File "/home/nas/Python/py_cvs/Lib/email/test/test_email.py", line 2750, in test_multilingual eq(decode_header(enc), File "/home/nas/Python/py_cvs/Lib/email/Header.py", line 85, in decode_header dec = email.quopriMIME.header_decode(encoded) File "/home/nas/Python/py_cvs/Lib/email/quopriMIME.py", line 319, in header_decode return re.sub(r'=\w{2}', _unquote_match, s) File "/home/nas/Python/py_cvs/Lib/sre.py", line 142, in sub return _compile(pattern, 0).sub(repl, string, count) UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 0: ordinal not in range(128) From tim.peters at gmail.com Tue Aug 9 01:29:07 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 8 Aug 2005 19:29:07 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050808224912.GA11584@ActiveState.com> References: <1f7befae05072819142c36e610@mail.gmail.com> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de> <20050808224912.GA11584@ActiveState.com> Message-ID: <1f7befae05080816294bbc1100@mail.gmail.com> [Trent Mick] > ... > There are other little things, like not being able to trim the check-in > filelist when editing the check-in message. For example, say you have > 10 files checked out scattered around the Python source tree and you > want to check 9 of those in. This seems dubious, since you're not checking in the state you actually have locally, and you were careful to run the full Python test suite with your full local state ;-) > Currently with svn you have to manually specify those 9 to be sure to not > include the remaining one. With p4 you just say to check-in the whole tree > and then remove that one from the list give you in your editor with entering > the check-in message. Not that big of a deal. As a purely theoretical exercise , the last time I faced that under SVN, I opened the single file I didn't want to check-in in my editor, did "svn revert" on it from the cmdline, checked in the whole tree, and then hit the editor's "save" button. This doesn't scale well to skipping 25 of 50, but it's effective enough for 1 or 2. From pinard at iro.umontreal.ca Tue Aug 9 01:41:58 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Mon, 8 Aug 2005 19:41:58 -0400 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <5.1.1.6.0.20050808122851.025d3e90@mail.telecommunity.com> References: <20050808154157.GA28005@panix.com> <20050806102342.GA11309@mems-exchange.org> <20050808034756.GA16756@mems-exchange.org> <20050808154157.GA28005@panix.com> <5.1.1.6.0.20050808122851.025d3e90@mail.telecommunity.com> Message-ID: <20050808234158.GA19081@alcyon.progiciels-bpi.ca> [Phillip J. Eby] > At 09:14 AM 8/8/2005 -0700, Guido van Rossum wrote: > > I'm not going to change my mind on text() unless someone explains > > what's so attractive about it. > 2. It's more obvious to programmers that it's a *text* string rather > than a string of bytes I've no opinion on the proposal on itself, except maybe that "text", that precise word or name, is a pretty bad choice. It is far too likely that people already use or want to use that precise identifier. There once was a suggestion for naming "text" the module now known as "textwrap", under the premise that it could be later extended for holding many other various text-related functions. Happily enough, this idea was not retained. "textwrap" is much more reasonable as a name. I found Python 1.5.2's "string" to be especially prone to clashing. I still find "socket" obtrusive in that respect. Consider "len" as an example of a clever choice, while "length" would not have been. "str" is also a good choice. "object" is a bit more annoying theoretically, yet we almost never need it in practice. "type" is annoying as a name (yet very nice as a concept), as if it was free to use, it would often serve to label our own things. The fact is we often need the built-in. Python should not choose common English words for its built-ins, without very careful thought, and be reluctant to any compulsion in this area. -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From abo at minkirri.apana.org.au Tue Aug 9 02:32:16 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Mon, 08 Aug 2005 17:32:16 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050808224912.GA11584@ActiveState.com> References: <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de> <20050808224912.GA11584@ActiveState.com> Message-ID: <1123547536.3700.109.camel@warna.corp.google.com> On Mon, 2005-08-08 at 15:49, Trent Mick wrote: > One feature I like in Perforce (which Subversion doesn't have) is the > ability to have pending changesets. A changeset is, as with subversion, > something you check-in atomically. Pending changesets in Perforce allow > you to (1) group related files in a source tree where you might be > working on multiple things at once to ensure and (2) to build a change > description as you go. In a large source tree this can be useful for > separating chunks of work. This seems like a poor workaround for crappy branch/merge support. I'm new to perforce, but the pending changesets seem dodgey to me... you are accumulating changes gradually without recording any history during the process... ie, no checkins until the end. Even worse, perforce seems to treat clients like "unversioned branches", allowing you to review and test pending changesets in other clients. This supposedly allows people to review/test each others changes before they are committed. The problem is, since these changes are not committed, there is no firm history of what what was reviewed/tested vs what gets committed... ie they could be different. Having multiple different pending changesets in one large source tree also feels like a workaround for high client overheads. Trying to develop and test a mixture of different changes in one source tree is asking for trouble... they can interact. Maybe I just haven't grokked perforce yet... which might be considered a black mark against it's learning curve :-) For me, the logical way to group a collection of changes is in a branch. This allows you to commit and track history of the collection of changes. You check out each branch into different directories and develop/test them independantly. The branch can then be reviewed and merged when it is complete. -- Donovan Baarda From trentm at ActiveState.com Tue Aug 9 02:33:45 2005 From: trentm at ActiveState.com (Trent Mick) Date: Mon, 8 Aug 2005 17:33:45 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1f7befae05080816294bbc1100@mail.gmail.com> References: <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de> <20050808224912.GA11584@ActiveState.com> <1f7befae05080816294bbc1100@mail.gmail.com> Message-ID: <20050809003345.GB23158@ActiveState.com> [Tim Peters wrote] > [Trent Mick] > > ... > > There are other little things, like not being able to trim the check-in > > filelist when editing the check-in message. For example, say you have > > 10 files checked out scattered around the Python source tree and you > > want to check 9 of those in. > > This seems dubious, since you're not checking in the state you > actually have locally, Say that 10th file is a documentation fix for a module unrelated to the other 9 files. > and you were careful to run the full Python > test suite with your full local state ;-) Absolutely. Er. Always. :) Trent -- Trent Mick TrentM at ActiveState.com From trentm at ActiveState.com Tue Aug 9 02:51:23 2005 From: trentm at ActiveState.com (Trent Mick) Date: Mon, 8 Aug 2005 17:51:23 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1123547536.3700.109.camel@warna.corp.google.com> References: <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de> <20050808224912.GA11584@ActiveState.com> <1123547536.3700.109.camel@warna.corp.google.com> Message-ID: <20050809005123.GC23158@ActiveState.com> Who made me the Perforce-bitch? Here I am screaming "Subversion! Subversion!" and y'all think I just using that as cover for a p4 lover affair. :) [Donovan Baarda wrote] > On Mon, 2005-08-08 at 15:49, Trent Mick wrote: > > One feature I like in Perforce (which Subversion doesn't have) is the > > ability to have pending changesets. A changeset is, as with subversion, > > something you check-in atomically. Pending changesets in Perforce allow > > you to (1) group related files in a source tree where you might be > > working on multiple things at once to ensure and (2) to build a change > > description as you go. In a large source tree this can be useful for > > separating chunks of work. > > This seems like a poor workaround for crappy branch/merge support. More like a pretty nice independent self-organizing feature that was necessitated as a workaround for a crappy solution (clientspecs) for huge data trees. > I'm new to perforce, but the pending changesets seem dodgey to me... you > are accumulating changes gradually without recording any history during > the process... ie, no checkins until the end. You want to do checkins of code in a consisten state. Some large changes take a couple of days to write. During which one may have to do a couple minor things in unrelated sections of a project. Having some mechanism to capture some thoughts and be able to say "these are the relevant source files for this work" is handy. Creating a branch for something that takes a couple of days is overkill. Perforce branching is pretty good in my experience. For very long projects one can easily create a branch. > Even worse, perforce seems to treat clients like "unversioned branches", > allowing you to review and test pending changesets in other clients. I'm not sure what you are talking about here. Yes, client information is stored on the server, but the *changes* (i.e. the diffs) on the client aren't so you must be talking about some other tool. Actually, if there *were* such a feature that would be quite handy. I'd love to be able to easily transfer my diffs developed on my Windows box to my Linux or Mac OS X box to quickly test changes there before checking in. > This supposedly allows people to review/test each others changes before > they are committed. The problem is, since these changes are not > committed, there is no firm history of what what was reviewed/tested vs > what gets committed... ie they could be different. The alternative being either that you have separate branches for everything (can be a pain) or just check-in for review (possibly breaking the build or functionality for other developers until the review is done). Actually the Perl guys working on PureMessage downstairs have two branches going in Perforce: one for checking into right away and then a cleaner tree to which only reviewed check-ins from the first are integrated. I'm not saying I am awash in pending changelists here. Nor that they should be used for what is better handled with branching. It is a tool (and a minor one). > Trying to develop and test a mixture of different changes in one > source tree is asking for trouble... they can interact. ...within reason. Trent -- Trent Mick TrentM at ActiveState.com From foom at fuhm.net Tue Aug 9 02:56:51 2005 From: foom at fuhm.net (James Y Knight) Date: Mon, 8 Aug 2005 20:56:51 -0400 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: References: <20050806102342.GA11309@mems-exchange.org> <20050808034756.GA16756@mems-exchange.org> <20050808154157.GA28005@panix.com> Message-ID: <2B24D218-9919-4CF7-AEF2-7335B8360878@fuhm.net> On Aug 8, 2005, at 12:14 PM, Guido van Rossum wrote: > Ouch. Too much discussion to respond to it all. Please remember that > in Jythin and IronPython, str and unicode are already synonyms. That's > how Python 3.0 will do it, except unicode will disappear as being > redundant. I like the bytes/frozenbytes pair idea. Streams could grow > a getpos()/setpos() API pair that can be used for stateful encodings > (although it sounds like seek()/tell() would be okay to use in most > cases as long as you read in units of whole lines). For sockets, > send()/recv() would deal in bytes, and makefile() would get an > encoding parameter. I'm not going to change my mind on text() unless > someone explains what's so attractive about it. Files no more have an encoding than sockets do. Reading/writing them should ideally (by default) result in bytes. codecs.open and codecs.StreamReaderWriter provide the character-converting wrapper around file-like objects. I agree that getpos/setpos may be a useful addition to the API, but only because it would allow StreamReaderWriter to override it to do something useful. For normal files it could simply be an alias for tell/seek. Of course, someone would have to actually implement the ability to save and restore state for every codec... Hum, actually, it somewhat makes sense for the "open" builtin to become what is now "codecs.open", for convenience's sake, although it does blur the distinction between a byte stream and a character stream somewhat. If that happens, I suppose it does actually make sense to give "makefile" the same signature. James From tim.peters at gmail.com Tue Aug 9 03:12:49 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 8 Aug 2005 21:12:49 -0400 Subject: [Python-Dev] __traceback__ and reference cycles In-Reply-To: <20050808083106.GA15924@code1.codespeak.net> References: <20050808083106.GA15924@code1.codespeak.net> Message-ID: <1f7befae05080818127ad30e63@mail.gmail.com> [Armin Rigo] > There are various proposals to add an attribute on exception instances > to store the traceback (see PEP 344). A detail not discussed, which I > thought of historical interest only, is that today's exceptions try very > hard to avoid reference cycles, in particular the cycle > > 'frame -> local variable -> traceback object -> frame' > > which was important for pre-GC versions of Python. A clause 'except > Exception, e' would not create a local reference to the traceback, only > to the exception instance. If the latter grows a __traceback__ > attribute, it is no longer true, and every such except clause typically > creates a cycle. > > Of course, we don't care, we have a GC -- do we? Well, there are cases > where we do: see the attached program... In my opinion it should be > considered a bug of today's Python that this program leaks memory very > fast and takes longer and longer to run each loop (each loop takes half > a second longer than the previous one!). (I don't know how this bug > could be fixed, though.) > > Spoiling the fun of figuring out what is going on, the reason is that > 'e_tb' creates a reference cycle involving the frame of __del__, which > keeps a reference to 'self' alive. Python thinks 'self' was > resurrected. The next time the GC runs, the cycle disappears, and the > refcount of 'self' drops to zero again, calling __del__ again -- which > gets resurrected again by a new cycle. Etc... Note that no cycle > actually contains 'self'; they just point to 'self'. In summary, no X > instance gets ever freed, but they all have their destructors called > over and over again. > > Attaching a __traceback__ will only make this "bug" show up more often, > as the 'except Exception, e' line in a __del__() method would be enough > to trigger it. > > Not sure what to do about it. I just thought I should share these > thoughts (I stumbled over almost this problem in PyPy). I can't think of a Python feature with a higher aggregate braincell_burned / benefit ratio than __del__ methods. If P3K retains them-- or maybe even before --we should consider taking "the Java dodge" on this one. That is, decree that henceforth a __del__ method will get invoked by magic at most once on any given object O, no matter how often O is resurrected. It's been mentioned before, but it's at least theoretically backward-incompatible, so "it's scary". I can guarantee I don't have any code that would care, including all the ZODB code I watch over these days. For ZODB it's especially easy to be sure of this: the only __del__ method in the whole thing appears in the test suite, verifying that ZODB's object cache no longer gets into an infinite loop when a user-defined persistent object has a brain-dead __del__ method that reloads self from the database. (Interestingly enough, if Python guaranteed to call __del__ at most once, the infinite loop in ZODB's object cache never would have appeared in this case.) From abo at minkirri.apana.org.au Tue Aug 9 03:13:10 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Mon, 08 Aug 2005 18:13:10 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050809005123.GC23158@ActiveState.com> References: <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de> <20050808224912.GA11584@ActiveState.com> <1123547536.3700.109.camel@warna.corp.google.com> <20050809005123.GC23158@ActiveState.com> Message-ID: <1123549990.3695.119.camel@warna.corp.google.com> On Mon, 2005-08-08 at 17:51, Trent Mick wrote: [...] > [Donovan Baarda wrote] > > On Mon, 2005-08-08 at 15:49, Trent Mick wrote: [...] > You want to do checkins of code in a consisten state. Some large changes > take a couple of days to write. During which one may have to do a couple > minor things in unrelated sections of a project. Having some mechanism > to capture some thoughts and be able to say "these are the relevant I don't need to checkin in a consitent state if I'm working on a seperate branch. I can checkin any time I want to record a development checkpoint... I can capture the thoughts in the version history of the branch. > source files for this work" is handy. Creating a branch for something > that takes a couple of days is overkill. [...] > The alternative being either that you have separate branches for > everything (can be a pain) or just check-in for review (possibly It all comes down to how painless branch/merge is. Many esoteric "features" of version control systems feel like they are there to workaround the absence of proper branch/merge histories. Note: SVN doesn't have branch/merge histories either. -- Donovan Baarda From gvanrossum at gmail.com Tue Aug 9 03:18:06 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon, 8 Aug 2005 18:18:06 -0700 Subject: [Python-Dev] __traceback__ and reference cycles In-Reply-To: <1f7befae05080818127ad30e63@mail.gmail.com> References: <20050808083106.GA15924@code1.codespeak.net> <1f7befae05080818127ad30e63@mail.gmail.com> Message-ID: On 8/8/05, Tim Peters wrote: > I can't think of a Python feature with a higher aggregate > braincell_burned / benefit ratio than __del__ methods. If P3K retains > them-- or maybe even before --we should consider taking "the Java > dodge" on this one. That is, decree that henceforth a __del__ method > will get invoked by magic at most once on any given object O, no > matter how often O is resurrected. I'm sympathetic to this one. Care to write a PEP? It could be really short and sweet as long as it provides enough information to implement the feature. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ilya at bluefir.net Tue Aug 9 03:15:26 2005 From: ilya at bluefir.net (Ilya Sandler) Date: Mon, 8 Aug 2005 18:15:26 -0700 (PDT) Subject: [Python-Dev] an alternative suggestion, Re: pdb: should next command be extended? In-Reply-To: References: <42F679DC.6030705@v.loewis.de> Message-ID: > > Should pdb's next command accept an optional numeric argument? It would > > specify how many actual lines of code (not "line events") > > should be skipped in the current frame before stopping, > That would differ from gdb's "next ", which does "next" n times. > It would be confusing if pdb accepted the same command, but it > meant something different. So, would implementing gdb's "until" command instead of "next N" be a better idea? In its simplest form (without arguments) "until" advances to the next (textually) source line... This would solve the original problem of getting over list comprehensions... There is a bit of a problem with abbreviation of "until": gdb abbreviates it as "u", while in pdb "u" means "up"...It might be a good idea to have the same abbreviations Ilya Problem: When the code contains list comprehensions (or for that matter any other looping construct), the only way to get quickly through this code in pdb is to set a temporary breakpoint on the line after the loop, which is inconvenient.. There is a SF bug report #1248119 about this behavior. On Sun, 7 Aug 2005, Ilya Sandler wrote: > > > On Sun, 7 Aug 2005, [ISO-8859-1] "Martin v. L?wis" wrote: > > > Ilya Sandler wrote: > > > Should pdb's next command accept an optional numeric argument? It would > > > specify how many actual lines of code (not "line events") > > > should be skipped in the current frame before stopping, > > [...] > > > What do you think? > > > > That would differ from gdb's "next ", which does "next" n times. > > It would be confusing if pdb accepted the same command, but it > > meant something different. > > But as far as I can tell, pdb's next is > already different from gdb's next! gdb's next seem to always go to the > different source line, while pdb's next may stay on the current line. > > The problem with "next " meaning "repeat next n times" is that it > seems to be less useful that the original suggestion. > > Any alternative suggestions to allow to step over list comprehensions and > such? (SF 1248119) > > > Plus, there is always a chance that > > +n is never reached, which would also be confusing. > > That should not be a problem, returning from the current frame should be > treated as a stopping condition (similarly to the current "next" > behaviour)... > > Ilya > > > > > So I'm -1 here. > > > > Regards, > > Martin > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ilya%40bluefir.net > From pje at telecommunity.com Tue Aug 9 03:45:27 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 08 Aug 2005 21:45:27 -0400 Subject: [Python-Dev] __traceback__ and reference cycles In-Reply-To: <1f7befae05080818127ad30e63@mail.gmail.com> References: <20050808083106.GA15924@code1.codespeak.net> <20050808083106.GA15924@code1.codespeak.net> Message-ID: <5.1.1.6.0.20050808214339.02611248@mail.telecommunity.com> At 09:12 PM 8/8/2005 -0400, Tim Peters wrote: >I can't think of a Python feature with a higher aggregate >braincell_burned / benefit ratio than __del__ methods. If P3K retains >them-- or maybe even before --we should consider taking "the Java >dodge" on this one. That is, decree that henceforth a __del__ method >will get invoked by magic at most once on any given object O, no >matter how often O is resurrected. How does that help? Doesn't it mean that we'll have to have some way of keeping track of which items' __del__ methods were called? From bcannon at gmail.com Tue Aug 9 04:02:40 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 8 Aug 2005 19:02:40 -0700 Subject: [Python-Dev] __traceback__ and reference cycles In-Reply-To: <1f7befae05080818127ad30e63@mail.gmail.com> References: <20050808083106.GA15924@code1.codespeak.net> <1f7befae05080818127ad30e63@mail.gmail.com> Message-ID: On 8/8/05, Tim Peters wrote: > I can't think of a Python feature with a higher aggregate > braincell_burned / benefit ratio than __del__ methods. If P3K retains > them-- or maybe even before --we should consider taking "the Java > dodge" on this one. That is, decree that henceforth a __del__ method > will get invoked by magic at most once on any given object O, no > matter how often O is resurrected. > Wasn't there talk of getting rid of __del__ a little while ago and instead use weakrefs to functions to handle cleaning up? Is that still feasible? And if so, would this alleviate the problem? -Brett From tim.peters at gmail.com Tue Aug 9 04:37:56 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 8 Aug 2005 22:37:56 -0400 Subject: [Python-Dev] __traceback__ and reference cycles In-Reply-To: <5.1.1.6.0.20050808214339.02611248@mail.telecommunity.com> References: <20050808083106.GA15924@code1.codespeak.net> <1f7befae05080818127ad30e63@mail.gmail.com> <5.1.1.6.0.20050808214339.02611248@mail.telecommunity.com> Message-ID: <1f7befae05080819373910d1c2@mail.gmail.com> [Tim Peters] >> If P3K retains them [__del__]-- or maybe even before --we should >> consider taking "the Java dodge" on this one. That is, decree that >> henceforth a __del__ method will get invoked by magic at most >> once on any given object O, no matter how often O is resurrected. [Phillip J. Eby] > How does that help? You have to dig into Armin's example (or read his explanation): every time __del__ is called on one of his X objects, it creates a cycle by binding sys.exec_info()[2] to the local vrbl `e_tb`. `self` is reachable from that cycle, so self's refcount does not fall to 0 when __del__ returns. The object is resurrected. When cyclic gc next runs, it determines that the cycle is trash, and runs around decref'ing the objects in the cycle. That eventually makes the refcount on the X object fall to 0 again too, but then its __del__ method also runs again, and creates an isomorphic cycle, resurrecting `self` again. Etc. Armin didn't point this out explicitly, but it's important to realize that gc.garbage remains empty the entire time you let his program run. The object with the __del__ method isn't _in_ a cycle here, it's hanging _off_ a cycle, which __del__ keeps recreating. Cyclic gc isn't inhibited by a __del__ on an object hanging off a trash cycle (but not in a trash cycle itself), but in this case it's ineffective anyway. If __del__ were invoked only the first time cyclic gc ran, the original cycle would go away during the next cyclic gc run, and a new cycle would not take its place. > Doesn't it mean that we'll have to have some way of keeping track of > which items' __del__ methods were called? Yes, by hook or by crook; and yup too, that may be unattractive. From tim.peters at gmail.com Tue Aug 9 05:02:59 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 8 Aug 2005 23:02:59 -0400 Subject: [Python-Dev] __traceback__ and reference cycles In-Reply-To: References: <20050808083106.GA15924@code1.codespeak.net> <1f7befae05080818127ad30e63@mail.gmail.com> Message-ID: <1f7befae050808200272961928@mail.gmail.com> [Brett Cannon] > Wasn't there talk of getting rid of __del__ a little while ago and > instead use weakrefs to functions to handle cleaning up? There was from me, yes, with an eye toward P3K. > Is that still feasible? It never was, really. The combination of finalizers, cycles and resurrection is a freakin' mess, "even in theory". The way things are right now, Python's weakref gc endcase behavior is even more mystically implementation-driven than its __del__ gc endcase behavior, and nobody has had time to try to dream up a cleaner approach. > And if so, would this alleviate the problem? Absolutely . The underlying reason for optimism is that weakrefs in Python are designed to, at worst, let *other* objects learn that a given object has died, via a callback function. The weakly referenced object itself is not passed to the callback, and the presumption is that the weakly referenced object is unreachable trash at the time the callback is invoked. IOW, resurrection was "obviously" impossible, making endcase life very much simpler. That paragraph is from Modules/gc_weakref.txt, and you can read there all about why optimism hasn't work yet ;-) From ilya at bluefir.net Tue Aug 9 05:13:41 2005 From: ilya at bluefir.net (Ilya Sandler) Date: Mon, 8 Aug 2005 20:13:41 -0700 (PDT) Subject: [Python-Dev] pdb: should next command be extended? In-Reply-To: <20050808154503.GB28005@panix.com> References: <20050808154503.GB28005@panix.com> Message-ID: > At OSCON, Anthony Baxter made the point that pdb is currently one of the > more unPythonic modules. What is unpythonic about pdb? Is this part of Anthony's presentation online? (Google found a summary and slides from presentation but they don't say anything about pdb's deficiencies) Ilya On Mon, 8 Aug 2005, Aahz wrote: > On Sun, Aug 07, 2005, Ilya Sandler wrote: > > > > Solution: > > > > Should pdb's next command accept an optional numeric argument? It would > > specify how many actual lines of code (not "line events") > > should be skipped in the current frame before stopping, > > At OSCON, Anthony Baxter made the point that pdb is currently one of the > more unPythonic modules. If you're feeling a lot of energy about this, > rewriting pdb might be more productive. > -- > Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ > > The way to build large Python applications is to componentize and > loosely-couple the hell out of everything. > From bcannon at gmail.com Tue Aug 9 06:32:49 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 8 Aug 2005 21:32:49 -0700 Subject: [Python-Dev] Exception Reorg PEP revised yet again Message-ID: version 1.7 scales the proposal back once more (http://www.python.org/peps/pep-0348.html). At this point the only changes to the hierarchy are the addition of BaseException and TerminatingException, and the change of inheritnace for KeyboardInterrupt, SystemExit, and NotImplementedError. At this point I don't think MAL or Raymond will have any major complaints. =) Assuming no one throws a fit over this version, discussing transition is the next step. I think the transition plan is fine, but if anyone has any specific input that would be great. I could probably stand to do a more specific timeline in terms of 2.x, 2.x+1, 3.0-1, etc., but that will have to wait for another day this week. And once that is settled I guess it is either time for pronouncement or it just sits there until Python 3.0 actually starts to come upon us. -Brett From stephen at xemacs.org Tue Aug 9 07:15:48 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 09 Aug 2005 14:15:48 +0900 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1123549990.3695.119.camel@warna.corp.google.com> (Donovan Baarda's message of "Mon, 08 Aug 2005 18:13:10 -0700") References: <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de> <20050808224912.GA11584@ActiveState.com> <1123547536.3700.109.camel@warna.corp.google.com> <20050809005123.GC23158@ActiveState.com> <1123549990.3695.119.camel@warna.corp.google.com> Message-ID: <87k6iv7ksb.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Donovan" == Donovan Baarda writes: Donovan> It all comes down to how painless branch/merge is. Many Donovan> esoteric "features" of version control systems feel like Donovan> they are there to workaround the absence of proper Donovan> branch/merge histories. It's not that simple. I've followed both the Arch and the darcs lists---they handle a lot more branch/merge scenarios than Subversion does, but you still can't get away with zero discipline. On the other hand, for the purpose of the main repository for a well-factored project with disciplined workflow like Python, it's not obvious to me that the middle-complexity scenarios are that important. Furthermore, taking good advantage of the more complex branch/merge scenarios will require a change to Python workflow (for example, push- to-tracker will no longer be a convenient way to submit patches for most developers); that will be costly. More important, since none of the core Python people have spoken up strongly in favor of an advanced system, I would guess there's little experience to support planning a new workflow. (Cf. the Linux case, where Linus opted to roll his own.) I know many people in the Emacs communities who are successfully using CVS for the main repositories and various advanced systems (prcs, darcs, arch at least) for local branching and small group project communication. It seems fairly easy to automate that (much easier than extracting changeset information from CVS!) I think that as developers find they have need for such capabilities, the practice will grow in Python too, and then there may be a case to be built for moving the main repository to such a system. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From stephen at xemacs.org Tue Aug 9 07:28:08 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 09 Aug 2005 14:28:08 +0900 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <42F7D1A9.4000909@v.loewis.de> (Martin v. =?iso-8859-1?q?L=F6wis's?= message of "Mon, 08 Aug 2005 23:42:01 +0200") References: <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com> <42F60E35.9080809@egenix.com> <20050806102342.GA11309@mems-exchange.org> <42F60E35.9080809@egenix.com> <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com> <5.1.1.6.0.20050808094640.025b8d98@mail.telecommunity.com> <42F7D1A9.4000909@v.loewis.de> Message-ID: <87fytj7k7r.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Martin" == Martin v L?wis writes: Martin> While this would work, it would still feel wrong: the Martin> binary data are *not* latin1 (most likely), so declaring Martin> them to be latin1 would be confusing. Perhaps a synonym Martin> '8bit' for latin1 could be introduced. Be careful. This alias has caused Emacs some amount of pain, as binary data escapes into contexts (such as Universal Newline processing) where it gets interpreted as character data. We've also had some problems in codec implementation, because latin1 and (eg) latin9 have some differences in semantics other than changing the coded character set for the GR register---controls are treated differently, for example, because they _are_ binary (alias latin1) octets, but not in the range of the latin9 code. I won't go so far as to say it won't work, but it will require careful design. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From python at rcn.com Tue Aug 9 07:43:32 2005 From: python at rcn.com (Raymond Hettinger) Date: Tue, 9 Aug 2005 01:43:32 -0400 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: Message-ID: <000a01c59ca5$47627960$803dc797@oemcomputer> [Brett Cannon] > At this point the only > changes to the hierarchy are the addition of BaseException and > TerminatingException, and the change of inheritnace for > KeyboardInterrupt, SystemExit, and NotImplementedError. TerminatingException -------------------- The rationale for adding TerminatingException needs to be developed or reconsidered. AFAICT, there hasn't been an exploration of existing code bases to determine that there is going to be even minimal use of "except TerminatingException". Are KeyboardInterrupt and SystemExit often caught together on the same line and handled in the same way? If so, isn't "except TerminatingException" less explicit, clear, and flexible than "except (KeyboardInterrupt, SystemExit)"? Do we need a second way to do it? Doesn't the new meaning of Exception already offer a better idiom: try: suite() except Exception: log_or_recover() except: handle_terminating_exceptions() else: Are there any benefits sufficient to warrant yet another new built-in? Does it also warrant violating FIBTN by introducing more structure? While I'm clear on why KeyboardInterrupt and SystemExit were moved from under Exception, it is not at all clear what problem is being solved by adding a new intermediate grouping. The PEP needs to address all of the above. Right now, it contains a definition rather than justification, research, and analysis. WindowsError ------------ This should be kept. Unlike module specific exceptions, this exception occurs in multiple places and diverse applications. It is appropriate to list as a builtin. "Too O/S specific" is not a reason for eliminating this. Looking at the codebase there does not appear to be a good substitute. Eliminating this one would break code, decrease clarity, and cause modules to grow competing variants. After the change, nothing would be better and many things would be worse. NotImplementedError ------------------- Moving this is fine. Removing unnecessary nesting is a step forward. The PEP should list that as a justification. Bare excepts defaulting to Exception ------------------------------------ After further thought, I'm not as sure about this one and whether it is workable. The code fragment above highlights the issue. In a series of except clauses, each line only matches what was not caught by a previous clause. This is a useful and basic part of the syntax. It leaves a bare except to have the role of a final catchall (much like a default in C's switch-case). If one line uses "except Exception", then a subsequence bare except should probably catch KeyboardInterrupt and SystemExit. Otherwise, there is a risk of creating optical illusion errors (code that looks like it should work but is actually broken). I'm not certain on this one, but the PEP does need to fully explore the implications and think-out the consequent usability issues. > And once that is settled I guess it is either time for pronouncement > or it just sits there until Python 3.0 actually starts to come upon > us. What happened to "don't take this too seriously, I'm just trying to get the ball rolling"? This PEP or any Py3.0 PEP needs to sit a good while before pronouncement. Because 3.0 is not an active project, the PEP is unlikely to be a high priority review item by many of Python's best minds. It should not be stamped as accepted until they've had a chance to think it through. Because 3.0 is still somewhat ethereal, it is not reasonable to expect them to push aside their other work to look at this right now. The PEP needs to be kicked around on the newsgroup (naming and grouping discussions are easy and everyone will have an opinion). Also the folks with PyPy, BitTorrent, Zope, Twisted, IronPython, Jython, and such need to have a chance to have their say. Because of Py3.0's low visibility, these PEPs could easily slide through prematurely. Were the project imminent, it is likely that this PEP would have had significantly more discussion. Try not to get frustrated at these reviews. Because there was no research into existing code, working to solve known problems, evaluation of alternatives, or usability analysis, it is no surprise Sturgeon's Law would apply. Since Python has been around so long, it is also no surprise that what we have now is pretty good and that improvements won't be trivially easy to come by. Raymond From steven.bethard at gmail.com Tue Aug 9 08:28:08 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Tue, 9 Aug 2005 00:28:08 -0600 Subject: [Python-Dev] Major revision of PEP 348 committed In-Reply-To: <000401c59b36$01226de0$e410c797@oemcomputer> References: <000401c59b36$01226de0$e410c797@oemcomputer> Message-ID: Raymond Hettinger wrote: > If the PEP can't resist the urge to create new intermediate groupings, > then start by grepping through tons of Python code to find-out which > exceptions are typically caught on the same line. That would be a > worthwhile empirical study and may lead to useful insights. I was curious, so I did a little grepping (ok, os.walking and re.findalling) ;-) through the Python source. The only exceptions that were caught together more than 5 times were: AttributeError and TypeError (23 instances) in code.py doctest.py linecache.py mailbox.py idlelib/rpc.py lib-old/newdir.py lib-tk/Tkinter.py test/test_descr.py test/test_file.py test/test_funcattrs.py test/test_os.py Though these occur in a few different contexts, one relatively common one was when the code tried to set a possibly read-only attribute. ImportError and AttributeError (9 instances), in getpass.py locale.py pydoc.py tarfile.py xmlrpclib.py lib-tk/tkFileDialog.py test/test_largefile.py test/test_tarfile.py This seemed to be used when an incompatible module might be present. (Attributes were tested to make sure the module was the right one.) Also used when code tried to use "private" module attributes (e.g. _getdefaultlocale()). OverflowError and ValueError (9 instances), in csv.py ftplib.py mhlib.py mimify.py warnings.py test/test_resource.py These were generally around a call to int(x). I assume they're generally unnecessary now that int() silently converts to longs. IOError and OSError (6 instances), in pty.py tempfile.py whichdb.py distutils/dir_util.py idlelib/configHandler.py test/test_complex.py These were all around file/directory handling that I didn't study in too much detail. With the current hierarchy, there's no reason these couldn't just be catching EnvironmentError anyway. As you can see, even for the most common pairs of exceptions, the total number of times these pairs were caught was pretty small. Even ignoring the low counts, we see that the last two pairs or exceptions aren't really necessary, thanks to int/long unification and the existence of EnvironmentError, and the former two pairs argue *against* added nesting as it is unclear whether to group AttributeError with ImportError or TypeError. So it doesn't really look like the stdlib's going to provide much of a case for adding nesting to the exception hierarchy. Anyway, I know PEP 348's been scaled back at this point anyway, but I figured I might as well post my findings in case anyone was curious. STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From martin at v.loewis.de Tue Aug 9 08:44:43 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 09 Aug 2005 08:44:43 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <20050808224912.GA11584@ActiveState.com> References: <1f7befae05072819142c36e610@mail.gmail.com> <1122605323.9670.11.camel@geddy.wooz.org> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de> <20050808224912.GA11584@ActiveState.com> Message-ID: <42F850DB.8000406@v.loewis.de> Trent Mick wrote: > One feature I like in Perforce (which Subversion doesn't have) is the > ability to have pending changesets. That sounds useful. > Currently with svn you have to manually > specify those 9 to be sure to not include the remaining one. With p4 you > just say to check-in the whole tree and then remove that one from the > list give you in your editor with entering the check-in message. Not > that big of a deal. Depends on the client also. With Tortoise SVN, you do get a checkbox list where you can exclude files from the checkin. Regards, Martin From martin at v.loewis.de Tue Aug 9 08:52:01 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 09 Aug 2005 08:52:01 +0200 Subject: [Python-Dev] an alternative suggestion, Re: pdb: should next command be extended? In-Reply-To: References: <42F679DC.6030705@v.loewis.de> Message-ID: <42F85291.9070605@v.loewis.de> Ilya Sandler wrote: > So, would implementing gdb's "until" command instead of "next N" be a > better idea? In its simplest form (without arguments) "until" advances to > the next (textually) source line... This would solve the original problem of > getting over list comprehensions... I like that idea. > There is a bit of a problem with abbreviation of "until": gdb abbreviates > it as "u", while in pdb "u" means "up"...It might be a good idea to have the > same abbreviations Indeed. I don't know much about pdb internals, but I think "u" should become unbound, and "up" and "unt" should become the shortest abbreviations. Regards, Martin From ncoghlan at gmail.com Tue Aug 9 11:28:01 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 09 Aug 2005 19:28:01 +1000 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <2B24D218-9919-4CF7-AEF2-7335B8360878@fuhm.net> References: <20050806102342.GA11309@mems-exchange.org> <20050808034756.GA16756@mems-exchange.org> <20050808154157.GA28005@panix.com> <2B24D218-9919-4CF7-AEF2-7335B8360878@fuhm.net> Message-ID: <42F87721.5020603@gmail.com> James Y Knight wrote: > Hum, actually, it somewhat makes sense for the "open" builtin to > become what is now "codecs.open", for convenience's sake, although it > does blur the distinction between a byte stream and a character > stream somewhat. If that happens, I suppose it does actually make > sense to give "makefile" the same signature. We could always give the text mode/binary mode distinction in "open" a real meaning - text mode deals with character sequences, binary mode deals with byte sequences. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From mwh at python.net Tue Aug 9 11:50:08 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 09 Aug 2005 10:50:08 +0100 Subject: [Python-Dev] Major revision of PEP 348 committed In-Reply-To: (Steven Bethard's message of "Tue, 9 Aug 2005 00:28:08 -0600") References: <000401c59b36$01226de0$e410c797@oemcomputer> Message-ID: <2my87bwib3.fsf@starship.python.net> Steven Bethard writes: > Raymond Hettinger wrote: >> If the PEP can't resist the urge to create new intermediate groupings, >> then start by grepping through tons of Python code to find-out which >> exceptions are typically caught on the same line. That would be a >> worthwhile empirical study and may lead to useful insights. > > I was curious, so I did a little grepping (ok, os.walking and > re.findalling) ;-) through the Python source. The only exceptions > that were caught together more than 5 times were: > > AttributeError and TypeError (23 instances) in > code.py > doctest.py > linecache.py > mailbox.py > idlelib/rpc.py > lib-old/newdir.py > lib-tk/Tkinter.py > test/test_descr.py > test/test_file.py > test/test_funcattrs.py > test/test_os.py > Though these occur in a few different contexts, one relatively common > one was when the code tried to set a possibly read-only attribute. This TypeError/AttributeError one is long known, and a bit of a mess, really. Finding an attribute usually fails because the object is not of the expected type, after all. > ImportError and AttributeError (9 instances), in > getpass.py > locale.py > pydoc.py > tarfile.py > xmlrpclib.py > lib-tk/tkFileDialog.py > test/test_largefile.py > test/test_tarfile.py > This seemed to be used when an incompatible module might be present. > (Attributes were tested to make sure the module was the right one.) > Also used when code tried to use "private" module attributes (e.g. > _getdefaultlocale()). This seems like ever-so-faintly lazy programming to me, but maybe that's overly purist. > OverflowError and ValueError (9 instances), in > csv.py > ftplib.py > mhlib.py > mimify.py > warnings.py > test/test_resource.py > These were generally around a call to int(x). I assume they're > generally unnecessary now that int() silently converts to longs. Yes, I think so. > IOError and OSError (6 instances), in > pty.py > tempfile.py > whichdb.py > distutils/dir_util.py > idlelib/configHandler.py > test/test_complex.py > These were all around file/directory handling that I didn't study in > too much detail. With the current hierarchy, there's no reason these > couldn't just be catching EnvironmentError anyway. Heh. I'd have to admit that I rarely know which of IOError or OSError I should be expecting in a given situation, nor that EnvironmentError is a common subclass that I could catch instead... [...] > Anyway, I know PEP 348's been scaled back at this point anyway, but I > figured I might as well post my findings in case anyone was curious. Was interesting, thanks! Cheers, mwh -- On a scale of One to AWESOME, twisted.web is PRETTY ABSTRACT!!!! -- from Twisted.Quotes From ncoghlan at gmail.com Tue Aug 9 12:30:03 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 09 Aug 2005 20:30:03 +1000 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <000a01c59ca5$47627960$803dc797@oemcomputer> References: <000a01c59ca5$47627960$803dc797@oemcomputer> Message-ID: <42F885AB.8040104@gmail.com> Raymond Hettinger wrote: > TerminatingException > -------------------- > > The rationale for adding TerminatingException needs to be developed or > reconsidered. AFAICT, there hasn't been an exploration of existing code > bases to determine that there is going to be even minimal use of "except > TerminatingException". > > Are KeyboardInterrupt and SystemExit often caught together on the same > line and handled in the same way? Yes, to avoid the current overbroad inheritance of "except Exception:" by intercepting and reraising these two terminating exceptions. > If so, isn't "except TerminatingException" less explicit, clear, and > flexible than "except (KeyboardInterrupt, SystemExit)"? No, TerminatingException makes it explicit to the reader what is going on - special handling is being applied to any exceptions that indicate the interpreter is expected to exit as a result of the exception. Using "except (KeyboardInterrupt, SystemExit):" is less explicit, as it relies on the reader knowing that these two exceptions share the common characteristic that they are generally meant to terminate the Python interpreter. > Are there any benefits sufficient to warrant yet another new built-in? > Does it also warrant violating FIBTN by introducing more structure? > While I'm clear on why KeyboardInterrupt and SystemExit were moved from > under Exception, it is not at all clear what problem is being solved by > adding a new intermediate grouping. The main benefits of TerminatingException lie in easing the transition to Py3k. After transition, "except Exception:" will already do the right thing. However, TerminatingException will still serve a useful documentational purpose, as it sums up in two words the key characteristic that caused KeyboardInterrupt and SystemExit to be moved out from underneath Exception. > Bare excepts defaulting to Exception > ------------------------------------ > > After further thought, I'm not as sure about this one and whether it is > workable. The code fragment above highlights the issue. In a series of > except clauses, each line only matches what was not caught by a previous > clause. This is a useful and basic part of the syntax. It leaves a > bare except to have the role of a final catchall (much like a default in > C's switch-case). If one line uses "except Exception", then a > subsequence bare except should probably catch KeyboardInterrupt and > SystemExit. Otherwise, there is a risk of creating optical illusion > errors (code that looks like it should work but is actually broken). > I'm not certain on this one, but the PEP does need to fully explore the > implications and think-out the consequent usability issues. I'm also concerned about this one. IMO, bare excepts in Python 3k should either not be allowed at all (use "except BaseException:" intead), or they should be synonyms for "except BaseException:". Having a bare except that doesn't actually catch everything just seems wrong - and we already have style guides that say "except Exception:" is to be generally preferred over a bare except. Consenting adults and all that. . . Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From pedronis at strakt.com Tue Aug 9 13:37:17 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Tue, 09 Aug 2005 13:37:17 +0200 Subject: [Python-Dev] __traceback__ and reference cycles In-Reply-To: <1f7befae05080818127ad30e63@mail.gmail.com> References: <20050808083106.GA15924@code1.codespeak.net> <1f7befae05080818127ad30e63@mail.gmail.com> Message-ID: <42F8956D.9060702@strakt.com> Tim Peters wrote: > > I can't think of a Python feature with a higher aggregate > braincell_burned / benefit ratio than __del__ methods. If P3K retains > them-- or maybe even before --we should consider taking "the Java > dodge" on this one. That is, decree that henceforth a __del__ method > will get invoked by magic at most once on any given object O, no > matter how often O is resurrected. > Jython __del__ support is already layered on Java finalize, so that's what one gets. From barry at python.org Tue Aug 9 14:38:14 2005 From: barry at python.org (Barry Warsaw) Date: Tue, 09 Aug 2005 08:38:14 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <1f7befae05080816294bbc1100@mail.gmail.com> References: <1f7befae05072819142c36e610@mail.gmail.com> <1f7befae0507281959abc2a7c@mail.gmail.com> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de> <20050808224912.GA11584@ActiveState.com> <1f7befae05080816294bbc1100@mail.gmail.com> Message-ID: <1123591094.11608.431.camel@presto.wooz.org> On Mon, 2005-08-08 at 19:29, Tim Peters wrote: > > Currently with svn you have to manually specify those 9 to be sure to not > > include the remaining one. With p4 you just say to check-in the whole tree > > and then remove that one from the list give you in your editor with entering > > the check-in message. Not that big of a deal. > > As a purely theoretical exercise , the last time I faced that > under SVN, I opened the single file I didn't want to check-in in my > editor, did "svn revert" on it from the cmdline, checked in the whole > tree, and then hit the editor's "save" button. This doesn't scale > well to skipping 25 of 50, but it's effective enough for 1 or 2. Or one could use a decent client, like say psvn under XEmacs which presents you a list of all modified files and lets you select which ones you want to commit. The one thing I dislike about svn (in my day-to-day use of it) is that it can take a VERY long time to do updates at the roots of very large trees. I once tried to check out the root of our dev tree, which contains all branches and tags. Of course the initial checkout took forever. But an update at the root made this approach unusable. svn would sit there, seemingly idle for 30-45 minutes and then take another 30-45 minutes updating the changes, which typically consisted of maybe 50 files out of thousands. And this on a gig LAN with fast h/w all around (and for Tim's sake I won't even complain about how some operating systems appear to perform much worse than others :). The smaller you can keep your working copies, the better. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050809/6a70e6d2/attachment.pgp From gvanrossum at gmail.com Tue Aug 9 17:03:12 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue, 9 Aug 2005 08:03:12 -0700 Subject: [Python-Dev] Generalised String Coercion In-Reply-To: <42F87721.5020603@gmail.com> References: <20050806102342.GA11309@mems-exchange.org> <20050808034756.GA16756@mems-exchange.org> <20050808154157.GA28005@panix.com> <2B24D218-9919-4CF7-AEF2-7335B8360878@fuhm.net> <42F87721.5020603@gmail.com> Message-ID: On 8/9/05, Nick Coghlan wrote: > We could always give the text mode/binary mode distinction in "open" a real > meaning - text mode deals with character sequences, binary mode deals with > byte sequences. I thought that's what I proposed before. I'm still for it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Tue Aug 9 19:17:42 2005 From: bcannon at gmail.com (Brett Cannon) Date: Tue, 9 Aug 2005 10:17:42 -0700 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <000a01c59ca5$47627960$803dc797@oemcomputer> References: <000a01c59ca5$47627960$803dc797@oemcomputer> Message-ID: On 8/8/05, Raymond Hettinger wrote: > [Brett Cannon] > > At this point the only > > changes to the hierarchy are the addition of BaseException and > > TerminatingException, and the change of inheritnace for > > KeyboardInterrupt, SystemExit, and NotImplementedError. > > TerminatingException > -------------------- > > The rationale for adding TerminatingException needs to be developed or > reconsidered. AFAICT, there hasn't been an exploration of existing code > bases to determine that there is going to be even minimal use of "except > TerminatingException". > > Are KeyboardInterrupt and SystemExit often caught together on the same > line and handled in the same way? > The problem with existing code checking for this situation is that the situation itself is not the same as it will be if bare 'except's change:: try: ... except: ... except TerminatingException: ... has never really been possible before, but will be if the PEP goes forward. > If so, isn't "except TerminatingException" less explicit, clear, and > flexible than "except (KeyboardInterrupt, SystemExit)"? Do we need a > second way to do it? > But what if we add other exceptions that don't inherit from Exception that was want to typically propagate up? Having a catch-all for exceptions that a bare 'except' will skip that is more explicit than ``except BaseException`` seems reasonable to me. As Nick said in another email, it provides a more obvoius self-documentation point to catch TerminatingException than ``(KeyboardInterrupt, SystemExit)``, plus you get some future-proofing on top of it in case we add more exceptions that are not caught by a bare 'except'. > Doesn't the new meaning of Exception already offer a better idiom: > > try: > suite() > except Exception: > log_or_recover() > except: > handle_terminating_exceptions() > else: > > Are there any benefits sufficient to warrant yet another new built-in? > Does it also warrant violating FIBTN by introducing more structure? > While I'm clear on why KeyboardInterrupt and SystemExit were moved from > under Exception, it is not at all clear what problem is being solved by > adding a new intermediate grouping. > > The PEP needs to address all of the above. Right now, it contains a > definition rather than justification, research, and analysis. > > > > WindowsError > ------------ > > This should be kept. Unlike module specific exceptions, this exception > occurs in multiple places and diverse applications. It is appropriate > to list as a builtin. > > "Too O/S specific" is not a reason for eliminating this. Looking at the > codebase there does not appear to be a good substitute. Eliminating > this one would break code, decrease clarity, and cause modules to grow > competing variants. > I unfortunately forgot to add that the exception would be moved under os, so it would be more of a renaming than a removal. The reason I pulled it was that Guido said UnixError and MacError didn't belong, so why should WindowsError stay? Obviously there are backwards-compatibility issues with removing it, but why should we have this platform-specific thing in the built-in namespace? Nothing else is platform-specific in the language until you go into the stdlib. The language itself is supposed to be platform-agnostic, and yet here is this exception that is not meant to be used by anyone but by a specific OS. Seems like a contradiction to me. > After the change, nothing would be better and many things would be > worse. > > > > NotImplementedError > ------------------- > Moving this is fine. Removing unnecessary nesting is a step forward. > The PEP should list that as a justification. > Yay, something uncontraversial! =) > > > Bare excepts defaulting to Exception > ------------------------------------ > > After further thought, I'm not as sure about this one and whether it is > workable. The code fragment above highlights the issue. In a series of > except clauses, each line only matches what was not caught by a previous > clause. This is a useful and basic part of the syntax. It leaves a > bare except to have the role of a final catchall (much like a default in > C's switch-case). If one line uses "except Exception", then a > subsequence bare except should probably catch KeyboardInterrupt and > SystemExit. Otherwise, there is a risk of creating optical illusion > errors (code that looks like it should work but is actually broken). > I'm not certain on this one, but the PEP does need to fully explore the > implications and think-out the consequent usability issues. > This is Guido's thing. You will have to convince him of the change. I can flesh out the PEP to argue for which ever result he wants, but that part of the proposal is in there because Guido wanted it. I am just a PEP lackey in this case. =) > > > And once that is settled I guess it is either time for pronouncement > > or it just sits there until Python 3.0 actually starts to come upon > > us. > > What happened to "don't take this too seriously, I'm just trying to get > the ball rolling"? > Nothing, it's called writing the email when I was tired and while I was trying to fall asleep realizing what I had done. =) It still needs to go out to c.l.py and will probably sit for a long while unpronounced. That's the reason I was saying that the transition plan needs to be fleshed out with 2.x, 2.x+1 version numbers instead of concrete ones like 2.5 . -Brett From jack at performancedrivers.com Tue Aug 9 19:53:38 2005 From: jack at performancedrivers.com (Jack Diederich) Date: Tue, 9 Aug 2005 13:53:38 -0400 Subject: [Python-Dev] Major revision of PEP 348 committed In-Reply-To: References: <000401c59b36$01226de0$e410c797@oemcomputer> Message-ID: <20050809175338.GB1365@performancedrivers.com> On Tue, Aug 09, 2005 at 12:28:08AM -0600, Steven Bethard wrote: > Raymond Hettinger wrote: > > If the PEP can't resist the urge to create new intermediate groupings, > > then start by grepping through tons of Python code to find-out which > > exceptions are typically caught on the same line. That would be a > > worthwhile empirical study and may lead to useful insights. > > I was curious, so I did a little grepping (ok, os.walking and > re.findalling) ;-) through the Python source. The only exceptions > that were caught together more than 5 times were: > > AttributeError and TypeError (23 instances) > ImportError and AttributeError (9 instances) > OverflowError and ValueError (9 instances) > IOError and OSError (6 instances) I grepped my own source (ok, find, xargs, and grep'd ;) and here is what I found. 40 KLOCs, it is a web app so I mainly catch multiple exceptions when interpreting URLs and doing type convertions. Unexpected quacks from inside the app are allowed to rise to the top because at that point all the input should be in a good state. All of these arise because more than one operation is happening in the try/except each of which could raise an exception (even if it is a one-liner). ValueError, TypeError (6 instances) Around calls to int() like foo = int(cgi_dict.get('foo', None)) This is pretty domain specific, cgi variables are in a dict-alike object that returns None for missing keys. If it was a proper dict instead this pairing would be (ValueError, KeyError). The rest are a variation on the above where the result is used in the same couple lines to do some kind of a lookup in a dict, list, or namespace. client_id = int(cgi_dict.get('foo', None)) client_name = names[client_id] ValueError, TypeError, AttributeError (2 instances) ValueError, TypeError, KeyError (3 instances) ValueError, TypeError, IndexError (3 instances) And finally this one because bsddb can say "Failed" in more than one way. IOError, bsddb.error (2 incstances) btree = bsddb.btopen(self.filename, open_type) -Jack From nas at arctrix.com Tue Aug 9 21:19:13 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 9 Aug 2005 13:19:13 -0600 Subject: [Python-Dev] Sourceforge CVS down? Message-ID: <20050809191913.GB22038@mems-exchange.org> I've been getting: ssh: connect to host cvs.sourceforge.net port 22: Connection refused for the past few hours. Their "Site News" doesn't say anything about downtime. Neil From martin at v.loewis.de Tue Aug 9 22:05:58 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 09 Aug 2005 22:05:58 +0200 Subject: [Python-Dev] Sourceforge CVS down? In-Reply-To: <20050809191913.GB22038@mems-exchange.org> References: <20050809191913.GB22038@mems-exchange.org> Message-ID: <42F90CA6.5090704@v.loewis.de> Neil Schemenauer wrote: > I've been getting: > > ssh: connect to host cvs.sourceforge.net port 22: Connection refused > > for the past few hours. Their "Site News" doesn't say anything > about downtime. I'm seeing the same. Martin From tim.peters at gmail.com Tue Aug 9 22:06:45 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 9 Aug 2005 16:06:45 -0400 Subject: [Python-Dev] Sourceforge CVS down? In-Reply-To: <20050809191913.GB22038@mems-exchange.org> References: <20050809191913.GB22038@mems-exchange.org> Message-ID: <1f7befae050809130638398dbe@mail.gmail.com> [Neil Schemenauer[ > I've been getting: > > ssh: connect to host cvs.sourceforge.net port 22: Connection refused > > for the past few hours. Their "Site News" doesn't say anything > about downtime. A cvs update doesn't work for me either now. I did finish one sometime before noon (EDT) today, though. From eric.nieuwland at xs4all.nl Tue Aug 9 22:32:50 2005 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Tue, 9 Aug 2005 22:32:50 +0200 Subject: [Python-Dev] PEP 348 and ControlFlow Message-ID: Dear all, Sorry to bring this up again, but I think there is an inconsistency in PEP 348 in its current formulation. From PEP: "In Python 2.4, a bare except clause will catch any and all exceptions. Typically, though, this is not what is truly desired. More often than not one wants to catch all error exceptions that do not signify a "bad" interpreter state. In the new exception hierarchy this is condition is embodied by Exception. Thus bare except clauses will catch only exceptions inheriting from Exception." So, bare except will catch anything that is an Exception. This includes GeneratorExit and StopIteration, which contradicts: "It has been suggested that ControlFlowException should inherit from Exception. This idea has been rejected based on the thinking that control flow exceptions typically should not be caught by bare except clauses, whereas Exception subclasses should be." To me this means GeneratorExit and StopIteration are to be taken out of the Exception subtree. It seems to me rather awkward to put them at the same level as Exception and TerminatingException. So there comes the old (yeah, I know REJECTED) idea of a ControlFlowException class, right next to Exception and TerminatingException: BaseException +TerminatingException + ... + Exception + ... + ControlFlowException + GeneratorExit + StopIteration Is my logic flawed (again ;-)? --eric Eric Nieuwland From gvwilson at cs.utoronto.ca Tue Aug 9 15:05:14 2005 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Tue, 9 Aug 2005 09:05:14 -0400 (EDT) Subject: [Python-Dev] PSF grant / contacts Message-ID: Hi, I'm working with support from the Python Software Foundation to develop an open source course on basic software development skills for people with backgrounds in science and engineering. I have a beta version of the course notes ready for review, and would like to pull in Python-friendly people in sci&eng to look it over and give me feedback. If you know people who fit this bill (particularly people who might be interested in following along with a trial run of the course this fall), I'd be grateful for pointers. Thanks, Greg Wilson From raymond.hettinger at verizon.net Wed Aug 10 01:15:17 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 09 Aug 2005 19:15:17 -0400 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: Message-ID: <001701c59d38$3517aa80$302ac797@oemcomputer> [Brett] > The problem with existing code checking for this situation is that the > situation itself is not the same as it will be if bare 'except's > change:: > > try: > ... > except: > ... > except TerminatingException: > ... > > has never really been possible before, but will be if the PEP goes > forward. That's not an improvement. The above code fragment should trigger a gag reflex indicating that something is wrong with the proposed default for a bare except. > Having a catch-all for > exceptions that a bare 'except' will skip that is more explicit than > ``except BaseException`` seems reasonable to me. The data gathered by Jack and Steven's research indicate that the number of cases where TerminatingException would be useful is ZERO. Try not to introduce a new builtin that no one will ever use. Try not to add a new word whose only function is to replace a two-word tuple (TOOWTDI). Try not to unnecessarily nest the tree (FITBN). Try not to propose solutions to problems that don't exist (PBP). Raymond From stephan.richter at tufts.edu Wed Aug 10 00:24:44 2005 From: stephan.richter at tufts.edu (Stephan Richter) Date: Tue, 9 Aug 2005 18:24:44 -0400 Subject: [Python-Dev] PSF grant / contacts In-Reply-To: References: Message-ID: <200508091824.45192.stephan.richter@tufts.edu> On Tuesday 09 August 2005 09:05, Greg Wilson wrote: > I'm working with support from the Python Software Foundation to develop an > open source course on basic software development skills for people with > backgrounds in science and engineering. ?I have a beta version of the > course notes ready for review, and would like to pull in Python-friendly > people in sci&eng to look it over and give me feedback. ?If you know > people who fit this bill (particularly people who might be interested in > following along with a trial run of the course this fall), I'd be grateful > for pointers. Yeah, I would be interested. I have taught my fellow grad students last semester Python, but the docs out there were not that good for teaching scientific data analysis. I am planning to repeat the course with Physics undergrad students this Fall. If you could send me the material, I would appreciate it. Regards, Stephan -- Stephan Richter CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student) Web2k - Web Software Design, Development and Training From theller at python.net Wed Aug 10 09:22:32 2005 From: theller at python.net (Thomas Heller) Date: Wed, 10 Aug 2005 09:22:32 +0200 Subject: [Python-Dev] Sourceforge CVS down? References: <20050809191913.GB22038@mems-exchange.org> <1f7befae050809130638398dbe@mail.gmail.com> Message-ID: <7jeu6ytj.fsf@python.net> Tim Peters writes: > [Neil Schemenauer[ >> I've been getting: >> >> ssh: connect to host cvs.sourceforge.net port 22: Connection refused >> >> for the past few hours. Their "Site News" doesn't say anything >> about downtime. > > A cvs update doesn't work for me either now. I did finish one > sometime before noon (EDT) today, though. They've been upgrading the CVS server hardware. See the 'site status' page http://sourceforge.net/docman/display_doc.php?group_id=1&docid=2352 Thomas From fredrik at pythonware.com Wed Aug 10 12:53:28 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 10 Aug 2005 12:53:28 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion References: <42E93940.6080708@v.loewis.de><1122605323.9670.11.camel@geddy.wooz.org><1f7befae0507281959abc2a7c@mail.gmail.com><1122607673.9665.38.camel@geddy.wooz.org><87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp><1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de><66d0a6e105080312181e25fa08@mail.gmail.com><42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> Message-ID: Nicholas Bastin wrote: > It's a mature product. I would hope that that would count for > something. I've had enough corrupted subversion repositories that I'm > not crazy about the thought of using it in a production system. I > know I'm not the only person with this experience. compared to Perforce, SVN is extremely fragile. I've used both for years, and I've never had Perforce repository break down on me. our SVN repositories are relatively stable these days, but the clients are still buggy as hell (mostly along the "I don't feel like doing this today, despite the fact that it worked yesterday, and I don't feel like telling you what's wrong either" lines. having to nuke workspaces from time to time gets boring, quickly.) in contrast, Perforce just runs and runs and runs. the clients always do what you tell them. and server maintenance is trivial; just make sure that the server starts when the host computer boots, and if you have enough disk, just leave it running. if you're tight on disk space, trim away some log files now and then. that's it. but despite this, if all you need is a better CVS, I'd say SVN is good enough for today's python-dev. I'd still think that a more distributed, mail-driven system (built on top of Mercurial, Bazaar-NG, or some such (*)) would speed up both development and patch processing, and also make it a lot easier for "casual contributors" and "drive-by developers" to help develop Python, but that's another story. *) being able to ship a fully working Python-powered SCM with the Python source code would be an extra coolness bonus, of course. From gvanrossum at gmail.com Wed Aug 10 16:32:27 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 10 Aug 2005 07:32:27 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: References: <42E93940.6080708@v.loewis.de> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> Message-ID: On 8/10/05, Fredrik Lundh wrote: > in contrast, Perforce just runs and runs and runs. the clients always > do what you tell them. and server maintenance is trivial; just make sure > that the server starts when the host computer boots, and if you have > enough disk, just leave it running. if you're tight on disk space, trim > away some log files now and then. that's it. We've used P4 at Elemental for two years now; I mostly agree with this assessment, although occasionally the server becomes unbearably slow and a sysadmin does some painful magic to rescue it. Maybe that's just because the box is underpowered. More troublesome is that I've seen a few client repositories getting out of sync; one developer spent a lot of time tracking down mysterious compilation errors that went away after forced resync'ing. We never figured out the cause, but (since he swears he didn't touch the affected files) most likely hitting ^C during a previous sync could've broken some things. Another problem with P4 is that local operation is lousy -- if you can't reach the server, you can't do *anything* -- while svn always lets you edit and diff. Also, P4 has *no* command to tell you which files you've created without adding them to the repository yet -- so the most frequent build breakage is caused by missing new files. Finally, while I hear that P4's branching support is superior over SVN's, I find it has a steep learning curve -- almost every developer needs some serious hand-holding before they understand P4 branches correctly. I'm intrigued by Linus Torvald's preference for extremely distributed source control, but I have no experience and it seems a bit, um, experimental. Someone should contact Steve Alexander, who I believe is excited about Bazaar-NG. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at discworld.dyndns.org Wed Aug 10 17:03:14 2005 From: python at discworld.dyndns.org (Charles Cazabon) Date: Wed, 10 Aug 2005 09:03:14 -0600 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: References: <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> Message-ID: <20050810150313.GA7757@discworld.dyndns.org> Guido van Rossum wrote: > > I'm intrigued by Linus Torvald's preference for extremely distributed > source control, but I have no experience and it seems a bit, um, > experimental. "git", which is Linus' home-grown replacement for BitKeeper, quickly attracted a development community and has grown into a reasonably full-featured distributed RCS. It is apparently already stable enough for serious use. If I was trying to pick an RCS for a large, distributed project, I would at least investigate it as a possibility. Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://pyropus.ca/software/ ----------------------------------------------------------------------- From trentm at ActiveState.com Wed Aug 10 20:47:40 2005 From: trentm at ActiveState.com (Trent Mick) Date: Wed, 10 Aug 2005 11:47:40 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: References: <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> Message-ID: <20050810184740.GK15991@ActiveState.com> [Guido van Rossum wrote] > Also, P4 has *no* command to tell you which > files you've created without adding them to the repository yet -- so > the most frequent build breakage is caused by missing new files. This one is a frequent complaint from CVS-heads here at ActiveState. I have a p4 wrapper called "px" that extends some p4 commands (and adds a couple). One of the commands that it extends is "diff" to add a "-sn" (new) option similar to the "-se" (edit), "-sd" (delete). $ px help diff ...the usual 'p4 help diff'... new px options: [-sn -c changelist#] Px adds another -s option: -sn Local files not in the p4 client. Px also adds the --skip option (which only makes sense together with -sn) to specify that regularly skipped file (CVS control files, *~) should be skipped. The '-c' option can be used to limit diff'ing to files in the given changelist. '-c' cannot be used with any of the '-s' options. 'px' should grow a "px status" a la "svn|cvs status" to give a quick summary of local differences. Other additions: $ px help px 'px' entensions to 'p4': px --help Add px-specific help output to the usual 'p4 -h' and 'p4 -?'. See 'px help usage'. px -V, --version Print px-specific version information in addition to the usage 'p4 -V' output. See 'px help usage'. px -g ... Format input/output as *un*marshalled Python objects. Compare to the usual 'p4 -G ...'. See 'px help usage'. px annotate ... Identify last change to each line in given file, like 'cvs annotate' or 'p4pr.pl'. See 'px help annotate'. px backout ... Provide all the general steps for rolling back a perforce change as described in Perforce technote 14. See 'px help backout'. px changes -d ... Print the full 'p4 describe -du' output for each listed change. See 'px help changes'. px diff -sn --skip ... List local files not in the p4 depot. Useful for importing new files into a depot via 'px diff -sn --skip ./... | px -x - add'. See 'px help diff'. px diff -c ... Limit diffing to files opened in the given pending change. See 'px help diff'. px genpatch [] Generate a patch (usable by the GNU 'patch' program) from a pending or submitted chagelist. See 'px help genpatch'. Available here: http://starship.python.net/~tmick/#px Pure python. Works on Python >=2.2. Windows, Linux, Mac OS X, Unix. Trent -- Trent Mick TrentM at ActiveState.com From joseh.martins at gmail.com Wed Aug 10 20:51:36 2005 From: joseh.martins at gmail.com (Joseh Martins) Date: Wed, 10 Aug 2005 15:51:36 -0300 Subject: [Python-Dev] Python + Ping Message-ID: Hello Everybody, I?m a beginner in python dev.. Well, i need to implement a external ping command and get the results to view the output. How can I do that? Per example, i need to ping and IP address and need to know if the host is down or up. Tka a lot? From trentm at ActiveState.com Wed Aug 10 21:00:31 2005 From: trentm at ActiveState.com (Trent Mick) Date: Wed, 10 Aug 2005 12:00:31 -0700 Subject: [Python-Dev] pdb: should next command be extended? In-Reply-To: References: <20050808154503.GB28005@panix.com> Message-ID: <20050810190031.GM15991@ActiveState.com> [Ilya Sandler wrote] > > > At OSCON, Anthony Baxter made the point that pdb is currently one of the > > more unPythonic modules. > > What is unpythonic about pdb? Is this part of Anthony's presentation > online? (Google found a summary and slides from presentation but they > don't say anything about pdb's deficiencies) Kevin Altis was policing him to 5 minutes for his lightning talk so he didn't have a lot of time to elaborate. :) His slides were more of the Lawrence Lessig, quick and pithy style rather than lots of explanatory text. I think overridability, i.e. being about to subclass the Pdb stuff to do useful things, or lack of it was the main beef. Mostly Anthony was echoing comments from others' experiences with trying to work with the Pdb code. Trent -- Trent Mick TrentM at ActiveState.com From trentm at ActiveState.com Wed Aug 10 21:00:31 2005 From: trentm at ActiveState.com (Trent Mick) Date: Wed, 10 Aug 2005 12:00:31 -0700 Subject: [Python-Dev] pdb: should next command be extended? In-Reply-To: References: <20050808154503.GB28005@panix.com> Message-ID: <20050810190031.GM15991@ActiveState.com> [Ilya Sandler wrote] > > > At OSCON, Anthony Baxter made the point that pdb is currently one of the > > more unPythonic modules. > > What is unpythonic about pdb? Is this part of Anthony's presentation > online? (Google found a summary and slides from presentation but they > don't say anything about pdb's deficiencies) Kevin Altis was policing him to 5 minutes for his lightning talk so he didn't have a lot of time to elaborate. :) His slides were more of the Lawrence Lessig, quick and pithy style rather than lots of explanatory text. I think overridability, i.e. being about to subclass the Pdb stuff to do useful things, or lack of it was the main beef. Mostly Anthony was echoing comments from others' experiences with trying to work with the Pdb code. Trent -- Trent Mick TrentM at ActiveState.com From tdelaney at avaya.com Wed Aug 10 22:16:49 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Thu, 11 Aug 2005 06:16:49 +1000 Subject: [Python-Dev] Python + Ping Message-ID: <2773CAC687FD5F4689F526998C7E4E5F05CC00@au3010avexu1.global.avaya.com> Joseh Martins wrote: > I?m a beginner in python dev.. > > Well, i need to implement a external ping command and get the results > to view the output. How can I do that? > > Per example, i need to ping and IP address and need to know if the > host is down or up. python-dev is for discussion of the development *of* python, not development *with* python. This question should be posted to the python-list at python.org discussion list (or comp.lang.python newsgroup - they're the same thing) or possibly even the tutor at python.org mailing list. Tim Delaney From jcarlson at uci.edu Wed Aug 10 21:06:01 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 10 Aug 2005 12:06:01 -0700 Subject: [Python-Dev] Python + Ping In-Reply-To: References: Message-ID: <20050810120458.7812.JCARLSON@uci.edu> Your email is off-topic for python-dev, which is for the development OF Python. Repost your question on python-list. - Josiah Joseh Martins wrote: > > Hello Everybody, > > I?m a beginner in python dev.. > > Well, i need to implement a external ping command and get the results > to view the output. How can I do that? > > Per example, i need to ping and IP address and need to know if the > host is down or up. > > Tka a lot? > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jcarlson%40uci.edu From raymond.hettinger at verizon.net Wed Aug 10 23:27:47 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 10 Aug 2005 17:27:47 -0400 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: Message-ID: <000001c59df2$5bc96960$60b9958d@oemcomputer> > > WindowsError > > ------------ > > > > This should be kept. Unlike module specific exceptions, this exception > > occurs in multiple places and diverse applications. It is appropriate > > to list as a builtin. > > > > "Too O/S specific" is not a reason for eliminating this. Looking at the > > codebase there does not appear to be a good substitute. Eliminating > > this one would break code, decrease clarity, and cause modules to grow > > competing variants. [Brett] > I unfortunately forgot to add that the exception would be moved under > os, so it would be more of a renaming than a removal. Isn't OSError already used for another purpose (non-platform dependent exceptions raised by the os module)? Raymond From bcannon at gmail.com Wed Aug 10 23:34:00 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 10 Aug 2005 14:34:00 -0700 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <000001c59df2$5bc96960$60b9958d@oemcomputer> References: <000001c59df2$5bc96960$60b9958d@oemcomputer> Message-ID: On 8/10/05, Raymond Hettinger wrote: > > > WindowsError > > > ------------ > > > > > > This should be kept. Unlike module specific exceptions, this > exception > > > occurs in multiple places and diverse applications. It is > appropriate > > > to list as a builtin. > > > > > > "Too O/S specific" is not a reason for eliminating this. Looking at > the > > > codebase there does not appear to be a good substitute. Eliminating > > > this one would break code, decrease clarity, and cause modules to > grow > > > competing variants. > > [Brett] > > I unfortunately forgot to add that the exception would be moved under > > os, so it would be more of a renaming than a removal. > > Isn't OSError already used for another purpose (non-platform dependent > exceptions raised by the os module)? > Don't quite follow what that has to do with making WindowsError become os.WindowsError. Yes, OSError is meant for platform-agnostic OS errors by the os module, but how does that affect the proposed move of WindowsError? -Brett From raymond.hettinger at verizon.net Thu Aug 11 01:45:40 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 10 Aug 2005 19:45:40 -0400 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: Message-ID: <000801c59e05$9e054de0$6f14c797@oemcomputer> > > Then I don't follow what you mean by "moved under os". > > In other words, to get the exception, do ``from os import > WindowsError``. Unfortunately we don't have a generic win module to > put it under. Maybe in the platform module instead? -1 on either. The WindowsError exception needs to in the main exception tree. It occurs in too many different modules and applications. That is a good reason for being in the main tree. If the name bugs you, I would support renaming it to PlatformError or somesuch. That would make it free for use with Mac errors and Linux errors. Also, it wouldn't tie a language feature to the name of an MS product. Raymond From bcannon at gmail.com Thu Aug 11 02:06:22 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 10 Aug 2005 17:06:22 -0700 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <000801c59e05$9e054de0$6f14c797@oemcomputer> References: <000801c59e05$9e054de0$6f14c797@oemcomputer> Message-ID: On 8/10/05, Raymond Hettinger wrote: > > > Then I don't follow what you mean by "moved under os". > > > > In other words, to get the exception, do ``from os import > > WindowsError``. Unfortunately we don't have a generic win module to > > put it under. Maybe in the platform module instead? > > -1 on either. The WindowsError exception needs to in the main exception > tree. It occurs in too many different modules and applications. That > is a good reason for being in the main tree. > Where is it used so much? In the stdlib, grepping for WindowsError recursively in Lib in 2.4 turns up only one module raising it (subprocess) and only two modules with a total of three places of catching it (ntpath once, urllib twice). In Module, there are no hits. > If the name bugs you, I would support renaming it to PlatformError or > somesuch. That would make it free for use with Mac errors and Linux > errors. Also, it wouldn't tie a language feature to the name of an MS > product. > I can compromise to this if others prefer this alternative. Anybody else have an opinion? -Brett From aahz at pythoncraft.com Thu Aug 11 02:16:06 2005 From: aahz at pythoncraft.com (Aahz) Date: Wed, 10 Aug 2005 17:16:06 -0700 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: References: <000801c59e05$9e054de0$6f14c797@oemcomputer> Message-ID: <20050811001606.GA14208@panix.com> On Wed, Aug 10, 2005, Brett Cannon wrote: > On 8/10/05, Raymond Hettinger wrote: >> >> If the name bugs you, I would support renaming it to PlatformError or >> somesuch. That would make it free for use with Mac errors and Linux >> errors. Also, it wouldn't tie a language feature to the name of an MS >> product. > > I can compromise to this if others prefer this alternative. Anybody > else have an opinion? Googling for "windowserror python" produces 800 hits. So yes, it does seem to be widely used. I'm -0 on renaming; +1 on leaving things as-is. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From trentm at ActiveState.com Thu Aug 11 02:25:38 2005 From: trentm at ActiveState.com (Trent Mick) Date: Wed, 10 Aug 2005 17:25:38 -0700 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: References: <000801c59e05$9e054de0$6f14c797@oemcomputer> Message-ID: <20050811002538.GA27433@ActiveState.com> [Brett Cannon wrote] > Where is it used so much? In the stdlib, grepping for WindowsError > recursively in Lib in 2.4 turns up only one module raising it > (subprocess) and only two modules with a total of three places of > catching it (ntpath once, urllib twice). In Module, there are no > hits. Just a data point (not really following this thread): The PyWin32 sources raise WindowsError twice (one of them is win32\Demos\winprocess.py which is probably where subprocess got it from) an catches it in 11 places. Trent -- Trent Mick TrentM at ActiveState.com From bcannon at gmail.com Thu Aug 11 02:47:39 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 10 Aug 2005 17:47:39 -0700 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <20050811001606.GA14208@panix.com> References: <000801c59e05$9e054de0$6f14c797@oemcomputer> <20050811001606.GA14208@panix.com> Message-ID: On 8/10/05, Aahz wrote: > On Wed, Aug 10, 2005, Brett Cannon wrote: > > On 8/10/05, Raymond Hettinger wrote: > >> > >> If the name bugs you, I would support renaming it to PlatformError or > >> somesuch. That would make it free for use with Mac errors and Linux > >> errors. Also, it wouldn't tie a language feature to the name of an MS > >> product. > > > > I can compromise to this if others prefer this alternative. Anybody > > else have an opinion? > > Googling for "windowserror python" produces 800 hits. So yes, it does > seem to be widely used. I'm -0 on renaming; +1 on leaving things as-is. But Googling for "attributeerror python" turns up 94,700, a factor of over 118. OSError turns up 20,300 hits; a factor of 25. Even EnvironmentError turns up more at 5,610 and I would expect most people don't use this class directly that often. While 800 might seem large, it's puny compared to other exceptions. Plus, if you look at the first 10 hits, 4 are from PEP 348, one of which is the top hit. =) -Brett From kbk at shore.net Thu Aug 11 03:34:43 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Wed, 10 Aug 2005 21:34:43 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200508110134.j7B1YhHG024463@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 352 open ( -2) / 2896 closed ( +8) / 3248 total ( +6) Bugs : 913 open ( +4) / 5162 closed (+10) / 6075 total (+14) RFE : 191 open ( +0) / 178 closed ( +0) / 369 total ( +0) New / Reopened Patches ______________________ compiler package: "global a; a=5" (2005-08-04) http://python.org/sf/1251748 opened by Armin Rigo Simplying Tkinter's event loop (2005-08-05) http://python.org/sf/1252236 opened by Michiel de Hoon modulefinder misses modules (2005-08-05) http://python.org/sf/1252550 opened by Thomas Heller poplib list() docstring fix (2005-08-05) CLOSED http://python.org/sf/1252706 opened by Steve Greenland QuickTime API needs corrected object types (2005-08-09) http://python.org/sf/1254695 opened by Christopher K Davis GCC detection for runtime_library_dirs when ccache is used (2005-08-09) http://python.org/sf/1254718 opened by Seo Sanghyeon Patches Closed ______________ Faster commonprefix in macpath, ntpath, etc. (2005-01-20) http://python.org/sf/1105730 closed by birkenfeld poplib list() docstring fix (2005-08-05) http://python.org/sf/1252706 closed by birkenfeld absolute paths cause problems for MSVC (2003-10-21) http://python.org/sf/827386 closed by loewis Fix LINKCC (Bug #1189330) (2005-07-15) http://python.org/sf/1239112 closed by loewis file.encoding support for file.write and file.writelines (2005-06-04) http://python.org/sf/1214889 closed by birkenfeld st_gen and st_birthtime support for FreeBSD (2005-04-11) http://python.org/sf/1180695 closed by loewis Add unicode for sys.argv, os.environ, os.system (2005-07-02) http://python.org/sf/1231336 closed by loewis Refactoring Python/import.c (2004-12-30) http://python.org/sf/1093253 closed by theller New / Reopened Bugs ___________________ cgitb gives wrong lineno inside try:..finally: (2005-08-03) http://python.org/sf/1251026 opened by Rob W.W. Hooft Decoding with unicode_internal segfaults on UCS-4 builds (2005-08-03) http://python.org/sf/1251300 opened by nhaldimann smtplib and email.py (2005-08-03) http://python.org/sf/1251528 opened by Cosmin Nicolaescu Python 2.4.1 crashes when importing the attached script (2005-08-04) http://python.org/sf/1251631 opened by Viktor Ferenczi Fail codecs.lookup() on 'mbcs' and 'tactis' (2005-08-04) http://python.org/sf/1251921 reopened by lemburg Fail codecs.lookup() on 'mbcs' and 'tactis' (2005-08-04) http://python.org/sf/1251921 opened by liturgist Issue with telnetlib read_until not timing out (2005-08-04) http://python.org/sf/1252001 opened by padded IOError after normal write (2005-08-04) http://python.org/sf/1252149 opened by Patrick Gerken os.system on win32 can't handle pathnames with spaces (2005-08-05) CLOSED http://python.org/sf/1252733 opened by Ori Avtalion non-admin install may fail (win xp pro) (2005-07-05) CLOSED http://python.org/sf/1232947 reopened by loewis raw_input() displays wrong unicode prompt (2005-01-10) http://python.org/sf/1099364 reopened by prikryl Python interpreter unnecessarily linked against c++ runtime (2005-08-08) http://python.org/sf/1254125 opened by Zak Kipling parser fails on long non-ascii lines if coding declared (2005-08-08) CLOSED http://python.org/sf/1254248 opened by Oleg Noga Docs for list.extend() are incorrect (2005-08-08) CLOSED http://python.org/sf/1254362 opened by Kent Johnson "appropriately decorated" is undefined in MultiFile.push doc (2005-08-09) http://python.org/sf/1255218 opened by Alan float('-inf') (2005-08-10) http://python.org/sf/1255395 opened by Steven Bird bug in use of __getattribute__ ? (2005-08-10) CLOSED http://python.org/sf/1256010 opened by sylvain ferriol Bugs Closed ___________ numarray in debian python 2.4.1 (2005-08-02) http://python.org/sf/1249903 closed by birkenfeld incorrect description of range function (2005-08-02) http://python.org/sf/1250306 closed by birkenfeld isinstance() fails depending on how modules imported (2005-08-01) http://python.org/sf/1249615 closed by hgibson50 set of pdb breakpoint fails (2005-07-30) http://python.org/sf/1248127 closed by birkenfeld Fail codecs.lookup() on 'mbcs' and 'tactis' (2005-08-04) http://python.org/sf/1251921 closed by loewis os.system on win32 can't handle pathnames with spaces (2005-08-05) http://python.org/sf/1252733 closed by salty-horse distutils: MetroWerks support can go (2005-07-17) http://python.org/sf/1239948 closed by jackjansen non-admin install may fail (win xp pro) (2005-07-05) http://python.org/sf/1232947 closed by loewis segfault in os module (2005-06-24) http://python.org/sf/1226969 closed by loewis LINKCC incorrect (2005-04-25) http://python.org/sf/1189330 closed by loewis parser fails on long non-ascii lines if coding declared (2005-08-08) http://python.org/sf/1254248 closed by doerwalter Docs for list.extend() are incorrect (2005-08-08) http://python.org/sf/1254362 closed by birkenfeld bug in use of __getattribute__ ? (2005-08-10) http://python.org/sf/1256010 closed by birkenfeld From raymond.hettinger at verizon.net Thu Aug 11 03:44:05 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 10 Aug 2005 21:44:05 -0400 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <20050811001606.GA14208@panix.com> Message-ID: <000301c59e16$28d453c0$8b33c797@oemcomputer> [Brett] > I can compromise to this if others prefer this alternative. Anybody > else have an opinion? We're not opinion shopping -- we're looking for analysis. Py3.0 is not supposed to just a Python variant -- it is supposed to be better. It is not about making compromises -- it is about only making changes that are clear improvements. First, do no harm. It is an abuse of the PEP process to toss up one random idea after another with whimsical justifications, zero research, zero analysis of the implications, no respect for existing code, no recognition that the current design is somewhat successful, and contravention of basic design principles (Zen of Python). The only thing worse is wasting everyone's time by sticking to the proposals like glue when others take the time to think it through and offer sound reasons why the proposal is not a good idea. [Aahz] > Googling for "windowserror python" produces 800 hits. So yes, it does > seem to be widely used. I'm -0 on renaming; +1 on leaving things as-is. Well said. Squirreling WindowsError away in another namespace harms existing code, reduces clarity, and offers no offsetting gains. It is simply crummy design to take a multi-module, multi-application exception and push it down into a module namespace. +0 on renaming; +1 on leaving as-is. Raymond From raymond.hettinger at verizon.net Thu Aug 11 05:32:57 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 10 Aug 2005 23:32:57 -0400 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: Message-ID: <000901c59e25$5e823fa0$7c06a044@oemcomputer> > There > is a reason you listed writing a PEP on your own on the "School of > Hard Knocks" list; it isn't easy. I am trying my best here. Hang in there. Do what you can to make sure we get a result we can live with. -- R From foom at fuhm.net Thu Aug 11 16:57:21 2005 From: foom at fuhm.net (James Y Knight) Date: Thu, 11 Aug 2005 10:57:21 -0400 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <001701c59d38$3517aa80$302ac797@oemcomputer> References: <001701c59d38$3517aa80$302ac797@oemcomputer> Message-ID: <97674DB3-344F-4271-8857-8F798FA29D50@fuhm.net> On Aug 9, 2005, at 7:15 PM, Raymond Hettinger wrote: > The data gathered by Jack and Steven's research indicate that the > number > of cases where TerminatingException would be useful is ZERO. Try > not to > introduce a new builtin that no one will ever use. Try not to add > a new > word whose only function is to replace a two-word tuple (TOOWTDI). > Try > not to unnecessarily nest the tree (FITBN). Try not to propose > solutions to problems that don't exist (PBP). I disagree. TerminatingException is useful. For the immediate future, I'd like to be able to write code like this (I'm assuming that "except:" means what it means now, because changing that for Py2.5 would be insane): try: TerminatingException except NameError: # compatibility with python < 2.5 TerminatingException = (KeyboardInterrupt, SystemExit) try: foo.... except TerminatingException: raise except: print "error message" What this gets me: 1) easy backwards compatibility with earlier pythons which still have KeyboardInterrupt and SystemExit under Exception and don't provide TerminatingException 2) I still catch string exceptions, in case anyone raises one 3) Forward compatibility with pythons that add more types of terminating exceptions. James From foom at fuhm.net Thu Aug 11 19:10:28 2005 From: foom at fuhm.net (James Y Knight) Date: Thu, 11 Aug 2005 13:10:28 -0400 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <000801c59e05$9e054de0$6f14c797@oemcomputer> References: <000801c59e05$9e054de0$6f14c797@oemcomputer> Message-ID: On Aug 10, 2005, at 7:45 PM, Raymond Hettinger wrote: >>> Then I don't follow what you mean by "moved under os". >>> >> >> In other words, to get the exception, do ``from os import >> WindowsError``. Unfortunately we don't have a generic win module to >> put it under. Maybe in the platform module instead? >> > > -1 on either. The WindowsError exception needs to in the main > exception > tree. It occurs in too many different modules and applications. That > is a good reason for being in the main tree. > > If the name bugs you, I would support renaming it to PlatformError or > somesuch. That would make it free for use with Mac errors and Linux > errors. Also, it wouldn't tie a language feature to the name of an MS > product. WindowsError is an important distinction because its error codes are to be interepreted as being from Microsoft's windows error code list. That is a useful meaning. PlatformError is completely meaningless. Whether or not Python should really be raising errors with error numbers from the MS error number list instead of translating them to standard error codes is another issue...but as long as it does so, it should do so using WindowsError. James From foom at fuhm.net Thu Aug 11 23:19:29 2005 From: foom at fuhm.net (James Y Knight) Date: Thu, 11 Aug 2005 17:19:29 -0400 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <20050811114121.781B.JCARLSON@uci.edu> References: <001701c59d38$3517aa80$302ac797@oemcomputer> <97674DB3-344F-4271-8857-8F798FA29D50@fuhm.net> <20050811114121.781B.JCARLSON@uci.edu> Message-ID: <7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net> On Aug 11, 2005, at 2:41 PM, Josiah Carlson wrote: > Remember, the Exception reorganization is for Python 3.0/3k/whatever, > not for 2.5 . Huh, I could *swear* we were talking about fixing things for 2.5...but I see at least the current version of the PEP says it's talking about 3.0. If that's true, this is hardly worth discussing as 3.0 is never going to happen anyways. And here I was hoping this was an actual proposal. Ah well, then. James From jimjjewett at gmail.com Thu Aug 11 23:21:19 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 11 Aug 2005 17:21:19 -0400 Subject: [Python-Dev] [Python-checkins] python/dist/src/Objects setobject.c, 1.45, 1.46 In-Reply-To: <20050811075856.1C4B31E4004@bag.python.org> References: <20050811075856.1C4B31E4004@bag.python.org> Message-ID: (1) Is there a reason that you never shrink sets for discard/remove/pop? (set difference will do a one-time shrink, if there are enough dummy entries, but even then, it doesn't look at the %filled, so a merge-related overallocation will stick around) I note the you do the same with dicts, but I think sets are a more natural candidate for "this is the set of things I still have to process, in any order". (I suppose enforcing an order with deque may be faster -- unless I'm worried about duplicates.) (2) When adding an element, you check that if (!(so->used > n_used && so->fill*3 >= (so->mask+1)*2)) Is there any reason to use that +1? Without it, resizes will happen element sooner, but probably not much more often -- and you could avoid an add on every insert. (I suppose dictionaries have the same question.) (3) In set_merge, when finding the new size, you use (so->fill + other->used) Why so->fill? If it is different from so->used, then the extras are dummy entries that it would be good to replace. (I note that dictobject does use ->used.) -jJ From bcannon at gmail.com Thu Aug 11 23:28:01 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 11 Aug 2005 14:28:01 -0700 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net> References: <001701c59d38$3517aa80$302ac797@oemcomputer> <97674DB3-344F-4271-8857-8F798FA29D50@fuhm.net> <20050811114121.781B.JCARLSON@uci.edu> <7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net> Message-ID: On 8/11/05, James Y Knight wrote: > On Aug 11, 2005, at 2:41 PM, Josiah Carlson wrote: > > Remember, the Exception reorganization is for Python 3.0/3k/whatever, > > not for 2.5 . > > Huh, I could *swear* we were talking about fixing things for > 2.5...but I see at least the current version of the PEP says it's > talking about 3.0. If that's true, this is hardly worth discussing as > 3.0 is never going to happen anyways. > And why do you think it will never happen? Guido has already said publicly multiple times that the 2.x branch will not go past 2.9, so unless Python goes stale there will be a 3.0 release. Python 3.0 might not be around the corner, but will come eventually and this stuff needs to get done at some point. -Brett From gvanrossum at gmail.com Fri Aug 12 00:10:24 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 11 Aug 2005 15:10:24 -0700 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net> References: <001701c59d38$3517aa80$302ac797@oemcomputer> <97674DB3-344F-4271-8857-8F798FA29D50@fuhm.net> <20050811114121.781B.JCARLSON@uci.edu> <7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net> Message-ID: On 8/11/05, James Y Knight wrote: > If that's true, this is hardly worth discussing as > 3.0 is never going to happen anyways. You are wrong. So wrong. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Fri Aug 12 01:36:20 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 11 Aug 2005 19:36:20 -0400 Subject: [Python-Dev] [Python-checkins] python/dist/src/Objectssetobject.c, 1.45, 1.46 In-Reply-To: Message-ID: <000801c59ecd$7afe5620$a838c797@oemcomputer> [Jim Jewett] > (1) Is there a reason that you never shrink sets for discard/remove/pop? Yes, to avoid adding an O(n) step to what would otherwise be an O(1) operation. These tight, granular methods are so fast that even checking for potential resizes would impact their performance. Also, I was keeping the dict philosophy of shrinking only when an item is added. That approach prevents thrashing in the face of a series of alternating add/pop operations. OTOH, practicality beats purity. Set differencing demands some downsizing code (see below). > (set difference will do a one-time shrink, if there are enough dummy > entries, but even then, it doesn't look at the %filled, so a > merge-related overallocation will stick around) > > I note the you do the same with dicts, but I think sets are a more > natural candidate for "this is the set of things I still have to > process, in any order". (I suppose enforcing an order with deque may > be faster -- unless I'm worried about duplicates.) It is all about balancing trade-offs. Dummies have very little impact on iteration speed, it is the used/(mask+1) sparseness ratio that matters. Also, they have very little impact on lookup time unless the table is nearly full (and it affects not-found searches more than successful searches). Resizing is not a cheap operation. The right balance is very likely application dependent. For now, my goal is to deviate from dict code only for clear improvements (i.e. lookups based on entry rather than just the key). > (2) When adding an element, you check that > > if (!(so->used > n_used && so->fill*3 >= (so->mask+1)*2)) > > Is there any reason to use that +1? Without it, resizes will happen > element sooner, but probably not much more often -- and you could > avoid an add on every insert. > (I suppose dictionaries have the same question.) Without the +1, small dicts and sets could only hold four entries instead of five (which has shown itself to be a better cutoff point). Even if this didn't apply to sets, I still aspire to keep true to dictobject.c. That code has been thoroughly tested and tuned. By starting with mature code, I've saved years of evolution. Also, there is a maintenance benefit -- developers familiar with dictobject.c will find setobject.c to be instantly recognizable. There is only one new trick, set_swap_bodies(), and that is thoroughly commented. > (3) In set_merge, when finding the new size, you use (so->fill + other- > >used) > > Why so->fill? If it is different from so->used, then the extras are > dummy entries that it would be good to replace. > (I note that dictobject does use ->used.) The cop-out answer is that this is what is done in PyDict_Merge(). I believe the reasoning behind that design was to provide the best guess as to the maximum amount of space that could be consumed by the impending insertions. If they will all fit, then resizing is skipped. The approach reflects a design that values avoiding resizes more than it values eliminating dummy entries. AFAICT, dummy elimination is a by-product of resizing rather than its goal. With sets, I followed that design except for set differencing. In dictionaries, there is no parallel operation of mass deletion. I had to put in some control so that s-=t wouldn't leave a giant set with only a handful of non-dummy entries. This reflects the space saving goal for the Py2.5 updates. There was also a goal to eliminate redundant calls to PyObject_Hash(). The nice performance improvement was an unexpected bonus. Your questions are good. Thanks for reading the code and thinking about it. Hope you enjoy the new implementation which for the first time can outperform dicts in terms of both space and speed. Raymond From raymond.hettinger at verizon.net Fri Aug 12 01:44:54 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 11 Aug 2005 19:44:54 -0400 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: <7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net> Message-ID: <000901c59ece$acdddd40$a838c797@oemcomputer> [James Y Knight] > Huh, I could *swear* we were talking about fixing things for > 2.5...but I see at least the current version of the PEP says it's > talking about 3.0. If that's true, this is hardly worth discussing as > 3.0 is never going to happen anyways. > > And here I was hoping this was an actual proposal. Ah well, then. Whenever a 3.0 aimpoint is agreed upon, as much as possible will be introduced before then (pretty much everything that doesn't break code). Ideally, the final step to 3.0 will consist primary of dropping obsolete things that had been kept only for backwards compatibility. Raymond From anthony at interlink.com.au Fri Aug 12 02:51:55 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 11 Aug 2005 17:51:55 -0700 Subject: [Python-Dev] plans for 2.4.2 and 2.5a1 Message-ID: <200508111751.56899.anthony@interlink.com.au> So I'm currently planning for a 2.4.2 sometime around mid September. I figure we cut a release candidate either on the 7th or 14th, and a final a week later. In addition, I'd like to suggest we think about a first alpha of 2.5 sometime during March 2006, with a final release sometime around May-June. This would mean (assuming people are happy with this) we need to make a list of what's still outstanding for 2.5. There's a bunch of accepted PEPs that are waiting for code. Once that's done, there will be a final 2.4.3 sometime after or close to the 2.5 final release. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From anthony at interlink.com.au Fri Aug 12 03:02:42 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 11 Aug 2005 18:02:42 -0700 Subject: [Python-Dev] pdb: should next command be extended? In-Reply-To: References: <20050808154503.GB28005@panix.com> Message-ID: <200508111802.44357.anthony@interlink.com.au> On Monday 08 August 2005 20:13, Ilya Sandler wrote: > > At OSCON, Anthony Baxter made the point that pdb is currently one of the > > more unPythonic modules. > > What is unpythonic about pdb? Is this part of Anthony's presentation > online? (Google found a summary and slides from presentation but they > don't say anything about pdb's deficiencies) It was a lightning talk, I'll put the slides up somewhere at some point. My experience with pdb is that it's more or less impossible to extend or subclass it in any way, and the code is pretty nasty. In addition, pretty much everyone I asked "which modules in the std lib need to be seriously fixed" listed pdb first (and sometimes first, second and third). Anthony -- Anthony Baxter It's never too late to have a happy childhood. From anthony at interlink.com.au Fri Aug 12 03:14:10 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 11 Aug 2005 18:14:10 -0700 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F68C36.4090208@v.loewis.de> References: <42E93940.6080708@v.loewis.de> <42F68754.6090400@minkirri.apana.org.au> <42F68C36.4090208@v.loewis.de> Message-ID: <200508111814.13072.anthony@interlink.com.au> On Sunday 07 August 2005 15:33, Martin v. L?wis wrote: > Ah, ok. That's true. It doesn't mean you can't do proper merging > with subversion - it only means that it is harder, as you need to > figure out the revision range that you want to merge. > > If this is too painful, you can probably use subversion to store > the relevant information. For example, you could define a custom > property on the directory, last_merge_from_trunk, which you > would always update after you have done a merge operation. Then > you don't have to look through history to find out when you > last merged. This is what I do with shtoom - I have properties branchURI and branchRev on the root of the branch. I can then use these when landing the branch. It seems to work well enough for me. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From bob at redivi.com Fri Aug 12 03:18:30 2005 From: bob at redivi.com (Bob Ippolito) Date: Thu, 11 Aug 2005 15:18:30 -1000 Subject: [Python-Dev] pdb: should next command be extended? In-Reply-To: <200508111802.44357.anthony@interlink.com.au> References: <20050808154503.GB28005@panix.com> <200508111802.44357.anthony@interlink.com.au> Message-ID: <24EEDE5B-4511-40D4-9C16-8A33C4ACE1C8@redivi.com> On Aug 11, 2005, at 3:02 PM, Anthony Baxter wrote: > On Monday 08 August 2005 20:13, Ilya Sandler wrote: > >>> At OSCON, Anthony Baxter made the point that pdb is currently one >>> of the >>> more unPythonic modules. >>> >> >> What is unpythonic about pdb? Is this part of Anthony's presentation >> online? (Google found a summary and slides from presentation but they >> don't say anything about pdb's deficiencies) >> > > It was a lightning talk, I'll put the slides up somewhere at some > point. > My experience with pdb is that it's more or less impossible to > extend or > subclass it in any way, and the code is pretty nasty. In addition, > pretty > much everyone I asked "which modules in the std lib need to be > seriously > fixed" listed pdb first (and sometimes first, second and third). One thing PDB needs is a mode that runs as a background thread and opens up a socket so that another Python process can talk to it, for embedded/remote/GUI debugging. This is what IDLE, Wing, and WinPDB (haven't tried it yet ) do. Unfortunately, most of the other Python IDE's run interpreters and debuggers in-process, so it makes them unsuitable for developing GUI and embedded apps and opens you up for crashing the IDE as well as whatever code you're trying to fix. -bob From theller at python.net Fri Aug 12 10:42:09 2005 From: theller at python.net (Thomas Heller) Date: Fri, 12 Aug 2005 10:42:09 +0200 Subject: [Python-Dev] Exception Reorg PEP revised yet again References: <000801c59e05$9e054de0$6f14c797@oemcomputer> Message-ID: Brett Cannon writes: > On 8/10/05, Raymond Hettinger wrote: >> > > Then I don't follow what you mean by "moved under os". >> > >> > In other words, to get the exception, do ``from os import >> > WindowsError``. Unfortunately we don't have a generic win module to >> > put it under. Maybe in the platform module instead? >> >> -1 on either. The WindowsError exception needs to in the main exception >> tree. It occurs in too many different modules and applications. That >> is a good reason for being in the main tree. >> > > Where is it used so much? In the stdlib, grepping for WindowsError > recursively in Lib in 2.4 turns up only one module raising it > (subprocess) and only two modules with a total of three places of > catching it (ntpath once, urllib twice). In Module, there are no > hits. > I don't know how you've been grepping, but the Python api functions to raise WindowsErrors are named like PyErr_SetFromWindowsErr() or so. Typically, WindowsErrors are raised when Win32 API functions fail. In the core extension modules, I find at least mmapmodule.c, posixmodule.c, _subprocess.c, and _winreg.c raising them. It may be a bit hidden, because the docs for _winreg mention only EnvironmentError, but they are wrong: C:\>py Python 2.5a0 (#60, Jul 4 2005, 19:53:27) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import _winreg >>> _winreg.OpenKey(_winreg.HKEY_CLASSES_ROOT, "blah") Traceback (most recent call last): File "", line 1, in ? WindowsError: [Errno 2] Das System kann die angegebene Datei nicht finden >>> >> If the name bugs you, I would support renaming it to PlatformError or >> somesuch. That would make it free for use with Mac errors and Linux >> errors. Also, it wouldn't tie a language feature to the name of an >> MS product. >> > > I can compromise to this if others prefer this alternative. Anybody > else have an opinion? Win32 has the FormatError() api to convert error codes into descriptions - these descriptions are very useful, as are the error codes when you catch errors in client code. I would say as long as the Python core contains win32 specific modules like _winreg WindowsError should stay. For the name, I have no preference but I see no need to change it. Thomas PS: For ctypes, it doesn't matter if WindowsError stays or not. No problem to invent my own WindowsError if it goes away. From tzot at mediconsa.com Fri Aug 12 10:44:19 2005 From: tzot at mediconsa.com (Christos Georgiou) Date: Fri, 12 Aug 2005 11:44:19 +0300 Subject: [Python-Dev] Terminology for PEP 343 References: <000d01c57dbc$71df2420$1330cb97@oemcomputer> <2macl7xxpa.fsf@starship.python.net> Message-ID: "Michael Hudson" wrote in message news:2macl7xxpa.fsf at starship.python.net... > > Guard? Monitor? Don't really like either of these. > I know I am late, but since guard means something else, 'sentinel' (in the line of __enter__ and __exit__ interpretation) could be an alternative. Tongue in cheek. From dw at botanicus.net Fri Aug 12 11:05:46 2005 From: dw at botanicus.net (David Wilson) Date: Fri, 12 Aug 2005 10:05:46 +0100 Subject: [Python-Dev] dev listinfo page (was: Re: Python + Ping) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F05CC00@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F05CC00@au3010avexu1.global.avaya.com> Message-ID: <42FC666A.90206@botanicus.net> Hello, Would it perhaps be an idea, given the number of users posting to the dev list, to put a rather obvious warning on the listinfo page: http://mail.python.org/mailman/listinfo/python-dev Something like
Do not post general Python questions to this list! For help with Python please see the Python help page!
David. From mwh at python.net Fri Aug 12 13:29:49 2005 From: mwh at python.net (Michael Hudson) Date: Fri, 12 Aug 2005 12:29:49 +0100 Subject: [Python-Dev] plans for 2.4.2 and 2.5a1 In-Reply-To: <200508111751.56899.anthony@interlink.com.au> (Anthony Baxter's message of "Thu, 11 Aug 2005 17:51:55 -0700") References: <200508111751.56899.anthony@interlink.com.au> Message-ID: <2m8xz7wfyq.fsf@starship.python.net> Anthony Baxter writes: > So I'm currently planning for a 2.4.2 sometime around mid September. I figure > we cut a release candidate either on the 7th or 14th, and a final a week > later. Cool. I'm not sure how many outstanding bugs should be fixed before 2.4.2. Some stuff to do with files with PEP 263 style declarations? (Walter? I've lost track of these). I think I should probably just check my fix for "PyXxx_Check(x) trusts x->ob_type->tp_mro" (http://python.org/sf/1153075) in to both branches, unless someone can think of a good reason not to (Armin?). (The whole area could do with some work, really, but that's another story). > In addition, I'd like to suggest we think about a first alpha of 2.5 sometime > during March 2006, with a final release sometime around May-June. This would > mean (assuming people are happy with this) we need to make a list of what's > still outstanding for 2.5. There's a bunch of accepted PEPs that are waiting > for code. Once that's done, there will be a final 2.4.3 sometime after or > close to the 2.5 final release. I have some outstanding patches: 1) My PEP 343 implementation (http://python.org/sf/1235943). Needs reviewing, but docs are in another patch. I also recently realized that my patch is incomplete, we should accept stuff like this: with cm as (a,b,c): ... where cm.__enter__ returns a 3-sequence. My patch just allows a NAME after the 'as' pseudo-keyword (if anyone else wants to fix this, be my guest :) 2) The new-style exceptions patch (http://python.org/sf/1104669). This mostly needs documentation, but could also do with review/testing. 3) "explicit sign variable for longs" (http://python.org/sf/1177779). This is a user-invisible patch, really, so I'm not so concerned about it (I'd like to follow it up by emitting DeprecationWarnings on ob_size abuse in 2.6 and disallowing it in 2.7 -- or maybe we could even emit DeprecationWarnings in 2.5 already). 4) "__slots__ for subclasses of variable length types" (http://python.org/sf/1173475) -- this is very pie-in-the-sky and in fact the attached patch is completely broken, but I think work in this area would still be a good thing. Review the others before looking at this one, please :) ... then there's the ast-branch, of course ... Is there a 2.5 release PEP yet? Cheers, mwh -- If i don't understand lisp, it would be wise to not bray about how lisp is stupid or otherwise criticize, because my stupidity would be archived and open for all in the know to see. -- Xah, comp.lang.lisp From walter at livinglogic.de Fri Aug 12 15:34:35 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri, 12 Aug 2005 15:34:35 +0200 Subject: [Python-Dev] plans for 2.4.2 and 2.5a1 In-Reply-To: <2m8xz7wfyq.fsf@starship.python.net> References: <200508111751.56899.anthony@interlink.com.au> <2m8xz7wfyq.fsf@starship.python.net> Message-ID: <42FCA56B.5060604@livinglogic.de> Michael Hudson wrote: > Anthony Baxter writes: > >>So I'm currently planning for a 2.4.2 sometime around mid September. I figure >>we cut a release candidate either on the 7th or 14th, and a final a week >>later. > > Cool. I'm not sure how many outstanding bugs should be fixed before > 2.4.2. Some stuff to do with files with PEP 263 style declarations? > (Walter? I've lost track of these). True, there's a whole bunch of them (mostly duplicates): Bug #1076985: Incorrect behaviour of StreamReader.readline leads to crash (fixed) Bug #1089395: segfault/assert in tokenizer (fixed) Bug #1098990: codec readline() splits lines apart (fixed) Bug #1163244: Syntax error on large file with MBCS encoding (open) Bug #1175396: codecs.readline sometimes removes newline chars (open) Bug #1178484: Erroneous line number error in Py2.4.1 (open) Bug #1200686: SyntaxError raised on win32 for correct files (open, probably duplicate) Bug #1211639: parser tells invalid syntax with correct code (duplicate) Bug #1218930: Parser chokes on big files (duplicate) Bug #1225059: Line endings problem with Python 2.4.1 cause imports to fail (duplicate) Bug #1241507: StreamReader broken for byte string to byte string codecs (fixed) Bug #1251631: Python 2.4.1 crashes when importing the attached script (open, probably duplicate) Patch #1101726: Patch for potential buffer overrun in tokenizer.c (applied) Most of them are fixed. #1178484 is waiting for a final OK. Bye, Walter D?rwald From barry at python.org Fri Aug 12 15:40:22 2005 From: barry at python.org (Barry Warsaw) Date: Fri, 12 Aug 2005 09:40:22 -0400 Subject: [Python-Dev] plans for 2.4.2 and 2.5a1 In-Reply-To: <200508111751.56899.anthony@interlink.com.au> References: <200508111751.56899.anthony@interlink.com.au> Message-ID: <1123854022.10627.2.camel@geddy.wooz.org> On Thu, 2005-08-11 at 20:51, Anthony Baxter wrote: > So I'm currently planning for a 2.4.2 sometime around mid September. I figure > we cut a release candidate either on the 7th or 14th, and a final a week > later. Cool. I'd like to commit the patches in this bug report: https://sourceforge.net/tracker/index.php?func=detail&aid=900092&group_id=5470&atid=105470 which fixes a long standing hotshot bug. Any objections? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050812/33f20adc/attachment.pgp From paolo_veronelli at libero.it Fri Aug 12 17:14:24 2005 From: paolo_veronelli at libero.it (Paolino) Date: Fri, 12 Aug 2005 17:14:24 +0200 Subject: [Python-Dev] set.remove feature/bug Message-ID: <42FCBCD0.1000406@libero.it> I can't contact sourceforge bug tracker sorry. set.remove is trying to freeze sets when they are used as keys.No matter if an __hash__ method is defined. This is incoherent with Set.remove and dict.__delete__ & co. If this is a feature ,then I ask strongly to keep sets module in the stdlib for ever. Or if there is a workaround, please tell me here because python-list didn't help. class H(set): def __hash__(self):return id(self) s=H() f=set() f.add(s) f.remove(s) # this fails Regards Paolino From raymond.hettinger at verizon.net Fri Aug 12 17:44:38 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 12 Aug 2005 11:44:38 -0400 Subject: [Python-Dev] set.remove feature/bug In-Reply-To: <42FCBCD0.1000406@libero.it> Message-ID: <002701c59f54$c0074ba0$dd1ecb97@oemcomputer> [Paolino] > I can't contact sourceforge bug tracker sorry. I've added a bug report for you: www.python.org/sf/1257731 > set.remove is trying to freeze sets when they are used as keys.No matter > if an __hash__ method is defined. Will fix. Feel free to email me off-list with any questions. Raymond From tzot at mediconsa.com Fri Aug 12 18:21:57 2005 From: tzot at mediconsa.com (Christos Georgiou) Date: Fri, 12 Aug 2005 19:21:57 +0300 Subject: [Python-Dev] plans for 2.4.2 and 2.5a1 References: <200508111751.56899.anthony@interlink.com.au> <2m8xz7wfyq.fsf@starship.python.net> Message-ID: "Michael Hudson" wrote in message news:2m8xz7wfyq.fsf at starship.python.net... > Anthony Baxter writes: > >> So I'm currently planning for a 2.4.2 sometime around mid September. I >> figure >> we cut a release candidate either on the 7th or 14th, and a final a week >> later. > > Cool. I'm not sure how many outstanding bugs should be fixed before > 2.4.2. Some stuff to do with files with PEP 263 style declarations? > (Walter? I've lost track of these). This is a serious issue (spurious syntax errors). One bug about files with encoding declarations is www.python.org/sf/1163244 . So far, it seems that source files having a size of f*n+x (for some small indeterminate value of x, and f is a power of 2 like 512 or 1024) occasionally fail to compile with spurious syntax errors. (I once had a file show up the line with the "syntax error", and the reported line was comprised half from the failing line and half from the line above --unfortunately I kept the file for examination in a USB key that some colleague formatted). The syntax errors disappear if the coding declaration is removed or if some blank lines are inserted before the failing line. I think this occurs only on Windows, so it should be something to do with line endings and buffering. At the moment I'm trying to create a minimal file that when imported fails with 2.4.1 . I'll update the case as soon as I have one, but I wanted to draw some attention in python-dev in case it rings a bell. From tzot at mediconsa.com Fri Aug 12 18:32:16 2005 From: tzot at mediconsa.com (Christos Georgiou) Date: Fri, 12 Aug 2005 19:32:16 +0300 Subject: [Python-Dev] plans for 2.4.2 and 2.5a1 References: <200508111751.56899.anthony@interlink.com.au><2m8xz7wfyq.fsf@starship.python.net> Message-ID: > At the moment I'm trying to create a minimal file that when imported fails > with 2.4.1 . I'll update the case as soon as I have one, but I wanted to > draw some attention in python-dev in case it rings a bell. Please ignore my previous message --through gmane I saw only mwh's message, and after sending my reply, I got Walter's message. From jcarlson at uci.edu Fri Aug 12 18:44:09 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 12 Aug 2005 09:44:09 -0700 Subject: [Python-Dev] plans for 2.4.2 and 2.5a1 In-Reply-To: <200508111751.56899.anthony@interlink.com.au> References: <200508111751.56899.anthony@interlink.com.au> Message-ID: <20050812091538.7836.JCARLSON@uci.edu> For 2.5a1... Some exposure of _PyLong_AsByteArray() and _PyLong_FromByteArray() to Python. There was a discussion about this almost a year ago (http://python.org/sf/1023290), and no mechanism (struct format code addition, binascii.tolong/fromlong, long.tostring/fromstring, ...) actually made it into Python 2.4 . At this point, I'd be happy to get /any/ mechanism, with a preference to struct and/or binascii (I'd put them in both, if only because different groups of people people may look for them in both places, and people who use one tend to like to use that one for as much as possible, and because the code additions in both are minor). - Josiah Anthony Baxter wrote: > > So I'm currently planning for a 2.4.2 sometime around mid September. I figure > we cut a release candidate either on the 7th or 14th, and a final a week > later. > > In addition, I'd like to suggest we think about a first alpha of 2.5 sometime > during March 2006, with a final release sometime around May-June. This would > mean (assuming people are happy with this) we need to make a list of what's > still outstanding for 2.5. There's a bunch of accepted PEPs that are waiting > for code. Once that's done, there will be a final 2.4.3 sometime after or > close to the 2.5 final release. > > Anthony > -- > Anthony Baxter > It's never too late to have a happy childhood. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jcarlson%40uci.edu From bcannon at gmail.com Fri Aug 12 19:00:38 2005 From: bcannon at gmail.com (Brett Cannon) Date: Fri, 12 Aug 2005 10:00:38 -0700 Subject: [Python-Dev] Exception Reorg PEP revised yet again In-Reply-To: References: <000801c59e05$9e054de0$6f14c797@oemcomputer> Message-ID: On 8/12/05, Thomas Heller wrote: > Brett Cannon writes: > > > On 8/10/05, Raymond Hettinger wrote: > >> > > Then I don't follow what you mean by "moved under os". > >> > > >> > In other words, to get the exception, do ``from os import > >> > WindowsError``. Unfortunately we don't have a generic win module to > >> > put it under. Maybe in the platform module instead? > >> > >> -1 on either. The WindowsError exception needs to in the main exception > >> tree. It occurs in too many different modules and applications. That > >> is a good reason for being in the main tree. > >> > > > > Where is it used so much? In the stdlib, grepping for WindowsError > > recursively in Lib in 2.4 turns up only one module raising it > > (subprocess) and only two modules with a total of three places of > > catching it (ntpath once, urllib twice). In Module, there are no > > hits. > > > > I don't know how you've been grepping, but the Python api functions to > raise WindowsErrors are named like PyErr_SetFromWindowsErr() or so. > Forgot to add that to the grep statement after I discovered that. > Typically, WindowsErrors are raised when Win32 API functions fail. > In the core extension modules, I find at least mmapmodule.c, > posixmodule.c, _subprocess.c, and _winreg.c raising them. It may be a > bit hidden, because the docs for _winreg mention only EnvironmentError, > but they are wrong: > > C:\>py > Python 2.5a0 (#60, Jul 4 2005, 19:53:27) [MSC v.1310 32 bit (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> import _winreg > >>> _winreg.OpenKey(_winreg.HKEY_CLASSES_ROOT, "blah") > Traceback (most recent call last): > File "", line 1, in ? > WindowsError: [Errno 2] Das System kann die angegebene Datei nicht finden > >>> > > >> If the name bugs you, I would support renaming it to PlatformError or > >> somesuch. That would make it free for use with Mac errors and Linux > >> errors. Also, it wouldn't tie a language feature to the name of an > >> MS product. > >> > > > > I can compromise to this if others prefer this alternative. Anybody > > else have an opinion? > > Win32 has the FormatError() api to convert error codes into descriptions > - these descriptions are very useful, as are the error codes when you > catch errors in client code. > > I would say as long as the Python core contains win32 specific modules > like _winreg WindowsError should stay. For the name, I have no > preference but I see no need to change it. > OK, then it will just stay as-is. People can expect an updated PEP sometime this weekend. -Brett From python at rcn.com Fri Aug 12 19:14:11 2005 From: python at rcn.com (Raymond Hettinger) Date: Fri, 12 Aug 2005 13:14:11 -0400 Subject: [Python-Dev] plans for 2.4.2 and 2.5a1 Message-ID: <001001c59f61$43a72a00$9023a044@oemcomputer> [Josiah] > At this point, I'd be happy to get > /any/ mechanism, with a preference to struct and/or binascii Assign 1023290 to me and I'll get it done in the next month or so. Raymond From martin at v.loewis.de Fri Aug 12 23:51:35 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 12 Aug 2005 23:51:35 +0200 Subject: [Python-Dev] plans for 2.4.2 and 2.5a1 In-Reply-To: <200508111751.56899.anthony@interlink.com.au> References: <200508111751.56899.anthony@interlink.com.au> Message-ID: <42FD19E7.6060108@v.loewis.de> Anthony Baxter wrote: > So I'm currently planning for a 2.4.2 sometime around mid September. I figure > we cut a release candidate either on the 7th or 14th, and a final a week > later. I'm returning only on Sep 12 from vacation, so the Windows binaries of a release candidate would have to wait until that Monday; the 14th would suit me better. Unfortunately, I'm likely also travelling on 19..23, so the final release would have to wait until Sep 24 or so. Regards, Martin From skip at pobox.com Sat Aug 13 02:51:38 2005 From: skip at pobox.com (skip@pobox.com) Date: Fri, 12 Aug 2005 19:51:38 -0500 Subject: [Python-Dev] Hosting svn.python.org In-Reply-To: <20050812231416.638DB1E4005@bag.python.org> References: <20050812231416.638DB1E4005@bag.python.org> Message-ID: <17149.17434.157348.230440@montanaro.dyndns.org> martin> Log Message: martin> Add wush.net hosting. ... martin> + * Greg Stein suggested http://www.wush.net/subversion.php. ... I will enthusiastically cast my vote for tummy.com, Sean Reifschneider's company. Mojam leases a server there (Mojam & Musi-Cal websites running CentOS 4, Apache+mod_perl, Python, Mason, MySQLdb, Mailman, etc). Their service has been absolutely awesome. Sean is one of the python.org webmasters to boot, so he knows our culture very well already. Skip From goodger at python.org Sat Aug 13 03:57:46 2005 From: goodger at python.org (David Goodger) Date: Fri, 12 Aug 2005 21:57:46 -0400 Subject: [Python-Dev] new PEP type: Process Message-ID: <42FD539A.1060407@python.org> Barry Warsaw and I, the PEP editors, have been discussing the need for a new PEP type lately. Martin von L?wis' PEP 347 was a prime example of a PEP that didn't fit into the existing Standards Track and Informational categories. We agreed upon a new "Process" PEP type. For more information, please see PEP 1 (http://www.python.org/peps/pep-0001.html) -- the type of which has also been changed to Process. Other good examples of Process PEPs are the release schedule PEPs, and I understand there may be a new one soon. (Please cc: any PEP-related mail to peps at python.org) -- David Goodger -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 253 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20050812/18686911/signature.pgp From kbenoit at opersys.com Thu Aug 11 04:59:12 2005 From: kbenoit at opersys.com (Kristian Benoit) Date: Wed, 10 Aug 2005 22:59:12 -0400 Subject: [Python-Dev] xml.parsers.expat no userdata in callback functions Message-ID: <1123729152.23755.3.camel@localhost> In the C version of expat, handlers receive a void *userdata, but it is not the case in the python version. This means one cant parse multiple files at the same time using the same handlers. You cant pass the context current context to the handler, you must base your code on global variables which is not so nice. Thanks Please leave the cc in the mail header as I'm not subscribed to the list. Kristian From senko.rasic at gmail.com Fri Aug 12 01:40:03 2005 From: senko.rasic at gmail.com (Senko Rasic) Date: Fri, 12 Aug 2005 01:40:03 +0200 Subject: [Python-Dev] Extension to dl module to allow passing strings from native function Message-ID: <48bbc5810508111640a6bd03e@mail.gmail.com> Hi all, recently I've tried to use dl module functionality to interface with external C function. (It was a quick hack so I didn't want to write wrapper code). To my dismay I learned that call method doesn't allow passing data by reference (since strings are immutable in python) - but passing pointers around and modifying caller's data is used all the time in C, so that makes dl practically useless. I've hacked the method to allow mutable data, by allocating temporary buffers for all string arguments passed to it, calling the c function, and then constructing new strings from the data in those buffers and returning them in a tuple together with function return code. Combined with pack/unpack from struct module, this allows passing any structure to and from the external C function, so, imho, it's a useful thing to have. To my knowledge, this functionality can't be achieved in pure python programs, and there's no alternative dynamic loader module that can do it. More info with examples: http://ptlo.blogspot.com/2005/08/pyinvoke.html Source: http://software.senko.net/pub/python-dl2.tgz (the tarball contains setup.py and my dlmodule.c version, for experimenting without patching the official module, and patch made against (fairly recent) cvs version of dlmodule.c) Thoughts, comments? Could this be put in standard module, does it make sense, etc? Regards, Senko -- Senko Rasic From aahz at pythoncraft.com Sat Aug 13 06:51:26 2005 From: aahz at pythoncraft.com (Aahz) Date: Fri, 12 Aug 2005 21:51:26 -0700 Subject: [Python-Dev] new PEP type: Process In-Reply-To: <42FD539A.1060407@python.org> References: <42FD539A.1060407@python.org> Message-ID: <20050813045125.GA1985@panix.com> On Fri, Aug 12, 2005, David Goodger wrote: > > Barry Warsaw and I, the PEP editors, have been discussing the > need for a new PEP type lately. Martin von L?wis' PEP 347 was > a prime example of a PEP that didn't fit into the existing > Standards Track and Informational categories. We agreed upon a > new "Process" PEP type. For more information, please see PEP 1 > (http://www.python.org/peps/pep-0001.html) -- the type of which has > also been changed to Process. Go ahead and make PEP 6 a Process PEP. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From raymond.hettinger at verizon.net Sat Aug 13 13:34:34 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 13 Aug 2005 07:34:34 -0400 Subject: [Python-Dev] Discussion draft: Proposed Py2.5 C API for set and frozenset objects Message-ID: <000701c59ffa$fb54b480$772dc797@oemcomputer> The object and types -------------------- PySetObject subclass of object used for both sets and frozensets PySet_Type a basetype PyFrozenSet_Type a basetype The type check macros --------------------- PyFrozenSet_CheckExact(ob) a frozenset PyAnySet_CheckExact(ob) a set or frozenset PyAnySet_Check(ob) a set, frozenset, or subclass of either The constructors ---------------- obj PySet_New(it) takes an iterable or NULL; returns new ref obj PyFrozenSet_New(it) takes an iterable or NULL; returns new ref The fine grained methods ------------------------ int PySet_Size(so) int PySet_Contains(so, key) 1 for yes; 0 for no; -1 for error raises TypeError for unhashable key does not automatically convert to frozenset int PySet_Add(so, key) 0 for success; -1 for error raises TypeError for unhashable key raises MemoryError if no room to grow obj PySet_Pop(so) return new ref or NULL on failure raises KeyError if set is emtpy int PySet_Discard(so, key) 1 if found and removed 0 if not found (does not raise KeyError) -1 on error raises TypeError for unhashable key does not automatically convert to frozenset Course grained methods left for access through PyObject_CallMethod() -------------------------------------------------------------------- copy, clear, union, intersection, difference, symmetric_difference, update, intersection_update, difference_update, symmetric_difference_update issubset, issuperset, __reduce__ Other functions left for access through the existing abstract API ----------------------------------------------------------------- PyObject_RichCompareBool() PyObject_Hash() PyObject_Repr() PyObject_IsTrue() PyObject_Print() PyObject_GetIter() From martin at v.loewis.de Sat Aug 13 13:47:10 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 13 Aug 2005 13:47:10 +0200 Subject: [Python-Dev] Hosting svn.python.org In-Reply-To: <17149.17434.157348.230440@montanaro.dyndns.org> References: <20050812231416.638DB1E4005@bag.python.org> <17149.17434.157348.230440@montanaro.dyndns.org> Message-ID: <42FDDDBE.3040903@v.loewis.de> skip at pobox.com wrote: > I will enthusiastically cast my vote for tummy.com, Sean Reifschneider's > company. Mojam leases a server there (Mojam & Musi-Cal websites running > CentOS 4, Apache+mod_perl, Python, Mason, MySQLdb, Mailman, etc). Their > service has been absolutely awesome. But we don't want to lease a server - we are looking for an Subversion hoster. If we *just* wanted a server, there would be no reason to drop (*) the current svn.python.org. So what precisely is the Subversion offer of tummy.com ($/per month for what disk limit, monthly download limit, number of developers limit, backup service, email notification, ability for offsite download of the repository tarball, what access method (is svn+ssh supported, anonymous WebDAV))? In case this isn't clear yet: several people are concerned that running the Python svn repository by volunteers will risk service outage, and unnecessarily consume volunteer resources. So just replacing the machine we get for free now with a machine we have to pay for won't do any good. I understand that I could now go to tummy.com, contact them, and research all details myself. But I'm not willing to: everybody who wants to suggest a different service should find out all the details of that service, and report them so I can include them into the PEP. Regards, Martin (*) This PEP is actually not at all about svn.python.org, and the pydotorg SVN repository. Those are in the realms of the infrastructure committee, and they do a great job. The PEP is *only* about migrating the Python source code proper from CVS (along with the other code snippets that are in that CVS). From martin at v.loewis.de Sat Aug 13 13:50:06 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 13 Aug 2005 13:50:06 +0200 Subject: [Python-Dev] xml.parsers.expat no userdata in callback functions In-Reply-To: <1123729152.23755.3.camel@localhost> References: <1123729152.23755.3.camel@localhost> Message-ID: <42FDDE6E.2050309@v.loewis.de> Kristian Benoit wrote: > This means one cant parse multiple files at the same time using the same > handlers. You cant pass the context current context to the handler, you must > base your code on global variables which is not so nice. This is not true. You can create multiple parsers, and then can make the callback functions bound methods, using self to store parse-specific data. There is no need to have extra callback data. Regards, Martin From martin at v.loewis.de Sat Aug 13 13:56:48 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 13 Aug 2005 13:56:48 +0200 Subject: [Python-Dev] Extension to dl module to allow passing strings from native function In-Reply-To: <48bbc5810508111640a6bd03e@mail.gmail.com> References: <48bbc5810508111640a6bd03e@mail.gmail.com> Message-ID: <42FDE000.9080508@v.loewis.de> Senko Rasic wrote: > Thoughts, comments? Could this be put in standard module, does it make > sense, etc? Are you aware of the ctypes module? http://starship.python.net/crew/theller/ctypes/ Regards, Martin From goodger at python.org Sat Aug 13 14:38:36 2005 From: goodger at python.org (David Goodger) Date: Sat, 13 Aug 2005 08:38:36 -0400 Subject: [Python-Dev] new PEP type: Process In-Reply-To: <20050813045125.GA1985@panix.com> References: <42FD539A.1060407@python.org> <20050813045125.GA1985@panix.com> Message-ID: <42FDE9CC.8030008@python.org> [Aahz] > Go ahead and make PEP 6 a Process PEP. Done! -- David Goodger -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 253 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20050813/c4db8c5f/signature.pgp From wsanchez at wsanchez.net Sat Aug 13 18:02:00 2005 From: wsanchez at wsanchez.net (=?ISO-8859-1?Q?Wilfredo_S=E1nchez_Vega?=) Date: Sat, 13 Aug 2005 09:02:00 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: References: Message-ID: <6AA5C3D1-6AEC-4CE4-AEB9-84FBDA10EFA9@wsanchez.net> I'm curious about why Python lacks FileNotFoundError, PermissionError and the like as subclasses of IOError. Catching IOError and looking at errno to figure out what went wrong seems pretty unpythonic, and I've often wished for built-in subclasses of IOError. I sometimes subclass them myself, but a lot of the time, I'm catching such exceptions as thrown by the standard library. -wsv From gvanrossum at gmail.com Sat Aug 13 23:02:54 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 13 Aug 2005 14:02:54 -0700 Subject: [Python-Dev] xml.parsers.expat no userdata in callback functions In-Reply-To: <42FDDE6E.2050309@v.loewis.de> References: <1123729152.23755.3.camel@localhost> <42FDDE6E.2050309@v.loewis.de> Message-ID: > Kristian Benoit wrote: > > This means one cant parse multiple files at the same time using the same > > handlers. You cant pass the context current context to the handler, you must > > base your code on global variables which is not so nice. > "Martin v. L?wis" replied: > This is not true. You can create multiple parsers, and then can make the > callback functions bound methods, using self to store parse-specific > data. There is no need to have extra callback data. What he said. Kristian's complaint is probably a common misconception about Python -- not too many languages have unified the concepts of "bound methods" and "callables" so completely as Python. Every callable is in a sense a closure (or can be). Nested functions are other examples. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Sat Aug 13 23:27:22 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 13 Aug 2005 14:27:22 -0700 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <42FBA376.5030605@canonical.com> References: <42FBA376.5030605@canonical.com> Message-ID: With permission, I'm forwarding an email from Mark Shuttleworth about Bazaar-2 (aka Bazaar-NG), a distributed source control system (not entirely unlike bitkeeper, I presume) written in Python and in use by the Ubuntu system. What do people think of using this for Python? Is it the right model? Do we want to encourage widespread experimentation with the Python source code? --Guido van Rossum (home page: http://www.python.org/~guido/) ---------- Forwarded message ---------- From: Mark Shuttleworth Date: Aug 11, 2005 12:13 PM Subject: Distributed RCS To: Guido van Rossum Cc: Steve Alexander , Martin Pool , Fredrik Lundh Hi Guido Steve forwarded your mail to me regarding distributed revision control, so I thought I'd follow up with some thoughts on why I agree with Frederick Lundh that it's important, and where we are going with the Bazaar project. First, distributed RCS systems reduce the barrier to participation. Anybody can create their own branches, and begin work on features, with full revision control support, without having to coordinate with the core RCS server sysadmin. So, for example, if someone gets an idea to work on PEP X, they can simply create a branch, and start hacking on it locally, with full RCS functionality like commit and undo, and logs of their changes over time. They can easily merge continually from the trunk, to keep their branch up to date. And they can publish their branch using only a web server. With Bazaar, these branches can be personal or shared group branches. The net effect of this is to make branching a core part of the development process. Each feature gets developed on a branch, and then merged when its ready. Instead of passing patches around in email, you find yourself passing out branch references, which are much easier to deal with since they are always "up to date". In Launchpad, we have evolved to work around this branch-per-feature approach, and built a review process so that each branch gets a review before the code is merged to the trunk. It also has a positive social impact, because groups that are interested in a feautre can begin to collaborate on it immediately rather than waiting to get consensus from everybody else, they just start their branch and get more testing when it is reaching a reasonable state of maturity - then the project decides whether or not it lands. That results in less argument about whether or not a feature is a good idea before anybody really knows what it's going to look like. Those who are interested, participate, and those who aren't reserve judgement till it's done. As for Bazaar, we have just wrapped up our latest sprint, where we decided that bazaar-ng (bzr), which is being written in Python by Martin Pool, will become Bazaar 2.x, in the first quarter of 2006. The current 1.x line of development has served us well, but the ideas we developed and which have been implemented as a working bazaar-ng reference by Martin are now proven enough that I'm committing the project (Ubuntu, and all of Launchpad) to it. Martin will continue to work on it full time, and will be joined by the current Bazaar 1.x team, Robert Collins, David Allouche and James Blackwell. That makes for a substantial chunk of resources but I think it's worth it because we need a truly superb free revision control system when dealing with something as large and complex as an entire distribution. The whole of Ubuntu will be in Bazaar in due course. Currently, we have about 500 upstreams published in the Bazaar 1.x format (see http://bazaar.ubuntu.com/ for the list), all of those will be converted to Bazaar 2.x and in addition we will continue to publish more and more upstreams in the 2.x archive format. We actively convert CVS and SVN upstreams and publish them in the Bazaar format to allow us to use a single, distributed revision control system across all of those packages. So there's a lot of real-world data and real-world coding going on with Bazaar as the RCS holding it all together. Perhaps more importantly, we are integrating Bazaar tightly with the other Launchpad applications, Rosetta and Malone. This means that bug tracking and translation will be "branch aware". You will be able to close a bug by noting that a commit in one of your branches fixes the bug, then merging it into the relevant mainline branch, and have the launchpad bug tracker automatically mark the bug as closed, if you wish. Similarly you will be able to get the latest translations just by merging from the branch published by Rosetta that has the latest translations in it for your application. The combination of distributed revision control, and ultimately integrated bug tracking and translations, will I think be a very efficient platform for collaborative development. Bazaar is free, and the use of Launchpad is free though we have not yet released the code to the web services for bug tracking and translation. I hope that puts bazaar into perspective for you. Give it a spin - the development 2.x codebase is robust enough now to handle a line of development and do basic merging, we are switching our own development to the pre-release 2.x line in October, and we will switch over all the public archives we maintain in around March next year. Cheers, Mark From skip at pobox.com Sun Aug 14 01:00:37 2005 From: skip at pobox.com (skip@pobox.com) Date: Sat, 13 Aug 2005 18:00:37 -0500 Subject: [Python-Dev] cvs to bzr? Message-ID: <17150.31637.180169.877441@montanaro.dyndns.org> Based on the message Guido forwarded, I installed bazaar-ng. From Mark's note it seems they convert cvs repositories to bzr repositories, but I didn't see any mention in the bzr docs of any sort of cvs2bzr tool. Likewise, Google didn't turn up anything obvious. Anyone know of something? Thx, Skip From nas at arctrix.com Sun Aug 14 02:02:40 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Sat, 13 Aug 2005 18:02:40 -0600 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: References: <42FBA376.5030605@canonical.com> Message-ID: <20050814000240.GA5470@mems-exchange.org> On Sat, Aug 13, 2005 at 02:27:22PM -0700, Guido van Rossum wrote: > What do people think of using this for Python? I think it deserves consideration. One idea would be to have a Bazaar-NG repository that tracks the CVS SF repository. I haven't tried it yet but there is a tool called Tailor[1] that automates the task. That would give people a chance to experiment with Bazaar-NG (and still work with SF is down) without committing to it. > Is it the right model? Do we want to encourage widespread > experimentation with the Python source code? I think Python works fairly well with the centralized model. However, I expect it's hard to know what we are missing. Neil 1. http://darcs.net/DarcsWiki/Tailor From nas at arctrix.com Sun Aug 14 02:03:46 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Sat, 13 Aug 2005 18:03:46 -0600 Subject: [Python-Dev] cvs to bzr? In-Reply-To: <17150.31637.180169.877441@montanaro.dyndns.org> References: <17150.31637.180169.877441@montanaro.dyndns.org> Message-ID: <20050814000346.GB5470@mems-exchange.org> On Sat, Aug 13, 2005 at 06:00:37PM -0500, skip at pobox.com wrote: > Based on the message Guido forwarded, I installed bazaar-ng. From Mark's > note it seems they convert cvs repositories to bzr repositories, but I > didn't see any mention in the bzr docs of any sort of cvs2bzr tool. Haven't tried it but should work: http://darcs.net/DarcsWiki/Tailor From gvanrossum at gmail.com Sun Aug 14 02:11:05 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 13 Aug 2005 17:11:05 -0700 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <42FD2A13.8000900@canonical.com> References: <42FBA376.5030605@canonical.com> <42FD2A13.8000900@canonical.com> Message-ID: Another fwd, describing how Steve Alexander's group user bazaar. --Guido van Rossum (home page: http://www.python.org/~guido/) ---------- Forwarded message ---------- From: Steve Alexander Date: Aug 12, 2005 4:00 PM Subject: Re: Distributed RCS To: Guido van Rossum Cc: Mark Shuttleworth , Martin Pool , Fredrik Lundh Hi Guido, I'm not going to post to python-dev just now, because I'm leaving on 1.5 weeks vacation tomorrow, and I'd rather be absent than unable to answer questions promptly. Martin Pool will be around next week, and will be able to take part in discussions on the list. Feel free to post all or part of Mark's or my emails to the python lists. Mark wrote: > > I hope that puts bazaar into perspective for you. Give it a spin - the > development 2.x codebase is robust enough now to handle a line of > development and do basic merging, we are switching our own development > to the pre-release 2.x line in October, and we will switch over all the > public archives we maintain in around March next year. A large part of the internal development at Canonical is the Launchpad system. This is about 30-40 kloc of Python code, including various Twisted services, cron scripts, a Zope 3 web application, database tools, ... It's being worked on by 20 software developers. Everyone uses bazaar 1.4 or 1.5, and around October, we'll be switching to use bazaar 2.x. I'll describe how we work on Launchpad using Bazaar. This is all from the Bazaar 1.x perspective, and some things will become simpler when we change to using Bazaar 2.x. I've left the description quite long, as I hope it will give you some of the flavour of working with a distributed RCS. == Two modes of working: shared branches and PQM == Bazaar supports two different modes of working for a group like the Launchpad team. 1. There's a shared read/write place that all the developers have access to. This is contains the branches we release from, and represents the "trunk" of the codebase. 2. A "virtual person" called the "patch queue manager" (PQM) has exclusive write access to a collection of branches. PQM takes instructions as GPG signed emails from launchpad developers, to merge their code into PQM's branches. We use the latter mode because we have PQM configured not only to accept requests to merge code into PQM's codebase, but to run all the tests first and refuse to merge if any test fails. == The typical flow of work on Launchpad == Say I want to work on some new feature for Launchpad. What do I do? 1. I use 'baz switch' to change my working tree from whatever I was working on last, and make it become PQM's latest code. baz switch rocketfuel at canonical.com/launchpad--devel--0 "rocketfuel" is the code-name for the branches we release our code from. PQM manages the rocketfuel branches. In Bazaar 1.x, collections of branches are called "archives" and are identified by an email address plus some other optional information. So, "rocketfuel at canonical.com" is PQM's email address. "launchpad--devel--0" is simply the name of the main launchpad branch. The format of branch names is very strict in Bazaar 1.x. It is much more open in Bazaar 2.x. 2. I use 'baz branch' to create my own branch of this code that I can commit changes to. baz branch steve.alexander at canonical.com/launchpad--ImproveLogins--0 My archive is called "steve.alexander at canonical.com". The branch will be used to work on the login functionality of Launchpad, so I have named the branch "launchpad--ImproveLogins--0". 3. I hack on the code, and from time to time commit my changes. I need to 'baz add' new files and directories, and 'baz rm' to remove files, and 'baz mv' to move files around. # hack hack hack baz commit -s "Refactored the whatever.py module." # hack hack hack baz del whatever_deprecated.py baz commit -s "Removed deprecated whatevers." # hack hack hack 4. Let's say I hacked on some stuff, but I didn't commit it. I don't like what I did, and I want to start again. # hack hack hack baz undo 'baz undo' puts the source code back into the state it was in after the last commit, and puts the changes somewhere. If I change my mind again, I can say 'baz redo', and get my changes back. 5. All this hacking and committing has been happening on my own workstation, without a connection to the internet. Perhaps I've been on a plane or at a cafe. When I have a connection again, I can make my work available for others to see by mirroring my code to a central location. Each Launchpad developer has a mirror of the archive they use for Launchpad work on a central machine at the Canonical data centre. In our case, the mirror command uses sftp to copy the latest changes I have made into the mirror on this central server. baz archive-mirror 6. Because we have a strict code review proccess for Launchpad development, I can't (or rather, shouldn't) submit my changes to PQM yet. I should get it reviewed. But, let's say Andrew wants to do some work that depends on my work, before my work has made its way into PQM's rocketfuel "Trunk". He can simply merge from me. # in Andrew's working tree, on his workstation. baz merge steve.alexander at canonical.com/launchpad--ImproveLogins--0 baz commit -s "Merged steve's ImproveLogins work." When Andrew eventually gets his work reviewed, and sends it on to PQM to be merged into Rocketfuel, the Bazaar merging algorithms will work out that Andrew merged from me, and will sort things out. Of course, there can be conflicts when people have worked in divergent ways on the same code. These are resolved in a similar way to CVS or SVN. 7. I want to get my code reviewed by a member of the review team. I add the details of my branch to the PendingReviews page on the launchpad development wiki. This wiki is publicly readable. https://wiki.launchpad.canonical.com/PendingReviews There is a script that periodically reads the PendingReviews page, attempts to merge the branches listed there into rocketfuel (just as PQM would do), and produces a diff for use by the review team. The diff represents what changes would be made to the rocketfuel Trunk were the branch in question to be sent to PQM. This diff is often enough for the reviewers to work with. If they need to see more context, they can simply check out the branch in question using 'baz get branchname'. The script also highlights whether there were any conflicts that would prevent a merge, and gives an indication of the size of the change. The script's output is accessible only to Launchpad developers. However, I've made a couple of screenshots to give you some idea of what it looks like. This is the summary page, that uses information taken from the PendingReviews wiki page. http://people.ubuntu.com/~stevea/branch-summary.png This is a typical diff representing what is to be merged. http://people.ubuntu.com/~stevea/branch-diff.png The reviewer sends an email to the author of the code, cc the launchpad-reviews mailing list. The review email typically has sections of code included, each line prefixed with '> ', with comments, questions and requests for improvement beneath each section of code. The reviewer will either approve the code for merging, approve the code providing certain remedial actions are taken, or reject the code, requiring a new review later. 8. My code has been successfully reviewed by JamesH, so I send a signed mail to PQM asking to merge my work into rocketfuel. submit-merge "r=JamesH, Improvements to logging in." pqm at pqm.ubuntu.com PQM checks that each merge request has r=someone in the message, as a reminder that launchpad developers need to have their code reviewed. The submit-merge script gets takes the archive name, the branch name, and the "patch level" that the branch is at, composes an email saying "pqm, please merge steve.alexander at canonical.com/launchpad--ImproveLogins-0--patch-18 into rocketfuel at canonical.com/launchpad--devel--0." Signs it with my gpg key, and mails it. Some time later, once PQM has merged the code and successfully run all the launchpad tests, an email will go out to me, and to a pqm-commits mailing list, saying that the merge was successful. If it was unsuccessful, I get an email with the error output. An irc robot listens to the pqm-commits mailing list, and announces new landings to the rocketfuel Trunk on irc. == Naming branches == The Launchpad team is distributed around the world. To cope with this, and also to get our community of users involved in the development of the software, Launchpad development emphasises writing specifications and proposals, and implementing features based on these proposals. You can read all the launchpad proposals on the launchpad development wiki. https://wiki.launchpad.canonical.com/ So, we usually name branches after the specification that is being implemented on that branch. The branch is named near the top of the specification, so someone reading the specification who has access to the source code can see what's happening with the implementation. Branches are also often named after bugs. For example, launchpad--bug123--0. The use of '--' in branch names, and the '--0' thing at the end is occassionally useful, but more of a hangover from the 'tla' system that bazaar is based on. This strict branch naming format is not being carried over into bazaar 2. == External contributors == The source code to Launchpad is not available at this time. We intend to make it open source at some point in the future, but I'm not sure when that will be. Let's consider what would happen if we decided to make the Launchpad code fully open source tommorrow. Someone from outside of Canonical could get a copy of the main launchpad "rocketfuel" branch, make their own branch by branching from the rocketfuel branch, do a bunch of work, mirror it to their own website, and email a Canonical launchpad developer to ask that it be reviewed, or merged into that launchpad developer's branch. This way, even though an outside contributor doesn't have rights with PQM, they could still make fine-grained commits, merge frmo a variety of places, and participate at the same level as someone employed by Canonical. -- Steve Alexander From tjreedy at udel.edu Sun Aug 14 05:07:49 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 13 Aug 2005 23:07:49 -0400 Subject: [Python-Dev] Distributed RCS References: <42FBA376.5030605@canonical.com> <42FD2A13.8000900@canonical.com> Message-ID: > Another fwd, describing how Steve Alexander's group user bazaar. I found this rather clear and easy to understand even without having directly used CVS (other than to browse). Some of the automation features seem useful but I don't know whether they are specific to bazaar. Anyway, my thoughts. It seems to me that auto testing of the tentatively updated trunk before final commitment would avoid the 'who checked in test-breaking code' messages that appear here occasionally. But it requires that the update + test-suite time be less than the average inter-update interval. I understand the main difference between baz and cvs (and similar) to be that checked-out-to-developers copies remain 'within' the distributed system and accessible to the master system rather than becoming external (and lost track of) copies. In consequence (again if I understand correctly), pre- and post-review diffs and merges are done directly between the developers branch and the current system trunk rather than (for diffs) with a possibly out-of-date master on the developer's machine, leading to trunk updates with a possibly out-of-date diff. If so, this would eliminate reviewers having to make requests such as 'please run a new diff against the current CVS head' that I remember sometimes seeing on the SF tracker. The current bottleneck in Python development appears to be patch reviews. So merely making submission and commitment easier will not help much. An alternative to more reviewers is more automation to make more effective use of existing reviewers. (And this might also encourage more reviewers.) The Launchpad group seems to be ahead in this regard, but I don't know how much this is due to using bazaar. In any case, ease of improving the review process might be a criterion for choosing a source code system. But I leave this to ML. *Other things being equal*, using a state-of-the-art development system written in Python to develop Python would be a marketing plus. Terry J. Reedy From anthony at interlink.com.au Sun Aug 14 06:36:33 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Sun, 14 Aug 2005 14:36:33 +1000 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: References: <42FBA376.5030605@canonical.com> Message-ID: <200508141436.36913.anthony@interlink.com.au> I have great hopes for baz-ng, but I don't know that it's really ready for production use just yet. I don't know that we want to be right out on the bleeding edge of revision control systems for Python. The current bazaar, last time I looked (a few months ago) did not work on Windows. This is a complete deal-breaker for us, unless we can agree to dump that Windows support (who needs it, really? ) I *hope* that baz-ng will work fine on Windows - I haven't looked too closely at that side of it. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From skip at pobox.com Sun Aug 14 13:44:25 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 14 Aug 2005 06:44:25 -0500 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <200508141436.36913.anthony@interlink.com.au> References: <42FBA376.5030605@canonical.com> <200508141436.36913.anthony@interlink.com.au> Message-ID: <17151.11929.958753.192768@montanaro.dyndns.org> Anthony> The current bazaar, last time I looked (a few months ago) did Anthony> not work on Windows. This is a complete deal-breaker for us, I assume it would be a deal breaker for many people. According to the Bazaar-NG website it works on "Linux, Windows and Mac OS X, or any system with a Python interpreter". If it's that platform-independent, perhaps it will work on some systems that don't support CVS. It does require Python 2.4, though I doubt that would be a great hardship for many people interested in Python development. Skip From martin at v.loewis.de Sun Aug 14 14:01:46 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 14 Aug 2005 14:01:46 +0200 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: References: <42FBA376.5030605@canonical.com> Message-ID: <42FF32AA.7040506@v.loewis.de> Guido van Rossum wrote: > With permission, I'm forwarding an email from Mark Shuttleworth about > Bazaar-2 (aka Bazaar-NG), a distributed source control system (not > entirely unlike bitkeeper, I presume) written in Python and in use by > the Ubuntu system. What do people think of using this for Python? Is > it the right model? Like Skip, I tried experimenting with it. While that may be the right model, I don't think it is the right software. In bazaar-ng 0.0.5 (which is what Debian unstable currently has), bzr commit would not open a text editor, but require the commit message on the command line; selective commit of only some of the changed files is also not supported. bzr diff cannot show the changes between two revisions, and cannot show revisions across branches. So I assume that using bazaar-ng right now would cause problems in day-to-day usage. Regards, Martin From skip at pobox.com Sun Aug 14 14:37:55 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 14 Aug 2005 07:37:55 -0500 Subject: [Python-Dev] cvs to bzr? In-Reply-To: <20050814000346.GB5470@mems-exchange.org> References: <17150.31637.180169.877441@montanaro.dyndns.org> <20050814000346.GB5470@mems-exchange.org> Message-ID: <17151.15139.655967.250675@montanaro.dyndns.org> >> ... I didn't see any mention in the bzr docs of any sort of cvs2bzr >> tool. Neil> Haven't tried it but should work: Neil> http://darcs.net/DarcsWiki/Tailor Thanks Neil. I downloaded it last night and played around a bit. What follows is a description that will hopefully keep others from stepping in the same booby traps I did. It doesn't appear to work at this point, both based on attempts to use it and hints on the above page. After some reading, I was able to pull the latest version with wget --exclude-directories=/~lele/projects/tailor/_darcs --mirror \ --no-parent --no-host-directories --cut-dirs=3 -e robots=off \ http://nautilus.homeip.net/~lele/projects/tailor/ (The wget example at the top of the wiki page points to an older version.) Unfortunately, it (like the older version) is missing if __name__ == "__main__": main() in tailor.py and has no call to its main() function anywhere in the source. This seemed very odd to me, so I added one, as well as a #! line. I eventually noticed that there is a vcpx package and in the directory above, a number of other files, tailor, a bunch of index.html files and a README. It was reminiscent of expanding a tar file in the bad old days (or unzipping a zip file nowadays) before everbody got the idea that it would be a good idea to create a top-level directory... Anyway, I'm still struggling with it. If I get further I'll post my results. If others have gotten further, tips would be appreciated. Skip From skip at pobox.com Sun Aug 14 14:46:16 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 14 Aug 2005 07:46:16 -0500 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <42FF32AA.7040506@v.loewis.de> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> Message-ID: <17151.15640.173982.961359@montanaro.dyndns.org> Martin> Like Skip, I tried experimenting with it. While that may be the Martin> right model, I don't think it is the right software. [problems Martin> elided] Martin> So I assume that using bazaar-ng right now would cause problems Martin> in day-to-day usage. Granted. What is the cost of waiting a bit longer to see if it (or something else) gets more usable and would hit the mark better than svn? I presume that once we switch away from cvs to something else, it's unlikely we would switch again unless some huge roadblock appeared that made the initial change the wrong one. I was amazed at the number of different version control systems out there now. CVS, while enormously successful from a practical standpoint, clearly has its detractors. That there are so many alternatives suggests that it's not clear yet what the "correct" feature set for a version control system is. Skip From martin at v.loewis.de Sun Aug 14 18:16:11 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 14 Aug 2005 18:16:11 +0200 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <17151.15640.173982.961359@montanaro.dyndns.org> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> <17151.15640.173982.961359@montanaro.dyndns.org> Message-ID: <42FF6E4B.4000206@v.loewis.de> skip at pobox.com wrote: > Granted. What is the cost of waiting a bit longer to see if it (or > something else) gets more usable and would hit the mark better than svn? I > presume that once we switch away from cvs to something else, it's unlikely > we would switch again unless some huge roadblock appeared that made the > initial change the wrong one. It depends on what "a bit" is. Waiting a month would be fine; waiting two years might be pointless. So I think I will personally pursue PEP 347 (switching to SVN); it will be then an issue of BDFL pronouncement. Regards, Martin From nas at arctrix.com Sun Aug 14 19:12:59 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Sun, 14 Aug 2005 11:12:59 -0600 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <42FF6E4B.4000206@v.loewis.de> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> <17151.15640.173982.961359@montanaro.dyndns.org> <42FF6E4B.4000206@v.loewis.de> Message-ID: <20050814171259.GA8200@mems-exchange.org> On Sun, Aug 14, 2005 at 06:16:11PM +0200, "Martin v. L?wis" wrote: > It depends on what "a bit" is. Waiting a month would be fine; waiting > two years might be pointless. It looks like the process of converting a CVS repository to Bazaar-NG does not yet work well (to be kind). The path CVS->SVN->bzr would probably work better. I suspect cvs2svn has been used on quite a few CVS repositories already. I don't think going to SVN first would lose any information. My vote is to continue with the migration to SVN. We can re-evaluate Bazaar-NG at a later time. Neil From ronaldoussoren at mac.com Sun Aug 14 19:17:00 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 14 Aug 2005 19:17:00 +0200 Subject: [Python-Dev] build problems on macosx (CVS HEAD) Message-ID: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com> Hi, I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a checkout that is less than two hours old. I'm building a standard unix tree (no framework install): $ ./configure --prefix=/opt/python/2.5 ... $ make ... ar cr libpython2.5.a Modules/config.o Modules/getpath.o Modules/ main.o Modules/gcmodule.o ar cr libpython2.5.a Modules/threadmodule.o Modules/signalmodule.o Modules/posixmodule.o Modules/errnomodule.o Modules/_sre.o Modules/ _codecsmodule.o Modules/zipimport.o Modules/symtablemodule.o Modules/xxsubtype.o ranlib libpython2.5.a c++ -u _PyMac_Error -o python.exe \ Modules/python.o \ libpython2.5.a -ldl case $MAKEFLAGS in \ *-s*) CC='gcc' LDSHARED='gcc -bundle -undefined dynamic_lookup' OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python.exe -E ./ setup.py -q build;; \ *) CC='gcc' LDSHARED='gcc -bundle -undefined dynamic_lookup' OPT='- DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python.exe -E ./setup.py build;; \ esac make: *** [sharedmods] Error 139 This is a segmentation fault when trying to build extensions: $ ./python.exe Python 2.5a0 (#5, Aug 14 2005, 18:20:08) [GCC 4.0.0 (Apple Computer, Inc. build 5026)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import setup Segmentation fault The minimal import that causes a crash is 'import distutils.sysconfig'. I've rebuild using --enable-debug and --with- pydebug to check if gdb could tell me more. The start of the stacktrace: (gdb) r -c 'import distutils.sysconfig' Starting program: /Volumes/Data/Users/ronald/Python/python-HEAD/dist/ src/python.exe -c 'import distutils.sysconfig' Reading symbols for shared libraries . done Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0xcbcbcbd3 0x001500b0 in structseq_dealloc (obj=0x3c3878) at Objects/structseq.c:47 47 Py_XDECREF(obj->ob_item[i]); (gdb) where #0 0x001500b0 in structseq_dealloc (obj=0x3c3878) at Objects/ structseq.c:47 #1 0x0002fdb0 in _Py_Dealloc (op=0x3c3878) at Objects/object.c:1883 #2 0x000eedb4 in frame_dealloc (f=0x1816c18) at Objects/ frameobject.c:394 #3 0x0002fdb0 in _Py_Dealloc (op=0x1816c18) at Objects/object.c:1883 #4 0x000dd1d0 in fast_function (func=0x390038, pp_stack=0xbfffd788, n=1, na=1, nk=0) at Python/ceval.c:3654 #5 0x000dcdc8 in call_function (pp_stack=0xbfffd788, oparg=1) at Python/ceval.c:3590 #6 0x000d6aa4 in PyEval_EvalFrameEx (f=0x610358, throw=0) at Python/ ceval.c:2181 #7 0x000d98d0 in PyEval_EvalCodeEx (co=0x5180d8, globals=0x5146c8, locals=0x5146c8, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2748 #8 0x000ce270 in PyEval_EvalCode (co=0x5180d8, globals=0x5146c8, locals=0x5146c8) at Python/ceval.c:490 #9 0x0001643c in PyImport_ExecCodeModuleEx (name=0xbfffe808 "distutils.sysconfig", co=0x5180d8, pathname=0xbfffde4c "/Volumes/ Data/Users/ronald/Python/python-HEAD/dist/src/Lib/distutils/ sysconfig.pyc") at Python/import.c:620 At the DECREF, i == 17, size == 18 and obj->ob_item[i] == 0xcbcbcbcb, and obj is an posix.stat_result. From nas at arctrix.com Sun Aug 14 19:21:07 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Sun, 14 Aug 2005 11:21:07 -0600 Subject: [Python-Dev] cvs to bzr? In-Reply-To: <20050814000346.GB5470@mems-exchange.org> References: <17150.31637.180169.877441@montanaro.dyndns.org> <20050814000346.GB5470@mems-exchange.org> Message-ID: <20050814172107.GB8200@mems-exchange.org> On Sat, Aug 13, 2005 at 06:03:46PM -0600, Neil Schemenauer wrote: > Haven't tried it but should work: > > http://darcs.net/DarcsWiki/Tailor After applying the attached patch, this command seemed to work for converting the initial revision: ~/src/cvsync/tailor.py --source-kind cvs --target-kind bzr \ --bootstrap --repository ~/Python/python-cvsroot -m python \ --revision r16b1 py_bzr After, I think this command is supposed to bring the bzr repostiory up-to-date: cd py_bzr; ~/src/cvsync/tailor.py -v It does not seem to work for me (it only updates one file and then quits). cvs2svn seems to be much more mature. Neil -------------- next part -------------- diff -rN -u old-cvsync/vcpx/bzr.py new-cvsync/vcpx/bzr.py --- old-cvsync/vcpx/bzr.py 2005-08-14 09:43:15.000000000 -0600 +++ new-cvsync/vcpx/bzr.py 2005-08-14 10:38:02.000000000 -0600 @@ -29,14 +29,23 @@ ## SyncronizableTargetWorkingDir - def _addEntries(self, root, entries): - """ - Add a sequence of entries. - """ + def _addPathnames(self, root, entries): + c = SystemCommand(working_dir=root, command="bzr add --no-recurse" + " %(entries)s") + c(entries=' '.join([shrepr(e) for e in entries])) + def _addSubtree(self, root, *entries): c = SystemCommand(working_dir=root, command="bzr add %(entries)s") - c(entries=' '.join([shrepr(e.name) for e in entries])) + c(entries=' '.join([shrepr(e) for e in entries])) + def _removePathnames(self, root, names): + pass # bzr handles this itself + + def _renamePathname(self, root, oldname, newname): + c = SystemCommand(working_dir=root, + command="bzr mv %(old)s %(new)s") + c(old=shrepr(oldname), new=shrepr(newname)) + def _commit(self,root, date, author, remark, changelog=None, entries=None): """ Commit the changeset. @@ -112,7 +121,7 @@ # Create the .bzrignore file, that contains a glob per line, # with all known VCs metadirs to be skipped. - ignore = open(join(root, '.hgignore'), 'w') + ignore = open(join(root, '.bzrignore'), 'w') ignore.write('\n'.join(['(^|/)%s($|/)' % md for md in IGNORED_METADIRS])) ignore.write('\ntailor.log\ntailor.info\n') From gvanrossum at gmail.com Sun Aug 14 20:11:46 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 14 Aug 2005 11:11:46 -0700 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <20050814171259.GA8200@mems-exchange.org> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> <17151.15640.173982.961359@montanaro.dyndns.org> <42FF6E4B.4000206@v.loewis.de> <20050814171259.GA8200@mems-exchange.org> Message-ID: On 8/14/05, Neil Schemenauer wrote: > It looks like the process of converting a CVS repository to > Bazaar-NG does not yet work well (to be kind). The path > CVS->SVN->bzr would probably work better. I suspect cvs2svn has > been used on quite a few CVS repositories already. I don't think > going to SVN first would lose any information. > > My vote is to continue with the migration to SVN. We can > re-evaluate Bazaar-NG at a later time. That's looks like a fair assessment -- although it means twice the conversion pain for developers. It sounds as if bazaar-NG can use a bit of its own medicine -- I hope everybody who found a bug in their tools submitted a patch! :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Sun Aug 14 20:13:29 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 14 Aug 2005 11:13:29 -0700 Subject: [Python-Dev] Fwd: PEP: Migrating the Python CVS to Subversion In-Reply-To: <1123986783.21455.35.camel@linux.site> References: <1123986783.21455.35.camel@linux.site> Message-ID: Here's another POV. (Why does evereybody keep emailing me personally?) --Guido van Rossum (home page: http://www.python.org/~guido/) ---------- Forwarded message ---------- From: Daniel Berlin Date: Aug 13, 2005 7:33 PM Subject: Re: [Python-Dev] PEP: Migrating the Python CVS to Subversion To: gvanrossum at gmail.com (Sorry for the lack of proper References: headers, this is a reply to the email archive). It's been a couple years since i've been in the world of python-dev, but apparently i'm rejoining the mailing list at just the right time. Take all of this for what it is worth: I'm currently responsible for GCC's bugzilla, wiki, in addition to maintaining several optimization areas of the compiler :P. In addition, i'm responsible for pushing GCC (my main love and world :P) towards Subversion. I should note my bias at this point. I now have full commit access to Subversion. However, I've also submitted patches to monotone, etc. We had a long thread about the various alternatives (arch, bzr, etc), and besides "freeness" constraints on what we can run on gcc.gnu.org as an FSF project, it wouldn't have mattered anyway. This has been in the planning for about a year now (mainly waiting for new hardware). Originally, we were hoping to move GCC to monotone, but it didn't mature fast enough (it's way too slow), and we couldn't make it centralized enough for our tastes (more later). The rest of the free tools other than subversion (arch, monotone, git, darcs, etc) simply couldn't handle our repository with reasonable speed/hardware. GCC has project history dating back to 1987. It's a 4 gig CVS repo containing > 1000 tags, and > 300 branches. The changelog alone has 30k trunk revisions. Those distributed systems that carry full history often can't deal with this fast enough or in any space efficient way. arch was an example of this. It had a retarded mechanism that forced users to care about caching certain revisions to speed it up , instead of doing it on it's own. I've never tried converting this repo to bazaar-ng, it wasn't far enough along when i started. It also had no cvs2bzr type program, and we aren't about to lose all our history. Except for monotone (builtin cvs_import) and subversion (cvs2svn), none of the cvs2* programs i've run across either run in reasonable time (those that don't actually understand how to extract rcs revisions would take weeks to convert our repo, literally), or could handle all the complexities a repository with our history present (branches of branches, etc). Most simply crash in weird ways or run out of memory :). Anyway: Monotone took 45 minutes just to diff two old revisions that are one revision away from each other. CVS takes about 2 minutes for the same operation. SVN on fsfs takes 4 seconds. The converted SVN repo has > 100000 revisions, and is only ~15% bigger than the cvs repo (mostly due to stupid copies it has to do to handle some tag fuckery people were doing in some cases. If it had been subversion from the start, it would have been smaller). We have cvs2svn speedup patches that were done with the KDE folks that make cvs2svn io bound again instead of cpu bound (it was O(N^2) in extracting cvs revision texts before). It takes 17 hours to convert the gcc repository now, only 45 minutes of cpu time :). It used to take 52 hours. I've also talked with Linus about version control before. He believes extreme distributed development is the way to go. I believe heavily that in most cases where you have a mix of corporations and free developers, it ends up causing people to "hide the ball" more than they should. This is particularly prevalent in GCC. We don't want design and development done and then sent as mega-patches presented as fait accompli, then watch these people whine as their designs get torn apart. We'd rather have the discussion on the mailing list and the work done in a visible place (IE CVS branch stored in some central place) rather than getting patch bombs. As a result (and there are many other reasons, i'm just presenting one of them), we actually don't *want* to move from a centralized model, in order to help control the social and political problems we'd have to face if we went fully distributed. Python may not face any of these problem to the degree that GCC does (i doubt many projects do actually. GCC is a very weird and tense political situation :P), because of size, etc, in which case, a distributed model may make more sense. However, you need to be careful to make sure that people understand that it hasn't actually changed your real development process (PEP's, etc), only the workflow used to implement it. From bcannon at gmail.com Sun Aug 14 20:37:43 2005 From: bcannon at gmail.com (Brett Cannon) Date: Sun, 14 Aug 2005 11:37:43 -0700 Subject: [Python-Dev] build problems on macosx (CVS HEAD) In-Reply-To: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com> References: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com> Message-ID: On 8/14/05, Ronald Oussoren wrote: > Hi, > > I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a > checkout that is less than two hours old. I'm building a standard > unix tree (no framework install): > > $ ./configure --prefix=/opt/python/2.5 > ... > $ make > ... > ar cr libpython2.5.a Modules/config.o Modules/getpath.o Modules/ > main.o Modules/gcmodule.o > ar cr libpython2.5.a Modules/threadmodule.o Modules/signalmodule.o > Modules/posixmodule.o Modules/errnomodule.o Modules/_sre.o Modules/ > _codecsmodule.o Modules/zipimport.o Modules/symtablemodule.o > Modules/xxsubtype.o > ranlib libpython2.5.a > c++ -u _PyMac_Error -o python.exe \ > Modules/python.o \ > libpython2.5.a -ldl > case $MAKEFLAGS in \ > *-s*) CC='gcc' LDSHARED='gcc -bundle -undefined dynamic_lookup' > OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python.exe -E ./ > setup.py -q build;; \ > *) CC='gcc' LDSHARED='gcc -bundle -undefined dynamic_lookup' OPT='- > DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python.exe -E ./setup.py > build;; \ > esac > make: *** [sharedmods] Error 139 > > This is a segmentation fault when trying to build extensions: > I can verify the breakage. I did a ``make distclean``, updated, built, and got the same 139 error. I am short on time today, so I don't think I will be able to dive into this right away. -Brett From mwh at python.net Sun Aug 14 21:09:21 2005 From: mwh at python.net (Michael Hudson) Date: Sun, 14 Aug 2005 20:09:21 +0100 Subject: [Python-Dev] build problems on macosx (CVS HEAD) In-Reply-To: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com> (Ronald Oussoren's message of "Sun, 14 Aug 2005 19:17:00 +0200") References: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com> Message-ID: <2mslxcuyhq.fsf@starship.python.net> Ronald Oussoren writes: > Hi, > > I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a > checkout that is less than two hours old. I'm building a standard > unix tree (no framework install): It seems very likely that it was this change: http://fisheye.cenqua.com/changelog/~br=MAIN/python?cs=MAIN:loewis:20050809145951 Refcounting, posixmodule.c, aiee! You are in a maze of twisty #ifdefs, all alike. I'll probably find the problem fairly soon, we'll see... :) Cheers, mwh -- All obscurity will buy you is time enough to contract venereal diseases. -- Tim Peters, python-dev From ndbecker2 at gmail.com Sun Aug 14 20:14:22 2005 From: ndbecker2 at gmail.com (Neal Becker) Date: Sun, 14 Aug 2005 14:14:22 -0400 Subject: [Python-Dev] Fwd: Distributed RCS References: <42FBA376.5030605@canonical.com> Message-ID: I encourage everyone to look at mercurial. It is also written in Python. I am using it daily. From mwh at python.net Sun Aug 14 21:21:31 2005 From: mwh at python.net (Michael Hudson) Date: Sun, 14 Aug 2005 20:21:31 +0100 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: <6AA5C3D1-6AEC-4CE4-AEB9-84FBDA10EFA9@wsanchez.net> ( =?iso-8859-1?q?Wilfredo_S=E1nchez_Vega's_message_of?= "Sat, 13 Aug 2005 09:02:00 -0700") References: <6AA5C3D1-6AEC-4CE4-AEB9-84FBDA10EFA9@wsanchez.net> Message-ID: <2moe80uxxg.fsf@starship.python.net> Wilfredo S?nchez Vega writes: > I'm curious about why Python lacks FileNotFoundError, > PermissionError and the like as subclasses of IOError. Good question. Lack of effort/inertia? > Catching IOError and looking at errno to figure out what went > wrong seems pretty unpythonic, and I've often wished for built-in > subclasses of IOError. The py library does this (http://codespeak.net/py). > I sometimes subclass them myself, but a lot of the time, I'm > catching such exceptions as thrown by the standard library. Well, indeed. OTOH, functions like os.open aren't really *meant* to be pythonic. I don't think this is something I can get interested enough in to work on myself. Cheers, mwh -- As far as I'm concerned, the meat pie is the ultimate unit of currency. -- from Twisted.Quotes From gvanrossum at gmail.com Sun Aug 14 21:27:18 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 14 Aug 2005 12:27:18 -0700 Subject: [Python-Dev] Exception Reorg PEP checked in In-Reply-To: <2moe80uxxg.fsf@starship.python.net> References: <6AA5C3D1-6AEC-4CE4-AEB9-84FBDA10EFA9@wsanchez.net> <2moe80uxxg.fsf@starship.python.net> Message-ID: On 8/14/05, Michael Hudson wrote: > Wilfredo S?nchez Vega writes: > > > I'm curious about why Python lacks FileNotFoundError, > > PermissionError and the like as subclasses of IOError. > > Good question. Lack of effort/inertia? Well, I wonder how often it's needed. My typical use is this: try: f = open(filename) except IOError, err: print "Can't open %s: %s" % (filename, err) return and the error printed contains all the necessary details (in fact it even repeats the filename, so I could probably just say "print err"). Why do you need to know the exact reason for the failure? If you simply want to know whether the file exists, I'd use os.path.exists() or isfile(). (Never mind that this is the sometimes-frowned-upon look-before-you-leap; I think it's often fine.) Also note that providing the right detail can be very OS specific. Python doesn't just run on Unix and Windows. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh at python.net Sun Aug 14 21:35:00 2005 From: mwh at python.net (Michael Hudson) Date: Sun, 14 Aug 2005 20:35:00 +0100 Subject: [Python-Dev] Distributed RCS In-Reply-To: (Terry Reedy's message of "Sat, 13 Aug 2005 23:07:49 -0400") References: <42FBA376.5030605@canonical.com> <42FD2A13.8000900@canonical.com> Message-ID: <2mk6iouxaz.fsf@starship.python.net> "Terry Reedy" writes: > It seems to me that auto testing of the tentatively updated trunk before > final commitment would avoid the 'who checked in test-breaking code' > messages that appear here occasionally. I don't think there's any fundamental impossibility in setting up such a system for CVS, and am pretty certain there's not for SVN. > But it requires that the update + test-suite time be less than the > average inter-update interval. Indeed. > The current bottleneck in Python development appears to be patch reviews. And acting on those reviews... > So merely making submission and commitment easier will not help much. I'm not sure, I think it could help quite a bit. > An alternative to more reviewers is more automation to make more > effective use of existing reviewers. (And this might also encourage > more reviewers.) The Launchpad group seems to be ahead in this > regard, but I don't know how much this is due to using bazaar. In > any case, ease of improving the review process might be a criterion > for choosing a source code system. But I leave this to ML. > > *Other things being equal*, using a state-of-the-art development system > written in Python to develop Python would be a marketing plus. I think the words "stable" and "reliable" should be in there somewhere :) I don't get the impression bazaar-ng is there yet. Cheers, mwh -- Unfortunately, nigh the whole world is now duped into thinking that silly fill-in forms on web pages is the way to do user interfaces. -- Erik Naggum, comp.lang.lisp From gustavo at niemeyer.net Sun Aug 14 23:21:40 2005 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Sun, 14 Aug 2005 18:21:40 -0300 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <42FF32AA.7040506@v.loewis.de> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> Message-ID: <20050814212140.GA11278@burma.localdomain> > Like Skip, I tried experimenting with it. While that may be the right > model, I don't think it is the right software. In bazaar-ng 0.0.5 (which > is what Debian unstable currently has), bzr commit would not open > a text editor, but require the commit message on the command line; > selective commit of only some of the changed files is also not > supported. bzr diff cannot show the changes between two revisions, The development version has all of those features implemented already. > and cannot show revisions across branches. I'm not sure about this one, though. -- Gustavo Niemeyer http://niemeyer.net From martin at v.loewis.de Sun Aug 14 23:43:54 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 14 Aug 2005 23:43:54 +0200 Subject: [Python-Dev] build problems on macosx (CVS HEAD) In-Reply-To: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com> References: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com> Message-ID: <42FFBB1A.8060206@v.loewis.de> Ronald Oussoren wrote: > I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a > checkout that is less than two hours old. I'm building a standard > unix tree (no framework install): I just committed what I think is a bugfix for the recent st_gen support. Unfortunately, I can't try the code, since I don't have access to BSD/OSX at the moment. So please report whether there is any change in behaviour. Regards, Martin From martin at v.loewis.de Sun Aug 14 23:47:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 14 Aug 2005 23:47:27 +0200 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> <17151.15640.173982.961359@montanaro.dyndns.org> <42FF6E4B.4000206@v.loewis.de> <20050814171259.GA8200@mems-exchange.org> Message-ID: <42FFBBEF.5060202@v.loewis.de> Guido van Rossum wrote: > It sounds as if bazaar-NG can use a bit of its own medicine -- I hope > everybody who found a bug in their tools submitted a patch! :-) I had problems finding the place where the bazaar-NG source code repository is stored - is there a public access to the HEAD version? There also doesn't appear to be a bug tracker - but I found a mentioning that bug reports should go to the mailing list. Regards, Martin From martin at v.loewis.de Sun Aug 14 23:58:46 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 14 Aug 2005 23:58:46 +0200 Subject: [Python-Dev] Fwd: PEP: Migrating the Python CVS to Subversion In-Reply-To: References: <1123986783.21455.35.camel@linux.site> Message-ID: <42FFBE96.7040000@v.loewis.de> Guido van Rossum wrote: > Here's another POV. I think I agree with Daniel's view, in particular wrt. to performance. Whatever the replacement tool, it should perform as well or better than CVS currently does; it also shouldn't perform much worse than subversion. I've been using git (or, rather, cogito) to keep up-to-date with the Linux kernel. While performance of git is really good, storage requirements are *quite* high, and initial "checkout" takes a long time - even though the Linux kernel repository stores virtual no history (there was a strict cut when converting the bitkeeper HEAD). So these distributed tools would cause quite some disk consumption on client machines. bazaar-ng apparently supports only-remote repositories as well, so that might be no concern. Regards, Martin From dberlin at dberlin.org Mon Aug 15 00:07:39 2005 From: dberlin at dberlin.org (Daniel Berlin) Date: Sun, 14 Aug 2005 18:07:39 -0400 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <20050814171259.GA8200@mems-exchange.org> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> <17151.15640.173982.961359@montanaro.dyndns.org> <42FF6E4B.4000206@v.loewis.de> <20050814171259.GA8200@mems-exchange.org> Message-ID: <1124057259.25267.70.camel@linux.site> On Sun, 2005-08-14 at 11:12 -0600, Neil Schemenauer wrote: > On Sun, Aug 14, 2005 at 06:16:11PM +0200, "Martin v. L?wis" wrote: > > It depends on what "a bit" is. Waiting a month would be fine; waiting > > two years might be pointless. > > It looks like the process of converting a CVS repository to > Bazaar-NG does not yet work well (to be kind). The path > CVS->SVN->bzr would probably work better. I suspect cvs2svn has > been used on quite a few CVS repositories already. I don't think > going to SVN first would lose any information. It doesn't. As a data point, CVS2SVN can handle gcc's massive cvs repository, which has merged rcs file information in it dating back to 1987, >1000 tags, and > 300 branches. Besides monotone's cvs_import, it's actually the only properly designed cvs converter I've seen in a while (Properly designed in that it actually uses the necessary and correct algorithms to get all the weirdities of cvs branches and tags right). I'm not sure how big python's repo is, but you probably want to use the attached patch to speed up cvs2svn. It changes it to reconstruct the revisions on it's own instead of calling cvs or rcs. For GCC, and KDE, this makes a significant difference (17 hours for our 4 gig cvs repo convresion instead of 52 hours), because it was spawning cvs/rcs 50 billion times, and the milliseconds add up :) > My vote is to continue with the migration to SVN. We can > re-evaluate Bazaar-NG at a later time. GCC is moving to SVN (very soon now, within 2 months), and this has been my viewpoint as well. It's much easier to go from something that has changesets and global revisions, to a distributed system, if you want to, than it is to try to reconstruct that info from CVS on your own :). Subversion also has excellent language bindings, including the python bindings. That's how i've hooked it up to gcc's bugzilla. You could easily write something to transform *from* subversion to another system using the bindings. Things like viewcvs use the python bindings to deal with the svn repository entirely. --Dan -------------- next part -------------- Index: cvs2svn =================================================================== --- cvs2svn (revision 1423) +++ cvs2svn (working copy) @@ -166,6 +166,10 @@ # grouping. See design-notes.txt for details. DATAFILE = 'cvs2svn-data' +REVISIONS_DB = 'cvs2svn-cvsrepo.db' + +CHECKOUT_DB = 'cvs2svn-cvsco.db' + # This file contains a marshalled copy of all the statistics that we # gather throughout the various runs of cvs2svn. The data stored as a # marshalled dictionary. @@ -355,40 +359,7 @@ " cvsroot\n" % (error_prefix, cvsroot, fname)) sys.exit(1) -def get_co_pipe(c_rev, extra_arguments=None): - """Return a command string, and the pipe created using that string. - C_REV is a CVSRevision, and EXTRA_ARGUMENTS is used to add extra - arguments. The pipe returns the text of that CVS Revision.""" - ctx = Ctx() - if extra_arguments is None: - extra_arguments = [] - if ctx.use_cvs: - pipe_cmd = [ 'cvs' ] + ctx.cvs_global_arguments + \ - [ 'co', '-r' + c_rev.rev, '-p' ] + extra_arguments + \ - [ ctx.cvs_module + c_rev.cvs_path ]; - else: - pipe_cmd = [ 'co', '-q', '-x,v', '-p' + c_rev.rev ] + extra_arguments + \ - [ c_rev.rcs_path() ] - pipe = SimplePopen(pipe_cmd, True) - pipe.stdin.close() - return pipe_cmd, pipe - -def generate_ignores(c_rev): - # Read in props - pipe_cmd, pipe = get_co_pipe(c_rev) - buf = pipe.stdout.read(PIPE_READ_SIZE) - raw_ignore_val = "" - while buf: - raw_ignore_val = raw_ignore_val + buf - buf = pipe.stdout.read(PIPE_READ_SIZE) - pipe.stdout.close() - error_output = pipe.stderr.read() - exit_status = pipe.wait() - if exit_status: - sys.exit("%s: The command '%s' failed with exit status: %s\n" - "and the following output:\n" - "%s" % (error_prefix, pipe_cmd, exit_status, error_output)) - +def generate_ignores(raw_ignore_val): # Tweak props: First, convert any spaces to newlines... raw_ignore_val = '\n'.join(raw_ignore_val.split()) raw_ignores = raw_ignore_val.split('\n') @@ -614,9 +585,7 @@ DB_OPEN_READ = 'r' DB_OPEN_NEW = 'n' -# A wrapper for anydbm that uses the marshal module to store items as -# strings. -class Database: +class SDatabase: def __init__(self, filename, mode): # pybsddb3 has a bug which prevents it from working with # Berkeley DB 4.2 if you open the db with 'n' ("new"). This @@ -635,22 +604,24 @@ self.db = anydbm.open(filename, mode) - def has_key(self, key): - return self.db.has_key(key) + def __getattr__(self, name): + return getattr(self.db, name) +# A wrapper for anydbm that uses the marshal module to store items as +# strings. +class Database(SDatabase): + def __getitem__(self, key): return marshal.loads(self.db[key]) def __setitem__(self, key, value): self.db[key] = marshal.dumps(value) - def __delitem__(self, key): - del self.db[key] - def get(self, key, default): - if self.has_key(key): - return self.__getitem__(key) - return default + try: + return marshal.loads(self.db[key]) + except KeyError: + return default class StatsKeeper: @@ -841,6 +812,192 @@ Cleanup().register(temp(TAGS_DB), pass8) +def msplit(stri): + re = [ i + "\n" for i in stri.split("\n") ] + re[-1] = re[-1][:-1] + if not re[-1]: + del re[-1] + return re + + +class RCSStream: + ad_command = re.compile('^([ad])(\d+)\\s(\\d+)') + a_command = re.compile('^a(\d+)\\s(\\d+)') + + def __init__(self): + self.texts = [] + + def copy(self): + ret = RCSStream() + ret.texts = self.texts[:] + return ret + + def setText(self, text): + self.texts = msplit(text) + + def getText(self): + return "".join(self.texts) + + def applyDiff(self, diff): + diffs = msplit(diff) + adjust = 0 + i = 0 + while i < len(diffs): + admatch = self.ad_command.match(diffs[i]) + i += 1 + try: + cn = int(admatch.group(3)) + except: + print diffs + raise RuntimeError, 'Error parsing diff commands' + if admatch.group(1) == 'd': # "d" - Delete command + sl = int(admatch.group(2)) - 1 + adjust + del self.texts[sl:sl + cn] + adjust -= cn + else: # "a" - Add command + sl = int(admatch.group(2)) + adjust + self.texts[sl:sl] = diffs[i:i + cn] + adjust += cn + i += cn + + def invertDiff(self, diff): + diffs = msplit(diff) + ndiffs = [] + adjust = 0 + i = 0 + while i < len(diffs): + admatch = self.ad_command.match(diffs[i]) + i += 1 + try: + cn = int(admatch.group(3)) + except: + raise RuntimeError, 'Error parsing diff commands' + if admatch.group(1) == 'd': # "d" - Delete command + sl = int(admatch.group(2)) - 1 + adjust + # handle substitution explicitly, as add must come after del + # (last add may have incomplete line) + if i < len(diffs): + amatch = self.a_command.match(diffs[i]) + else: + amatch = None + if amatch and int(amatch.group(1)) + adjust == sl + cn: + cn2 = int(amatch.group(2)) + i += 1 + ndiffs += ["d%d %d\na%d %d\n" % (sl + 1, cn2, sl + cn2, cn)] + \ + self.texts[sl:sl + cn] + self.texts[sl:sl + cn] = diffs[i:i + cn2] + adjust += cn2 - cn + i += cn2 + else: + ndiffs += ["a%d %d\n" % (sl, cn)] + self.texts[sl:sl + cn] + del self.texts[sl:sl + cn] + adjust -= cn + else: # "a" - Add command + sl = int(admatch.group(2)) + adjust + ndiffs += ["d%d %d\n" % (sl + 1, cn)] + self.texts[sl:sl] = diffs[i:i + cn] + adjust += cn + i += cn + return "".join(ndiffs) + + def zeroDiff(self): + if not self.texts: + return "" + return "a0 " + str(len(self.texts)) + "\n" + "".join(self.texts) + + +class CVSCheckout: + + class Rev: pass + + __shared_state = { } + def __init__(self): + self.__dict__ = self.__shared_state + + def init(self): + self.co_db = SDatabase(temp(CHECKOUT_DB), DB_OPEN_NEW) + Cleanup().register(temp(CHECKOUT_DB), pass8) + self.rev_db = SDatabase(temp(REVISIONS_DB), DB_OPEN_READ) + self.files = { } + + def done(self): + print "leftover revisions:" + for file in self.files: + print file + ':', + for r in self.files[file]: + print r, + print + self.co_db.close() + self.rev_db.close() + + def init_file(self, fname): + revs = { } + for line in self.rev_db[fname].split('\n'): + prv = None + for r in line.split(): + try: + rev = revs[r] + except KeyError: + rev = CVSCheckout.Rev() + rev.ref = 0 + rev.prev = None + revs[r] = rev + if prv: + revs[prv].prev = r + rev.ref += 1 + prv = r + return revs + + def checkout_i(self, fname, revs, r, co, ref): + rev = revs[r] + if rev.prev: + prev = revs[rev.prev] + try: + key = fname + '/' + rev.prev + co.setText(self.co_db[key]) + prev.ref -= 1 + if not prev.ref: +# print "used saved", fname, rev.prev, "- deleted" + del revs[rev.prev] + del self.co_db[key] +# else: +# print "used saved", fname, rev.prev, "- keeping. ref is now", prev.ref + except KeyError: + self.checkout_i(fname, revs, rev.prev, co, 1) + try: + co.applyDiff(self.rev_db[fname + '/' + r]) + except KeyError: + pass + rev.ref -= ref + if rev.ref: +# print "checked out", fname, r, "- saving. ref is", rev.ref + self.co_db[fname + '/' + r] = co.getText() + else: +# print "checked out", fname, r, "- not saving" + del revs[r] + + def checkout_ii(self, fname, revs, r, cvtnl=None): + co = RCSStream() + self.checkout_i(fname, revs, r, co, 0) + rv = co.getText() + if cvtnl: + rv = rv.replace('\r\n', '\n').replace('\r', '\n') + return rv + + def checkout(self, c_rev, cvtnl=None): + try: + revs = self.files[c_rev.fname] + rv = self.checkout_ii(c_rev.fname, revs, c_rev.rev, cvtnl) + if not revs: + del self.files[c_rev.fname] + except KeyError: + revs = self.init_file(c_rev.fname) + rv = self.checkout_ii(c_rev.fname, revs, c_rev.rev, cvtnl) + if revs: + self.files[c_rev.fname] = revs + return rv + + class CVSRevision: def __init__(self, ctx, *args): """Initialize a new CVSRevision with Ctx object CTX, and ARGS. @@ -848,7 +1005,6 @@ If CTX is None, the following members and methods of the instantiated CVSRevision class object will be unavailable (or simply will not work correctly, if at all): - cvs_path svn_path svn_trunk_path is_default_branch_revision() @@ -870,7 +1026,6 @@ prev_rev --> (string or None) previous CVS rev, e.g., "1.2" rev --> (string) this CVS rev, e.g., "1.3" next_rev --> (string or None) next CVS rev, e.g., "1.4" - file_in_attic --> (char or None) true if RCS file is in Attic file_executable --> (char or None) true if RCS file has exec bit set. file_size --> (int) size of the RCS file deltatext_code --> (char) 'N' if non-empty deltatext, else 'E' @@ -883,16 +1038,16 @@ The two forms of initialization are equivalent.""" self._ctx = ctx - if len(args) == 16: + if len(args) == 15: (self.timestamp, self.digest, self.prev_timestamp, self.op, - self.prev_rev, self.rev, self.next_rev, self.file_in_attic, + self.prev_rev, self.rev, self.next_rev, self.file_executable, self.file_size, self.deltatext_code, self.fname, self.mode, self.branch_name, self.tags, self.branches) = args elif len(args) == 1: - data = args[0].split(' ', 14) + data = args[0].split(' ', 13) (self.timestamp, self.digest, self.prev_timestamp, self.op, - self.prev_rev, self.rev, self.next_rev, self.file_in_attic, + self.prev_rev, self.rev, self.next_rev, self.file_executable, self.file_size, self.deltatext_code, self.mode, self.branch_name, numtags, remainder) = data # Patch up data items which are not simple strings @@ -905,8 +1060,6 @@ self.prev_rev = None if self.next_rev == "*": self.next_rev = None - if self.file_in_attic == "*": - self.file_in_attic = None if self.file_executable == "*": self.file_executable = None self.file_size = int(self.file_size) @@ -923,12 +1076,11 @@ self.branches = branches_and_fname[:-1] self.fname = branches_and_fname[-1] else: - raise TypeError, 'CVSRevision() takes 2 or 16 arguments (%d given)' % \ + raise TypeError, 'CVSRevision() takes 2 or 15 arguments (%d given)' % \ (len(args) + 1) - if ctx is not None: - self.cvs_path = relative_name(self._ctx.cvsroot, self.fname[:-2]) - self.svn_path = self._make_path(self.cvs_path, self.branch_name) - self.svn_trunk_path = self._make_path(self.cvs_path) + if ctx is not None: # strictly speaking this check is now superfluous + self.svn_path = self._make_path(self.fname, self.branch_name) + self.svn_trunk_path = self._make_path(self.fname) # The 'primary key' of a CVS Revision is the revision number + the # filename. To provide a unique key (say, for a dict), we just glom @@ -941,10 +1093,10 @@ return revnum + "/" + self.fname def __str__(self): - return ('%08lx %s %s %s %s %s %s %s %s %d %s %s %s %d%s%s %d%s%s %s' % ( + return ('%08lx %s %s %s %s %s %s %s %d %s %s %s %d%s%s %d%s%s %s' % ( self.timestamp, self.digest, self.prev_timestamp or "*", self.op, (self.prev_rev or "*"), self.rev, (self.next_rev or "*"), - (self.file_in_attic or "*"), (self.file_executable or "*"), + (self.file_executable or "*"), self.file_size, self.deltatext_code, (self.mode or "*"), (self.branch_name or "*"), len(self.tags), self.tags and " " or "", " ".join(self.tags), @@ -967,11 +1119,11 @@ return 0 def is_default_branch_revision(self): - """Return 1 if SELF.rev of SELF.cvs_path is a default branch + """Return 1 if SELF.rev of SELF.fname is a default branch revision according to DEFAULT_BRANCHES_DB (see the conditions documented there), else return None.""" - if self._ctx._default_branches_db.has_key(self.cvs_path): - val = self._ctx._default_branches_db[self.cvs_path] + if self._ctx._default_branches_db.has_key(self.fname): + val = self._ctx._default_branches_db[self.fname] val_last_dot = val.rindex(".") our_last_dot = self.rev.rindex(".") default_branch = val[:val_last_dot] @@ -1031,19 +1183,6 @@ else: return self._ctx.trunk_base + '/' + path - def rcs_path(self): - """Returns the actual filesystem path to the RCS file of this - CVSRevision.""" - if self.file_in_attic is None: - return self.fname - else: - basepath, filename = os.path.split(self.fname) - return os.path.join(basepath, 'Attic', filename) - - def filename(self): - "Return the last path component of self.fname, minus the ',v'" - return os.path.split(self.fname)[-1][:-2] - class SymbolDatabase: """This database records information on all symbols in the RCS files. It is created in pass 1 and it is used in pass 2.""" @@ -1177,6 +1316,8 @@ def __init__(self): self.revs = open(temp(DATAFILE + REVS_SUFFIX), 'w') Cleanup().register(temp(DATAFILE + REVS_SUFFIX), pass2) + self.revisions_db = SDatabase(temp(REVISIONS_DB), DB_OPEN_NEW) + Cleanup().register(temp(REVISIONS_DB), pass8) self.resync = open(temp(DATAFILE + RESYNC_SUFFIX), 'w') Cleanup().register(temp(DATAFILE + RESYNC_SUFFIX), pass2) self.default_branches_db = Database(temp(DEFAULT_BRANCHES_DB), DB_OPEN_NEW) @@ -1211,6 +1352,8 @@ if not canonical_name == filename: self.file_in_attic = 1 + self.stream = RCSStream() + file_stat = os.stat(filename) # The size of our file in bytes self.file_size = file_stat[stat.ST_SIZE] @@ -1247,6 +1390,8 @@ # distinguish between an add and a change. self.rev_state = { } + self.empty_1111 = None + # Hash mapping branch numbers, like '1.7.2', to branch names, # like 'Release_1_0_dev'. self.branch_names = { } @@ -1505,6 +1650,10 @@ # finished the for-loop (no resyncing was performed) return + def writeout(self, r, tx): + if tx: + self.revisions_db[self.rel_name + '/' + r] = tx + def set_revision_info(self, revision, log, text): timestamp, author, old_ts = self.rev_data[revision] digest = sha.new(log + '\0' + author).hexdigest() @@ -1552,13 +1701,15 @@ deltatext_code = DELTATEXT_NONEMPTY else: deltatext_code = DELTATEXT_EMPTY + if revision == '1.1.1.1': + self.empty_1111 = 1 c_rev = CVSRevision(Ctx(), timestamp, digest, prev_timestamp, op, self.prev_rev[revision], revision, self.next_rev.get(revision), - self.file_in_attic, self.file_executable, + self.file_executable, self.file_size, - deltatext_code, self.fname, + deltatext_code, self.rel_name, self.mode, self.rev_to_branch_name(revision), self.taglist.get(revision, []), self.branchlist.get(revision, [])) @@ -1568,6 +1719,16 @@ if not self.metadata_db.has_key(digest): self.metadata_db[digest] = (author, log) + if trunk_rev.match(revision): + if revision not in self.next_rev: + self.stream.setText(text) + else: + self.writeout(self.next_rev[revision], self.stream.invertDiff(text)) + if not self.prev_rev[revision]: + self.writeout(revision, self.stream.zeroDiff()) + else: + self.writeout(revision, text) + def parse_completed(self): # Walk through all branches and tags and register them with # their parent branch in the symbol database. @@ -1579,8 +1740,33 @@ self.num_files = self.num_files + 1 + tree = [ ] + for r in self.prev_rev: + if r not in self.next_rev and not (r == "1.1.1.1" and self.empty_1111): + while self.rev_state[r] == 'dead': + pr = self.prev_rev[r] + if not pr: + break + if self.next_rev.get(pr) != r: + break + r = pr + else: + rvs = [ ] + while 1: + rvs.append(r) + pr = self.prev_rev[r] + if not pr: + break + if self.next_rev.get(pr) != r: + rvs.append(pr) + break + r = pr + tree.append(" ".join(rvs)) + self.revisions_db[self.rel_name] = "\n".join(tree) + def write_symbol_db(self): self.symbol_db.write() + self.revisions_db.close() class SymbolingsLogger: """Manage the file that contains lines for symbol openings and @@ -2038,7 +2224,7 @@ if not c_rev.branches: continue cvs_generated_msg = ('file %s was initially added on branch %s.\n' - % (c_rev.filename(), + % (os.path.split(c_rev.fname)[-1], c_rev.branches[0])) author, log_msg = \ Ctx()._persistence_manager.svn_commit_metadata[c_rev.digest] @@ -3389,7 +3575,7 @@ keywords = None if self.mime_mapper: - mime_type = self.mime_mapper.get_type_from_filename(c_rev.cvs_path) + mime_type = self.mime_mapper.get_type_from_filename(c_rev.fname) if not c_rev.mode == 'b': if not self.no_default_eol: @@ -3684,10 +3870,12 @@ if props_len: props_header = 'Prop-content-length: %d\n' % props_len + co = CVSCheckout().checkout(c_rev, s_item.needs_eol_filter) + # treat .cvsignore as a directory property dir_path, basename = os.path.split(c_rev.svn_path) if basename == ".cvsignore": - ignore_vals = generate_ignores(c_rev) + ignore_vals = generate_ignores(co) ignore_contents = '\n'.join(ignore_vals) ignore_contents = ('K 10\nsvn:ignore\nV %d\n%s\n' % \ (len(ignore_contents), ignore_contents)) @@ -3705,73 +3893,24 @@ % (self._utf8_path(dir_path), ignore_len, ignore_len, ignore_contents)) - # If the file has keywords, we must use -kk to prevent CVS/RCS from - # expanding the keywords because they must be unexpanded in the - # repository, or Subversion will get confused. - if s_item.has_keywords: - pipe_cmd, pipe = get_co_pipe(c_rev, [ '-kk' ]) - else: - pipe_cmd, pipe = get_co_pipe(c_rev) + checksum = md5.new() + checksum.update(co) self.dumpfile.write('Node-path: %s\n' 'Node-kind: file\n' 'Node-action: %s\n' '%s' # no property header if no props - 'Text-content-length: ' + 'Text-content-length: %d\n' + 'Text-content-md5: %s\n' + 'Content-length: %d\n' + '\n' % (self._utf8_path(c_rev.svn_path), - action, props_header)) - - pos = self.dumpfile.tell() - - self.dumpfile.write('0000000000000000\n' - 'Text-content-md5: 00000000000000000000000000000000\n' - 'Content-length: 0000000000000000\n' - '\n') - + action, props_header, + len(co), checksum.hexdigest(), + len(co) + props_len)) if prop_contents: self.dumpfile.write(prop_contents) - - # Insert a filter to convert all EOLs to LFs if neccessary - if s_item.needs_eol_filter: - data_reader = LF_EOL_Filter(pipe.stdout) - else: - data_reader = pipe.stdout - - # Insert the rev contents, calculating length and checksum as we go. - checksum = md5.new() - length = 0 - while True: - buf = data_reader.read(PIPE_READ_SIZE) - if buf == '': - break - checksum.update(buf) - length = length + len(buf) - self.dumpfile.write(buf) - - pipe.stdout.close() - error_output = pipe.stderr.read() - exit_status = pipe.wait() - if exit_status: - sys.exit("%s: The command '%s' failed with exit status: %s\n" - "and the following output:\n" - "%s" % (error_prefix, pipe_cmd, exit_status, error_output)) - - # Go back to patch up the length and checksum headers: - self.dumpfile.seek(pos, 0) - # We left 16 zeros for the text length; replace them with the real - # length, padded on the left with spaces: - self.dumpfile.write('%16d' % length) - # 16... + 1 newline + len('Text-content-md5: ') == 35 - self.dumpfile.seek(pos + 35, 0) - self.dumpfile.write(checksum.hexdigest()) - # 35... + 32 bytes of checksum + 1 newline + len('Content-length: ') == 84 - self.dumpfile.seek(pos + 84, 0) - # The content length is the length of property data, text data, - # and any metadata around/inside around them. - self.dumpfile.write('%16d' % (length + props_len)) - # Jump back to the end of the stream - self.dumpfile.seek(0, 2) - + self.dumpfile.write(co) # This record is done (write two newlines -- one to terminate # contents that weren't themselves newline-termination, one to # provide a blank line for readability. @@ -4208,7 +4347,7 @@ warning_prefix) msg = "RESYNC: '%s' (%s): old time='%s' delta=%ds" \ - % (c_rev.cvs_path, c_rev.rev, time.ctime(c_rev.timestamp), + % (c_rev.fname, c_rev.rev, time.ctime(c_rev.timestamp), record[2] - c_rev.timestamp) Log().write(LOG_VERBOSE, msg) @@ -4322,6 +4461,9 @@ Log().write(LOG_QUIET, "Done.") def pass8(): + checkout = CVSCheckout() + checkout.init() + svncounter = 2 # Repository initialization is 1. repos = SVNRepositoryMirror() persistence_manager = PersistenceManager(DB_OPEN_READ) @@ -4346,6 +4488,8 @@ repos.finish() + checkout.done() + _passes = [ pass1, pass2, @@ -4389,7 +4533,6 @@ self.no_default_eol = 0 self.eol_from_mime_type = 0 self.keywords_off = 0 - self.use_cvs = None self.svnadmin = "svnadmin" self.username = None self.print_help = 0 @@ -4492,8 +4635,6 @@ print ' --profile profile with \'hotshot\' (into file cvs2svn.hotshot)' print ' --dry-run do not create a repository or a dumpfile;' print ' just print what would happen.' - print ' --use-cvs use CVS instead of RCS \'co\' to extract data' - print ' (only use this if having problems with RCS)' print ' --svnadmin=PATH path to the svnadmin program' print ' --trunk-only convert only trunk commits, not tags nor branches' print ' --trunk=PATH path for trunk (default: %s)' \ @@ -4538,7 +4679,7 @@ "username=", "existing-svnrepos", "branches=", "tags=", "encoding=", "force-branch=", "force-tag=", "exclude=", - "use-cvs", "mime-types=", + "mime-types=", "eol-from-mime-type", "no-default-eol", "trunk-only", "no-prune", "dry-run", "dump-only", "dumpfile=", "tmpdir=", @@ -4588,8 +4729,6 @@ ctx.dumpfile = value elif opt == '--tmpdir': ctx.tmpdir = value - elif opt == '--use-cvs': - ctx.use_cvs = 1 elif opt == '--svnadmin': ctx.svnadmin = value elif opt == '--trunk-only': @@ -4673,30 +4812,6 @@ "existing directory.\n" % ctx.cvsroot) sys.exit(1) - if ctx.use_cvs: - # Ascend above the specified root if necessary, to find the cvs_repository - # (a directory containing a CVSROOT directory) and the cvs_module (the - # path of the conversion root within the cvs repository) - # NB: cvs_module must be seperated by '/' *not* by os.sep . - ctx.cvs_repository = os.path.abspath(ctx.cvsroot) - prev_cvs_repository = None - ctx.cvs_module = "" - while prev_cvs_repository != ctx.cvs_repository: - if os.path.isdir(os.path.join(ctx.cvs_repository, 'CVSROOT')): - break - prev_cvs_repository = ctx.cvs_repository - ctx.cvs_repository, module_component = os.path.split(ctx.cvs_repository) - ctx.cvs_module = module_component + "/" + ctx.cvs_module - else: - # Hit the root (of the drive, on Windows) without finding a CVSROOT dir. - sys.stderr.write(error_prefix + - ": the path '%s' is not a CVS repository, nor a path " \ - "within a CVS repository. A CVS repository contains " \ - "a CVSROOT directory within its root directory.\n" \ - % ctx.cvsroot) - sys.exit(1) - os.environ['CVSROOT'] = ctx.cvs_repository - if (not ctx.target) and (not ctx.dump_only) and (not ctx.dry_run): sys.stderr.write(error_prefix + ": must pass one of '-s' or '--dump-only'.\n") @@ -4772,28 +4887,6 @@ % ctx.tmpdir) sys.exit(1) - if ctx.use_cvs: - def cvs_ok(): - pipe = SimplePopen([ 'cvs' ] + Ctx().cvs_global_arguments + \ - [ '--version' ], True) - pipe.stdin.close() - pipe.stdout.read() - errmsg = pipe.stderr.read() - status = pipe.wait() - ok = len(errmsg) == 0 and status == 0 - return (ok, status, errmsg) - - ctx.cvs_global_arguments = [ "-q", "-R" ] - ok, cvs_exitstatus, cvs_errmsg = cvs_ok() - if not ok: - ctx.cvs_global_arguments = [ "-q" ] - ok, cvs_exitstatus, cvs_errmsg = cvs_ok() - - if not ok: - sys.stderr.write(error_prefix + - ": error executing CVS: status %s, error output:\n" \ - % (cvs_exitstatus) + cvs_errmsg) - # But do lock the tmpdir, to avoid process clash. try: os.mkdir(os.path.join(ctx.tmpdir, 'cvs2svn.lock')) From dberlin at dberlin.org Mon Aug 15 00:09:04 2005 From: dberlin at dberlin.org (Daniel Berlin) Date: Sun, 14 Aug 2005 18:09:04 -0400 Subject: [Python-Dev] Fwd: PEP: Migrating the Python CVS to Subversion In-Reply-To: References: <1123986783.21455.35.camel@linux.site> Message-ID: <1124057345.25267.72.camel@linux.site> On Sun, 2005-08-14 at 11:13 -0700, Guido van Rossum wrote: > Here's another POV. (Why does evereybody keep emailing me personally?) > Because we love you, and I forgot to cc python-dev. From gustavo at niemeyer.net Mon Aug 15 00:14:39 2005 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Sun, 14 Aug 2005 19:14:39 -0300 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <42FFBBEF.5060202@v.loewis.de> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> <17151.15640.173982.961359@montanaro.dyndns.org> <42FF6E4B.4000206@v.loewis.de> <20050814171259.GA8200@mems-exchange.org> <42FFBBEF.5060202@v.loewis.de> Message-ID: <20050814221439.GB11278@burma.localdomain> > I had problems finding the place where the bazaar-NG source code > repository is stored - is there a public access to the HEAD version? You may use rsync: rsync -av --delete bazaar-ng.org::bazaar-ng/bzr/bzr.dev . Or bzr itself: bzr branch http://bazaar-ng.org/bzr/bzr.dev Regards, -- Gustavo Niemeyer http://niemeyer.net From martin at v.loewis.de Mon Aug 15 00:15:30 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 15 Aug 2005 00:15:30 +0200 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <1124057259.25267.70.camel@linux.site> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> <17151.15640.173982.961359@montanaro.dyndns.org> <42FF6E4B.4000206@v.loewis.de> <20050814171259.GA8200@mems-exchange.org> <1124057259.25267.70.camel@linux.site> Message-ID: <42FFC282.9060307@v.loewis.de> Daniel Berlin wrote: > I'm not sure how big python's repo is, but you probably want to use the > attached patch to speed up cvs2svn. It changes it to reconstruct the > revisions on it's own instead of calling cvs or rcs. Thanks for the patch, but cvs2svn works fairly well for us as is (in the version that was released with Debian sarge); see http://www.python.org/peps/pep-0347.html for the conversion procedure. On the machine where I originally did the conversion, the script required 7h; on my current machine, it is done in 1:40 or so, which is acceptable. Out of curiosity: do you use the --cvs-revnums parameter? Should we? Regards, Martin From dberlin at dberlin.org Mon Aug 15 00:25:02 2005 From: dberlin at dberlin.org (Daniel Berlin) Date: Sun, 14 Aug 2005 18:25:02 -0400 Subject: [Python-Dev] Fwd: PEP: Migrating the Python CVS to Subversion In-Reply-To: <42FFBE96.7040000@v.loewis.de> References: <1123986783.21455.35.camel@linux.site> <42FFBE96.7040000@v.loewis.de> Message-ID: <1124058303.25267.88.camel@linux.site> On Sun, 2005-08-14 at 23:58 +0200, "Martin v. L?wis" wrote: > Guido van Rossum wrote: > > Here's another POV. > > I think I agree with Daniel's view, in particular wrt. to performance. > Whatever the replacement tool, it should perform as well or better > than CVS currently does; it also shouldn't perform much worse than > subversion. Then, in fairness, I should note that annotate is slower on subversion (and monotone, and anything using binary deltas) than CVS. This is because you can't generate line-diffs that annotate wants from binary copy + add diffs. You have to reconstruct the actual revisions and then line diff them. Thus, CVS is O(N) here, and SVN and other binary delta users are O(N^2). You wouldn't really notice the speed difference when you are annotating a file with 100 revisions. You would if you annotate the 800k changelog which has 30k trunk revisions. CVS takes 4 seconds, svn takes ~5 minutes, the whole time being spent in doing diffs of those revisions. I rewrote the blame algorithm recently so that it will only take about 2 minutes on changelog, but it cheats because it knows it can stop early because it's blamed all the revisions (since our changelog rotates). For those curious, you also can't directly generate "always-correct" byte-level differences from the diffs, since their goal is to find the most space efficient way to transform rev old into rev new, *not* record actual byte-level changes that occurred between old and new. It may turn out that doing an add of 2 bytes is cheaper than specifying the opcode for copy(start,len). Actual diffs are produced by reproducing the texts and line diffing them. Such is the cost of efficient storage :). > > I've been using git (or, rather, cogito) to keep up-to-date with the > Linux kernel. While performance of git is really good, storage > requirements are *quite* high, and initial "checkout" takes a long > time - even though the Linux kernel repository stores virtual no > history (there was a strict cut when converting the bitkeeper HEAD). > So these distributed tools would cause quite some disk consumption > on client machines. bazaar-ng apparently supports only-remote > repositories as well, so that might be no concern. The argument "network and disk is cheap" doesn't work for us when you are talking 5-10 gigabytes of initial transfer :). However, I doubt it's more than a hundred meg or so for python, if that. You may run into these problems in 10 years :) From mwh at python.net Mon Aug 15 00:31:33 2005 From: mwh at python.net (Michael Hudson) Date: Sun, 14 Aug 2005 23:31:33 +0100 Subject: [Python-Dev] build problems on macosx (CVS HEAD) In-Reply-To: <42FFBB1A.8060206@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Sun, 14 Aug 2005 23:43:54 +0200") References: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com> <42FFBB1A.8060206@v.loewis.de> Message-ID: <2mfytcup4q.fsf@starship.python.net> "Martin v. L?wis" writes: > Ronald Oussoren wrote: >> I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a >> checkout that is less than two hours old. I'm building a standard >> unix tree (no framework install): > > I just committed what I think is a bugfix for the recent st_gen support. > Unfortunately, I can't try the code, since I don't have access to > BSD/OSX at the moment. > > So please report whether there is any change in behaviour. Seems to have done the trick, thanks. Cheers, mwh -- I just had a very odd phone call from a researcher with the french TV station "TF1" asking about inflatable football referees -- from Twisted.Quotes From dberlin at dberlin.org Mon Aug 15 00:32:31 2005 From: dberlin at dberlin.org (Daniel Berlin) Date: Sun, 14 Aug 2005 18:32:31 -0400 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <42FFC282.9060307@v.loewis.de> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> <17151.15640.173982.961359@montanaro.dyndns.org> <42FF6E4B.4000206@v.loewis.de> <20050814171259.GA8200@mems-exchange.org> <1124057259.25267.70.camel@linux.site> <42FFC282.9060307@v.loewis.de> Message-ID: <1124058751.25267.94.camel@linux.site> On Mon, 2005-08-15 at 00:15 +0200, "Martin v. L?wis" wrote: > Daniel Berlin wrote: > > I'm not sure how big python's repo is, but you probably want to use the > > attached patch to speed up cvs2svn. It changes it to reconstruct the > > revisions on it's own instead of calling cvs or rcs. > > Thanks for the patch, but cvs2svn works fairly well for us as is (in > the version that was released with Debian sarge); see > > http://www.python.org/peps/pep-0347.html > > for the conversion procedure. On the machine where I originally did > the conversion, the script required 7h; on my current machine, it is > done in 1:40 or so, which is acceptable. > > Out of curiosity: do you use the --cvs-revnums parameter? Should we? No. In our case, it doesn't buy us anything. In the name of continuity, we have to make the old cvsweb urls work with new viewcvs urls anyway (they appear in bug reports, etc). We also don't want to destroy the ability for people to diff existing cvs working copies. I may have been able to hack something around with cvs-revnums, but not easily. Thus, we are just going to keep a readonly version of the repo around, and a readonly cvsweb, with a warning at the top of the page that the current source is stored in subversion. > > Regards, > Martin From martin at v.loewis.de Mon Aug 15 00:33:04 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 15 Aug 2005 00:33:04 +0200 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <20050814221439.GB11278@burma.localdomain> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> <17151.15640.173982.961359@montanaro.dyndns.org> <42FF6E4B.4000206@v.loewis.de> <20050814171259.GA8200@mems-exchange.org> <42FFBBEF.5060202@v.loewis.de> <20050814221439.GB11278@burma.localdomain> Message-ID: <42FFC6A0.9080509@v.loewis.de> Gustavo Niemeyer wrote: > You may use rsync: > > rsync -av --delete bazaar-ng.org::bazaar-ng/bzr/bzr.dev . > > Or bzr itself: > > bzr branch http://bazaar-ng.org/bzr/bzr.dev Ah, thanks. Fetching it with rsync is so much faster than fetching it with bzr, though... Regards, Martin From martin at v.loewis.de Mon Aug 15 00:37:31 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 15 Aug 2005 00:37:31 +0200 Subject: [Python-Dev] Fwd: PEP: Migrating the Python CVS to Subversion In-Reply-To: <1124058303.25267.88.camel@linux.site> References: <1123986783.21455.35.camel@linux.site> <42FFBE96.7040000@v.loewis.de> <1124058303.25267.88.camel@linux.site> Message-ID: <42FFC7AB.6000201@v.loewis.de> Daniel Berlin wrote: > The argument "network and disk is cheap" doesn't work for us when you > are talking 5-10 gigabytes of initial transfer :). However, I doubt > it's more than a hundred meg or so for python, if that. > > You may run into these problems in 10 years :) I don't know how bazaar-ng would perform - but the converted fsfs svn repository is 718MiB. Of course, in 10 years, 5-10GiB of network transfer will be cheap :-) Regards, Martin From lalo at exoweb.net Mon Aug 15 03:58:58 2005 From: lalo at exoweb.net (Lalo Martins) Date: Mon, 15 Aug 2005 09:58:58 +0800 Subject: [Python-Dev] cvs to bzr? In-Reply-To: <17150.31637.180169.877441@montanaro.dyndns.org> References: <17150.31637.180169.877441@montanaro.dyndns.org> Message-ID: And so says skip at pobox.com on 14/08/05 07:00... > Based on the message Guido forwarded, I installed bazaar-ng. From Mark's > note it seems they convert cvs repositories to bzr repositories, but I > didn't see any mention in the bzr docs of any sort of cvs2bzr tool. > Likewise, Google didn't turn up anything obvious. Anyone know of something? Just for the sake of fairness - Mark's email states that they convert cvs repositories to baz (Bazaar 1.x), not to bzr (Bazaar-NG, soon-to-be Bazaar 2.x). The tools to convert to bzr are not yet mature, as bzr itself just recently started to solidify. (The pace of development is one of my favorite "features" about bzr; it's a testament to python and to bzr itself.) You can, however, convert from CVS to baz (arch), and from there to bzr. best, Lalo Martins -- So many of our dreams at first seem impossible, then they seem improbable, and then, when we summon the will, they soon become inevitable. -- http://www.exoweb.net/ mailto:lalo at exoweb.net GNU: never give up freedom http://www.gnu.org/ From skip at pobox.com Mon Aug 15 04:39:41 2005 From: skip at pobox.com (skip@pobox.com) Date: Sun, 14 Aug 2005 21:39:41 -0500 Subject: [Python-Dev] cvs to bzr? In-Reply-To: References: <17150.31637.180169.877441@montanaro.dyndns.org> Message-ID: <17152.109.883835.190683@montanaro.dyndns.org> Lalo> You can, however, convert from CVS to baz (arch), and from there Lalo> to bzr. Would this be with cscvs? According to the cscvs wiki page at http://wiki.gnuarch.org/cscvs cscvs is current unmaintained and can't handle repositories with branches. In addition, it appears that to do a one-time convertsion from cvs to bzr I will need to also install arch and baz as well as any other packages they depend on. Skip From greg at electricrain.com Mon Aug 15 05:34:49 2005 From: greg at electricrain.com (Gregory P. Smith) Date: Sun, 14 Aug 2005 20:34:49 -0700 Subject: [Python-Dev] request for code review - hashlib - patch #1121611 Message-ID: <20050815033449.GW16043@zot.electricrain.com> https://sourceforge.net/tracker/index.php?func=detail&aid=1121611&group_id=5470&atid=305470 This is the hashlib module that speeds up python's md5 and sha1 support by using openssl (when available) as well as adding sha224/256 + sha384/512 support (plus anything openssl provides). I believe it is complete and ready to commit (hashlib-009.patch), any objections? compiled docs in html are here for easy perusal: http://electricrain.com/greg/hashlib-py25-doc/ thanks, greg From t-meyer at ihug.co.nz Mon Aug 15 05:46:19 2005 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Mon, 15 Aug 2005 15:46:19 +1200 Subject: [Python-Dev] python-dev Summary for 2005-07-16 through 2005-07-31 [draft] Message-ID: Here's July Part Two. As usual, if anyone can spare the time to proofread this (it's fairly short this fortnight!), that would be great! Please send any corrections or suggestions to Tim (tlesher at gmail.com), Steve (steven.bethard at gmail.com) and/or me, rather than cluttering the list. Ta! ============= Announcements ============= ------------------------------------------------- PyPy Sprint in Heidelberg 22nd - 29th August 2005 ------------------------------------------------- Heidelberg University in Germany will host a PyPy_ sprint from 22nd August to 29th August. The sprint will push towards the 0.7 release of PyPy_ which hopes to reach Python 2.4.1 compliancy and to have full, direct translation into a low level language, instead of reinterpretation through CPython. If you'd like to help out, this is a great place to start! For more information, see PyPy's `Heidelberg sprint`_ page. .. _PyPy: http://codespeak.net/pypy .. _Heidelberg sprint: http://codespeak.net/pypy/index.cgi?extradoc/sprintinfo/Heidelberg-sprint.ht ml Contributing thread: - `Next PyPy sprint: Heidelberg (Germany), 22nd-29th of August `__ -------------------------------- zlib 1.2.3 in Python 2.4 and 2.5 -------------------------------- Trent Mick supplied a patch for updating Python from zlib 1.2.1 to zlib 1.2.3, which eliminates some potential security vulnerabilities. Python will move to this new version of zlib in both the maintenance 2.4 branch and the main (2.5) branch. Contributing thread: - `zlib 1.2.3 is just out `__ ========= Summaries ========= ------------------------------- Moving Python CVS to Subversion ------------------------------- Martin v. L?wis submitted `PEP 347`_, covering changing from CVS to SVN for source code revision control of the Python repository, and moving from hosting the repository on sourceforge.net to python.org. Moving to SVN from CVS met with general favour from most people, although most were undecided about moving from sourceforge.net to python.org. The additional administration requirements of the move were the primary concern, and moving to an alternative host was suggested. Martin is open to including suggestions for alternative hosts in the PEP, but is not interested in carrying out such research himself; as such, if alternative hosts are to be included, someone needs to volunteer to collect all the required information and submit it to Martin. Discussion about the conversion and the move is continuing in August. .. _PEP 347: http://www.python.org/peps/pep-0347.html Contributing thread: - `PEP: Migrating the Python CVS to Subversion `__ --------------------------------- Exception Hierarchy in Python 3.0 --------------------------------- Brett Cannon posted the first draft of `PEP 348`_, covering reorganisation of exceptions in Python 3.0. The initial draft included major changes to the hierarchy, requiring any object raised to inherit from a certain superclass, and changing bare 'except' clauses to catch a specific superclass. The latter two proposals didn't generate much comment (although Guido vacillated between removing bare 'except' clauses and not), but the proposed hierarchy organisation and renaming was hotly discussed. Nick Coghlan countered each revision of Brett's maximum-changes PEP with a minimum-changes PEP, each evolving through python-dev discussion, and gradually moving to an acceptable middle ground. At present, it seems that the changes will be much more minor than the original proposal. The thread branched off into comments about `Python 3.0`_ changes in general. The consensus was generally that although backwards compatibility isn't required in Python 3.0, it should only be broken when there is a clear reason for it, and that, as much as possible, Python 3.0 should be Python 2.9 without a lot of backwards compatibility code. A number of people indicated that they were reasonably content with the existing exception hierarchy, and didn't feel that major changes were required. Guido suggested that a good principle for determining the ideal exception hierarchy is whether there's a use case for catching the common base class. Marc-Andre Lemburg pointed out that when migrating code changes in Exception names are reasonably easy to automate, but changes in the inheritance tree are much more difficult. Many exceptions were discussed at length (e.g. WindowsError, RuntimeError), with debate about whether they should continue to exist in Python 3.0, be renamed, or be removed. The PEP contains the current status for each of these exceptions. The PEP evolution and discussion are still continuing in August, and since this is for Python 3.0, are likely to be considered open for some time yet. .. _Python 3.0: http://www.python.org/peps/pep-3000.html .. _PEP 348: http://www.python.org/peps/pep-0348.html Contributing thread: - `Pre-PEP: Exception Reorganization for Python 3.0 `__ ----------------------------------------- Docstrings and the Official Documentation ----------------------------------------- A new `bug report`_ pointed out that the docstring help for cgi.escape was not as detailed as that in the full documentation, prompting Skip Montanaro to ask whether this should be the case or not. Several reasons were outlined why docstrings should be more of a "quick reference card" than a "textbook" (i.e. maintain the status quo). Tim Peters suggested that tools to extract text from the full documentation would be a more sensible method of making the "textbook" available from help ()/pydoc; if anyone is interested, then this would probably be the best way to start implementing this. .. _bug report: http://python.org/sf/1243553 Contributing thread: - `should doc string content == documentation content? `__ --------------------------- Syntax suggestion: "while:" --------------------------- Martin Blais suggested "while:" as a syntactic shortcut for "while True:". The suggestion was shot down pretty quickly; not only is "while:" less explicit than "while True:", but it introduces readability problems for the apparently large number of people who, when reading "while:", immediately think "while what?" Contributing thread: - `while: `__ ------------------ Sets in Python 2.5 ------------------ In Python 2.4, there is no C API for the built-in set type; you must use PyObject_Call(), etc. as you would in accessing other Python objects. However, in Python 2.5, Raymond Hettinger plans to introduce a C API along with a new implementation of the set type that uses its own data structure instead of forwarding everything to dicts. Contributing thread: - `C api for built-in type set? `__ =============== Skipped Threads =============== - `Some RFE for review `__ - `python/dist/src/Doc/lib emailutil.tex,1.11,1.12 `__ - `read only files `__ - `builtin filter function `__ - `Weekly Python Patch/Bug Summary `__ - `Information request; Keywords: compiler compiler, EBNF, python, ISO 14977 `__ - `installation of python on a Zaurus `__ - `python-dev summary for 2005-07-01 to 2005-07-15 [draft] `__ - `math.fabs redundant? `__ ================================================= Skipped Threads (covered in the previous summary) ================================================= - `'With' context documentation draft (was Re: Terminology for PEP 343 `__ - `Adding the 'path' module (was Re: Some RFE for review) `__ - `[C++-sig] GCC version compatibility `__ From bcannon at gmail.com Mon Aug 15 06:35:02 2005 From: bcannon at gmail.com (Brett Cannon) Date: Sun, 14 Aug 2005 21:35:02 -0700 Subject: [Python-Dev] PEP 348 (exception reorg) revised again Message-ID: I am sure people mainly care about the big changes inroduced by revision 1.8 of the PEP (http://www.python.org/peps/pep-0348.html). So, first is that WindowsError is staying. Enough people want it to stay and have a legitimate use that I removed the proposal to ditch it. Second, I changed the bare 'except' proposal again to recommend its removal. I had been feeling they should just go for about a week, but I solidified my thinking when I was talking with Alex and Anna Martelli and managed to convince them bare 'except's should go after Alex initially thought they should be changed to be ``except Exception``. This obviously goes against what Guido last said he wanted, but I hope I can convince him to get rid of bare 'except's. Minor stuff is fleshing out the arguments for TerminatingException (I am sure Raymond loves that I am leaving this part in =) and adding a Roadmap for the transition. -Brett From miha at mpi-magdeburg.mpg.de Mon Aug 15 08:22:10 2005 From: miha at mpi-magdeburg.mpg.de (Michael Krasnyk) Date: Mon, 15 Aug 2005 08:22:10 +0200 Subject: [Python-Dev] SWIG and rlcompleter Message-ID: <43003492.8060904@mpi-magdeburg.mpg.de> Hello all, Recently I've found that rlcompleter does not work correctly with SWIG generated classes. In some cases dir(object) containes not only strings, but also type of the object, smth like . And condition "word[:n] == attr" throws an exception. Is it possible to solve this problem with following path? --- cut --- --- rlcompleter.py.org 2005-08-14 13:02:02.000000000 +0200 +++ rlcompleter.py 2005-08-14 13:18:59.000000000 +0200 @@ -136,8 +136,11 @@ matches = [] n = len(attr) for word in words: - if word[:n] == attr and word != "__builtins__": - matches.append("%s.%s" % (expr, word)) + try: + if word[:n] == attr and word != "__builtins__": + matches.append("%s.%s" % (expr, word)) + except: + pass return matches def get_class_members(klass): --- cut --- Thanks in advance, Michael Krasnyk From stephen.thorne at gmail.com Mon Aug 15 08:40:22 2005 From: stephen.thorne at gmail.com (Stephen Thorne) Date: Mon, 15 Aug 2005 16:40:22 +1000 Subject: [Python-Dev] string_join overrides TypeError exception thrown in generator Message-ID: <3e8ca5c8050814234020735adf@mail.gmail.com> Hi, An interesting problem was pointed out to me, which I have distilled to this testcase: def gen(): raise TypeError, "I am a TypeError" yield 1 def one(): return ''.join( x for x in gen() ) def two(): return ''.join([x for x in gen()]) for x in one, two: try: x() except TypeError, e: print e Expected output is: """ I am a TypeError I am a TypeError """ Actual output is: """ sequence expected, generator found I am a TypeError """ Upon looking at the implementation of 'string_join' in stringobject.c[1], It's quite obvious what's gone wrong, an exception has been triggered in PySequence_Fast, and string_join overrides that exception, assuming that the only TypeErrors thrown by PySequence_Fast are caused by 'orig' being a value that was an invalid sequence type, ignoring the possibility that a TypeError could be thrown by exhausting a generator. seq = PySequence_Fast(orig, ""); if (seq == NULL) { if (PyErr_ExceptionMatches(PyExc_TypeError)) PyErr_Format(PyExc_TypeError, "sequence expected, %.80s found", orig->ob_type->tp_name); return NULL; } I can't see an obvious solution, but perhaps generators should get special treatment regardless. Reading over this code it looks like the generator is exhausted all at once, instead of incrementally.. -- Stephen Thorne Development Engineer [1] http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Objects/stringobject.c?rev=2.231&view=markup From benji at benjiyork.com Mon Aug 15 13:30:36 2005 From: benji at benjiyork.com (Benji York) Date: Mon, 15 Aug 2005 07:30:36 -0400 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <42FF6E4B.4000206@v.loewis.de> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> <17151.15640.173982.961359@montanaro.dyndns.org> <42FF6E4B.4000206@v.loewis.de> Message-ID: <43007CDC.6060200@benjiyork.com> Martin v. L?wis wrote: > skip at pobox.com wrote: >>Granted. What is the cost of waiting a bit longer to see if it (or >>something else) gets more usable and would hit the mark better than svn? > > It depends on what "a bit" is. Waiting a month would be fine; waiting > two years might be pointless. This might be too convoluted to consider, but I thought I might throw it out there. We use svn for our repositories, but I've taken to also using bzr so I can do local commits and reversions (within a particular svn reversion). I can imagine expanding that usage to sharing branches and such via bzr (or mercurial, which looks great), but keeping the trunk in svn. -- Benji York From ncoghlan at gmail.com Mon Aug 15 14:28:09 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 15 Aug 2005 22:28:09 +1000 Subject: [Python-Dev] string_join overrides TypeError exception thrown in generator In-Reply-To: <3e8ca5c8050814234020735adf@mail.gmail.com> References: <3e8ca5c8050814234020735adf@mail.gmail.com> Message-ID: <43008A59.2070306@gmail.com> Stephen Thorne wrote: > I can't see an obvious solution, but perhaps generators should get > special treatment regardless. Reading over this code it looks like the > generator is exhausted all at once, instead of incrementally.. Indeed - str.join uses a multipass approach to build the final string, so it needs to ensure it has a reiterable to play with. PySequence_Fast achieves that, at the cost of dumping a generator into a sequence rather than building a string from it directly. Unicode.join uses PySequence_Fast too, and has the same problem with masking the TypeError from the generator. The calling code simply can't tell if the NULL return was set directly by PySequence_Fast, or was relayed by PySequence_List (which got it from _PyList_Extend, which got it from listextend, which got it from iternext, etc). This is the kind of problem that PEP 344 is designed to solve :) This also shows that argument validation is one of the cases where using an iterable instead of a generator is a good thing, since errors get raised where the generator is created, instead of where it is first used: class gen(object): def __init__(self): raise TypeError, "I am a TypeError" def __iter__(self): yield 1 def one(): return ''.join( x for x in gen() ) def two(): return ''.join([x for x in gen()]) for x in one, two: try: x() except TypeError, e: print e Hmm, makes me think of a neat little decorator: def step_on_creation(gen): def start_gen(*args, **kwds): g = gen(*args, **kwds) g.next() return g start_gen.__name__ = gen.__name__ start_gen.__doc__ = gen.__doc__ start_gen.__dict__ = gen.__dict__ return start_gen @step_on_creation def gen(): # Setup executed at creation time raise TypeError, "I am a TypeError" yield None # The actual iteration steps yield 1 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From raymond.hettinger at verizon.net Mon Aug 15 15:16:47 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Mon, 15 Aug 2005 09:16:47 -0400 Subject: [Python-Dev] PEP 348 (exception reorg) revised again In-Reply-To: Message-ID: <001601c5a19b$9b2abf80$af26c797@oemcomputer> [Brett] > This obviously goes against what Guido last said he > wanted, but I hope I can convince him to get rid of bare 'except's. -1 on eliminating bare excepts. This unnecessarily breaks tons of code without offering ANY compensating benefits. There are valid use cases for this construct. It is completely Pythonic to have bare keywords apply a useful default as an aid to readability and ease of coding. +1 on the new BaseException +1 on moving NotImplementedError, SystemExit, and KeyboardInterrupt. -1 on replacing "except (KeyboardInterrupt, SystemExit)" with "except TerminatingException". 1) Grepping existing code bases shows that these two are almost never caught together so it is a bit silly to introduce a second way to do it. 2) Efforts to keep the builtin namespace compact argue against adding a new builtin that will almost never be used. 3) The change unnecessarily sacrifices flatness, making the language more difficult to learn. 4) The "self-documenting" rationale is weak -- if needed, a two-word comment would suffice. Existing code almost never has had to comment on catching multiple exceptions -- the exception tuple itself has been sufficiently obvious and explicit. This rationale assumes that code readers aren't smart enough to infer that SystemExit has something to do with termination. Raymond From phd at mail2.phd.pp.ru Mon Aug 15 15:33:33 2005 From: phd at mail2.phd.pp.ru (Oleg Broytmann) Date: Mon, 15 Aug 2005 17:33:33 +0400 Subject: [Python-Dev] PEP 348 (exception reorg) revised again In-Reply-To: <001601c5a19b$9b2abf80$af26c797@oemcomputer> References: <001601c5a19b$9b2abf80$af26c797@oemcomputer> Message-ID: <20050815133333.GA17966@phd.pp.ru> On Mon, Aug 15, 2005 at 09:16:47AM -0400, Raymond Hettinger wrote: > It is completely Pythonic to have bare keywords > apply a useful default as an aid to readability and ease of coding. Bare "while:" was rejected because of "while WHAT?!". Bare "except:" does not cause "except WHAT?!" reaction. Isn't it funny?! (-: Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From raymond.hettinger at verizon.net Mon Aug 15 15:47:54 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Mon, 15 Aug 2005 09:47:54 -0400 Subject: [Python-Dev] PEP 348 (exception reorg) revised again In-Reply-To: <20050815133333.GA17966@phd.pp.ru> Message-ID: <001801c5a19f$f43ae6a0$af26c797@oemcomputer> > > It is completely Pythonic to have bare keywords > > apply a useful default as an aid to readability and ease of coding. [Oleg] > Bare "while:" was rejected because of "while WHAT?!". Bare "except:" > does not cause "except WHAT?!" reaction. Isn't it funny?! (-: It's both funny and interesting. It raises the question of what makes the two different -- why is one instantly recognizable and why does the other trigger a gag reflex. My thought is that bare excepts occur in a context that makes their meaning clear: try: block() except SpecificException: se_handler() except: handle_everything_else() The pattern of use is similar to a "default" in a switch-case construct. Viewed out-of-context, one would ask "default WHAT". Viewed after a series of case statements, the meaning is vividly clear. Raymond From tdickenson at devmail.geminidataloggers.co.uk Mon Aug 15 16:48:11 2005 From: tdickenson at devmail.geminidataloggers.co.uk (Toby Dickenson) Date: Mon, 15 Aug 2005 15:48:11 +0100 Subject: [Python-Dev] PEP 348 (exception reorg) revised again In-Reply-To: <001601c5a19b$9b2abf80$af26c797@oemcomputer> References: <001601c5a19b$9b2abf80$af26c797@oemcomputer> Message-ID: <200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk> On Monday 15 August 2005 14:16, Raymond Hettinger wrote: > -1 on replacing "except (KeyboardInterrupt, SystemExit)" with "except > TerminatingException". The rationale for including TerminatingException in the PEP would also be satisfied by having a TerminatingExceptions tuple (in the exceptions module?). It makes sense to express the classification of exceptions that are intended to terminate the interpreter, but we dont need to express that classification as inheritence. -- Toby Dickenson From nick.bastin at gmail.com Mon Aug 15 18:27:36 2005 From: nick.bastin at gmail.com (Nicholas Bastin) Date: Mon, 15 Aug 2005 12:27:36 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <42F6F61B.1080505@v.loewis.de> References: <42E93940.6080708@v.loewis.de> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> <42F6F61B.1080505@v.loewis.de> Message-ID: <66d0a6e1050815092760a2dab3@mail.gmail.com> On 8/8/05, "Martin v. L?wis" wrote: > Nicholas Bastin wrote: > > It's a mature product. I would hope that that would count for > > something. > > Sure. But so is subversion. I will then assume that you and I have different ideas of what 'mature' means. > So I should then remove your offer to host a perforce installation, > as you never made such an offer, right? Correct. . > Yes. That's what this PEP is for. So I guess you are -1 on the > PEP. Not completely. More like -0 at the moment. We need a better system, but I think we shouldn't just pick a system because it's the one the PEP writer preferred - there should be some sort of effort to test a few systems (including bug trackers). I know this is work, but this isn't just something we can change easily again later. -- Nick From ronaldoussoren at mac.com Mon Aug 15 09:15:59 2005 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 15 Aug 2005 09:15:59 +0200 Subject: [Python-Dev] build problems on macosx (CVS HEAD) In-Reply-To: <42FFBB1A.8060206@v.loewis.de> References: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com> <42FFBB1A.8060206@v.loewis.de> Message-ID: <1A6CE290-CD7F-4AEC-B649-39CB4ED57E8B@mac.com> On 14-aug-2005, at 23:43, Martin v. L?wis wrote: > Ronald Oussoren wrote: > >> I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a >> checkout that is less than two hours old. I'm building a standard >> unix tree (no framework install): >> > > I just committed what I think is a bugfix for the recent st_gen > support. > Unfortunately, I can't try the code, since I don't have access to > BSD/OSX at the moment. > > So please report whether there is any change in behaviour. Your change has fixed this issue. Thanks for the quick response, Ronald > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ > ronaldoussoren%40mac.com > From gvanrossum at gmail.com Mon Aug 15 19:08:02 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon, 15 Aug 2005 10:08:02 -0700 Subject: [Python-Dev] PEP 348 (exception reorg) revised again In-Reply-To: <001601c5a19b$9b2abf80$af26c797@oemcomputer> References: <001601c5a19b$9b2abf80$af26c797@oemcomputer> Message-ID: I'm with Raymond here. On 8/15/05, Raymond Hettinger wrote: > [Brett] > > This obviously goes against what Guido last said he > > wanted, but I hope I can convince him to get rid of bare 'except's. > > -1 on eliminating bare excepts. This unnecessarily breaks tons of code > without offering ANY compensating benefits. There are valid use cases > for this construct. It is completely Pythonic to have bare keywords > apply a useful default as an aid to readability and ease of coding. > > +1 on the new BaseException > > +1 on moving NotImplementedError, SystemExit, and KeyboardInterrupt. > > -1 on replacing "except (KeyboardInterrupt, SystemExit)" with "except > TerminatingException". 1) Grepping existing code bases shows that these > two are almost never caught together so it is a bit silly to introduce a > second way to do it. 2) Efforts to keep the builtin namespace compact > argue against adding a new builtin that will almost never be used. 3) > The change unnecessarily sacrifices flatness, making the language more > difficult to learn. 4) The "self-documenting" rationale is weak -- if > needed, a two-word comment would suffice. Existing code almost never > has had to comment on catching multiple exceptions -- the exception > tuple itself has been sufficiently obvious and explicit. This rationale > assumes that code readers aren't smart enough to infer that SystemExit > has something to do with termination. > > > > Raymond > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Mon Aug 15 19:11:24 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon, 15 Aug 2005 10:11:24 -0700 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: <43003492.8060904@mpi-magdeburg.mpg.de> References: <43003492.8060904@mpi-magdeburg.mpg.de> Message-ID: (1) Please use the SF patch manager. (2) Please don't propose adding more bare "except:" clauses to the standard library. (3) I think a better patch is to use str(word)[:n] instead of word[:n]. On 8/14/05, Michael Krasnyk wrote: > Hello all, > > Recently I've found that rlcompleter does not work correctly with SWIG > generated classes. > In some cases dir(object) containes not only strings, but also type of > the object, smth like . > And condition "word[:n] == attr" throws an exception. > Is it possible to solve this problem with following path? > > --- cut --- > --- rlcompleter.py.org 2005-08-14 13:02:02.000000000 +0200 > +++ rlcompleter.py 2005-08-14 13:18:59.000000000 +0200 > @@ -136,8 +136,11 @@ > matches = [] > n = len(attr) > for word in words: > - if word[:n] == attr and word != "__builtins__": > - matches.append("%s.%s" % (expr, word)) > + try: > + if word[:n] == attr and word != "__builtins__": > + matches.append("%s.%s" % (expr, word)) > + except: > + pass > return matches > > def get_class_members(klass): > --- cut --- > > Thanks in advance, > Michael Krasnyk > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Mon Aug 15 19:36:46 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 15 Aug 2005 10:36:46 -0700 Subject: [Python-Dev] PEP 348 (exception reorg) revised again In-Reply-To: References: <001601c5a19b$9b2abf80$af26c797@oemcomputer> Message-ID: OK, I will take this as BDFL pronouncement that ditching bare 'except's is just not going to happen. Had to try. =) And I will strip out the TerminatingException proposal. -Brett On 8/15/05, Guido van Rossum wrote: > I'm with Raymond here. > > On 8/15/05, Raymond Hettinger wrote: > > [Brett] > > > This obviously goes against what Guido last said he > > > wanted, but I hope I can convince him to get rid of bare 'except's. > > > > -1 on eliminating bare excepts. This unnecessarily breaks tons of code > > without offering ANY compensating benefits. There are valid use cases > > for this construct. It is completely Pythonic to have bare keywords > > apply a useful default as an aid to readability and ease of coding. > > > > +1 on the new BaseException > > > > +1 on moving NotImplementedError, SystemExit, and KeyboardInterrupt. > > > > -1 on replacing "except (KeyboardInterrupt, SystemExit)" with "except > > TerminatingException". 1) Grepping existing code bases shows that these > > two are almost never caught together so it is a bit silly to introduce a > > second way to do it. 2) Efforts to keep the builtin namespace compact > > argue against adding a new builtin that will almost never be used. 3) > > The change unnecessarily sacrifices flatness, making the language more > > difficult to learn. 4) The "self-documenting" rationale is weak -- if > > needed, a two-word comment would suffice. Existing code almost never > > has had to comment on catching multiple exceptions -- the exception > > tuple itself has been sufficiently obvious and explicit. This rationale > > assumes that code readers aren't smart enough to infer that SystemExit > > has something to do with termination. > > > > > > > > Raymond > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From bcannon at gmail.com Mon Aug 15 19:44:12 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 15 Aug 2005 10:44:12 -0700 Subject: [Python-Dev] PEP 348 (exception reorg) revised again In-Reply-To: <200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk> References: <001601c5a19b$9b2abf80$af26c797@oemcomputer> <200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk> Message-ID: On 8/15/05, Toby Dickenson wrote: > On Monday 15 August 2005 14:16, Raymond Hettinger wrote: > > > -1 on replacing "except (KeyboardInterrupt, SystemExit)" with "except > > TerminatingException". > > The rationale for including TerminatingException in the PEP would also be > satisfied by having a TerminatingExceptions tuple (in the exceptions > module?). It makes sense to express the classification of exceptions that are > intended to terminate the interpreter, but we dont need to express that > classification as inheritence. > While the idea is fine, I just know that the point is going to be brought up that the addition should not be done until experience with the new hierarchy is had. I will add a comment that tuples can be added to the module after enough experience is had, but I am not going to try pushing for this right now. Of course I could be surprised and everyone could support the idea. =) -Brett From dberlin at dberlin.org Mon Aug 15 21:06:03 2005 From: dberlin at dberlin.org (Daniel Berlin) Date: Mon, 15 Aug 2005 15:06:03 -0400 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <66d0a6e1050815092760a2dab3@mail.gmail.com> References: <42E93940.6080708@v.loewis.de> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> <42F6F61B.1080505@v.loewis.de> <66d0a6e1050815092760a2dab3@mail.gmail.com> Message-ID: <1124132763.3143.10.camel@MAMBA> On Mon, 2005-08-15 at 12:27 -0400, Nicholas Bastin wrote: > On 8/8/05, "Martin v. L?wis" wrote: > > Nicholas Bastin wrote: > > > It's a mature product. I would hope that that would count for > > > something. > > > > Sure. But so is subversion. > > I will then assume that you and I have different ideas of what 'mature' means. Bigger projects than Python use it and consider it mature for real use (All the Apache projects, all of KDE, GNOME is planning on switching soon, etc). I've never seen a corrupted FSFS repo, only corrupted BDB repos, and I will happily grant that using BDB ended up being a big mistake for Subversion. Not one that could have easily been foreseen at the time, but such is life. But this is why FSFS is the default for 1.2+ I've never seen you post about a corrupted repository to svn-users or svn-dev or file a bug, so i can't say why you see corrupted repositories if they are FSFS ones. --Dan From Scott.Daniels at Acm.Org Mon Aug 15 21:27:57 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Mon, 15 Aug 2005 12:27:57 -0700 Subject: [Python-Dev] PEP 348 (exception reorg) revised again In-Reply-To: <200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk> References: <001601c5a19b$9b2abf80$af26c797@oemcomputer> <200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk> Message-ID: Toby Dickenson wrote: > On Monday 15 August 2005 14:16, Raymond Hettinger wrote: > > The rationale for including TerminatingException in the PEP would also be > satisfied by having a TerminatingExceptions tuple (in the exceptions > module?). It makes sense to express the classification of exceptions that are > intended to terminate the interpreter, but we dont need to express that > classification as inheritence. > An argument _for_ TerminatingException as a class is that I can define my own subclasses of TerminatingException without forcing it to being a subclass of KeyboardInterrupt or SystemExit. -- Scott David Daniels Scott.Daniels at Acm.Org From gvanrossum at gmail.com Mon Aug 15 21:36:29 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon, 15 Aug 2005 12:36:29 -0700 Subject: [Python-Dev] PEP 348 (exception reorg) revised again In-Reply-To: References: <001601c5a19b$9b2abf80$af26c797@oemcomputer> <200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk> Message-ID: On 8/15/05, Scott David Daniels wrote: > An argument _for_ TerminatingException as a class is that I can > define my own subclasses of TerminatingException without forcing > it to being a subclass of KeyboardInterrupt or SystemExit. And how would that help you? Would your own exceptions be more like SystemExit or more like KeyboardInterrupt, or neither? If you mean them to be excluded by base "except:", you can always subclass BaseException, which exists for this purpose. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Mon Aug 15 22:49:48 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 15 Aug 2005 22:49:48 +0200 Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion In-Reply-To: <66d0a6e1050815092760a2dab3@mail.gmail.com> References: <42E93940.6080708@v.loewis.de> <1122607673.9665.38.camel@geddy.wooz.org> <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp> <1122918723.9680.33.camel@warna.corp.google.com> <42EF2794.1000209@v.loewis.de> <66d0a6e105080312181e25fa08@mail.gmail.com> <42F1AADE.50908@v.loewis.de> <66d0a6e105080718527939aa81@mail.gmail.com> <42F6F61B.1080505@v.loewis.de> <66d0a6e1050815092760a2dab3@mail.gmail.com> Message-ID: <4300FFEC.3090001@v.loewis.de> Nicholas Bastin wrote: > Not completely. More like -0 at the moment. We need a better system, > but I think we shouldn't just pick a system because it's the one the > PEP writer preferred - there should be some sort of effort to test a > few systems (including bug trackers). But that's how the PEP process works: the PEP author is supposed to collect feedback from the community in a fair way, but he is not required to implement every suggestion that the community makes. People who strongly disagree that the entire approach should be taken should write an alternative ("counter") PEP, proposing their strategy. In the end, the BDFL will pronounce which approach (if any) should be implemented. In the specific case, I'm personally not willing to discuss every SCM system out there. If somebody manages to make me curious (as Guido did with the bazaar posts), I will try it out, if I can find an easy way to do so. Your comments about (what was the name again) did not make me curious. As for bug trackers: this PEP is specifically *not* about bug trackers at all. If you think the SourceForge bugtracker should be replaced with something else, write a PEP. I really don't see a reasonable alternative to the SF bugtracker. > I know this is work, but this > isn't just something we can change easily again later. I don't bother asking who "we" is, here: apparently not you. Regards, Martin From fperez.net at gmail.com Mon Aug 15 22:48:48 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 15 Aug 2005 14:48:48 -0600 Subject: [Python-Dev] SWIG and rlcompleter References: <43003492.8060904@mpi-magdeburg.mpg.de> Message-ID: Guido van Rossum wrote: > (1) Please use the SF patch manager. > > (2) Please don't propose adding more bare "except:" clauses to the > standard library. > > (3) I think a better patch is to use str(word)[:n] instead of word[:n]. Sorry to jump in, but this same patch was proposed for ipython, and my reply was that it appeared to me as a SWIG bug. From: http://www.python.org/doc/2.4.1/lib/built-in-funcs.html the docs for dir() seem to suggest that dir() should only return strings (I am inferring that from things like 'The resulting list is sorted alphabetically'). The docs are not fully explicit on this, though. Am I interpreting the docs correctly, case in which this should be considered a SWIG bug? Or is it OK for objects to stuff non-strings in __dict__, case in which SWIG is OK and then rlcompleter (and the corresponding system in ipython) do need to protect against this situation. I'd appreciate a clarification here, so I can close my own ipython bug report as well. Thanks, f From bos at serpentine.com Mon Aug 15 23:04:53 2005 From: bos at serpentine.com (Bryan O'Sullivan) Date: Mon, 15 Aug 2005 14:04:53 -0700 Subject: [Python-Dev] On distributed vs centralised SCM for Python Message-ID: <1124139893.20124.29.camel@localhost.localdomain> Pardon me for coming a little late to the SCM discussion, but I thought I would throw a few comments in. A little background: I've used Perforce, CVS, Subversion and BitKeeper for a number of years. Currently, I hack on Mercurial . However, I'm not here to try and specifically push Mercurial, but rather to bring up a few points that I haven't seen made in the earlier discussions. The biggest distinguishing factor between centralised and decentralised SCMs is the kinds of interaction they permit between the core developer community and outsiders. The centralised SCM tools all create a wall between core developers (i.e. people with commit access to the central repository) and people who are on the fringes. Outsiders may be able to get anonymous read-only access, but they are left up to their own devices if they want to make changes that they would like to contribute back to the project. With centralised tools, any revision control that outsiders do must be ad-hoc in nature, and they cannot share their changes in a natural way (i.e. preserving revision history) with anyone else. I do not follow Python development closely, so I have no idea how open Python typically is to contributions from people outside the core CVS committers. However, it's worth pointing out that with a distributed SCM - it doesn't really matter which one you use - it is simple to put together a workflow that operates in the same way as a centralised SCM. You lose nothing in the translation. What you gain is several-fold: * Outsiders get to work according to the same terms, and with the same tools, as core developers. * Everyone can perform whatever work they want (branch, commit, diff, undo, etc) without being connected to the main repository in any way. * Peer-level sharing of changes, for testing or evaluation, is easy and doesn't clutter up the central server with short-lived branches. * Speculative branching: it is cheap to create a local private branch that contains some half-baked changes. If they work out, fold them back and commit them to the main repository. If not, blow the branch away and forget about it. Regardless of what you may think of the Linux development model, it is teling that there have been about 80 people able to commit changes to Python since 1990 (I just checked the cvsroot tarball), whereas my estimate is that about ten times as many used BitKeeper to contribute changes to the Linux kernel just since the 2.5 tree began in 2002. (The total number of users who contributed changes was about 1600, 1300 of whom used BK, while the remainder emailed plain old patches that someone applied.) It is, of course, not possible for me to tell which CVS commits were really patches that originated with someone else, but my intent is to show how the choice of tools affects the ability of people to contribute in "natural" ways. How much of the difference in numbers is due to the respective popularity or accessibility of the projects is anyone's guess. With any luck, there's some food for thought above. Regards, From foom at fuhm.net Mon Aug 15 23:20:15 2005 From: foom at fuhm.net (James Y Knight) Date: Mon, 15 Aug 2005 17:20:15 -0400 Subject: [Python-Dev] On distributed vs centralised SCM for Python In-Reply-To: <1124139893.20124.29.camel@localhost.localdomain> References: <1124139893.20124.29.camel@localhost.localdomain> Message-ID: <6208AA5C-3E27-40EE-BD7B-CB6E1CA3D764@fuhm.net> On Aug 15, 2005, at 5:04 PM, Bryan O'Sullivan wrote: > The centralised SCM tools all create a wall between core developers > (i.e. people with commit access to the central repository) and people > who are on the fringes. Outsiders may be able to get anonymous > read-only access, but they are left up to their own devices if they > want > to make changes that they would like to contribute back to the > project. But, if python is using svn, outside developers can seamlessly use svk (http://svk.elixus.org/) to do their own branches if they wish, no? Sure, that is "their own devices", but it seems a fairly workable solution to me as the two are so closely related. Now, I've never tried this, so I'm just judging from the "marketing material" on the svk website. James From martin at v.loewis.de Mon Aug 15 23:29:23 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 15 Aug 2005 23:29:23 +0200 Subject: [Python-Dev] On distributed vs centralised SCM for Python In-Reply-To: <1124139893.20124.29.camel@localhost.localdomain> References: <1124139893.20124.29.camel@localhost.localdomain> Message-ID: <43010933.3050303@v.loewis.de> Bryan O'Sullivan wrote: > However, it's worth pointing out that with a distributed SCM - it > doesn't really matter which one you use - it is simple to put together a > workflow that operates in the same way as a centralised SCM. You lose > nothing in the translation. What you gain is several-fold: That may be off-topic for python-dev, but can you please explain how this works? > * Outsiders get to work according to the same terms, and with the > same tools, as core developers. I'm using git on the kernel level. In what way am I at the same level as the core developers? They can write to the kernel.org repository, I cannot. They use commit, I send diffs. > * Everyone can perform whatever work they want (branch, commit, > diff, undo, etc) without being connected to the main repository > in any way. So what? If I want to branch, I create a new sandbox. I have to do that anyway, since independent projects should not influence each other. I can also easily diff, whether I have write access or not (in svn, even simpler so than in CVS). There is no easy way to undo parts of the changes, that's true. > * Peer-level sharing of changes, for testing or evaluation, is > easy and doesn't clutter up the central server with short-lived > branches. So how does that work? If I commit the changes to my local version of the repository, how do they get peer-level-shared? I turn off my machine when I leave the house, and I don't have a permanent IP, anyway, to host a web server or some such. > * Speculative branching: it is cheap to create a local private > branch that contains some half-baked changes. If they work out, > fold them back and commit them to the main repository. If not, > blow the branch away and forget about it. I do that with separate sandboxes right now. cp -a py2.5 py-64bit gives me a new sandbox, in which I can do my speculative project. > Regardless of what you may think of the Linux development model, it is > teling that there have been about 80 people able to commit changes to > Python since 1990 (I just checked the cvsroot tarball), whereas my > estimate is that about ten times as many used BitKeeper to contribute > changes to the Linux kernel just since the 2.5 tree began in 2002. (The > total number of users who contributed changes was about 1600, 1300 of > whom used BK, while the remainder emailed plain old patches that someone > applied.) Hmm. The changes of these 800 people had to be approved by some core developers, or perhaps even all approved by Linus Torvalds, right? This is really the same for Python: A partial list of contributors is in Misc/ACKS (663 lines at the moment), and this doesn't list all the people who contributed trivial changes. So I guess Python has the same number of contributors per line as the Linux kernel. > It is, of course, not possible for me to tell which CVS commits were > really patches that originated with someone else, but my intent is to > show how the choice of tools affects the ability of people to contribute > in "natural" ways. I hear that, but I have a hard time believing it. People find the "cvs diff -u, send diff file for discussion to patches tracker" cycle quite natural. Regards, Martin From bos at serpentine.com Tue Aug 16 00:19:40 2005 From: bos at serpentine.com (Bryan O'Sullivan) Date: Mon, 15 Aug 2005 15:19:40 -0700 Subject: [Python-Dev] On distributed vs centralised SCM for Python In-Reply-To: <43010933.3050303@v.loewis.de> References: <1124139893.20124.29.camel@localhost.localdomain> <43010933.3050303@v.loewis.de> Message-ID: <1124144380.20124.44.camel@localhost.localdomain> On Mon, 2005-08-15 at 23:29 +0200, "Martin v. L?wis" wrote: > That may be off-topic for python-dev, but can you please explain how > this works? It's simple enough. In place of a central server that hosts a set of repositories and a number of branches, and to which only a few people have access, you use a central server that hosts a number of repositories, and you get the idea. But the difference lies in the way you use it. In the centralised model, there's only one server, and only one repository, anywhere. In the distributed model, each developer has one or more repositories that they keep in sync with the central ones they are interested in, pulling and pushing changes as necessary. The difference is that they get to share changes horizontally if they wish, without going through the central server. > I'm using git on the kernel level. In what way am I at the same level > as the core developers? You can use the same tools to do the same things they can. You can communicate with them in terms of commits. You may each have access to different sets of servers from which other people can pull changes, but if they want to take changes from you, you have the option of giving them complete history of all the edits and merges you've done, with no information loss. > So how does that work? If I commit the changes to my local version of > the repository, how do they get peer-level-shared? You have to do something to share them, but it's a lot simpler than sending diffs to a mailing list, or attaching them to a bug tracking system note. > Hmm. The changes of these 800 people had to be approved by some core > developers, or perhaps even all approved by Linus Torvalds, right? True. > I hear that, but I have a hard time believing it. People find the > "cvs diff -u, send diff file for discussion to patches tracker" > cycle quite natural. People will find doing the same of anything, over and over for fifteen years, quite natural :-) From raymond.hettinger at verizon.net Tue Aug 16 02:21:45 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Mon, 15 Aug 2005 20:21:45 -0400 Subject: [Python-Dev] On distributed vs centralised SCM for Python In-Reply-To: <6208AA5C-3E27-40EE-BD7B-CB6E1CA3D764@fuhm.net> Message-ID: <006201c5a1f8$7c9f4060$af26c797@oemcomputer> [Bryan O'Sullivan] > > The centralised SCM tools all create a wall between core developers > > (i.e. people with commit access to the central repository) and people > > who are on the fringes. Outsiders may be able to get anonymous > > read-only access, but they are left up to their own devices if they > > want > > to make changes that they would like to contribute back to the > > project. [James Y Knight] > But, if python is using svn, outside developers can seamlessly use > svk (http://svk.elixus.org/) to do their own branches if they wish, > no? Sure, that is "their own devices", but it seems a fairly workable > solution to me as the two are so closely related. +1 This seems to be the most flexible and sensible idea so far. The svn system has had many accolades; Martin knows how to convert it; and it presents only a small learning curve to cvs users. Optionally adding svk to the mix allows us to get the benefits of a distributed system without any additional migration or support issues. Very nice. Raymond From tim.peters at gmail.com Tue Aug 16 03:07:51 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 15 Aug 2005 21:07:51 -0400 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <42F61C03.6050703@v.loewis.de> References: <42F61C03.6050703@v.loewis.de> Message-ID: <1f7befae05081518073433da62@mail.gmail.com> [Martin v. L?wis] > I have placed a new version of the PEP on > > http://www.python.org/peps/pep-0347.html ... +1 from me. But, I don't think my vote should count much, and (sorry) Guido's even less: what do the people who frequently check in want? That means people like you (Martin), Michael, Raymond, Walter, Fred. ... plus the release manager(s). BTW, a stumbling block in Zope's conversion to SVN was that the conversion script initially never set svn:eol-style on any file. This caused weeks of problems, as people on Windows got Linux line ends, and people checking in from Windows forced Windows line ends on Linuxheads (CVS defaults to assuming files are text; SVN binary). The peculiar workaround at Zope is that we're all encouraged to add something like this to our SVN config file: """ [auto-props] # Setting eol-style to native on all files is a trick: if svn # believes a new file is binary, it won't honor the eol-style # auto-prop. However, svn considers the request to set eol-style # to be an error then, and if adding multiple files with one # svn "add" cmd, svn will stop adding files after the first # such error. A future release of svn will probably consider # this to be a warning instead (and continue adding files). * = svn:eol-style=native """ It would be best if svn:eol-style were set to native during initial conversion from CVS, on all files not marked binary in CVS. From jepler at unpythonic.net Tue Aug 16 04:16:18 2005 From: jepler at unpythonic.net (jepler@unpythonic.net) Date: Mon, 15 Aug 2005 21:16:18 -0500 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: References: <43003492.8060904@mpi-magdeburg.mpg.de> Message-ID: <20050816021614.GA23688@unpythonic.net> You don't need something like a buggy SWIG to put non-strings in dir(). >>> class C: pass ... >>> C.__dict__[3] = "bad wolf" >>> dir(C) [3, '__doc__', '__module__'] This is likely to happen "legitimately", for instance in a class that allows x.y and x['y'] to mean the same thing. (if the user assigns to x[3]) Jeff -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20050815/0ce34ebf/attachment.pgp From mbp at sourcefrog.net Mon Aug 15 07:12:34 2005 From: mbp at sourcefrog.net (Martin Pool) Date: Mon, 15 Aug 2005 15:12:34 +1000 Subject: [Python-Dev] cvs to bzr? References: <17150.31637.180169.877441@montanaro.dyndns.org> <17152.109.883835.190683@montanaro.dyndns.org> Message-ID: On Sun, 14 Aug 2005 21:39:41 -0500, skip wrote: > Lalo> You can, however, convert from CVS to baz (arch), and from there > Lalo> to bzr. > > Would this be with cscvs? According to the cscvs wiki page at > > http://wiki.gnuarch.org/cscvs > > cscvs is current unmaintained and can't handle repositories with branches. > In addition, it appears that to do a one-time convertsion from cvs to bzr I > will need to also install arch and baz as well as any other packages they > depend on. Canonical has had an ongoing project to pull many cvs trees into baz, for the benefit of our Ubuntu distribution people amongst other things. There are some people working on imports using (I think) a hacked version of cscvs, and I have asked them to get Python in as a high priority. Apparently there is something in the cvs history which makes a precise import hard. The cvs->baz->bzr process is unfortunate. As Mark said, we're going to be moving away from the Arch-based code and so trying to make that process simpler. -- Martin From abo at minkirri.apana.org.au Tue Aug 16 06:51:37 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Mon, 15 Aug 2005 21:51:37 -0700 Subject: [Python-Dev] Fwd: Distributed RCS In-Reply-To: <43007CDC.6060200@benjiyork.com> References: <42FBA376.5030605@canonical.com> <42FF32AA.7040506@v.loewis.de> <17151.15640.173982.961359@montanaro.dyndns.org> <42FF6E4B.4000206@v.loewis.de> <43007CDC.6060200@benjiyork.com> Message-ID: <1124167897.351.50.camel@warna.corp.google.com> On Mon, 2005-08-15 at 04:30, Benji York wrote: > Martin v. L?wis wrote: > > skip at pobox.com wrote: > >>Granted. What is the cost of waiting a bit longer to see if it (or > >>something else) gets more usable and would hit the mark better than svn? > > > > It depends on what "a bit" is. Waiting a month would be fine; waiting > > two years might be pointless. > > This might be too convoluted to consider, but I thought I might throw it > out there. We use svn for our repositories, but I've taken to also > using bzr so I can do local commits and reversions (within a particular > svn reversion). I can imagine expanding that usage to sharing branches > and such via bzr (or mercurial, which looks great), but keeping the > trunk in svn. Not too convoluted at all; I already do exactly this with many upstream CVS and SVN repositorys, using a local PRCS for my own branches. I'm considering switching to a distributed RCS for my own branches because it would make it easier for others to share them. I think this probably is the best solution; it gives a reliable(?) centralised RCS for the trunk, but allows distributed development. -- Donovan Baarda From bcannon at gmail.com Tue Aug 16 08:25:15 2005 From: bcannon at gmail.com (Brett Cannon) Date: Mon, 15 Aug 2005 23:25:15 -0700 Subject: [Python-Dev] rev. 1.9 of PEP 348: Raymond tested, Guido approved Message-ID: OK, TerminatingException and the removal of bare 'except' clauses are now out. I also stripped out the transition plan to basically just add BaseException in Python 2.5, tweak docs to recommend future-proof practices, and then change everything in Python 3.0 . This will prevent any nasty performance hit from what was being previously suggested to try to make it all backwards-compatible. -Brett From martin at v.loewis.de Tue Aug 16 08:52:43 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 16 Aug 2005 08:52:43 +0200 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <1f7befae05081518073433da62@mail.gmail.com> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> Message-ID: <43018D3B.9040404@v.loewis.de> Tim Peters wrote: > It would be best if svn:eol-style were set to native during initial > conversion from CVS, on all files not marked binary in CVS. Ok, I'll add that to the PEP. Not sure how to implement it, yet... Regards, Martin From senko.rasic at gmail.com Tue Aug 16 09:17:42 2005 From: senko.rasic at gmail.com (Senko Rasic) Date: Tue, 16 Aug 2005 09:17:42 +0200 Subject: [Python-Dev] Extension to dl module to allow passing strings from native function In-Reply-To: <42FDE000.9080508@v.loewis.de> References: <48bbc5810508111640a6bd03e@mail.gmail.com> <42FDE000.9080508@v.loewis.de> Message-ID: <48bbc58105081600174fe3570d@mail.gmail.com> On 8/13/05, "Martin v. L?wis" wrote: > Are you aware of the ctypes module? > > http://starship.python.net/crew/theller/ctypes/ I didn't know about ctypes, thanks for the pointer. It definitely has much more functionality (although it's more complex and a whole new module) than my little hack ;-) Regards, Senko -- Senko Rasic From mwh at python.net Tue Aug 16 13:35:43 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 16 Aug 2005 12:35:43 +0100 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: <20050816021614.GA23688@unpythonic.net> (jepler@unpythonic.net's message of "Mon, 15 Aug 2005 21:16:18 -0500") References: <43003492.8060904@mpi-magdeburg.mpg.de> <20050816021614.GA23688@unpythonic.net> Message-ID: <2m4q9qunao.fsf@starship.python.net> jepler at unpythonic.net writes: > You don't need something like a buggy SWIG to put non-strings in dir(). > >>>> class C: pass > ... >>>> C.__dict__[3] = "bad wolf" >>>> dir(C) > [3, '__doc__', '__module__'] > > This is likely to happen "legitimately", for instance in a class that allows > x.y and x['y'] to mean the same thing. (if the user assigns to x[3]) I wonder if dir() should strip non-strings? Cheers, mwh -- A VoIP server "powered entirely by stabbing, that I made out of this gun I had" -- from Twisted.Quotes From mwh at python.net Tue Aug 16 13:42:32 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 16 Aug 2005 12:42:32 +0100 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <1f7befae05081518073433da62@mail.gmail.com> (Tim Peters's message of "Mon, 15 Aug 2005 21:07:51 -0400") References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> Message-ID: <2mzmrit8ev.fsf@starship.python.net> Tim Peters writes: > [Martin v. L?wis] >> I have placed a new version of the PEP on >> >> http://www.python.org/peps/pep-0347.html > > ... > > +1 from me. But, I don't think my vote should count much, and (sorry) > Guido's even less: what do the people who frequently check in want? > That means people like you (Martin), Michael, Raymond, Walter, Fred. > ... plus the release manager(s). I want svn, I think. I'm open to more sophisticated approaches but am not sure that any of them are really mature enough yet. Probably will be soon, but not soon enough to void the effort of moving to svn (IMHO). I'm not really a release manager these days, but if I was, I'd wand svn for that reason too. The third set of people who count are pydotorg admins. I'm not really one of those either at the moment. While SF's CVS setup has it's problems (occasional outages; it's only CVS) it's hard to beat what it costs us in sysadmin time: zero. > It would be best if svn:eol-style were set to native during initial > conversion from CVS, on all files not marked binary in CVS. Yes. Cheers, mwh -- I recompiled XFree 4.2 with gcc 3.2-beta-from-cvs with -O42 and -march-pentium4-800Mhz and I am sure that the MOUSE CURSOR is moving 5 % FASTER! -- from Twisted.Quotes From anthony at interlink.com.au Tue Aug 16 14:08:26 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue, 16 Aug 2005 22:08:26 +1000 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <2mzmrit8ev.fsf@starship.python.net> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> Message-ID: <200508162208.28862.anthony@interlink.com.au> On Tuesday 16 August 2005 21:42, Michael Hudson wrote: > I want svn, I think. I'm open to more sophisticated approaches but am > not sure that any of them are really mature enough yet. Probably will > be soon, but not soon enough to void the effort of moving to svn > (IMHO). > > I'm not really a release manager these days, but if I was, I'd wand > svn for that reason too. I _am_ a release manager these days, and I'm in favour of svn. I really want to be off CVS, and I would love to be able to go with something more sophisticated than svn. Unfortunately, I really don't think any of the alternatives are appropriate. While Perforce is definitely capable, the Bitkeeper disaster strongly influence me against relying on the generosity of a commercial software vendor who could change their mind at any time. The more radical (and powerful) tools such as baz/bzr, darcs, monotone and the like really aren't there yet. I have no doubt that they will get there, but right now, I want something better than CVS, and I don't want to have to fight bugs or limitations in the revision control system. By the way - if you're intending on suggesting alternates to svn, please don't just post a link saying "check out this system". Post an explanation of _why_ we should look at this particular system. What's it's strengths? Why should we invest the time to download it and play with it? Speaking for myself, I don't have the time or energy to spend trying the countless numbers of revision control systems that are out there. Thanks, Anthony -- Anthony Baxter It's never too late to have a happy childhood. From barry at python.org Tue Aug 16 14:42:59 2005 From: barry at python.org (Barry Warsaw) Date: Tue, 16 Aug 2005 08:42:59 -0400 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <2mzmrit8ev.fsf@starship.python.net> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> Message-ID: <1124196179.9673.12.camel@geddy.wooz.org> On Tue, 2005-08-16 at 07:42, Michael Hudson wrote: > The third set of people who count are pydotorg admins. I'm not really > one of those either at the moment. While SF's CVS setup has it's > problems (occasional outages; it's only CVS) it's hard to beat what it > costs us in sysadmin time: zero. True, although because of the peculiarities of cvs, there have definitely been times I wish we had direct access to the repository. svn should make most of those reasons moot. As for sysadmin time with the changes proposed by the pep -- clearly they won't be zero, but I think the overhead for svn itself will be nearly so. With the fsfs backend, there's almost no continuous care and feeding needed, including for backups (which XS4ALL takes care of). The overhead for the admins will be in user management. I really don't think it will be that much more effort for new developers to badger the admins into adding them to some config file than it currently is to get one of us to click a few links to add you to the SF project. ;) (Okay, yeah we'll have to manage credentials now.) The alternatives to svn all sound very enticing, however my own feeling is that while the workflows they make possible might be good for Python in the long run, it's not clear how all that will evolve. We know that we can treat svn as "a better cvs" and the current workflow seems to serve us well enough. I'd be happy to switch to svn now, while continuing to experiment and follow the better scm systems for the future. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050816/b66b0382/attachment.pgp From mwh at python.net Tue Aug 16 14:52:13 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 16 Aug 2005 13:52:13 +0100 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <1124196179.9673.12.camel@geddy.wooz.org> (Barry Warsaw's message of "Tue, 16 Aug 2005 08:42:59 -0400") References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> <1124196179.9673.12.camel@geddy.wooz.org> Message-ID: <2mvf26t56q.fsf@starship.python.net> Barry Warsaw writes: > On Tue, 2005-08-16 at 07:42, Michael Hudson wrote: > >> The third set of people who count are pydotorg admins. I'm not really >> one of those either at the moment. While SF's CVS setup has it's >> problems (occasional outages; it's only CVS) it's hard to beat what it >> costs us in sysadmin time: zero. > > True, although because of the peculiarities of cvs, there have > definitely been times I wish we had direct access to the repository. > svn should make most of those reasons moot. > > As for sysadmin time with the changes proposed by the pep -- clearly > they won't be zero, but I think the overhead for svn itself will be > nearly so. OK, that's more or less what I thought. [...] > I'd be happy to switch to svn now, while continuing to experiment > and follow the better scm systems for the future. I suppose another question is: when? Between 2.4.2 and 2.5a1 seems like a good opportunity. I guess the biggest job is collection of keys and associated admin? Cheers, mwh -- well, take it from an old hand: the only reason it would be easier to program in C is that you can't easily express complex problems in C, so you don't. -- Erik Naggum, comp.lang.lisp From jack at performancedrivers.com Tue Aug 16 15:00:46 2005 From: jack at performancedrivers.com (Jack Diederich) Date: Tue, 16 Aug 2005 09:00:46 -0400 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <200508162208.28862.anthony@interlink.com.au> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> <200508162208.28862.anthony@interlink.com.au> Message-ID: <20050816130045.GA10364@performancedrivers.com> On Tue, Aug 16, 2005 at 10:08:26PM +1000, Anthony Baxter wrote: > On Tuesday 16 August 2005 21:42, Michael Hudson wrote: > > I want svn, I think. I'm open to more sophisticated approaches but am > > not sure that any of them are really mature enough yet. Probably will > > be soon, but not soon enough to void the effort of moving to svn > > (IMHO). > > > > I'm not really a release manager these days, but if I was, I'd wand > > svn for that reason too. > > I _am_ a release manager these days, and I'm in favour of svn. I really > want to be off CVS, and I would love to be able to go with something > more sophisticated than svn. Unfortunately, I really don't think any of > the alternatives are appropriate. As a non-committer I can say _anything_ is preferable to the current situation and svn is good enough. bzr might make it even easier but svn is familiar and it will work right now. I haven't submitted a patch in ages partly because using anonymous SF cvs plain doesn't work. aside, at work we switched from cvs to svn and it the transition was easy for developers, svn lives up to its billing as a fixed cvs. -jack From e.a.m.brouwer at alumnus.utwente.nl Mon Aug 15 00:48:00 2005 From: e.a.m.brouwer at alumnus.utwente.nl (Martijn Brouwer) Date: Sun, 14 Aug 2005 22:48:00 +0000 Subject: [Python-Dev] implementation of copy standard lib Message-ID: <1124059680.11612.9.camel@localhost.localdomain> Hi, After profiling a small python script I found that approximately 50% of the runtime of my script was consumed by one line: "import copy". Another 15% was the startup of the interpreter, but that is OK for an interpreted language. The copy library is used by another library I am using for my scripts. Importing copy takes 5-10 times more time that import os, string and re together! I noticed that this lib is implemented in python, not in C. As I can imagine that *a lot* of libs/scripts use the copy library, I think it worthwhile to implement this lib in C. Unfortunately I cannot do this myself: I am relatively inexperienced with python and do not know C. What are your opinions? Martijn Brouwer -- __________________________________________________ I have a new e-mail adress. If you are still using e.a.m.brouwer at tnw.utwente.nl, please change to e.a.m.brouwer at alumnus.utwente.nl __________________________________________________ From simon.brunning at gmail.com Tue Aug 16 16:28:53 2005 From: simon.brunning at gmail.com (Simon Brunning) Date: Tue, 16 Aug 2005 15:28:53 +0100 Subject: [Python-Dev] implementation of copy standard lib In-Reply-To: <1124059680.11612.9.camel@localhost.localdomain> References: <1124059680.11612.9.camel@localhost.localdomain> Message-ID: <8c7f10c605081607281f8c1e38@mail.gmail.com> On 8/14/05, Martijn Brouwer wrote: > I noticed that this lib is implemented in python, not in C. As I can > imagine that *a lot* of libs/scripts use the copy library, I think it > worthwhile to implement this lib in C. > Unfortunately I cannot do this myself: I am relatively inexperienced > with python and do not know C. > > What are your opinions? I'll reply to this over on c.l.py, where it belongs. -- Cheers, Simon B, simon at brunningonline.net, http://www.brunningonline.net/simon/blog/ From foom at fuhm.net Tue Aug 16 16:34:31 2005 From: foom at fuhm.net (James Y Knight) Date: Tue, 16 Aug 2005 10:34:31 -0400 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <43018D3B.9040404@v.loewis.de> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <43018D3B.9040404@v.loewis.de> Message-ID: <4B72B610-80CA-40BC-9B9D-EB50F8077436@fuhm.net> On Aug 16, 2005, at 2:52 AM, Martin v. L?wis wrote: > Tim Peters wrote: > >> It would be best if svn:eol-style were set to native during initial >> conversion from CVS, on all files not marked binary in CVS. >> > > Ok, I'll add that to the PEP. Not sure how to implement it, yet... cvs2svn does that by default (now). James From fdrake at acm.org Tue Aug 16 18:41:10 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 16 Aug 2005 12:41:10 -0400 Subject: [Python-Dev] dev listinfo page (was: Re: Python + Ping) In-Reply-To: <42FC666A.90206@botanicus.net> References: <2773CAC687FD5F4689F526998C7E4E5F05CC00@au3010avexu1.global.avaya.com> <42FC666A.90206@botanicus.net> Message-ID: <200508161241.10908.fdrake@acm.org> On Friday 12 August 2005 05:05, David Wilson wrote: > Would it perhaps be an idea, given the number of users posting to the > dev list, to put a rather obvious warning on the listinfo page: Well, not exactly the style you suggested, but I've made it fairly close. It's certainly more noticable now. :-) -Fred -- Fred L. Drake, Jr. From fperez.net at gmail.com Tue Aug 16 19:08:59 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 16 Aug 2005 11:08:59 -0600 Subject: [Python-Dev] SWIG and rlcompleter References: <43003492.8060904@mpi-magdeburg.mpg.de> <20050816021614.GA23688@unpythonic.net> <2m4q9qunao.fsf@starship.python.net> Message-ID: Michael Hudson wrote: > jepler at unpythonic.net writes: > >> You don't need something like a buggy SWIG to put non-strings in dir(). >> >>>>> class C: pass >> ... >>>>> C.__dict__[3] = "bad wolf" >>>>> dir(C) >> [3, '__doc__', '__module__'] >> >> This is likely to happen "legitimately", for instance in a class that allows >> x.y and x['y'] to mean the same thing. (if the user assigns to x[3]) > > I wonder if dir() should strip non-strings? Me too. And it would be a good idea, I think, to specify explicitly in the dir() docs this behavior. Right now at least rlcompleter and ipython's completer can break due to this, there may be other tools out there with similar problems. If it's a stated design goal that dir() can return non-strings, that's fine. I can filter them out in my completion code. I'd just like to know what the official stance on dir()'s return values is. Cheers, f From fperez.net at gmail.com Tue Aug 16 19:17:04 2005 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 16 Aug 2005 11:17:04 -0600 Subject: [Python-Dev] SWIG and rlcompleter References: <43003492.8060904@mpi-magdeburg.mpg.de> Message-ID: Guido van Rossum wrote: > (3) I think a better patch is to use str(word)[:n] instead of word[:n]. Mmh, I'm not so sure that's a good idea, as it leads to this: In [1]: class f: pass ...: In [2]: a=f() In [3]: a.__dict__[1] = 8 In [4]: a.x = 0 In [5]: a. a.1 a.x In [5]: a.1 ------------------------------------------------------------ File "", line 1 a.1 ^ SyntaxError: invalid syntax In general, foo.x named attribute access is only valid for strings to begin with (what about unicode in there?). Instead, this is what I've actually implemented in ipython: words = [w for w in dir(object) if isinstance(w, basestring)] That does allow unicode, I'm not sure if that's a good thing to do. Cheers, f From martin at v.loewis.de Tue Aug 16 20:19:33 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 16 Aug 2005 20:19:33 +0200 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <4B72B610-80CA-40BC-9B9D-EB50F8077436@fuhm.net> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <43018D3B.9040404@v.loewis.de> <4B72B610-80CA-40BC-9B9D-EB50F8077436@fuhm.net> Message-ID: <43022E35.1070207@v.loewis.de> James Y Knight wrote: > cvs2svn does that by default (now). Ah, ok. Martin From martin at v.loewis.de Tue Aug 16 20:31:20 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 16 Aug 2005 20:31:20 +0200 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <2mvf26t56q.fsf@starship.python.net> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> <1124196179.9673.12.camel@geddy.wooz.org> <2mvf26t56q.fsf@starship.python.net> Message-ID: <430230F8.3020405@v.loewis.de> Michael Hudson wrote: > I suppose another question is: when? Between 2.4.2 and 2.5a1 seems > like a good opportunity. I guess the biggest job is collection of > keys and associated admin? I would agree. However, there still is the debate of hosting the repository elsehwere. Some people (Anthony, Guido, Tim) would prefer to pay for it, instead of hosting it on svn.python.org. Regards, Martin From nas at arctrix.com Tue Aug 16 21:18:35 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 16 Aug 2005 13:18:35 -0600 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <430230F8.3020405@v.loewis.de> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> <1124196179.9673.12.camel@geddy.wooz.org> <2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de> Message-ID: <20050816191835.GA18968@mems-exchange.org> On Tue, Aug 16, 2005 at 08:31:20PM +0200, "Martin v. L?wis" wrote: > I would agree. However, there still is the debate of hosting the > repository elsehwere. Some people (Anthony, Guido, Tim) would prefer > to pay for it, instead of hosting it on svn.python.org. Another option would be to pay someone to maintain the SVN setup on python.org. Unfortunately, I guess that would require someone else to first create a detailed description of the maintenance work required and to process bids. Neil From barry at python.org Tue Aug 16 21:28:35 2005 From: barry at python.org (Barry Warsaw) Date: Tue, 16 Aug 2005 15:28:35 -0400 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <20050816191835.GA18968@mems-exchange.org> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> <1124196179.9673.12.camel@geddy.wooz.org> <2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de> <20050816191835.GA18968@mems-exchange.org> Message-ID: <1124220515.5254.7.camel@geddy.wooz.org> On Tue, 2005-08-16 at 15:18, Neil Schemenauer wrote: > Another option would be to pay someone to maintain the SVN setup on > python.org. Unfortunately, I guess that would require someone else > to first create a detailed description of the maintenance work > required and to process bids. Again, it's not clear to me that there's much more we need to have done that we either don't want to do ourselves or that XS4ALL isn't doing for us. IOW, we get backups for free and mostly the repo just swims along nicely. We have to do user management, but I think we want to do that ourselves anyway. There may be occasional infrastructural work that needs to happen (e.g. we still owe Martin a login for tunneling), but those tasks seem to me to be better handled either by volunteers or by short-term paid piece work. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050816/8a628ba5/attachment.pgp From tim.peters at gmail.com Tue Aug 16 21:49:39 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 16 Aug 2005 15:49:39 -0400 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <430230F8.3020405@v.loewis.de> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> <1124196179.9673.12.camel@geddy.wooz.org> <2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de> Message-ID: <1f7befae050816124932f5c66@mail.gmail.com> [Michael Hudson] >> I suppose another question is: when? Between 2.4.2 and 2.5a1 seems >> like a good opportunity. I guess the biggest job is collection of >> keys and associated admin? [Martin v. L?wis] > I would agree. However, there still is the debate of hosting the > repository elsehwere. Some people (Anthony, Guido, Tim) would prefer > to pay for it, instead of hosting it on svn.python.org. Not this Tim. I _asked_ whether we had sufficient volunteer resource to host it on python.org, because I didn't know. Barry has since made sufficiently reassuring gurgles on that point, in particular that ongoing maintenance (after initial conversion) for filesystem-flavor SVN is likely in-the-noise level work. From martin at v.loewis.de Tue Aug 16 22:25:38 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 16 Aug 2005 22:25:38 +0200 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <1f7befae050816124932f5c66@mail.gmail.com> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> <1124196179.9673.12.camel@geddy.wooz.org> <2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de> <1f7befae050816124932f5c66@mail.gmail.com> Message-ID: <43024BC2.2010505@v.loewis.de> Tim Peters wrote: > Not this Tim. I _asked_ whether we had sufficient volunteer resource > to host it on python.org, because I didn't know. Barry has since made > sufficiently reassuring gurgles on that point, in particular that > ongoing maintenance (after initial conversion) for filesystem-flavor > SVN is likely in-the-noise level work. Ah, ok. Of course, Barry can only speak about the current availability of volunteers, which is quite good (especially since amk took over coordinating them), nobody can predict the future (the time machine apparently only works one-way). So I guess the concern stays, and, more objectively, this is a risk for the project (but so is any specific commercial offering). Regards, Martin From raymond.hettinger at verizon.net Tue Aug 16 22:24:47 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 16 Aug 2005 16:24:47 -0400 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <1f7befae05081518073433da62@mail.gmail.com> Message-ID: <011901c5a2a0$8c4f4700$af26c797@oemcomputer> [Tim] > +1 from me. But, I don't think my vote should count much, and (sorry) > Guido's even less: what do the people who frequently check in want? > That means people like you (Martin), Michael, Raymond, Walter, Fred. > ... plus the release manager(s). +1 from me. CVS is meeting my needs but I would definitely benefit from fast diffs and atomic commits. My experiences with SVN to-date have all been positive and it was easy to learn. Also, I think it is a nice plus that our choosing SVN means that others can choose SVK and get the benefits of a distributed rcs without us having to do anything extra to support it. James Knight's thoughts on the subject seem on target. Raymond From martin at v.loewis.de Tue Aug 16 22:33:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 16 Aug 2005 22:33:27 +0200 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <20050816191835.GA18968@mems-exchange.org> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> <1124196179.9673.12.camel@geddy.wooz.org> <2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de> <20050816191835.GA18968@mems-exchange.org> Message-ID: <43024D97.5050306@v.loewis.de> Neil Schemenauer wrote: > Another option would be to pay someone to maintain the SVN setup on > python.org. Unfortunately, I guess that would require someone else > to first create a detailed description of the maintenance work > required and to process bids. I think this would be difficult. I could imagine services like tummy.com, where you can hire somebody on an hours-per-week basis; these people maintain multiple servers, and just need to do the proper accounting. However, they also (naturally) tend to desire an organization that meets their needs also, e.g. by providing the machine and network (this is apparently how tummy.com operates). If you are suggesting that the PSF hires a specific individual for that maintenance, the risk of getting somebody unexperienced/uncooperative would be much higher: if we were unhappy with the tummy.com guy looking after our hardware, we could complain to his boss; if that is the boss, we would just take our data and cancel the contract. Also, hiring somebody would be somewhat unfair to people who do similar tasks as volunteers, and I guess the board might not agree to such expenses. Regards, Martin From tim.peters at gmail.com Tue Aug 16 22:52:09 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 16 Aug 2005 16:52:09 -0400 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <43024BC2.2010505@v.loewis.de> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> <1124196179.9673.12.camel@geddy.wooz.org> <2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de> <1f7befae050816124932f5c66@mail.gmail.com> <43024BC2.2010505@v.loewis.de> Message-ID: <1f7befae050816135244a6b72f@mail.gmail.com> [Martin v. L?wis] > Ah, ok. Of course, Barry can only speak about the current availability > of volunteers, which is quite good (especially since amk took over > coordinating them), nobody can predict the future (the time machine > apparently only works one-way). So I guess the concern stays, and, > more objectively, this is a risk for the project (but so is any > specific commercial offering). I'm not really worried about it. Sounds like ongoing pain is pretty much limited to keeping committer accounts/credentials up to date, and that normal good backup procedures will deal with filesystem-SVN state as a matter of course. If there's one thing sysadmins love to do, it's fiddling with user accounts and credentials -- if _anyone_ volunteers to work on python.org, they'll be eager to lord this power over us . If not, that's fine too. The PSF has the funds and the mission to pay for infrastructure support; I'd just _rather_ spend PSF funds on "more glamorous" stuff (like grants and conferences). From tim.peters at gmail.com Tue Aug 16 23:00:43 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 16 Aug 2005 17:00:43 -0400 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <011901c5a2a0$8c4f4700$af26c797@oemcomputer> References: <1f7befae05081518073433da62@mail.gmail.com> <011901c5a2a0$8c4f4700$af26c797@oemcomputer> Message-ID: <1f7befae050816140063dcda4@mail.gmail.com> [Raymond Hettinger] > +1 from me. CVS is meeting my needs but I would definitely benefit from > fast diffs and atomic commits. My experiences with SVN to-date have all > been positive and it was easy to learn. Good! That was my experience too, BTW -- SVN was a genuine improvement over CVS, and I was productive with it the first hour. There are "tricks" you'll learn too (or already have); for example, if you make a bunch of changes in a local checkout, and have to drop it for a while, it's easy and fast to create an SVN branch with those changes despite that you didn't plan on it from the start (create a new branch in the repository; `svn switch` to it locally, which leaves your local changes alone; then commit). > Also, I think it is a nice plus that our choosing SVN means that others > can choose SVK and get the benefits of a distributed rcs without us > having to do anything extra to support it. James Knight's thoughts on > the subject seem on target. Too new-fashioned for me, although I can see how it might appeal to kids ;-) From skip at pobox.com Tue Aug 16 23:07:09 2005 From: skip at pobox.com (skip@pobox.com) Date: Tue, 16 Aug 2005 16:07:09 -0500 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <43024BC2.2010505@v.loewis.de> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> <2mzmrit8ev.fsf@starship.python.net> <1124196179.9673.12.camel@geddy.wooz.org> <2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de> <1f7befae050816124932f5c66@mail.gmail.com> <43024BC2.2010505@v.loewis.de> Message-ID: <17154.21885.668810.977155@montanaro.dyndns.org> Martin> Of course, Barry can only speak about the current availability Martin> of volunteers, which is quite good (especially since amk took Martin> over coordinating them) .... I don't know why, but the first image that popped into my mind was of amk beating a bunch of Hunchback of Notre Dame types (maybe more the Marty Feldman (*) hunchback types) into submission with a whip while one of them cried, "We'll do anything you ask, master. Just don't beat us again." The-beatings-will-continue-until-morale-improves-ly, y'rs, Skip (*) http://en.wikipedia.org/wiki/Marty_Feldman From walter at livinglogic.de Tue Aug 16 23:06:37 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue, 16 Aug 2005 23:06:37 +0200 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <1f7befae05081518073433da62@mail.gmail.com> References: <42F61C03.6050703@v.loewis.de> <1f7befae05081518073433da62@mail.gmail.com> Message-ID: <1A846489-39FF-49D1-8AF0-5BA61F9277DF@livinglogic.de> Tim Peters wrote: > [Martin v. L?wis] > >> I have placed a new version of the PEP on >> >> http://www.python.org/peps/pep-0347.html >> > > ... > > +1 from me. But, I don't think my vote should count much, and (sorry) > Guido's even less: what do the people who frequently check in want? > That means people like you (Martin), Michael, Raymond, Walter, Fred. > ... plus the release manager(s). +1 from me for various reasons: * Subversion seems to be stable enough, and it's better than CVS which is enough for me. * The python.org machines can probably handle the load of *one* repository better then the SF machines that of several thousands. * Connectivity to python.org is much better then to cvs.sf.net (at least from here). * Our company repository might move to svn in the near future, so a Python svn repository would be a perfect playground to learn svn. ;) Bye, Walter D?rwald From tdelaney at avaya.com Wed Aug 17 01:53:20 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Wed, 17 Aug 2005 09:53:20 +1000 Subject: [Python-Dev] PEP 347: Migration to Subversion Message-ID: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com> Tim Peters wrote: > [Martin v. L?wis] >> I would agree. However, there still is the debate of hosting the >> repository elsehwere. Some people (Anthony, Guido, Tim) would prefer >> to pay for it, instead of hosting it on svn.python.org. > > Not this Tim. Not this one either. I haven't actually used any of the various systems that much (work is ClearCase) so I have no opinions whatsoever. It's interesting reading though. Tim Delaney From gvanrossum at gmail.com Wed Aug 17 01:58:01 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue, 16 Aug 2005 16:58:01 -0700 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com> Message-ID: Nor this Guido, FWIW (I think we shouldn't rule it out as an option, but I don't have any preferences). On 8/16/05, Delaney, Timothy (Tim) wrote: > Tim Peters wrote: > > > [Martin v. L?wis] > >> I would agree. However, there still is the debate of hosting the > >> repository elsehwere. Some people (Anthony, Guido, Tim) would prefer > >> to pay for it, instead of hosting it on svn.python.org. > > > > Not this Tim. > > Not this one either. I haven't actually used any of the various systems that much (work is ClearCase) so I have no opinions whatsoever. It's interesting reading though. > > Tim Delaney > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Wed Aug 17 02:55:26 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 16 Aug 2005 20:55:26 -0400 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: <2m4q9qunao.fsf@starship.python.net> Message-ID: <000801c5a2c6$5ca70080$af26c797@oemcomputer> [Michael Hudson] > I wonder if dir() should strip non-strings? -0 The behavior of dir() already a bit magical. Python is much simpler to comprehend if we have direct relationships like dir() and vars() corresponding as closely as possible to the object's dictionary. If someone injects non-strings into an attribute dictionary, why should dir() hide that fact? Likewise, we would have been better-off if ceval.c didn't pre-process data before handing it off to API functions (so that negative indices get handled the same way in operator module functions and in user defined methods, etc). Both Io and Lua have made a design principle out of keeping these relationships as direct as possible (i.e. a[b] always corresponds to the call a.__getitem__(b) with no intervening magic, etc.). The auto-exposure on my camera takes in nine data points and guesses whether the subject is backlit, whether there is a mix of light and dark, whether it is more important avoid blown highlights or to miss shadow detail, etc. The good news is that it often makes a decent guess. The bad news is that I've completely lost the ability to predict whether I've gotten a good shot based on the light conditions and camera settings. IOW, if you make the tools too smart, they become harder to use. Leica had it right all along. Raymond From ilya at bluefir.net Wed Aug 17 06:34:10 2005 From: ilya at bluefir.net (Ilya Sandler) Date: Tue, 16 Aug 2005 21:34:10 -0700 (PDT) Subject: [Python-Dev] remote debugging with pdb In-Reply-To: <24EEDE5B-4511-40D4-9C16-8A33C4ACE1C8@redivi.com> References: <20050808154503.GB28005@panix.com> <200508111802.44357.anthony@interlink.com.au> <24EEDE5B-4511-40D4-9C16-8A33C4ACE1C8@redivi.com> Message-ID: > One thing PDB needs is a mode that runs as a background thread and > opens up a socket so that another Python process can talk to it, for > embedded/remote/GUI debugging. There is a patch on SourceForge python.org/sf/721464 which allows pdb to read/write from/to arbitrary file objects. Would it answer some of your concerns (eg remote debugging)? The patch probably will not apply to the current code, but I guess, I could revive it if anyone thinks that it's worthwhile... What do you think? Ilya From kiko at async.com.br Wed Aug 17 16:02:18 2005 From: kiko at async.com.br (Christian Robottom Reis) Date: Wed, 17 Aug 2005 11:02:18 -0300 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) Message-ID: <20050817140217.GQ3389@www.async.com.br> In Launchpad (mainly because SQLObject is used) we end up with quite a few locals named id. Apart from the fact that naturally clobbering builtins is a bad idea, we get quite a few warnings when linting throughout the codebase. I've fixed these as I've found them, but today Andrew pointed out to me that this is noted in: http://www.python.org/doc/essays/ppt/regrets/PythonRegrets.ppt I wonder: is moving id() to sys doable in the 2.5 cycle, with a deprecation warning being raised for people using the builtin? We'd then phase it out in one of the latter 2.x versions. I've done some searching through my code and id() isn't the most-used builtin, so from my perspective the impact would be limited, but of course others might think otherwise. Is it worth writing a PEP for this, or is it crack? Take care, -- Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 3376 0125 From raymond.hettinger at verizon.net Wed Aug 17 17:48:57 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 17 Aug 2005 11:48:57 -0400 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: <20050817140217.GQ3389@www.async.com.br> Message-ID: <000a01c5a343$2f4deea0$5e01a044@oemcomputer> [Christian Robottom Reis] > I've done some searching through my code and id() isn't the most-used > builtin, so from my perspective the impact would be limited, but of > course others might think otherwise. > > Is it worth writing a PEP for this, or is it crack? FWIW, I use id() all the time and like having it as a builtin. Raymond From firemoth at gmail.com Wed Aug 17 18:32:42 2005 From: firemoth at gmail.com (Timothy Fitz) Date: Wed, 17 Aug 2005 12:32:42 -0400 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: <20050817140217.GQ3389@www.async.com.br> References: <20050817140217.GQ3389@www.async.com.br> Message-ID: <972ec5bd05081709327fed099@mail.gmail.com> On 8/17/05, Christian Robottom Reis wrote: > I've done some searching through my code and id() isn't the most-used > builtin, so from my perspective the impact would be limited, but of > course others might think otherwise. All of my primary uses of id would not show up in such a search. id is handy when debugging, when using the interactive interpreter and temporarily in scripts (print id(something), something for when repr(something) doesn't show the id). In my experience teaching python, id at the interactive interpreter is invaluable, which is why any proposal to move it would get a -1. The fundamental issue is that I want to explain reference semantics well before I talk about packages and the associated import call. From jeremy at alum.mit.edu Wed Aug 17 18:37:42 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed, 17 Aug 2005 12:37:42 -0400 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: <972ec5bd05081709327fed099@mail.gmail.com> References: <20050817140217.GQ3389@www.async.com.br> <972ec5bd05081709327fed099@mail.gmail.com> Message-ID: I'd like to see the builtin id() removed so that I can use it as a local variable name without clashing with the builtin name. I certainly use the id() function, but not as often as I have a local variable I'd like to name id. The sys module seems like a natural place to put id(), since it is exposing something about the implementation of Python rather than something about the language; the language offers the is operator to check ids. Jeremy From reinhold-birkenfeld-nospam at wolke7.net Wed Aug 17 18:37:11 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Wed, 17 Aug 2005 18:37:11 +0200 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: <20050817140217.GQ3389@www.async.com.br> References: <20050817140217.GQ3389@www.async.com.br> Message-ID: Christian Robottom Reis wrote: > In Launchpad (mainly because SQLObject is used) we end up with quite a > few locals named id. Apart from the fact that naturally clobbering > builtins is a bad idea, we get quite a few warnings when linting > throughout the codebase. I've fixed these as I've found them, but today > Andrew pointed out to me that this is noted in: > > http://www.python.org/doc/essays/ppt/regrets/PythonRegrets.ppt > > I wonder: is moving id() to sys doable in the 2.5 cycle, with a > deprecation warning being raised for people using the builtin? We'd then > phase it out in one of the latter 2.x versions. > > I've done some searching through my code and id() isn't the most-used > builtin, so from my perspective the impact would be limited, but of > course others might think otherwise. > > Is it worth writing a PEP for this, or is it crack? As I can see, this is not going to happen before Py3k, as it is completely breaking backwards compatibility. As such, a PEP would be unnecessary. Reinhold -- Mail address is perfectly valid! From pedronis at strakt.com Wed Aug 17 18:50:29 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Wed, 17 Aug 2005 18:50:29 +0200 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: References: <20050817140217.GQ3389@www.async.com.br> <972ec5bd05081709327fed099@mail.gmail.com> Message-ID: <43036AD5.6010902@strakt.com> Jeremy Hylton wrote: > I'd like to see the builtin id() removed so that I can use it as a > local variable name without clashing with the builtin name. I > certainly use the id() function, but not as often as I have a local > variable I'd like to name id. The sys module seems like a natural > place to put id(), since it is exposing something about the > implementation of Python rather than something about the language; the > language offers the is operator to check ids. > it is worth to remember that id() functionality is not cheap for Python impls using moving GCs. Identity mappings would be less taxing. > Jeremy > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com From barry at python.org Wed Aug 17 19:17:59 2005 From: barry at python.org (Barry Warsaw) Date: Wed, 17 Aug 2005 13:17:59 -0400 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: References: <20050817140217.GQ3389@www.async.com.br> <972ec5bd05081709327fed099@mail.gmail.com> Message-ID: <1124299079.23024.17.camel@geddy.wooz.org> On Wed, 2005-08-17 at 12:37, Jeremy Hylton wrote: > I'd like to see the builtin id() removed so that I can use it as a > local variable name without clashing with the builtin name. I > certainly use the id() function, but not as often as I have a local > variable I'd like to name id. Same here. > The sys module seems like a natural > place to put id(), since it is exposing something about the > implementation of Python rather than something about the language; the > language offers the is operator to check ids. +1 -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050817/4e62bbf8/attachment.pgp From bcannon at gmail.com Wed Aug 17 19:21:59 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 17 Aug 2005 10:21:59 -0700 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: <1124299079.23024.17.camel@geddy.wooz.org> References: <20050817140217.GQ3389@www.async.com.br> <972ec5bd05081709327fed099@mail.gmail.com> <1124299079.23024.17.camel@geddy.wooz.org> Message-ID: On 8/17/05, Barry Warsaw wrote: > On Wed, 2005-08-17 at 12:37, Jeremy Hylton wrote: > > I'd like to see the builtin id() removed so that I can use it as a > > local variable name without clashing with the builtin name. I > > certainly use the id() function, but not as often as I have a local > > variable I'd like to name id. > > Same here. > > > The sys module seems like a natural > > place to put id(), since it is exposing something about the > > implementation of Python rather than something about the language; the > > language offers the is operator to check ids. > > +1 > -Barry +1 -Brett From nas at arctrix.com Wed Aug 17 19:40:32 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 17 Aug 2005 11:40:32 -0600 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: References: <20050817140217.GQ3389@www.async.com.br> Message-ID: <20050817174031.GA22541@mems-exchange.org> On Wed, Aug 17, 2005 at 06:37:11PM +0200, Reinhold Birkenfeld wrote: > As I can see, this is not going to happen before Py3k, as it is completely > breaking backwards compatibility. As such, a PEP would be unnecessary. We could add sys.id for 2.5 and remove __builtin__.id a some later time (e.g. for 3.0). Neil From facundobatista at gmail.com Wed Aug 17 19:49:25 2005 From: facundobatista at gmail.com (Facundo Batista) Date: Wed, 17 Aug 2005 14:49:25 -0300 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: <20050817174031.GA22541@mems-exchange.org> References: <20050817140217.GQ3389@www.async.com.br> <20050817174031.GA22541@mems-exchange.org> Message-ID: On 8/17/05, Neil Schemenauer wrote: > On Wed, Aug 17, 2005 at 06:37:11PM +0200, Reinhold Birkenfeld wrote: > > As I can see, this is not going to happen before Py3k, as it is completely > > breaking backwards compatibility. As such, a PEP would be unnecessary. > > We could add sys.id for 2.5 and remove __builtin__.id a some later > time (e.g. for 3.0). +1 for adding it to sys in 2.5, removing the builtin one in 3.0. . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From firemoth at gmail.com Wed Aug 17 20:55:30 2005 From: firemoth at gmail.com (Timothy Fitz) Date: Wed, 17 Aug 2005 14:55:30 -0400 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: <000801c5a2c6$5ca70080$af26c797@oemcomputer> References: <2m4q9qunao.fsf@starship.python.net> <000801c5a2c6$5ca70080$af26c797@oemcomputer> Message-ID: <972ec5bd05081711555e9ad129@mail.gmail.com> On 8/16/05, Raymond Hettinger wrote: > -0 The behavior of dir() already a bit magical. Python is much simpler > to comprehend if we have direct relationships like dir() and vars() > corresponding as closely as possible to the object's dictionary. If > someone injects non-strings into an attribute dictionary, why should > dir() hide that fact? Indeed, there seem to be two camps, those who want dir to reflect __dict__ and those who want dir to reflect attributes of an object. It seems to me that those who want dir to reflect __dict__ should just use __dict__ in the first place. However, in the case of dir handling non-strings, should dir handle non-valid identifiers as well, that is to say that while foo.__dict__[2] = ... is an obvious case what about foo.__dict__["1"] ? Right now the documentation says that it returns "attributes", and I would not consider non-strings to be attributes, so either the documentation or the implementation should rectify this disagreement. From gvanrossum at gmail.com Wed Aug 17 21:10:15 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 17 Aug 2005 12:10:15 -0700 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: <972ec5bd05081711555e9ad129@mail.gmail.com> References: <2m4q9qunao.fsf@starship.python.net> <000801c5a2c6$5ca70080$af26c797@oemcomputer> <972ec5bd05081711555e9ad129@mail.gmail.com> Message-ID: On 8/17/05, Timothy Fitz wrote: > On 8/16/05, Raymond Hettinger wrote: > > -0 The behavior of dir() already a bit magical. Python is much simpler > > to comprehend if we have direct relationships like dir() and vars() > > corresponding as closely as possible to the object's dictionary. If > > someone injects non-strings into an attribute dictionary, why should > > dir() hide that fact? > > Indeed, there seem to be two camps, those who want dir to reflect __dict__ > and those who want dir to reflect attributes of an object. It seems to > me that those who want dir to reflect __dict__ should just use > __dict__ in the first place. Right. > However, in the case of dir handling non-strings, should dir handle > non-valid identifiers as well, that is to say that while > foo.__dict__[2] = ... is an obvious case what about foo.__dict__["1"] > ? See below. > Right now the documentation says that it returns "attributes", and I > would not consider non-strings to be attributes, so either the > documentation or the implementation should rectify this disagreement. I think that dir() should hide non-strings; these aren't attributes if you believe the definition that an attribute name is something acceptable to getattr() or setattr(). Following this definition, the string "1" is a valid attribute name (even though it's not a valid identifier), but the number 1 is not. Try it. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Wed Aug 17 21:21:22 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 17 Aug 2005 15:21:22 -0400 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: <972ec5bd05081711555e9ad129@mail.gmail.com> Message-ID: <001101c5a360$db2d63a0$3031c797@oemcomputer> [Timothy Fitz] > It seems to > me that those who want dir to reflect __dict__ should just use > __dict__ in the first place. The dir() builtin does quite a bit more than obj.__dict__.keys(). >>> class A(list): x = 1 >>> dir(A) ['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__delslice__', '__dict__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__module__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__str__', '__weakref__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort', 'x'] >>> A.__dict__.keys() ['__dict__', 'x', '__module__', '__weakref__', '__doc__'] Raymond From gvanrossum at gmail.com Wed Aug 17 21:30:33 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 17 Aug 2005 12:30:33 -0700 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: <001101c5a360$db2d63a0$3031c797@oemcomputer> References: <972ec5bd05081711555e9ad129@mail.gmail.com> <001101c5a360$db2d63a0$3031c797@oemcomputer> Message-ID: > [Timothy Fitz] > > It seems to > > me that those who want dir to reflect __dict__ should just use > > __dict__ in the first place. [Raymond] > The dir() builtin does quite a bit more than obj.__dict__.keys(). Well that's the whole point, right? We shouldn't conflate the two. I don't see this as an argument why it would be bad to delete non-string-keys found in __dict__ from dir()'s return value. I don't think that the equation set(x.__dict__) <= set(dir(x)) provides enough value to try and keep it. A more useful relationship is name in dir(x) <==> getattr(x, name) is valid -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Wed Aug 17 22:21:16 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 17 Aug 2005 16:21:16 -0400 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: Message-ID: <001701c5a369$3911f960$3031c797@oemcomputer> > > [Timothy Fitz] > > > It seems to > > > me that those who want dir to reflect __dict__ should just use > > > __dict__ in the first place. > > [Raymond] > > The dir() builtin does quite a bit more than obj.__dict__.keys(). > > Well that's the whole point, right? Perhaps. I wasn't taking a position. Just noting that Timothy's comment over-simplified the relationship. > A more useful relationship is > > name in dir(x) <==> getattr(x, name) is valid That would be a useful invariant. Raymond From gvanrossum at gmail.com Wed Aug 17 22:46:27 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 17 Aug 2005 13:46:27 -0700 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: <001701c5a369$3911f960$3031c797@oemcomputer> References: <001701c5a369$3911f960$3031c797@oemcomputer> Message-ID: [me] > > A more useful relationship is > > > > name in dir(x) <==> getattr(x, name) is valid [Raymond] > That would be a useful invariant. Well, the <== part can't really be guaranteed due to the existence of __getattr__ overriding (and all bets are off if __getattribute__ is overridden!), but apart from those, stripping non-strings in dir() would be a big help towards making the invariant true. So I'm +1 on that. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From caustin at spikesource.com Thu Aug 18 00:57:33 2005 From: caustin at spikesource.com (Calvin Austin) Date: Wed, 17 Aug 2005 15:57:33 -0700 Subject: [Python-Dev] A testing challenge Message-ID: <4303C0DD.5090903@spikesource.com> When was the last time someone thanked you for writing a test? I tried to think of the last time it happened to me and I can't remember. Well at Spikesource we want to thank you not just for helping the Python community but for your testing efforts too and we are running a participatory testing contest. This is a competition where there are no losers, every project gains if new tests are written. For more details see below, it is open worldwide. feel free to send questions to me. thanks calvin *_Open Testing Contest with Over $20,000 in Prizes_* Committers! SpikeSource is sponsoring a contest to help increase the participatory testing of open source software. Awards will be given to open source projects that have the greatest increase in code coverage from September 15 through December 31, 2005. Project sign-up is due by August 31^st and the contest begins on September 15^th . Visit http://www.spikesource.com/contest/ for complete details and to register your project. From eric at enthought.com Thu Aug 18 01:05:11 2005 From: eric at enthought.com (eric jones) Date: Wed, 17 Aug 2005 18:05:11 -0500 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: <1124299079.23024.17.camel@geddy.wooz.org> References: <20050817140217.GQ3389@www.async.com.br> <972ec5bd05081709327fed099@mail.gmail.com> <1124299079.23024.17.camel@geddy.wooz.org> Message-ID: <4303C2A7.2030608@enthought.com> Barry Warsaw wrote: >On Wed, 2005-08-17 at 12:37, Jeremy Hylton wrote: > > >>I'd like to see the builtin id() removed so that I can use it as a >>local variable name without clashing with the builtin name. I >>certainly use the id() function, but not as often as I have a local >>variable I'd like to name id. >> >> > >Same here. > > > >>The sys module seems like a natural >>place to put id(), since it is exposing something about the >>implementation of Python rather than something about the language; the >>language offers the is operator to check ids. >> >> > >+1 >-Barry > > +1 eric From foom at fuhm.net Thu Aug 18 01:23:47 2005 From: foom at fuhm.net (James Y Knight) Date: Wed, 17 Aug 2005 19:23:47 -0400 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: <972ec5bd05081711555e9ad129@mail.gmail.com> References: <2m4q9qunao.fsf@starship.python.net> <000801c5a2c6$5ca70080$af26c797@oemcomputer> <972ec5bd05081711555e9ad129@mail.gmail.com> Message-ID: <7F4E4259-CE92-4AED-823F-5E06BECAED6C@fuhm.net> On Aug 17, 2005, at 2:55 PM, Timothy Fitz wrote: > On 8/16/05, Raymond Hettinger wrote: > >> -0 The behavior of dir() already a bit magical. Python is much >> simpler >> to comprehend if we have direct relationships like dir() and vars() >> corresponding as closely as possible to the object's dictionary. If >> someone injects non-strings into an attribute dictionary, why should >> dir() hide that fact? >> > > Indeed, there seem to be two camps, those who want dir to reflect > __dict__ > and those who want dir to reflect attributes of an object. It seems to > me that those who want dir to reflect __dict__ should just use > __dict__ in the first place. > > However, in the case of dir handling non-strings, should dir handle > non-valid identifiers as well, that is to say that while > foo.__dict__[2] = ... is an obvious case what about foo.__dict__["1"] > ? > > Right now the documentation says that it returns "attributes", and I > would not consider non-strings to be attributes, so either the > documentation or the implementation should rectify this disagreement. > I initially was going to say no, there's no reason to restrict your idea of "attributes" to be purely strings, because surely you could use non-strings as attributes if you wished to. But Python proves me wrong: >>> class X: pass >>> X.__dict__[1] = 5 >>> dir(X) [1, '__doc__', '__module__'] >>> getattr(X, 1) TypeError: getattr(): attribute name must be string If dir() is supposed to return the list of attributes, it does seem logical that it should be possible to pass those names into getattr. I think I'd actually call that a defect in getattr() that it doesn't allow non-string attributes, not a defect in dir(). Ooh...even more annoying, it doesn't even allow unicode attributes that use characters outside the default encoding (ASCII). But either way, there's absolutely no reason to worry about the attribute string being a valid identifier. That's pretty much only a concern for tab-completion in python shells. James From paul at pfdubois.com Thu Aug 18 05:05:32 2005 From: paul at pfdubois.com (Paul F. Dubois) Date: Wed, 17 Aug 2005 20:05:32 -0700 Subject: [Python-Dev] Deprecating builtin id (and moving it to, sys()) Message-ID: <4303FAFC.3070204@pfdubois.com> -1 for this proposal from me. I use id some and therefore the change would break some of my code. Breaking existing code without some overwhelming reason is a very bad idea, in my opinion. The reason cited here, that the name is so natural that one is tempted to use it, applies to many builtins. Ever written dict = {} and then said to yourself, gee, that isn't a very good idea? I have. Besides that, the fact that an object has an identity, behaviors, and data is primary. For teaching beginners id() is important. Paul From anthony at interlink.com.au Thu Aug 18 06:09:16 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu, 18 Aug 2005 14:09:16 +1000 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: <20050817140217.GQ3389@www.async.com.br> References: <20050817140217.GQ3389@www.async.com.br> Message-ID: <200508181409.17431.anthony@interlink.com.au> On Thursday 18 August 2005 00:02, Christian Robottom Reis wrote: > I wonder: is moving id() to sys doable in the 2.5 cycle, with a > deprecation warning being raised for people using the builtin? We'd then > phase it out in one of the latter 2.x versions. I'm neutral on putting id() also into sys. I'm -1 on either issuing a deprecation warning or, worse yet, removing the id() builtin. The warnings system is expensive to call, and I know from a brief look at a bunch of code that I use id() inside some tight inner loops. Removing it entirely is gratuitous breakage, for a not very high payoff. If you _really_ want to call a local variable 'id' you can (but shouldn't). You also can't/shouldn't call a variable 'class', 'def', or 'len' -- but I don't see any movement to allow these... Anthony -- Anthony Baxter It's never too late to have a happy childhood. From mal at egenix.com Thu Aug 18 09:36:14 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 18 Aug 2005 09:36:14 +0200 Subject: [Python-Dev] SWIG and rlcompleter In-Reply-To: <7F4E4259-CE92-4AED-823F-5E06BECAED6C@fuhm.net> References: <2m4q9qunao.fsf@starship.python.net> <000801c5a2c6$5ca70080$af26c797@oemcomputer> <972ec5bd05081711555e9ad129@mail.gmail.com> <7F4E4259-CE92-4AED-823F-5E06BECAED6C@fuhm.net> Message-ID: <43043A6E.5020109@egenix.com> James Y Knight wrote: > On Aug 17, 2005, at 2:55 PM, Timothy Fitz wrote: > > >>On 8/16/05, Raymond Hettinger wrote: >> >> >>>-0 The behavior of dir() already a bit magical. Python is much >>>simpler >>>to comprehend if we have direct relationships like dir() and vars() >>>corresponding as closely as possible to the object's dictionary. If >>>someone injects non-strings into an attribute dictionary, why should >>>dir() hide that fact? >>> >> >>Indeed, there seem to be two camps, those who want dir to reflect >>__dict__ >>and those who want dir to reflect attributes of an object. It seems to >>me that those who want dir to reflect __dict__ should just use >>__dict__ in the first place. >> >>However, in the case of dir handling non-strings, should dir handle >>non-valid identifiers as well, that is to say that while >>foo.__dict__[2] = ... is an obvious case what about foo.__dict__["1"] >>? >> >>Right now the documentation says that it returns "attributes", and I >>would not consider non-strings to be attributes, so either the >>documentation or the implementation should rectify this disagreement. >> > > > I initially was going to say no, there's no reason to restrict your > idea of "attributes" to be purely strings, because surely you could > use non-strings as attributes if you wished to. But Python proves me > wrong: > >>> class X: pass > >>> X.__dict__[1] = 5 > >>> dir(X) > [1, '__doc__', '__module__'] > >>> getattr(X, 1) > TypeError: getattr(): attribute name must be string > > If dir() is supposed to return the list of attributes, it does seem > logical that it should be possible to pass those names into getattr. > I think I'd actually call that a defect in getattr() that it doesn't > allow non-string attributes, not a defect in dir(). Ooh...even more > annoying, it doesn't even allow unicode attributes that use > characters outside the default encoding (ASCII). Which is quite natural: Python doesn't allow any non-ASCII identifiers either :-) > But either way, there's absolutely no reason to worry about the > attribute string being a valid identifier. That's pretty much only a > concern for tab-completion in python shells. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 18 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Thu Aug 18 11:39:37 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 18 Aug 2005 11:39:37 +0200 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: <200508181409.17431.anthony@interlink.com.au> References: <20050817140217.GQ3389@www.async.com.br> <200508181409.17431.anthony@interlink.com.au> Message-ID: <43045759.80806@v.loewis.de> Anthony Baxter wrote: > Removing it entirely is gratuitous breakage, for a not very high payoff. If > you _really_ want to call a local variable 'id' you can (but shouldn't). > You also can't/shouldn't call a variable 'class', 'def', or 'len' -- but I > don't see any movement to allow these... This is getting off-topic, but... In C#, you can: you write @class, @void, @return. Apparently, this is so that you can access arbitrary COM objects (which may happen to use C# keywords as method names). Of course, we would put an underscore after the name in that case. Regards, Martin From ianb at colorstudy.com Thu Aug 18 17:54:49 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 18 Aug 2005 10:54:49 -0500 Subject: [Python-Dev] PEP 309: Partial method application Message-ID: <4304AF49.6030007@colorstudy.com> I missed the discussion on this (http://www.python.org/peps/pep-0309.html), but then 2.5 isn't out yet. I think partial() misses an important use case of method getting, for instance: lst = ['A', 'b', 'C'] lst.sort(key=partialmethod('lower')) Which sorts by lower-case. Of course you can use str.lower, except you'll have unnecessarily enforced a type (and excluded Unicode). So you are left with lambda x: x.lower(). Here's an implementation: def partialmethod(method, *args, **kw): def call(obj, *more_args, **more_kw): call_kw = kw.copy() call_kw.update(more_kw) return getattr(obj, method)(*(arg+more_args), **call_kw) return call This is obviously related to partial(). Maybe this implementation should be a classmethod or function attribute, partial.method(). -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From raymond.hettinger at verizon.net Thu Aug 18 18:00:06 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 18 Aug 2005 12:00:06 -0400 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <4304AF49.6030007@colorstudy.com> Message-ID: <003401c5a40d$ea5694c0$8b24c797@oemcomputer> [Ian Bicking] > I think partial() misses an important use case of method getting, for > instance: > > lst = ['A', 'b', 'C'] > lst.sort(key=partialmethod('lower')) We've already got one: lst.sort(key=operator.attrgetter('lower')) Raymond From gvanrossum at gmail.com Thu Aug 18 18:22:01 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 18 Aug 2005 09:22:01 -0700 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: <200508181409.17431.anthony@interlink.com.au> References: <20050817140217.GQ3389@www.async.com.br> <200508181409.17431.anthony@interlink.com.au> Message-ID: On 8/17/05, Anthony Baxter wrote: > If you _really_ want to call a local variable 'id' you can (but shouldn't). Disagreed. The built-in namespace is searched last for a reason -- the design is such that if you don't care for a particular built-in you don't need to know about it. > You also can't/shouldn't call a variable 'class', 'def', or 'len' -- but I > don't see any movement to allow these... Please don't propagate the confusion between reserved keywords and built-in names! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steven.bethard at gmail.com Thu Aug 18 18:34:00 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu, 18 Aug 2005 10:34:00 -0600 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <003401c5a40d$ea5694c0$8b24c797@oemcomputer> References: <4304AF49.6030007@colorstudy.com> <003401c5a40d$ea5694c0$8b24c797@oemcomputer> Message-ID: Raymond Hettinger wrote: > [Ian Bicking] > > I think partial() misses an important use case of method getting, for > > instance: > > > > lst = ['A', 'b', 'C'] > > lst.sort(key=partialmethod('lower')) > > We've already got one: > > lst.sort(key=operator.attrgetter('lower')) Doesn't that just sort on the str.lower or unicode.lower method object? py> sorted(['A', u'b', 'C'], key=operator.attrgetter('lower')) [u'b', 'C', 'A'] py> sorted(['A', u'b', 'C'], key=partialmethod('lower')) # after fixing arg -> args bug ['A', u'b', 'C'] STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From raymond.hettinger at verizon.net Thu Aug 18 18:43:07 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 18 Aug 2005 12:43:07 -0400 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: Message-ID: <003901c5a413$e997c9e0$8b24c797@oemcomputer> > > [Ian Bicking] > > > I think partial() misses an important use case of method getting, for > > > instance: > > > > > > lst = ['A', 'b', 'C'] > > > lst.sort(key=partialmethod('lower')) > > > > We've already got one: > > > > lst.sort(key=operator.attrgetter('lower')) > > Doesn't that just sort on the str.lower or unicode.lower method object? My mistake. It sorts on the bound method rather than the results of applying that method. Raymond From ianb at colorstudy.com Thu Aug 18 20:05:54 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 18 Aug 2005 13:05:54 -0500 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <003901c5a413$e997c9e0$8b24c797@oemcomputer> References: <003901c5a413$e997c9e0$8b24c797@oemcomputer> Message-ID: <4304CE02.7080308@colorstudy.com> Raymond Hettinger wrote: >>>>instance: >>>> >>>> lst = ['A', 'b', 'C'] >>>> lst.sort(key=partialmethod('lower')) >>> >>>We've already got one: >>> >>> lst.sort(key=operator.attrgetter('lower')) >> >>Doesn't that just sort on the str.lower or unicode.lower method >> object? > > My mistake. It sorts on the bound method rather than the results of > applying that method. Then I thought it might be right to do partial(operator.attrgetter('lower')). This, however, accomplishes exactly nothing. I only decided this after actually trying it, though upon reflection partial(function) always accomplishes nothing. I don't have any conclusion from this, but only mention it to demonstrate that callables on top of callables are likely to confuse. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From martin at v.loewis.de Thu Aug 18 21:40:19 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 18 Aug 2005 21:40:19 +0200 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <4304AF49.6030007@colorstudy.com> References: <4304AF49.6030007@colorstudy.com> Message-ID: <4304E423.9050005@v.loewis.de> Ian Bicking wrote: > > lst = ['A', 'b', 'C'] > lst.sort(key=partialmethod('lower')) > > Which sorts by lower-case. Of course you can use str.lower, except > you'll have unnecessarily enforced a type (and excluded Unicode). So > you are left with lambda x: x.lower(). For this specific case, you can use string.lower (which is exactly what the lambda function does). As for the more general proposal: -1 on more places to pass strings to denote method/function/class names. These are ugly to type. What I think you want is not a partial method, instead, you want to turn a method into a standard function, and in a 'virtual' way. So I would propose the syntax lst.sort(key=virtual.lower) # where virtual is functional.virtual As for extending PEP 309: This PEP deliberately abstained from other ways of currying, and instead only introduced the functional module. If you want to see "lazy functions" in the standard library, you should write a new PEP (unless there is an easy agreement about a single right way to do this, which I don't see). Regards, Martin P.S. It's not even clear that this should be added to functional, as attrgetter and itemgetter are already in operator. But, perhaps, they should be in functional. From shane at hathawaymix.org Thu Aug 18 22:13:31 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Thu, 18 Aug 2005 14:13:31 -0600 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <4304E423.9050005@v.loewis.de> References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de> Message-ID: <4304EBEB.9000300@hathawaymix.org> Martin v. L?wis wrote: > So I would propose the syntax > > lst.sort(key=virtual.lower) # where virtual is functional.virtual Ooh, may I say that idea is interesting! It's easy to implement, too: class virtual: def __getattr__(self, name): return lambda obj: getattr(obj, name)() virtual = virtual() Shane From gvanrossum at gmail.com Thu Aug 18 22:17:16 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 18 Aug 2005 13:17:16 -0700 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <4304E423.9050005@v.loewis.de> References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de> Message-ID: On 8/18/05, "Martin v. L?wis" wrote: > As for the more general proposal: -1 on more places to pass strings to > denote method/function/class names. These are ugly to type. Agreed. > What I think you want is not a partial method, instead, you want to > turn a method into a standard function, and in a 'virtual' way. > > So I would propose the syntax > > lst.sort(key=virtual.lower) # where virtual is functional.virtual I like this, but would hope for a different name -- the poor word 'virtual' has been abused enough by C++. > P.S. It's not even clear that this should be added to functional, > as attrgetter and itemgetter are already in operator. But, perhaps, > they should be in functional. They feel related to attrgetter more than to partial. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Thu Aug 18 22:46:00 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 18 Aug 2005 13:46:00 -0700 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de> Message-ID: On 8/18/05, Guido van Rossum wrote: > On 8/18/05, "Martin v. L?wis" wrote: > > As for the more general proposal: -1 on more places to pass strings to > > denote method/function/class names. These are ugly to type. > > Agreed. > > > What I think you want is not a partial method, instead, you want to > > turn a method into a standard function, and in a 'virtual' way. > > > > So I would propose the syntax > > > > lst.sort(key=virtual.lower) # where virtual is functional.virtual > > I like this, but would hope for a different name -- the poor word > 'virtual' has been abused enough by C++. > Yeah, me too. Possible name are 'delayed', 'lazyattr', or just plain 'lazy' since it reminds me of Haskell. > > P.S. It's not even clear that this should be added to functional, > > as attrgetter and itemgetter are already in operator. But, perhaps, > > they should be in functional. > > They feel related to attrgetter more than to partial. > True, but the idea of lazy evaluation, at least for me, reminds me more of functional languages and thus the functional module. Oh, when should we think of putting reduce into functional? I remember this was discussed when it was realized reduce was the only functional built-in that is not covered by itertools or listcomps. -Brett From raymond.hettinger at verizon.net Thu Aug 18 22:52:36 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 18 Aug 2005 16:52:36 -0400 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: Message-ID: <000c01c5a436$e232daa0$8b24c797@oemcomputer> [Guido] > They feel related to attrgetter more than to partial. That suggests operator.methodcall() From ianb at colorstudy.com Thu Aug 18 22:57:38 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 18 Aug 2005 15:57:38 -0500 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de> Message-ID: <4304F642.7040508@colorstudy.com> Brett Cannon wrote: >>>What I think you want is not a partial method, instead, you want to >>>turn a method into a standard function, and in a 'virtual' way. >>> >>>So I would propose the syntax >>> >>> lst.sort(key=virtual.lower) # where virtual is functional.virtual >> >>I like this, but would hope for a different name -- the poor word >>'virtual' has been abused enough by C++. >> > > > Yeah, me too. Possible name are 'delayed', 'lazyattr', or just plain > 'lazy' since it reminds me of Haskell. I don't think there's anything particularly lazy about it. It's like a compliment of attrgetter. Where attrgetter is an inversion of getattr, partialmethod is an inversion of... well, of something that currently has no name. There's kind of an implicit operation in obj.method() -- people will generally read that as a "method call", not as the retrieval of a bound method and later invocation of that method. I think that is why it's so hard to figure out how to represent this in terms of something like attrgetter -- we try to invert something (a method call) that doesn't exist in the language. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From steven.bethard at gmail.com Thu Aug 18 23:20:09 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu, 18 Aug 2005 15:20:09 -0600 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <4304EBEB.9000300@hathawaymix.org> References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de> <4304EBEB.9000300@hathawaymix.org> Message-ID: Martin v. L?wis wrote: > So I would propose the syntax > > lst.sort(key=virtual.lower) # where virtual is functional.virtual Shane Hathaway wrote: > class virtual: > def __getattr__(self, name): > return lambda obj: getattr(obj, name)() > virtual = virtual() I think (perhaps because of the name) that this could be confusing. I don't have any intuition that "virtual.lower" would return a function that calls the "lower" attribute instead of returning a function that simply accesses that attribute. If we're going to move away from the itemgetter() and attrgetter() style, then we should be consistent about it and provide a solution (or solutions) that answers all of these problems: obj.attr obj.attr(*args, **kwargs) obj[key] I'm not sure that there is a clean/obvious way to do this. STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From ncoghlan at gmail.com Fri Aug 19 00:43:06 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 19 Aug 2005 08:43:06 +1000 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de> Message-ID: <43050EFA.6070702@gmail.com> Brett Cannon wrote: >>>What I think you want is not a partial method, instead, you want to >>>turn a method into a standard function, and in a 'virtual' way. >>> >>>So I would propose the syntax >>> >>> lst.sort(key=virtual.lower) # where virtual is functional.virtual >> >>I like this, but would hope for a different name -- the poor word >>'virtual' has been abused enough by C++. > > Yeah, me too. Possible name are 'delayed', 'lazyattr', or just plain > 'lazy' since it reminds me of Haskell. Hmm, "methodcall"? As in: lst.sort(key=methodcall.lower) Where "methodcall" is something like what Shane described: class methodcall: def __getattr__(self, name): def delayedcall(*args, **kwds): return getattr(args[0], name)(*args[1:], **kwds) return delayedcall methodcall = methodcall() > > Oh, when should we think of putting reduce into functional? I > remember this was discussed when it was realized reduce was the only > functional built-in that is not covered by itertools or listcomps. I expected functional.map, functional.filter and functional.reduce to all exist in 2.5. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From bcannon at gmail.com Fri Aug 19 01:05:17 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 18 Aug 2005 16:05:17 -0700 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <43050EFA.6070702@gmail.com> References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de> <43050EFA.6070702@gmail.com> Message-ID: On 8/18/05, Nick Coghlan wrote: > Brett Cannon wrote: > > Oh, when should we think of putting reduce into functional? I > > remember this was discussed when it was realized reduce was the only > > functional built-in that is not covered by itertools or listcomps. > > I expected functional.map, functional.filter and functional.reduce to all > exist in 2.5. > Itertools covers map, filter is covered by genexps. 'reduce' is the only one that does not have an equivalent anywhere. I guess we could cross-link itertools.map into functional.map, but I would rather just mention in the docs of one that it is located in the other module. And filter is just not worth it; that can definitely be covered in the docs of the module. -Brett From jcarlson at uci.edu Fri Aug 19 02:09:09 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 18 Aug 2005 17:09:09 -0700 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: References: <4304EBEB.9000300@hathawaymix.org> Message-ID: <20050818162112.788A.JCARLSON@uci.edu> Steven Bethard wrote: > > Martin v. L?wis wrote: > > So I would propose the syntax > > > > lst.sort(key=virtual.lower) # where virtual is functional.virtual > > Shane Hathaway wrote: > > class virtual: > > def __getattr__(self, name): > > return lambda obj: getattr(obj, name)() > > virtual = virtual() > > I think (perhaps because of the name) that this could be confusing. I > don't have any intuition that "virtual.lower" would return a function > that calls the "lower" attribute instead of returning a function that > simply accesses that attribute. > > If we're going to move away from the itemgetter() and attrgetter() > style, then we should be consistent about it and provide a solution > (or solutions) that answers all of these problems: > obj.attr > obj.attr(*args, **kwargs) > obj[key] > I'm not sure that there is a clean/obvious way to do this. I thought that: operator.attrgetter() was for obj.attr operator.itemgetter() was for obj[integer_index] That's almost all the way there. All that remains is to have something that gets any key (not just integers) and which handles function calls. In terms of the function call semantics, what about: class methodcall: def __getattr__(self, name, *args, **kwds): def delayedcall(obj): return getattr(obj, name)(*args, **kwds) return delayedcall methodcall = methodcall() - Josiah From steven.bethard at gmail.com Fri Aug 19 07:33:36 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu, 18 Aug 2005 23:33:36 -0600 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <20050818162112.788A.JCARLSON@uci.edu> References: <4304EBEB.9000300@hathawaymix.org> <20050818162112.788A.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > Steven Bethard wrote: > > If we're going to move away from the itemgetter() and attrgetter() > > style, then we should be consistent about it and provide a solution > > (or solutions) that answers all of these problems: > > obj.attr > > obj.attr(*args, **kwargs) > > obj[key] > > I'm not sure that there is a clean/obvious way to do this. > > I thought that: > operator.attrgetter() was for obj.attr > operator.itemgetter() was for obj[integer_index] My point exactly. If we're sticking to the same style, I would expect that for obj.method(*args, **kwargs) we would have something like: operator.methodcaller('method', *args, **kwargs) The proposal by Martin v. L?wis is that this should instead look something like: methodcall.method(*args, **kwargs) which is a departure from the current attrgetter() and itemgetter() idiom. I'm not objecting to this approach, by the way. I think with the right name, it would probably read well. I just think that we should try to be consistent one way or the other. If we go with Martin v. L?wis's suggestion, I would then expect that the corrolates to attrgetter() and itemgetter() would also be included, e.g.: attrget.attr (for obj.attr) itemget[key] (for obj[key]) STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From martin at v.loewis.de Fri Aug 19 07:59:38 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 19 Aug 2005 07:59:38 +0200 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: References: <4304EBEB.9000300@hathawaymix.org> <20050818162112.788A.JCARLSON@uci.edu> Message-ID: <4305754A.4040900@v.loewis.de> Steven Bethard wrote: >>I thought that: >> operator.attrgetter() was for obj.attr >> operator.itemgetter() was for obj[integer_index] > > > My point exactly. If we're sticking to the same style, I would expect that for > obj.method(*args, **kwargs) > we would have something like: > operator.methodcaller('method', *args, **kwargs) You might be missing one aspect of attrgetter, though. I can have f = operator.attrgetter('name', 'age') and then f(person) gives me (person.name, person.age). Likewise for itemgetter(1,2,3). Extending this to methodcaller is not natural; you would have x=methodcaller(('open',['foo','r'],{}),('read',[100],{}), ('close',[],{})) and then x(somestorage) (I know this is not the typical open/read/close pattern, where you would normally call read on what open returns) It might be that there is no use case for a multi-call methodgetter; I just point out that a single-call methodgetter would *not* be in the same style as attrgetter and itemgetter. > attrget.attr (for obj.attr) > itemget[key] (for obj[key]) I agree that would be consistent. These also wouldn't allow to get multiple items and indices. I don't know what the common use for attrgetter is: one or more attributes? Regards, Martin From steven.bethard at gmail.com Fri Aug 19 09:14:17 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri, 19 Aug 2005 01:14:17 -0600 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <4305754A.4040900@v.loewis.de> References: <4304EBEB.9000300@hathawaymix.org> <20050818162112.788A.JCARLSON@uci.edu> <4305754A.4040900@v.loewis.de> Message-ID: Martin v. L?wis wrote: > Steven Bethard wrote: > >>I thought that: > >> operator.attrgetter() was for obj.attr > >> operator.itemgetter() was for obj[integer_index] > > > > > > My point exactly. If we're sticking to the same style, I would expect that for > > obj.method(*args, **kwargs) > > we would have something like: > > operator.methodcaller('method', *args, **kwargs) > > You might be missing one aspect of attrgetter, though. I can have > > f = operator.attrgetter('name', 'age') > > and then f(person) gives me (person.name, person.age). Likewise for > itemgetter(1,2,3). [snip] > I don't know what the common use for > attrgetter is: one or more attributes? Well, in current Python code, I'd be willing to wager that it's one, no more, since Python 2.4 only supports a single argument to itemgetter and attrgetter. Of course, when Python 2.5 comes out, it's certainly possible that the multi-argument forms will become commonplace. I agree that an operator.methodcaller() shouldn't try to support multiple methods. OTOH, the syntax methodcall.method(*args, **kwargs) doesn't really lend itself to multiple methods either. STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From jcarlson at uci.edu Fri Aug 19 09:37:43 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 19 Aug 2005 00:37:43 -0700 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: References: <4305754A.4040900@v.loewis.de> Message-ID: <20050819003609.789B.JCARLSON@uci.edu> Steven Bethard wrote: > I agree that an operator.methodcaller() shouldn't try to support > multiple methods. OTOH, the syntax > methodcall.method(*args, **kwargs) > doesn't really lend itself to multiple methods either. But that's OK, we don't want to be calling multiple methods anyways, do we? I'd personally like to see an example it makes sense if someone says that we do. - Josiah From raymond.hettinger at verizon.net Fri Aug 19 18:39:18 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 19 Aug 2005 12:39:18 -0400 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <20050819003609.789B.JCARLSON@uci.edu> Message-ID: <002501c5a4dc$8b521560$3f25a044@oemcomputer> [Steven Bethard] > > I agree that an operator.methodcaller() shouldn't try to support > > multiple methods. OTOH, the syntax > > methodcall.method(*args, **kwargs) > > doesn't really lend itself to multiple methods either. [Josiah Carlson] > But that's OK, we don't want to be calling multiple methods anyways, do > we? I'd personally like to see an example it makes sense if someone > says that we do. If an obvious syntax doesn't emerge, don't fret. The most obvious approach is to define a regular Python function and supply that function to the key= argument for list.sort() or sorted(). A virtue of the key= argument was reducing O(n log n) calls to just O(n). Further speed-ups are a false economy. So there's no need to twist syntax into knots just to get a C based method calling function. Likewise with map(), if a new function doesn't fit neatly, take that as a cue to be writing a plain for-loop. Raymond From martin at v.loewis.de Fri Aug 19 22:08:23 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 19 Aug 2005 22:08:23 +0200 Subject: [Python-Dev] PEP 309: Partial method application In-Reply-To: <20050819003609.789B.JCARLSON@uci.edu> References: <4305754A.4040900@v.loewis.de> <20050819003609.789B.JCARLSON@uci.edu> Message-ID: <43063C37.8040808@v.loewis.de> Josiah Carlson wrote: > Steven Bethard wrote: > >>I agree that an operator.methodcaller() shouldn't try to support >>multiple methods. OTOH, the syntax >> methodcall.method(*args, **kwargs) >>doesn't really lend itself to multiple methods either. > > > But that's OK, we don't want to be calling multiple methods anyways, do > we? I'd personally like to see an example it makes sense if someone > says that we do. Several people argued that the version with a string method name should be added "for consistency". I only pointed out that doing so would not be completely consistent. Regards, Martin From jeremy at alum.mit.edu Fri Aug 19 23:15:15 2005 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 19 Aug 2005 17:15:15 -0400 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: References: <20050817140217.GQ3389@www.async.com.br> <200508181409.17431.anthony@interlink.com.au> Message-ID: On 8/18/05, Guido van Rossum wrote: > On 8/17/05, Anthony Baxter wrote: > > If you _really_ want to call a local variable 'id' you can (but shouldn't). > > Disagreed. The built-in namespace is searched last for a reason -- the > design is such that if you don't care for a particular built-in you > don't need to know about it. In practice, it causes much confusion if you ever use a local variable that has the same name as the built-in namespace. If you intend to use id as a variable, it leads to confusing messages when a typo or editing error accidentally removes the definition, because the name will still be defined for you. It also leads to confusion when you later want to use the builtin in the same module or function (or in the debugger). If Python defines the name, I don't want to provide a redefinition. Jeremy From gvanrossum at gmail.com Sat Aug 20 06:00:17 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri, 19 Aug 2005 21:00:17 -0700 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: References: <20050817140217.GQ3389@www.async.com.br> <200508181409.17431.anthony@interlink.com.au> Message-ID: On 8/19/05, Jeremy Hylton wrote: > On 8/18/05, Guido van Rossum wrote: > > On 8/17/05, Anthony Baxter wrote: > > > If you _really_ want to call a local variable 'id' you can (but shouldn't). > > > > Disagreed. The built-in namespace is searched last for a reason -- the > > design is such that if you don't care for a particular built-in you > > don't need to know about it. > > In practice, it causes much confusion if you ever use a local variable > that has the same name as the built-in namespace. If you intend to > use id as a variable, it leads to confusing messages when a typo or > editing error accidentally removes the definition, because the name > will still be defined for you. It also leads to confusion when you > later want to use the builtin in the same module or function (or in > the debugger). If Python defines the name, I don't want to provide a > redefinition. This has startled me a few times, but never for more than 30 seconds. In correct code there sure isn't any confusion. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From anthony at interlink.com.au Sat Aug 20 10:48:11 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Sat, 20 Aug 2005 18:48:11 +1000 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: References: <20050817140217.GQ3389@www.async.com.br> <200508181409.17431.anthony@interlink.com.au> Message-ID: <200508201848.13750.anthony@interlink.com.au> On Friday 19 August 2005 02:22, Guido van Rossum wrote: > On 8/17/05, Anthony Baxter wrote: > > If you _really_ want to call a local variable 'id' you can (but > > shouldn't). > > Disagreed. The built-in namespace is searched last for a reason -- the > design is such that if you don't care for a particular built-in you > don't need to know about it. I'm not sure what you're disagreeing with. Are you saying you _can't_ call a variable 'id', or that it's OK to do this? > > You also can't/shouldn't call a variable 'class', 'def', or 'len' -- but > > I don't see any movement to allow these... > > Please don't propagate the confusion between reserved keywords and > built-in names! It's not a matter of 'confusion', more that there are some names you can't or shouldn't use in Python. When coding twisted, often the most obvious 'short' name for a Deferred is 'def', but of course that doesn't work. Anthony From gvanrossum at gmail.com Sat Aug 20 18:02:25 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 20 Aug 2005 09:02:25 -0700 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: <200508201848.13750.anthony@interlink.com.au> References: <20050817140217.GQ3389@www.async.com.br> <200508181409.17431.anthony@interlink.com.au> <200508201848.13750.anthony@interlink.com.au> Message-ID: On 8/20/05, Anthony Baxter wrote: > On Friday 19 August 2005 02:22, Guido van Rossum wrote: > > On 8/17/05, Anthony Baxter wrote: > > > If you _really_ want to call a local variable 'id' you can (but > > > shouldn't). > > > > Disagreed. The built-in namespace is searched last for a reason -- the > > design is such that if you don't care for a particular built-in you > > don't need to know about it. > > I'm not sure what you're disagreeing with. Are you saying you _can't_ call > a variable 'id', or that it's OK to do this? That it's OK. > > > You also can't/shouldn't call a variable 'class', 'def', or 'len' -- but > > > I don't see any movement to allow these... > > > > Please don't propagate the confusion between reserved keywords and > > built-in names! > > It's not a matter of 'confusion', more that there are some names you can't > or shouldn't use in Python. When coding twisted, often the most obvious > 'short' name for a Deferred is 'def', but of course that doesn't work. My point is that there are two reasons for not using such a name. With 'def', you *can't*. With 'len', you *could* (but it would be unwise). With 'id', IMO it's okay. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Sat Aug 20 18:14:51 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 20 Aug 2005 09:14:51 -0700 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com> Message-ID: I'm ready to accept te general idea of moving to subversion and away from SourceForge. On the hosting issue, I'm still neutral -- I expect we'll be able to support the current developer crowd easily on svn.python.org, but if we ever find ther are resource problems (either people or bandwidth etc.) I just received a recommendation for wush.net which specializes in svn hosting. $90/month for 5 Gb of disk space sounds like a good deal and easily within the PSF budget. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bob at redivi.com Sat Aug 20 18:45:05 2005 From: bob at redivi.com (Bob Ippolito) Date: Sat, 20 Aug 2005 06:45:05 -1000 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com> Message-ID: On Aug 20, 2005, at 6:14 AM, Guido van Rossum wrote: > I'm ready to accept te general idea of moving to subversion and away > from SourceForge. > > On the hosting issue, I'm still neutral -- I expect we'll be able to > support the current developer crowd easily on svn.python.org, but if > we ever find ther are resource problems (either people or bandwidth > etc.) I just received a recommendation for wush.net which specializes > in svn hosting. $90/month for 5 Gb of disk space sounds like a good > deal and easily within the PSF budget. We were using wush.net's subversion and trac service for a (commercial) project from February until a little over a week ago. Their servers dropped off the internet for about three days straight earlier this month and we were unable to contact anyone. I still don't think we've received an explanation as to what happened. When it did come up, our data was OK. Previous to that experience, it worked out OK. The subversion repository got wedged once, but that was fixed in a matter of hours after filing a ticket. We host our own subversion and trac now. We just can't afford that kind of downtime again. Setting up subversion and trac isn't a very big deal, and they don't really require any real maintenance as far as I can tell (.. and I have been dealing with subversion over apache via mod_dav_svn since pre-1.0 days). Another thing to note is that the trac installation at wush.net is a branch off the latest stable version, and the database can't be downgraded or upgraded correctly by the trac-admin tool. However, the SQL to downgrade the schema to the latest stable is trivial and I still have it lying around if anyone is interested in moving their trac repositories off of wush ;) -bob From barry at python.org Sat Aug 20 20:37:02 2005 From: barry at python.org (Barry Warsaw) Date: Sat, 20 Aug 2005 14:37:02 -0400 Subject: [Python-Dev] A testing challenge In-Reply-To: <4303C0DD.5090903@spikesource.com> References: <4303C0DD.5090903@spikesource.com> Message-ID: <1124563022.24297.35.camel@presto.wooz.org> On Wed, 2005-08-17 at 18:57, Calvin Austin wrote: > When was the last time someone thanked you for writing a test? I tried > to think of the last time it happened to me and I can't remember. Well > at Spikesource we want to thank you not just for helping the Python > community but for your testing efforts too and we are running a > participatory testing contest. This is a competition where there are no > losers, every project gains if new tests are written. For more details > see below, it is open worldwide. feel free to send questions to me. Since you posted to python-dev, you might think about adding Python to the list of languages "in which [...] the project [is] written" on the registration form. Currently, the only choices are C/C++, Java, and php. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050820/fa521b8b/attachment.pgp From paolo_veronelli at libero.it Sun Aug 21 11:35:37 2005 From: paolo_veronelli at libero.it (Paolino) Date: Sun, 21 Aug 2005 11:35:37 +0200 Subject: [Python-Dev] On decorators implementation Message-ID: <43084AE9.20900@libero.it> I noticed (via using them) that decorations are applied to methods before they become methods. This choice flattens down the implementation to no differentiating methods from functions. 1) I have to apply euristics on the wrapped function type when I use the function as an index key. if type(observed) is types.MethodType: observed=observed.im_func things like this are inside my decorators. 2) The behavior of decorations are not definable. I imagine that a method implementation of them inside the type metaclass could be better specified by people. This probably ends up in metamethods or something I can't grasp Thanks Paolino From martin at v.loewis.de Sun Aug 21 13:18:58 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 21 Aug 2005 13:18:58 +0200 Subject: [Python-Dev] On decorators implementation In-Reply-To: <43084AE9.20900@libero.it> References: <43084AE9.20900@libero.it> Message-ID: <43086322.2030706@v.loewis.de> Paolino wrote: > I imagine that a method implementation of them inside the type metaclass > could be better specified by people. What you ask for is unimplementable. Method objects are created only when the method is accessed, not (even) when the class is created. Watch this: >>> class X: ... def foo(self): ... pass ... >>> x=X() >>> type(x.foo) >>> type(X.__dict__['foo']) So even though the class has long been defined, inside X's dictionary, foo is still a function. Only when you *access* x.foo, a method object is created on the fly: >>> x.foo is x.foo False Therefore, a decorator function cannot possibly get access to the method object - it simply doesn't exist. Regards, Martin From martin at v.loewis.de Sun Aug 21 15:12:00 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 21 Aug 2005 15:12:00 +0200 Subject: [Python-Dev] Admin access using svn+ssh Message-ID: <43087DA0.702@v.loewis.de> It turns out that svn+ssh with a single account has limitations: you can only set the tunnel user when you are using a restricted key. In PEP 347, the plan is that the current SF project admins get shell access to the pythondev account, which just has been created. To resolve this, project admins need two different SSH keys: one for accessing the shell, and one for regular commit activities. I would suggest that the default key is used for regular commits, and a separate key is created for shell access. I described this a bit in the PEP, essentially, in .ssh/config, I have Host pythondev Hostname dinsdale.python.org User pythondev IdentityFile ~/.ssh/pythondev So when I do "ssh pythondev", I get the shell account; when I do "svn co svn+ssh://pythondev at svn.python.org/python/trunk/Modules", I use my default identity, which gets tunneled as "Martin v. Loewis". Regards, Martin From martin at v.loewis.de Sun Aug 21 15:34:57 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 21 Aug 2005 15:34:57 +0200 Subject: [Python-Dev] wush.net details Message-ID: <43088301.5010104@v.loewis.de> I made a service request at wush.net, asking for more details about their service. There was a first response within 6 hours, asking for more time to prepare an answer. I said I don't need one urgently, and, with apologies, got a response one week later. I added the essence to the PEP; namely: - The machine would be a Virtuozzo Virtual Private Server (VPS), hosted at PowerVPS. - The default repository URL would be http://python.wush.net/svn/projectname/, but anything else could be arranged - we would get SSH login to the machine, with sudo capabilities. - They have a Web interface for management of the various SVN repositories that we want to host, and to manage user accounts. While svn+ssh would be supported, the user interface does not yet support it (although he said they might have something in September) - For offsite mirroring/backup, they suggest to use rsync instead of download of repository tarballs. So it seems that the "regular" administrative overhead would be roughly the same on wush.net and python.org: we would have to maintain account information ourselves; the initial setup might be easier due to the UI wizard help. I understand that the hope when using a commercial service is that its availability is higher, due to us paying somebody for the availability. Of course, Bob Ippolito's report is discouraging here. Regards, Martin From martin at v.loewis.de Sun Aug 21 15:43:59 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 21 Aug 2005 15:43:59 +0200 Subject: [Python-Dev] PEP 347: Migration to Subversion In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com> Message-ID: <4308851F.9090800@v.loewis.de> Guido van Rossum wrote: > On the hosting issue, I'm still neutral -- I expect we'll be able to > support the current developer crowd easily on svn.python.org, but if > we ever find ther are resource problems (either people or bandwidth > etc.) I just received a recommendation for wush.net which specializes > in svn hosting. $90/month for 5 Gb of disk space sounds like a good > deal and easily within the PSF budget. I also have wush.net in the PEP, see my separate message. I'm not sure what it really is that we get over what we get from XS4ALL for free. >From the day-to-day maintenance, they seem comparable: they do backup for us, and we have to maintain accounts ourselves. Of course, wush.net has a Web GUI for maintenance activities (create repositories, create accounts, manage access control). I left out bandwidth details so far: we get 200GB/mo; after this, it is $50/200GB. Another issue might be server load. I don't know how many VPS they host on a single machine, or what their hardware is, but in either case, pythondev developer svn would be shared with something else (other VPSs for wush.net, regular pydotorg activities on python.org). Only day-to-day experience will tell whether this is acceptable. The critical issue seems to be availability: if the service goes down, when will it come back? Bob's experience is discouraging, but then, there also was a python.org outage from time to time (e.g. when MoinMoin consumed all CPU). As for the money itself: 90$/month certainly is not an issue at all. So far, I haven't received any other specific referrals for SVN hosters. Regards, Martin From martin at v.loewis.de Sun Aug 21 15:53:37 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 21 Aug 2005 15:53:37 +0200 Subject: [Python-Dev] Collecting SSH keys Message-ID: <43088761.7010905@v.loewis.de> I have setup a test installation on svn.python.org, so that developers can see how this would work. So if you are currently a sf.net/projects/python developer, please send me your SSH key before August 27 or after September 12. We will use real names for commit messages, so if you have specific preferences about the spelling of your name, please indicate them. The repository will be discarded after the testing, so feel free to make any changes you want. It's not decided yet whether the repository will eventually run on python.org, but it seems clear to me that we likely will use svn+ssh for developer access, unless testing reveals disadvantages of doing so. Please also look at the result of the conversion; if you find any issues, please report them. There is currently no anonymous WebDAV access to the repository. Regards, Martin From sjoerd at acm.org Sun Aug 21 17:28:07 2005 From: sjoerd at acm.org (Sjoerd Mullender) Date: Sun, 21 Aug 2005 17:28:07 +0200 Subject: [Python-Dev] Collecting SSH keys In-Reply-To: <43088761.7010905@v.loewis.de> References: <43088761.7010905@v.loewis.de> Message-ID: <43089D87.2060302@acm.org> Martin v. L?wis wrote: > I have setup a test installation on svn.python.org, so that > developers can see how this would work. > > So if you are currently a sf.net/projects/python developer, > please send me your SSH key before August 27 or after > September 12. We will use real names for commit messages, > so if you have specific preferences about the spelling > of your name, please indicate them. What about people with a whole host of ssh keys? I have a different key for each system I use (currently at least 6). Will this be supported? Will the different keys identify the same person? > The repository will be discarded after the testing, so > feel free to make any changes you want. > > It's not decided yet whether the repository will eventually > run on python.org, but it seems clear to me that we likely > will use svn+ssh for developer access, unless testing > reveals disadvantages of doing so. > > Please also look at the result of the conversion; if you > find any issues, please report them. > > There is currently no anonymous WebDAV access to the > repository. > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/sjoerd.mullender%40cwi.nl -- Sjoerd Mullender -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 369 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20050821/0afb7716/signature.pgp From martin at v.loewis.de Sun Aug 21 18:19:45 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 21 Aug 2005 18:19:45 +0200 Subject: [Python-Dev] Collecting SSH keys In-Reply-To: <43089D87.2060302@acm.org> References: <43088761.7010905@v.loewis.de> <43089D87.2060302@acm.org> Message-ID: <4308A9A1.1040700@v.loewis.de> Sjoerd Mullender wrote: > What about people with a whole host of ssh keys? I have a different key > for each system I use (currently at least 6). Will this be supported? > Will the different keys identify the same person? That would be possible, yes. You should send a single file containing all of them, and, each time something changes, resend the entire file. All of your keys would identify "Sjoerd Mullender". I don't know how this scales in OpenSSH having an authorized_keys file with hundred or more keys. On the wire, this seems safe, as it apparently is the client which offers various keys, and the server which then accepts or rejects them. Regards, Martin From skip at pobox.com Sat Aug 20 04:32:12 2005 From: skip at pobox.com (skip@pobox.com) Date: Fri, 19 Aug 2005 21:32:12 -0500 Subject: [Python-Dev] Deprecating builtin id (and moving it to sys()) In-Reply-To: References: <20050817140217.GQ3389@www.async.com.br> <200508181409.17431.anthony@interlink.com.au> Message-ID: <17158.38444.778226.955186@montanaro.dyndns.org> Guido> The built-in namespace is searched last for a reason -- the Guido> design is such that if you don't care for a particular built-in Guido> you don't need to know about it. In my mind there are three classes of builtins from the standpoint of overriding. Pychecker complains if you override any of them, but I think that many times it does so unnecessarily. The first class includes those builtins that you will likely find in many code samples and should just never be overridden. For me these include "abs", "map", "list", "int", "range", "zip", the various exceptions, etc. The second class of builtins consists of objects or functions that are fairly special-purpose. You might not really care if they are overridden, depending on context. For me this class includes "compile", "id", "reload", "execfile", "ord", etc. Finally, there is the subset of builtins that is included almost solely as a convenience for use at the interpreter prompt. They include "quit", "exit" and "copyright". I could care less if I override them in my code, and don't think pychecker should either. Skip From barry at python.org Mon Aug 22 01:01:22 2005 From: barry at python.org (Barry Warsaw) Date: Sun, 21 Aug 2005 19:01:22 -0400 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <43087DA0.702@v.loewis.de> References: <43087DA0.702@v.loewis.de> Message-ID: <1124665281.31664.28.camel@geddy.wooz.org> On Sun, 2005-08-21 at 09:12, "Martin v. L?wis" wrote: > It turns out that svn+ssh with a single account has limitations: > you can only set the tunnel user when you are using a restricted > key. In PEP 347, the plan is that the current SF project admins > get shell access to the pythondev account, which just has been > created. > > To resolve this, project admins need two different SSH keys: > one for accessing the shell, and one for regular commit activities. I may be totally misunderstanding, but to get shell access wouldn't I avoid using the pythondev account and just use my own account? I'd only need the pythondev account to access the svn repository, right? (And actually, it might be possible to set up group permissions and membership so that I could access the repo with either). The number of people who need shell access should be pretty small. I'm also a little confused about the pep. What does "admin access to the pythondev account" mean? Do you mean the people who are going to be managing users that can access svn? In that case, I think the system admins (i.e. those who already have shell access to dinsdale) would be the people managing user access to svn. > I would suggest that the default key is used for regular commits, > and a separate key is created for shell access. I described this > a bit in the PEP, essentially, in .ssh/config, I have > > Host pythondev > Hostname dinsdale.python.org > User pythondev > IdentityFile ~/.ssh/pythondev > > So when I do "ssh pythondev", I get the shell account; when I do > "svn co svn+ssh://pythondev at svn.python.org/python/trunk/Modules", > I use my default identity, which gets tunneled as "Martin v. Loewis". I'm confused again; are you saying that we should have a host named pythondev.python.org? I'm not sure that's necessary. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050821/580c0311/attachment-0001.pgp From aahz at pythoncraft.com Mon Aug 22 01:55:23 2005 From: aahz at pythoncraft.com (Aahz) Date: Sun, 21 Aug 2005 16:55:23 -0700 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <1124665281.31664.28.camel@geddy.wooz.org> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> Message-ID: <20050821235522.GA16606@panix.com> On Sun, Aug 21, 2005, Barry Warsaw wrote: > On Sun, 2005-08-21 at 09:12, "Martin v. L?wis" wrote: >> >> I would suggest that the default key is used for regular commits, >> and a separate key is created for shell access. I described this >> a bit in the PEP, essentially, in .ssh/config, I have >> >> Host pythondev >> Hostname dinsdale.python.org >> User pythondev >> IdentityFile ~/.ssh/pythondev >> >> So when I do "ssh pythondev", I get the shell account; when I do >> "svn co svn+ssh://pythondev at svn.python.org/python/trunk/Modules", >> I use my default identity, which gets tunneled as "Martin v. Loewis". > > I'm confused again; are you saying that we should have a host named > pythondev.python.org? I'm not sure that's necessary. No, pythondev is simply an SSH alias for dinsdale -- the server knows nothing about it. I don't quite understand the "User pythondev" line, though -- I think that's a mistake. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From martin at v.loewis.de Mon Aug 22 08:18:31 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Aug 2005 08:18:31 +0200 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <1124665281.31664.28.camel@geddy.wooz.org> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> Message-ID: <43096E37.8070708@v.loewis.de> Barry Warsaw wrote: > I may be totally misunderstanding, but to get shell access wouldn't I > avoid using the pythondev account and just use my own account? You could do that (or use the root account); I can't: I don't have a ssh account on dinsdale. An even if I had, I couldn't write to pythondev's authorized_keys2. > I'm also a little confused about the pep. What does "admin access to > the pythondev account" mean? Do you mean the people who are going to be > managing users that can access svn? Correct. > In that case, I think the system > admins (i.e. those who already have shell access to dinsdale) would be > the people managing user access to svn. Ok: to whom should I forward the ssh keys then which I'm currently collecting? >>Host pythondev >> Hostname dinsdale.python.org >> User pythondev >> IdentityFile ~/.ssh/pythondev >> >>So when I do "ssh pythondev", I get the shell account; when I do >>"svn co svn+ssh://pythondev at svn.python.org/python/trunk/Modules", >>I use my default identity, which gets tunneled as "Martin v. Loewis". > > > I'm confused again; are you saying that we should have a host named > pythondev.python.org? I'm not sure that's necessary. Not at all. This is rather an OpenSSH convenience mechanism to avoid typing hostname and user name all the time. I introduce a local alias pythondev, which means I want to access pythondev at dinsdale.python.org, using the key pythondev.pub. Regards, Martin From martin at v.loewis.de Mon Aug 22 08:31:30 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Aug 2005 08:31:30 +0200 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <20050821235522.GA16606@panix.com> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <20050821235522.GA16606@panix.com> Message-ID: <43097142.9050905@v.loewis.de> Aahz wrote: >>>Host pythondev >>> Hostname dinsdale.python.org >>> User pythondev >>> IdentityFile ~/.ssh/pythondev >>> >>I'm confused again; are you saying that we should have a host named >>pythondev.python.org? I'm not sure that's necessary. > > > No, pythondev is simply an SSH alias for dinsdale -- the server knows > nothing about it. I don't quite understand the "User pythondev" line, > though -- I think that's a mistake. That's intentional. "ssh pythondev" now becomes equivalent to ssh -l pythondev -i ~/.ssh/pythondev dinsdale.python.org IOW, the User option is equivalent to specifying the -l option. Regards, Martin From paolo_veronelli at libero.it Mon Aug 22 10:12:46 2005 From: paolo_veronelli at libero.it (Paolino) Date: Mon, 22 Aug 2005 10:12:46 +0200 Subject: [Python-Dev] On decorators implementation In-Reply-To: <43084AE9.20900@libero.it> References: <43084AE9.20900@libero.it> Message-ID: <430988FE.2020603@libero.it> Paolino wrote: > I noticed (via using them) that decorations are applied to methods > before they become methods. > > This choice flattens down the implementation to no differentiating > methods from functions. > > > > 1) > I have to apply euristics on the wrapped function type when I use the > function as an index key. > > if type(observed) is types.MethodType: > observed=observed.im_func > > things like this are inside my decorators. > > 2) > The behavior of decorations are not definable. > I imagine that a method implementation of them inside the type metaclass > could be better specified by people. > This probably ends up in metamethods or something I can't grasp > A downside of decorating at function level is that it's virtually impossible to check from the decorator that the first call parameter (aka self) is an instance of the method class.This check must be done inside the decorated. This can really happen in normal use as decorators are useful to register the decorated as a 'callback'.Who ever fires it can do it with no respect on the class belonging of the function/method, and the error raised will not be coherent with 'calling method on a incompatible instance'. Maybe it's possible to let the decorator know the method class even if the class is still undefined.(Just like recursive functions?) This would allow decorators to call super with the right class also. @callSuper decoration is something I really miss. Thanks Paolino From stephen at xemacs.org Mon Aug 22 09:39:03 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 22 Aug 2005 16:39:03 +0900 Subject: [Python-Dev] Collecting SSH keys In-Reply-To: <4308A9A1.1040700@v.loewis.de> ( =?iso-8859-1?q?Martin_v=2E_L=F6wis's_message_of?= "Sun, 21 Aug 2005 18:19:45 +0200") References: <43088761.7010905@v.loewis.de> <43089D87.2060302@acm.org> <4308A9A1.1040700@v.loewis.de> Message-ID: <87y86uqv3c.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Martin" == Martin v L?wis writes: Martin> I don't know how this scales in OpenSSH having an Martin> authorized_keys file with hundred or more keys. On cvs.xemacs.org (aka SunSITE.dk) ssh+cvs access with cvs access control being handled by a Perl script scales to approximately 85 users. I don't handle key management directly, but I believe several users use multiple keys (I don't personally). I've never heard any complaints from the guys who actually do key management; they just keep authorized_keys in alphabetical order by comment (= user's real name). Nor do I notice any authorization overhead vs. a simple ssh login when accessing the cvs server.[1] Evidently the "what keys do you have?" negotiation with the agent takes very little time (in terms of what a human can notice). If you want time(1) timings or something like that, I'd be happy to get an exact count of the number of keys and do them (but it will have to wait until I get back from travel August 28). Footnotes: [1] For testing whether keys are properly installed, the sequence "ssh xemacs at cvs.xemacs.org", then asking the server for "version" and sending EOF (^D), is what we use. So there is no overhead from a local CVS or anything like that, although of course you do have to start the remote cvs server process (via the COMMAND= in the .ssh/config file). How that compares to starting a shell I'm not sure. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From raymond.hettinger at verizon.net Mon Aug 22 14:46:27 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Mon, 22 Aug 2005 08:46:27 -0400 Subject: [Python-Dev] [Python-checkins] python/dist/src setup.py, 1.219, 1.220 In-Reply-To: <20050821184639.EF8711E4006@bag.python.org> Message-ID: <003101c5a717$83be4b60$3c23a044@oemcomputer> > A new hashlib module to replace the md5 and sha modules. It adds > support for additional secure hashes such as SHA-256 and SHA-512. The > hashlib module uses OpenSSL for fast platform optimized > implementations of algorithms when available. The old md5 and sha > modules still exist as wrappers around hashlib to preserve backwards > compatibility. I'm getting compilation errors: C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad suffix on number' C:\py25\Modules\sha512module.c(146) : error C2146: syntax error : missing ')' before identifier 'L' C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad suffix on number' C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad suffix on number' C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad suffix on number' C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad suffix on number' C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad suffix on number' C:\py25\Modules\sha512module.c(146) : fatal error C1013: compiler limit : too many open parentheses Also, there should be updating entries to Misc/NEWS, PC/VC6/pythoncore.dsp, and PC/config.c. Raymond From martin at v.loewis.de Mon Aug 22 16:11:31 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Aug 2005 16:11:31 +0200 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <1124711379.31664.213.camel@geddy.wooz.org> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <1124711379.31664.213.camel@geddy.wooz.org> Message-ID: <4309DD13.4040902@v.loewis.de> Barry Warsaw wrote: >>You could do that (or use the root account); I can't: I don't have >>a ssh account on dinsdale. An even if I had, I couldn't write to >>pythondev's authorized_keys2. > > > That's easily rectified! :) We should give you an account and sudo > access. Should I just use your keys from creosote? Please do! >>Ok: to whom should I forward the ssh keys then which I'm currently >>collecting? > > > Probably here, unless once you have the above, you still want to do it > yourself. I would be worried that you are a single point of failure here: for sf.net/projects/python, multiple people can add new users, and I think we should continue that tradition. I would be happy with *different* people being able to manage that, but the group should be larger than two, IMO. Regards, Martin From martin at v.loewis.de Mon Aug 22 16:20:37 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Aug 2005 16:20:37 +0200 Subject: [Python-Dev] Collecting SSH keys In-Reply-To: <87y86uqv3c.fsf@tleepslib.sk.tsukuba.ac.jp> References: <43088761.7010905@v.loewis.de> <43089D87.2060302@acm.org> <4308A9A1.1040700@v.loewis.de> <87y86uqv3c.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <4309DF35.5000902@v.loewis.de> Stephen J. Turnbull wrote: > On cvs.xemacs.org (aka SunSITE.dk) ssh+cvs access with cvs access > control being handled by a Perl script scales to approximately 85 > users. I don't handle key management directly, but I believe several > users use multiple keys (I don't personally). I've never heard any > complaints from the guys who actually do key management; they just > keep authorized_keys in alphabetical order by comment (= user's real > name). Nor do I notice any authorization overhead vs. a simple ssh > login when accessing the cvs server.[1] Evidently the "what keys do > you have?" negotiation with the agent takes very little time (in > terms of what a human can notice). That's encouraging; I'm willing to proceed with that approach then. As for key management: I just designed an infrastructure where ~pythondev/keys is a directory containing files named, say "Martin v. Loewis" (with spaces, ASCII only); the contents of the files are just the public keys. I run then make_authorized_keys, which regenerates the authorized_keys2 file, adding all the command= lines. This avoids editing authorized_keys2 in a text editor. Regards, Martin From skip at pobox.com Mon Aug 22 17:18:33 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 22 Aug 2005 10:18:33 -0500 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <43096E37.8070708@v.loewis.de> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> Message-ID: <17161.60617.950268.641009@montanaro.dyndns.org> Martin, I'm completely confused about what, if anything, I need to send to you. I can already access the python.org website repository via svn. Will I automatically get access to the new Python source repository or do I need to send you pub key(s)? Are dinsdale.python.org and svn.python.org the same machine with different IP addresses? If they are different machines, why would we want to host svn repositories on multiple machines? Skip From aahz at pythoncraft.com Mon Aug 22 17:25:33 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon, 22 Aug 2005 08:25:33 -0700 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <43097142.9050905@v.loewis.de> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <20050821235522.GA16606@panix.com> <43097142.9050905@v.loewis.de> Message-ID: <20050822152533.GB12281@panix.com> On Mon, Aug 22, 2005, "Martin v. L?wis" wrote: > Aahz wrote: >>Barry: >>>Martin: >>>> >>>>Host pythondev >>>> Hostname dinsdale.python.org >>>> User pythondev >>>> IdentityFile ~/.ssh/pythondev >>>> >>>I'm confused again; are you saying that we should have a host named >>>pythondev.python.org? I'm not sure that's necessary. >> >> No, pythondev is simply an SSH alias for dinsdale -- the server knows >> nothing about it. I don't quite understand the "User pythondev" line, >> though -- I think that's a mistake. > > That's intentional. "ssh pythondev" now becomes equivalent to > > ssh -l pythondev -i ~/.ssh/pythondev dinsdale.python.org > > IOW, the User option is equivalent to specifying the -l option. Yes, I know -- but it looks like a mistake to me. Are you saying that all shell access will be done through a single account? Isn't that a huge security risk? My understanding was that it was SVN access that would be going through a single account, not shell access. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From barry at python.org Mon Aug 22 17:32:10 2005 From: barry at python.org (Barry Warsaw) Date: Mon, 22 Aug 2005 11:32:10 -0400 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <17161.60617.950268.641009@montanaro.dyndns.org> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> Message-ID: <1124724730.17082.8.camel@geddy.wooz.org> On Mon, 2005-08-22 at 11:18, skip at pobox.com wrote: > I'm completely confused about what, if anything, I need to send to you. I > can already access the python.org website repository via svn. Will I > automatically get access to the new Python source repository or do I need to > send you pub key(s)? I think technically, the answer to that is "yes", you will automatically get access to the source repo. The question I have is whether you /should/ access the source repo that way, or use the shared pythondev account. Two unknowns for me are 1) will there be permission problems that either prevent you from doing this, or once you've committed a change, will screw pythondev-access?; 2) when we finally get email notifications worked in, will it still look like your commit is coming from the right place. I think the answer to #2 is yes, but I'm not sure about #1. > Are dinsdale.python.org and svn.python.org the same > machine with different IP addresses? If they are different machines, why They are the same machine, with different IP addresses. Anonymous webdav will require two Apache processes, since different user/groups are needed and to support different certs for svn.python.org and (eventually) www.python.org. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050822/8e2b29ff/attachment.pgp From skip at pobox.com Mon Aug 22 17:45:29 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 22 Aug 2005 10:45:29 -0500 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <1124724730.17082.8.camel@geddy.wooz.org> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> Message-ID: <17161.62233.743239.89277@montanaro.dyndns.org> >> Will I automatically get access to the new Python source repository >> or do I need to send you pub key(s)? Barry> I think technically, the answer to that is "yes", you will Barry> automatically get access to the source repo. Okay... Barry> The question I have is whether you /should/ access the source Barry> repo that way, or use the shared pythondev account. More confusion here. If I use some sort of shared access how will the system ascribe changes I make to me and not, for example, Martin? I think until this experiment is over and we have really and truly migrated to svn I will simply let other people fuss with things. Skip From foom at fuhm.net Mon Aug 22 17:57:54 2005 From: foom at fuhm.net (James Y Knight) Date: Mon, 22 Aug 2005 11:57:54 -0400 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <1124724730.17082.8.camel@geddy.wooz.org> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> Message-ID: <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net> On Aug 22, 2005, at 11:32 AM, Barry Warsaw wrote: > They are the same machine, with different IP addresses. Anonymous > webdav will require two Apache processes, since different user/groups > are needed and to support different certs for svn.python.org and > (eventually) www.python.org. > It seems a waste to use SVN's webdav support just for anon access. The svnserve method works well for anon access. The only reason to use svn webdav IMO is if you want to use that for authenticated access. But since you're talking about using svn+ssh for that.. James From martin at v.loewis.de Mon Aug 22 18:07:50 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Aug 2005 18:07:50 +0200 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <17161.60617.950268.641009@montanaro.dyndns.org> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> Message-ID: <4309F856.40506@v.loewis.de> skip at pobox.com wrote: > I'm completely confused about what, if anything, I need to send to you. I > can already access the python.org website repository via svn. Yes, but you do so using username/password, right? pythondev will be using svn+ssh. > Will I > automatically get access to the new Python source repository or do I need to > send you pub key(s)? You need to send me pubkeys. Actually, I just copied the ones from creosote (see below). You should now be able to checkout svn+ssh://pythondev at svn.python.org/python/trunk > Are dinsdale.python.org and svn.python.org the same > machine with different IP addresses? Correct. > If they are different machines, why > would we want to host svn repositories on multiple machines? We don't. However, we use different access methods. Actually, we *might* use different access methods. If this turns out to be too confusing to users, we are probably back to username/password. Regards, Martin P.S. The keys I installed are ssh-dss AAAAB3NzaC1kc3MAAACBAJAPN3ngdjih7H1wqkmbkaJDpfoW3fRrk9phtuuO+js43qU06BiqInbGZ/zjVZRrM7yzRbo2PGu1+ox8H/vkMlSk6IxmgMtNrrQ9SEoTRo7eyg5ku+JiC44h3RWT2IuiIALB8axHQSBsF6Oe4O9z/lgsLMO08M2l1TzRnjSjyOEZAAAAFQDGffqFFm+IoSH6cRfxnY+BiXxZ5QAAAIATuQmlscDd/QNSlk4Oy7ZMUdHplx76zQtyUHXvhRVkIu6QrduhnnCkGIFjSHQsnJOoroF4tVaJYY7oka17Ambd0LiWcSlNK+IHMdbvZ91wbVpeo9x/HBCJtCMxDX8PxG3TADuqiZjeC8nOpCdJ+cK7emQv+G4WIw3gC3IuPRINWAAAAIA5+OO9ApbKrcClwHXZ9DqtDJBe2fSox1mnei3VAajbOU/o3+j+G+5iLerOqLTCoOyIs7umvuUulIAXvhDzCzusw3mfBtt3UODQn0L3R47OFHzOiCEbihStxd36lVgCJgRBAW7UKf+2k3BzxJ5DVpp4+AZ7fS4FUVkZ8DYAog/68g== skip at montanaro.dyndns.org ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAq83rRGWRR4SdvvBUMJ/gDmMG7U7LdiC50kqUTbw+Kogum5JT7kexi1XYKgyKJ8FbRwMx1Xj9zjQERgDhYtFCJg72kSkD2muN3DkyU7vIoZQM/aNpspPNNDWRqj8pzHPzhWDUfL+tjZl78JD51mTOlGHaZUGdKnPeUOQF2XTadis= skip at montanaro.dyndns.org From martin at v.loewis.de Mon Aug 22 18:10:37 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Aug 2005 18:10:37 +0200 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <20050822152533.GB12281@panix.com> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <20050821235522.GA16606@panix.com> <43097142.9050905@v.loewis.de> <20050822152533.GB12281@panix.com> Message-ID: <4309F8FD.6080505@v.loewis.de> Aahz wrote: > Yes, I know -- but it looks like a mistake to me. Are you saying that > all shell access will be done through a single account? Isn't that a > huge security risk? My understanding was that it was SVN access that > would be going through a single account, not shell access. Only few selected people would have shell access; I don't see that as a huge risk. Anyway, Barry didn't like it either, so we removed shell access to the pythondev account; user keys now need to be added by the pydotorg admins. Regards, Martin From martin at v.loewis.de Mon Aug 22 18:16:24 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Aug 2005 18:16:24 +0200 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <1124724730.17082.8.camel@geddy.wooz.org> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> Message-ID: <4309FA58.4080103@v.loewis.de> Barry Warsaw wrote: > I think technically, the answer to that is "yes", you will automatically > get access to the source repo. At the moment, the answer actually is "no". For the projects repository, there is no group write permission - you must be pythondev in order to write. > The question I have is whether you > /should/ access the source repo that way, or use the shared pythondev > account. Two unknowns for me are 1) will there be permission problems > that either prevent you from doing this, or once you've committed a > change, will screw pythondev-access?; Yes to the former. The webserver has only read access to the (projects) repository. > 2) when we finally get email > notifications worked in, will it still look like your commit is coming > from the right place. Not sure what "the right place" would be: pythondev at python.org? I think the email could look any way we want it to look. > They are the same machine, with different IP addresses. Anonymous > webdav will require two Apache processes, since different user/groups > are needed Not necessarily. The repository could be world-readable, in which case "nobody" could access it. > and to support different certs for svn.python.org and > (eventually) www.python.org. Ah. I think anonymous read access should be on port 80. Regards, Martin From martin at v.loewis.de Mon Aug 22 18:20:42 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Aug 2005 18:20:42 +0200 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <17161.62233.743239.89277@montanaro.dyndns.org> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> <17161.62233.743239.89277@montanaro.dyndns.org> Message-ID: <4309FB5A.1040201@v.loewis.de> skip at pobox.com wrote: > More confusion here. If I use some sort of shared access how will the > system ascribe changes I make to me and not, for example, Martin? In pythondev's authorized_keys2, we have a line command="/usr/bin/svnserve --root=/data/repos/projects -t --tunnel-user 'Skip Montanaro'",no-port-forwarding,no-X11-forwarding, no-agent-forwarding,no-pty ssh-dss So the *only* command you are allowed to invoke is svnserve (actually, sshd will invoke that no matter what the ssh client requests). This will tell subversion that changes should be logges as 'Skip Montanaro'. > I think until this experiment is over and we have really and truly migrated > to svn I will simply let other people fuss with things. Well, you are not required to understand it, but you should try to use it. Just check out svn+ssh://pythondev at svn.python.org/python/trunk/Misc, and see whether this works. Regards, Martin From martin at v.loewis.de Mon Aug 22 18:23:01 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Aug 2005 18:23:01 +0200 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net> Message-ID: <4309FBE5.40204@v.loewis.de> James Y Knight wrote: > It seems a waste to use SVN's webdav support just for anon access. > The svnserve method works well for anon access. The only reason to > use svn webdav IMO is if you want to use that for authenticated > access. But since you're talking about using svn+ssh for that.. It has the advantage that we can easily point people to files with a web browser; they don't need an svn client. Regards, Martin From barry at python.org Mon Aug 22 18:41:11 2005 From: barry at python.org (Barry Warsaw) Date: Mon, 22 Aug 2005 12:41:11 -0400 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <4309FA58.4080103@v.loewis.de> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> <4309FA58.4080103@v.loewis.de> Message-ID: <1124728871.17084.13.camel@geddy.wooz.org> On Mon, 2005-08-22 at 12:16, "Martin v. L?wis" wrote: > Barry Warsaw wrote: > > I think technically, the answer to that is "yes", you will automatically > > get access to the source repo. > > At the moment, the answer actually is "no". For the projects repository, > there is no group write permission - you must be pythondev in order to > write. Good! I think that's a feature. :) I have a vague discomfort with allowing both types of access. I.e. I'd rather all source committers use the same mechanism. > > 2) when we finally get email > > notifications worked in, will it still look like your commit is coming > > from the right place. > > Not sure what "the right place" would be: pythondev at python.org? > I think the email could look any way we want it to look. I think it should be @python.org where is the firstname.lastname (with some exceptions) scheme that we've agreed on. I actually /don't/ want all commits to look like they're coming from pythondev at python.org > > and to support different certs for svn.python.org and > > (eventually) www.python.org. > > Ah. I think anonymous read access should be on port 80. Maybe we want to put websvn (or whatever it's called these days) on port 80 of svn.python.org? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050822/cb31d290/attachment.pgp From pje at telecommunity.com Mon Aug 22 18:42:57 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 22 Aug 2005 12:42:57 -0400 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <4309FBE5.40204@v.loewis.de> References: <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net> <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net> Message-ID: <5.1.1.6.0.20050822124128.01b0bd10@mail.telecommunity.com> At 06:23 PM 8/22/2005 +0200, Martin v. L?wis wrote: >James Y Knight wrote: > > It seems a waste to use SVN's webdav support just for anon access. > > The svnserve method works well for anon access. The only reason to > > use svn webdav IMO is if you want to use that for authenticated > > access. But since you're talking about using svn+ssh for that.. > >It has the advantage that we can easily point people to files >with a web browser; they don't need an svn client. You can do that with viewcvs, too. Viewcvs can also create tarballs for easy downloading, and has a lot of browsing and viewing options that the SVN webdav mode doesn't. From skip at pobox.com Mon Aug 22 18:43:06 2005 From: skip at pobox.com (skip@pobox.com) Date: Mon, 22 Aug 2005 11:43:06 -0500 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <4309FB5A.1040201@v.loewis.de> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> <17161.62233.743239.89277@montanaro.dyndns.org> <4309FB5A.1040201@v.loewis.de> Message-ID: <17162.155.1546.1991@montanaro.dyndns.org> >> I think until this experiment is over and we have really and truly >> migrated to svn I will simply let other people fuss with things. Martin> Well, you are not required to understand it, but you should try Martin> to use it. Good point. Martin> Just check out Martin> svn+ssh://pythondev at svn.python.org/python/trunk/Misc, and see Martin> whether this works. It worked. I made a trivial change to Misc/NEWS and checked it in. I then ran "svn blame NEWS" to see what it showed. This took approximately forever. Can I assume this is one thing svn is always going to be pretty slow at? I use cvs annotate frequently. Is there a faster alternative in svn to identify who did what? I notice that you use my real name (including spaces). I doubt we have any code that munches on annotated listings, but it seems that for the sake of script writers' sanity it would be better to elide spaces or replace them with underscores so the annotated user is a single "word": 40555 Skip Montanaro ++++++++++++ 28675 montanaro Python News 40555 Skip Montanaro ++++++++++++ 28675 montanaro 37655 anthonybaxter (editors: check NEWS.help for information about editing NEWS using ReST.) 37654 montanaro 37838 rhettinger What's New in Python 2.5 alpha 1? 37838 rhettinger ================================= 37838 rhettinger 38611 anthonybaxter *Release date: XX-XXX-2006* 38611 anthonybaxter 37838 rhettinger Core and builtins 37838 rhettinger ----------------- 37838 rhettinger ... That way column 2 would always be the contributor. Skip From gvanrossum at gmail.com Mon Aug 22 19:43:24 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon, 22 Aug 2005 10:43:24 -0700 Subject: [Python-Dev] On decorators implementation In-Reply-To: <430988FE.2020603@libero.it> References: <43084AE9.20900@libero.it> <430988FE.2020603@libero.it> Message-ID: > Maybe it's possible to let the decorator know the method class even if > the class is still undefined.(Just like recursive functions?) > This would allow decorators to call super with the right class also. > @callSuper decoration is something I really miss. You're thinking about it all wrong. Remember that decorators can also be used to declare that something is a static method or class method etc. Try to learn Python, not to write some other language using Python syntax. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From kbk at shore.net Mon Aug 22 20:37:54 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Mon, 22 Aug 2005 14:37:54 -0400 (EDT) Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200508221837.j7MIbsHG031701@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 352 open ( +0) / 2898 closed ( +2) / 3250 total ( +2) Bugs : 926 open (+13) / 5177 closed (+15) / 6103 total (+28) RFE : 190 open ( -1) / 179 closed ( +1) / 369 total ( +0) New / Reopened Patches ______________________ fix smtplib when local host isn't resolvable in dns (2005-08-12) http://python.org/sf/1257988 opened by Arkadiusz Miskiewicz tarfile: fix for bug #1257255 (2005-08-17) http://python.org/sf/1262036 opened by Lars Gust?bel Patches Closed ______________ sha256 module (2004-04-14) http://python.org/sf/935454 closed by greg sha and md5 modules should use OpenSSL when possible (2005-02-12) http://python.org/sf/1121611 closed by greg New / Reopened Bugs ___________________ Significant memory leak with PyImport_ReloadModule (2005-08-11) http://python.org/sf/1256669 opened by Ben Held slice object uses -1 as exclusive end-bound (2005-08-11) http://python.org/sf/1256786 opened by Bryan G. Olson tarfile local name is local, should be abspath (2005-08-12) http://python.org/sf/1257255 opened by Martin Blais Encodings iso8859_1 and latin_1 are redundant (2005-08-12) http://python.org/sf/1257525 opened by liturgist Solaris 8 declares gethostname(). (2005-08-12) http://python.org/sf/1257687 opened by Hans Deragon error message incorrectly claims Visual C++ is required (2005-08-12) http://python.org/sf/1257728 opened by Zooko O'Whielacronx Make set.remove() behave more like Set.remove() (2005-08-12) CLOSED http://python.org/sf/1257731 opened by Raymond Hettinger tkapp read-only attributes (2005-08-12) http://python.org/sf/1257772 opened by peeb gen_send_ex: Assertion `f->f_back ! (2005-08-12) CLOSED http://python.org/sf/1257960 opened by Neil Schemenauer http auth documentation/implementation conflict (2005-08-13) http://python.org/sf/1258485 opened by Matthias Klose "it's" vs. "its" typo in Language Reference (2005-08-14) CLOSED http://python.org/sf/1258922 opened by Wolfgang Petzold Makefile ignores $CPPFLAGS (2005-08-14) http://python.org/sf/1258986 opened by Dirk Pirschel Tix CheckList 'radio' option cannot be changed (2005-08-14) http://python.org/sf/1259434 opened by Raymond Maple subprocess: more general (non-buffering) communication (2005-08-15) http://python.org/sf/1260171 opened by Ian Bicking __new__ is class method (2005-08-16) http://python.org/sf/1261229 opened by Mike Orr import dynamic library bug? (2005-08-16) http://python.org/sf/1261390 opened by broadwin Tutorial doesn't cover * and ** function calls (2005-08-16) http://python.org/sf/1261659 opened by Brett Cannon precompiled code and nameError. (2005-08-17) http://python.org/sf/1261714 opened by Vladimir Menshakov minidom.py alternate newl support is broken (2005-08-17) http://python.org/sf/1262320 opened by John Whitley fcntl.ioctl have a bit problem. (2005-08-18) http://python.org/sf/1262856 opened by Raise L. Sail typo on "SimpleXMLRPCServer Objects" (2005-08-18) CLOSED http://python.org/sf/1263086 opened by Chad Whitacre type() and isinstance() do not call __getattribute__ (2005-08-19) http://python.org/sf/1263635 opened by Per Vognsen IDLE on Mac (2005-08-18) http://python.org/sf/1263656 opened by Bruce Sherwood PyArg_ParseTupleAndKeywords doesn't handle I format correctl (2005-08-19) CLOSED http://python.org/sf/1264168 opened by John Finlay PEP 8 uses wrong raise syntax (2005-08-20) CLOSED http://python.org/sf/1264666 opened by Steven Bethard sequence slicing documentation incomplete (2005-08-20) http://python.org/sf/1265100 opened by Steven Bethard lexists() is not exported from os.path (2005-08-22) CLOSED http://python.org/sf/1266283 opened by Martin Blais Mistakes in decimal.Context.subtract documentation (2005-08-22) http://python.org/sf/1266296 opened by Jim Sizelove Bugs Closed ___________ smtplib and email.py (2005-08-03) http://python.org/sf/1251528 closed by rhettinger float('-inf') (2005-08-09) http://python.org/sf/1255395 closed by tjreedy Make set.remove() behave more like Set.remove() (2005-08-12) http://python.org/sf/1257731 closed by rhettinger gen_send_ex: Assertion `f->f_back ! (2005-08-12) http://python.org/sf/1257960 closed by pje IOError after normal write (2005-08-04) http://python.org/sf/1252149 closed by tim_one IOError after normal write (2005-08-04) http://python.org/sf/1252149 deleted by patrick_gerken "it's" vs. "its" typo in Language Reference (2005-08-14) http://python.org/sf/1258922 closed by birkenfeld hotshot.stats.load (2004-02-19) http://python.org/sf/900092 closed by bwarsaw typo on "SimpleXMLRPCServer Objects" (2005-08-18) http://python.org/sf/1263086 closed by doerwalter PyArg_ParseTupleAndKeywords doesn't handle I format correctl (2005-08-19) http://python.org/sf/1264168 closed by birkenfeld PEP 8 uses wrong raise syntax (2005-08-20) http://python.org/sf/1264666 closed by goodger list(obj) can swallow KeyboardInterrupt (2005-07-21) http://python.org/sf/1242657 closed by rhettinger container methods raise KeyError not IndexError (2005-08-01) http://python.org/sf/1249837 closed by rhettinger zip incorrectly and incompletely documented (2005-02-12) http://python.org/sf/1121416 closed by rhettinger bz2 RuntimeError when decompressing file (2005-04-27) http://python.org/sf/1191043 closed by birkenfeld lexists() is not exported from os.path (2005-08-22) http://python.org/sf/1266283 closed by birkenfeld RFE Closed __________ md5 and sha1 modules should use openssl implementation (2004-06-30) http://python.org/sf/983069 closed by greg From nas at arctrix.com Mon Aug 22 23:31:42 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Mon, 22 Aug 2005 15:31:42 -0600 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings Message-ID: <20050822213142.GA5702@mems-exchange.org> [Please mail followups to python-dev at python.org.] The PEP has been rewritten based on a suggestion by Guido to change str() rather than adding a new built-in function. Based on my testing, I believe the idea is feasible. It would be helpful if people could test the patched Python with their own applications and report any incompatibilities. PEP: 349 Title: Allow str() to return unicode strings Version: $Revision: 1.3 $ Last-Modified: $Date: 2005/08/22 21:12:08 $ Author: Neil Schemenauer Status: Draft Type: Standards Track Content-Type: text/plain Created: 02-Aug-2005 Post-History: 06-Aug-2005 Python-Version: 2.5 Abstract This PEP proposes to change the str() built-in function so that it can return unicode strings. This change would make it easier to write code that works with either string type and would also make some existing code handle unicode strings. The C function PyObject_Str() would remain unchanged and the function PyString_New() would be added instead. Rationale Python has had a Unicode string type for some time now but use of it is not yet widespread. There is a large amount of Python code that assumes that string data is represented as str instances. The long term plan for Python is to phase out the str type and use unicode for all string data. Clearly, a smooth migration path must be provided. We need to upgrade existing libraries, written for str instances, to be made capable of operating in an all-unicode string world. We can't change to an all-unicode world until all essential libraries are made capable for it. Upgrading the libraries in one shot does not seem feasible. A more realistic strategy is to individually make the libraries capable of operating on unicode strings while preserving their current all-str environment behaviour. First, we need to be able to write code that can accept unicode instances without attempting to coerce them to str instances. Let us label such code as Unicode-safe. Unicode-safe libraries can be used in an all-unicode world. Second, we need to be able to write code that, when provided only str instances, will not create unicode results. Let us label such code as str-stable. Libraries that are str-stable can be used by libraries and applications that are not yet Unicode-safe. Sometimes it is simple to write code that is both str-stable and Unicode-safe. For example, the following function just works: def appendx(s): return s + 'x' That's not too surprising since the unicode type is designed to make the task easier. The principle is that when str and unicode instances meet, the result is a unicode instance. One notable difficulty arises when code requires a string representation of an object; an operation traditionally accomplished by using the str() built-in function. Using the current str() function makes the code not Unicode-safe. Replacing a str() call with a unicode() call makes the code not str-stable. Changing str() so that it could return unicode instances would solve this problem. As a further benefit, some code that is currently not Unicode-safe because it uses str() would become Unicode-safe. Specification A Python implementation of the str() built-in follows: def str(s): """Return a nice string representation of the object. The return value is a str or unicode instance. """ if type(s) is str or type(s) is unicode: return s r = s.__str__() if not isinstance(r, (str, unicode)): raise TypeError('__str__ returned non-string') return r The following function would be added to the C API and would be the equivalent to the str() built-in (ideally it be called PyObject_Str, but changing that function could cause a massive number of compatibility problems): PyObject *PyString_New(PyObject *); A reference implementation is available on Sourceforge [1] as a patch. Backwards Compatibility Some code may require that str() returns a str instance. In the standard library, only one such case has been found so far. The function email.header_decode() requires a str instance and the email.Header.decode_header() function tries to ensure this by calling str() on its argument. The code was fixed by changing the line "header = str(header)" to: if isinstance(header, unicode): header = header.encode('ascii') Whether this is truly a bug is questionable since decode_header() really operates on byte strings, not character strings. Code that passes it a unicode instance could itself be considered buggy. Alternative Solutions A new built-in function could be added instead of changing str(). Doing so would introduce virtually no backwards compatibility problems. However, since the compatibility problems are expected to rare, changing str() seems preferable to adding a new built-in. The basestring type could be changed to have the proposed behaviour, rather than changing str(). However, that would be confusing behaviour for an abstract base type. References [1] http://www.python.org/sf/1266570 Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: -------------- next part -------------- PEP: 349 Title: Allow str() to return unicode strings Version: $Revision: 1.3 $ Last-Modified: $Date: 2005/08/22 21:12:08 $ Author: Neil Schemenauer Status: Draft Type: Standards Track Content-Type: text/plain Created: 02-Aug-2005 Post-History: 06-Aug-2005 Python-Version: 2.5 Abstract This PEP proposes to change the str() built-in function so that it can return unicode strings. This change would make it easier to write code that works with either string type and would also make some existing code handle unicode strings. The C function PyObject_Str() would remain unchanged and the function PyString_New() would be added instead. Rationale Python has had a Unicode string type for some time now but use of it is not yet widespread. There is a large amount of Python code that assumes that string data is represented as str instances. The long term plan for Python is to phase out the str type and use unicode for all string data. Clearly, a smooth migration path must be provided. We need to upgrade existing libraries, written for str instances, to be made capable of operating in an all-unicode string world. We can't change to an all-unicode world until all essential libraries are made capable for it. Upgrading the libraries in one shot does not seem feasible. A more realistic strategy is to individually make the libraries capable of operating on unicode strings while preserving their current all-str environment behaviour. First, we need to be able to write code that can accept unicode instances without attempting to coerce them to str instances. Let us label such code as Unicode-safe. Unicode-safe libraries can be used in an all-unicode world. Second, we need to be able to write code that, when provided only str instances, will not create unicode results. Let us label such code as str-stable. Libraries that are str-stable can be used by libraries and applications that are not yet Unicode-safe. Sometimes it is simple to write code that is both str-stable and Unicode-safe. For example, the following function just works: def appendx(s): return s + 'x' That's not too surprising since the unicode type is designed to make the task easier. The principle is that when str and unicode instances meet, the result is a unicode instance. One notable difficulty arises when code requires a string representation of an object; an operation traditionally accomplished by using the str() built-in function. Using the current str() function makes the code not Unicode-safe. Replacing a str() call with a unicode() call makes the code not str-stable. Changing str() so that it could return unicode instances would solve this problem. As a further benefit, some code that is currently not Unicode-safe because it uses str() would become Unicode-safe. Specification A Python implementation of the str() built-in follows: def str(s): """Return a nice string representation of the object. The return value is a str or unicode instance. """ if type(s) is str or type(s) is unicode: return s r = s.__str__() if not isinstance(r, (str, unicode)): raise TypeError('__str__ returned non-string') return r The following function would be added to the C API and would be the equivalent to the str() built-in (ideally it be called PyObject_Str, but changing that function could cause a massive number of compatibility problems): PyObject *PyString_New(PyObject *); A reference implementation is available on Sourceforge [1] as a patch. Backwards Compatibility Some code may require that str() returns a str instance. In the standard library, only one such case has been found so far. The function email.header_decode() requires a str instance and the email.Header.decode_header() function tries to ensure this by calling str() on its argument. The code was fixed by changing the line "header = str(header)" to: if isinstance(header, unicode): header = header.encode('ascii') Whether this is truly a bug is questionable since decode_header() really operates on byte strings, not character strings. Code that passes it a unicode instance could itself be considered buggy. Alternative Solutions A new built-in function could be added instead of changing str(). Doing so would introduce virtually no backwards compatibility problems. However, since the compatibility problems are expected to rare, changing str() seems preferable to adding a new built-in. The basestring type could be changed to have the proposed behaviour, rather than changing str(). However, that would be confusing behaviour for an abstract base type. References [1] http://www.python.org/sf/1266570 Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From martin at v.loewis.de Mon Aug 22 23:47:02 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Aug 2005 23:47:02 +0200 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <5.1.1.6.0.20050822124128.01b0bd10@mail.telecommunity.com> References: <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net> <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net> <5.1.1.6.0.20050822124128.01b0bd10@mail.telecommunity.com> Message-ID: <430A47D6.70704@v.loewis.de> Phillip J. Eby wrote: > You can do that with viewcvs, too. Viewcvs can also create tarballs for > easy downloading, and has a lot of browsing and viewing options that the > SVN webdav mode doesn't. True. I had some issues with viewcvs, though: you cannot provide access control easily, as you cannot force it to slash-separated mode; it also couldn't fetch the history across renames. These may have been fixed meanwhile, of course. Regards, Martin From martin at v.loewis.de Mon Aug 22 23:57:11 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Aug 2005 23:57:11 +0200 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <17162.155.1546.1991@montanaro.dyndns.org> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> <17161.62233.743239.89277@montanaro.dyndns.org> <4309FB5A.1040201@v.loewis.de> <17162.155.1546.1991@montanaro.dyndns.org> Message-ID: <430A4A37.1060808@v.loewis.de> skip at pobox.com wrote: > It worked. I made a trivial change to Misc/NEWS and checked it in. I then > ran "svn blame NEWS" to see what it showed. This took approximately > forever. Can I assume this is one thing svn is always going to be pretty > slow at? Yes. Somebody commented that this is quadratic in svn with the number of revisions, whereas it is linear in CVS. Please try it on some other file; Misc/NEWS is probably the worst case in the Python repository. I don't know whether there is any better way; we should perhaps ask on the svn users list. > I notice that you use my real name (including spaces). I doubt we have any > code that munches on annotated listings, but it seems that for the sake of > script writers' sanity it would be better to elide spaces or replace them > with underscores so the annotated user is a single "word": That would be easy to do. For consistency, should we use . (with the usual exceptions 'aahz', 'guido.van.rossum', 'martin.v.loewis')? As for parsing these things: they also show up in 'svn log'. Regards, Martin From db3l at fitlinxx.com Tue Aug 23 03:06:35 2005 From: db3l at fitlinxx.com (David Bolen) Date: 22 Aug 2005 21:06:35 -0400 Subject: [Python-Dev] Admin access using svn+ssh References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> <17161.62233.743239.89277@montanaro.dyndns.org> <4309FB5A.1040201@v.loewis.de> <17162.155.1546.1991@montanaro.dyndns.org> <430A4A37.1060808@v.loewis.de> Message-ID: "Martin v. L?wis" writes: > skip at pobox.com wrote: > > It worked. I made a trivial change to Misc/NEWS and checked it in. I then > > ran "svn blame NEWS" to see what it showed. This took approximately > > forever. Can I assume this is one thing svn is always going to be pretty > > slow at? > > Yes. Somebody commented that this is quadratic in svn with the number of > revisions, whereas it is linear in CVS. Please try it on some other > file; Misc/NEWS is probably the worst case in the Python repository. > > I don't know whether there is any better way; we should perhaps ask > on the svn users list. One improvement, if you're looking for a fairly recent change is to bound the blame command with a revision range (I find a date up to HEAD as easiest). You'll miss annotations on lines which were last touched prior to the selected range, but it can definitely speed things up. On a file like News, even if you're generous (say take the last year) it would probably be noticeably faster than letting svn go back to revision 1. -- David From greg.ewing at canterbury.ac.nz Tue Aug 23 05:48:27 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 23 Aug 2005 15:48:27 +1200 Subject: [Python-Dev] On decorators implementation In-Reply-To: <430988FE.2020603@libero.it> References: <43084AE9.20900@libero.it> <430988FE.2020603@libero.it> Message-ID: <430A9C8B.9010704@canterbury.ac.nz> Paolino wrote: > Maybe it's possible to let the decorator know the method class even if > the class is still undefined.(Just like recursive functions?) No, it's not possible. The situation is not the same. With recursive functions, both functions are defined before either of them is called. But decorators in a class body are executed before the surrounding class even exists. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From paragate at gmx.net Tue Aug 23 10:46:36 2005 From: paragate at gmx.net (Wolfgang Lipp) Date: Tue, 23 Aug 2005 10:46:36 +0200 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: <20050822213142.GA5702@mems-exchange.org> References: <20050822213142.GA5702@mems-exchange.org> Message-ID: neil, i just intended to worry that returning a unicode object from ``str()`` would break assumptions about the way that 'type definers' like ``str()``, ``int()``, ``float()`` and so on work, but i quickly realized that e.g. ``int()`` does return a long where appropriate! since the principle works there one may surmise it will also work for ``str()`` in the long run. one point i don't seem to understand right now is why it says in the function definition:: if type(s) is str or type(s) is unicode: ... instead of using ``isinstance()``. Testing for ``type()`` means that instances of derived classes (that may or may not change nothing or almost nothing to the underlying class) when passed to a function that uses ``str()`` will behave in a different way! isn't it more realistic and commonplace to assume that derivatives of a class do fulfill the requirements of the underlying class? -- which may turn out to be wrong! but still... the code as it stands means i have to remember that *in this special case only* (when deriving from ``unicode``), i have to add a ``__str__()`` method myself that simply returns ``self``. then of course, one could change ``unicode.__str__()`` to return ``self``, itself, which should work. but then, why so complicated? i suggest to change said line to:: if isinstance( s, ( str, unicode ) ): ... any objections? _wolf -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/ From martin at v.loewis.de Tue Aug 23 12:03:06 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 23 Aug 2005 12:03:06 +0200 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <1124728871.17084.13.camel@geddy.wooz.org> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> <4309FA58.4080103@v.loewis.de> <1124728871.17084.13.camel@geddy.wooz.org> Message-ID: <430AF45A.1090506@v.loewis.de> Barry Warsaw wrote: >>Not sure what "the right place" would be: pythondev at python.org? >>I think the email could look any way we want it to look. > > > I think it should be @python.org where is the > firstname.lastname (with some exceptions) scheme that we've agreed on. > I actually /don't/ want all commits to look like they're coming from > pythondev at python.org Ok, I have now changed all user names for the python repository to firstname.lastname. That should allow to use them in From: fields of commit email. Regards, Martin From theller at python.net Tue Aug 23 12:19:11 2005 From: theller at python.net (Thomas Heller) Date: Tue, 23 Aug 2005 12:19:11 +0200 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings References: <20050822213142.GA5702@mems-exchange.org> Message-ID: Neil Schemenauer writes: > [Please mail followups to python-dev at python.org.] > > The PEP has been rewritten based on a suggestion by Guido to change > str() rather than adding a new built-in function. Based on my > testing, I believe the idea is feasible. It would be helpful if > people could test the patched Python with their own applications and > report any incompatibilities. > I like the fact that currently unicode(x) is guarateed to return a unicode instance, or raises a UnicodeDecodeError. Same for str(x), which is guaranteed to return a (byte) string instance or raise an error. Wouldn't also a new function make the intent clearer? So I think I'm +1 on the text() built-in, and -0 on changing str. Thomas From martin at v.loewis.de Tue Aug 23 12:38:05 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 23 Aug 2005 12:38:05 +0200 Subject: [Python-Dev] Subversion instructions Message-ID: <430AFC8D.6020000@v.loewis.de> As some people have been struggling with svn+ssh, I wrote a few instructions at http://www.python.org/dev/svn.html The main issues people have been struggling with are: - you really should use an agent, or else you have to type the private key passphrase three times on checkout - on windows, putty works fine, but you really should use the agent (pageant), or else plink might not find your key. Also, if you use Putty profiles, make sure to add the user name (pythondev) into the profile - we need SSH2 keys; SSH1 is disabled on svn.python.org. Some of you had been using SSH1 keys on sf.net all these years; you will need to generate SSH2 keys. Regards, Martin From mal at egenix.com Tue Aug 23 12:39:03 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 23 Aug 2005 12:39:03 +0200 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: References: <20050822213142.GA5702@mems-exchange.org> Message-ID: <430AFCC7.9030402@egenix.com> Thomas Heller wrote: > Neil Schemenauer writes: > > >>[Please mail followups to python-dev at python.org.] >> >>The PEP has been rewritten based on a suggestion by Guido to change >>str() rather than adding a new built-in function. Based on my >>testing, I believe the idea is feasible. It would be helpful if >>people could test the patched Python with their own applications and >>report any incompatibilities. >> > > > I like the fact that currently unicode(x) is guarateed to return a > unicode instance, or raises a UnicodeDecodeError. Same for str(x), > which is guaranteed to return a (byte) string instance or raise an > error. > > Wouldn't also a new function make the intent clearer? > > So I think I'm +1 on the text() built-in, and -0 on changing str. Same here. A new API would also help make the transition easier from the current mixed data/text type (strings) to data-only (bytes) and text-only (text, renamed from unicode) in Py3.0. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 23 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From p.f.moore at gmail.com Tue Aug 23 12:41:11 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 23 Aug 2005 11:41:11 +0100 Subject: [Python-Dev] Admin access using svn+ssh In-Reply-To: <4309FBE5.40204@v.loewis.de> References: <43087DA0.702@v.loewis.de> <1124665281.31664.28.camel@geddy.wooz.org> <43096E37.8070708@v.loewis.de> <17161.60617.950268.641009@montanaro.dyndns.org> <1124724730.17082.8.camel@geddy.wooz.org> <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net> <4309FBE5.40204@v.loewis.de> Message-ID: <79990c6b05082303415114abaf@mail.gmail.com> On 8/22/05, "Martin v. L?wis" wrote: > James Y Knight wrote: > > It seems a waste to use SVN's webdav support just for anon access. > > The svnserve method works well for anon access. The only reason to > > use svn webdav IMO is if you want to use that for authenticated > > access. But since you're talking about using svn+ssh for that.. > > It has the advantage that we can easily point people to files > with a web browser; they don't need an svn client. It also allows anonymous svn checkouts for people behind firewalls that only allow HTTP through. Paul. From paragate at gmx.net Tue Aug 23 14:59:28 2005 From: paragate at gmx.net (Wolfgang Lipp) Date: Tue, 23 Aug 2005 14:59:28 +0200 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: <430AFCC7.9030402@egenix.com> References: <20050822213142.GA5702@mems-exchange.org> <430AFCC7.9030402@egenix.com> Message-ID: just tested the proposed implementation on a unicode-naive module basically using import sys import __builtin__ reload( sys ); sys.setdefaultencoding( 'utf-8' ) __builtin__.__dict__[ 'str' ] = new_str_function et voil?, str() calls in the module are rewritten, and print u'd?sseldorf' does work as expected(*) (even on systems where i have no access to sitecustomize, like at my python-friendly isp's servers). --- * my expectation is that unicode strings do print out as utf-8, as i can't see any better solution. i suggest to make this option available e.g. via a module in the standard lib to ease transition for people in case the pep doesn't make it. it may be applied where deemed necessary and left ignored otherwise. if nobody thinks the reload hack is too awful and this solution stands testing, i guess i'll post it to the aspn cookbook. after all these countless hours of hunting down ordinal not in range, finally i'm starting to see some light in the issue. _wolf On Tue, 23 Aug 2005 12:39:03 +0200, M.-A. Lemburg wrote: > Thomas Heller wrote: >> Neil Schemenauer writes: >> >> >>> [Please mail followups to python-dev at python.org.] >>> >>> The PEP has been rewritten based on a suggestion by Guido to change >>> str() rather than adding a new built-in function. Based on my >>> testing, I believe the idea is feasible. It would be helpful if >>> people could test the patched Python with their own applications and >>> report any incompatibilities. >>> >> >> >> I like the fact that currently unicode(x) is guarateed to return a >> unicode instance, or raises a UnicodeDecodeError. Same for str(x), >> which is guaranteed to return a (byte) string instance or raise an >> error. >> >> Wouldn't also a new function make the intent clearer? >> >> So I think I'm +1 on the text() built-in, and -0 on changing str. > > Same here. > > A new API would also help make the transition easier from the > current mixed data/text type (strings) to data-only (bytes) > and text-only (text, renamed from unicode) in Py3.0. > -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/ From raymond.hettinger at verizon.net Tue Aug 23 16:11:56 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 23 Aug 2005 10:11:56 -0400 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: <20050821184613.A45C11E4288@bag.python.org> Message-ID: <000601c5a7ec$9f614680$8901a044@oemcomputer> This patch should be reverted or fixed so that the Py2.5 build works again. It contains a disasterous search and replace error that prevents it from compiling. Hence, it couldn't have passed the test suite before being checked in. Also, all of the project and config files need to be updated for the new modules. > -----Original Message----- > From: python-checkins-bounces at python.org [mailto:python-checkins- > bounces at python.org] On Behalf Of greg at users.sourceforge.net > Sent: Sunday, August 21, 2005 2:46 PM > To: python-checkins at python.org > Subject: [Python-checkins] python/dist/src/Modules _hashopenssl.c, > NONE,2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,2.1 md5module.c, > 2.35, 2.36 shamodule.c, 2.22, 2.23 > > Update of /cvsroot/python/python/dist/src/Modules > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv32064/Modules > > Modified Files: > md5module.c shamodule.c > Added Files: > _hashopenssl.c sha256module.c sha512module.c > Log Message: > [ sf.net patch # 1121611 ] > > A new hashlib module to replace the md5 and sha modules. It adds > support for additional secure hashes such as SHA-256 and SHA-512. The > hashlib module uses OpenSSL for fast platform optimized > implementations of algorithms when available. The old md5 and sha > modules still exist as wrappers around hashlib to preserve backwards > compatibility. From mwh at python.net Tue Aug 23 16:31:55 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 23 Aug 2005 15:31:55 +0100 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: <000601c5a7ec$9f614680$8901a044@oemcomputer> (Raymond Hettinger's message of "Tue, 23 Aug 2005 10:11:56 -0400") References: <000601c5a7ec$9f614680$8901a044@oemcomputer> Message-ID: <2mvf1wsp0k.fsf@starship.python.net> "Raymond Hettinger" writes: > This patch should be reverted or fixed so that the Py2.5 build works > again. > > It contains a disasterous search and replace error that prevents it from > compiling. Hence, it couldn't have passed the test suite before being > checked in. It works for me, on OS X. Passes the test suite, even. I presume you're on Windows of some kind? > Also, all of the project and config files need to be updated for the new > modules. Well, yes. But if Greg is on some unix-a-like, he can only update the unix build files (which he has done; it's in setup.py). Cheers, mwh -- If you are anal, and you love to be right all the time, C++ gives you a multitude of mostly untimportant details to fret about so you can feel good about yourself for getting them "right", while missing the big picture entirely -- from Twisted.Quotes From mwh at python.net Tue Aug 23 16:33:04 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 23 Aug 2005 15:33:04 +0100 Subject: [Python-Dev] PEP 342 Implementation In-Reply-To: <000001c5991d$e40bb140$12b62c81@oemcomputer> (Raymond Hettinger's message of "Thu, 04 Aug 2005 13:56:50 -0400") References: <000001c5991d$e40bb140$12b62c81@oemcomputer> Message-ID: <2mr7cksoyn.fsf@starship.python.net> "Raymond Hettinger" writes: > Could someone please make an independent check to verify an issue with > the 342 checkin. The test suite passes but when I run IDLE and open a > new window (using Control-N), it crashes and burns. > > The problem does not occur just before the checkin: > cvs up -D "2005-08-01 18:00" > But emerges immediately after: > cvs up -D "2005-08-01 21:00" Is this still happening? I'm not seeing any unusual flakiness, but then I can't run IDLE (OS X, no Tk). It's not exactly a minimal test case :) Cheers, mwh -- A difference which makes no difference is no difference at all. -- William James (I think. Reference anyone?) From raymond.hettinger at verizon.net Tue Aug 23 17:03:12 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 23 Aug 2005 11:03:12 -0400 Subject: [Python-Dev] PEP 342 Implementation In-Reply-To: <2mr7cksoyn.fsf@starship.python.net> Message-ID: <000001c5a7f3$c8e0e2c0$8901a044@oemcomputer> [Raymond Hettinger] > > > Could someone please make an independent check to verify an issue with > > the 342 checkin. The test suite passes but when I run IDLE and open a > > new window (using Control-N), it crashes and burns. > > > > The problem does not occur just before the checkin: > > cvs up -D "2005-08-01 18:00" > > But emerges immediately after: > > cvs up -D "2005-08-01 21:00" > > Is this still happening? I'm not seeing any unusual flakiness, but > then I can't run IDLE (OS X, no Tk). Yes, it is still happening. No one has yet offered an independent confirmation. > It's not exactly a minimal test case :) Right ;-) Once narrowed down, the problem and solution will likely be obvious. Raymond From fredrik at pythonware.com Tue Aug 23 16:58:53 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 23 Aug 2005 16:58:53 +0200 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings References: <20050822213142.GA5702@mems-exchange.org> Message-ID: Neil Schemenauer wrote: > The PEP has been rewritten based on a suggestion by Guido to change > str() rather than adding a new built-in function. Based on my testing, I > believe the idea is feasible. note that this breaks chapter 3 of the tutorial: http://docs.python.org/tut/node5.html#SECTION005130000000000000000 where str() is first introduced. From raymond.hettinger at verizon.net Tue Aug 23 17:16:11 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 23 Aug 2005 11:16:11 -0400 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: <2mvf1wsp0k.fsf@starship.python.net> Message-ID: <000101c5a7f5$989d5100$8901a044@oemcomputer> [Raymond Hettinger] > > This patch should be reverted or fixed so that the Py2.5 build works > > again. > > > > It contains a disasterous search and replace error that prevents it from > > compiling. Hence, it couldn't have passed the test suite before being > > checked in. [Michael Hudson] > It works for me, on OS X. Passes the test suite, even. I presume > you're on Windows of some kind? Here's an excerpt from the check-in note for sha512module.c: RND(S[0],S[1],S[2],S[3],S[4],S[5],S[6],S[7],0,0x428a2f98d728ae22ULL); RND(S[7],S[0],S[1],S[2],S[3],S[4],S[5],S[6],1,0x7137449123ef65cdULL); RND(S[6],S[7],S[0],S[1],S[2],S[3],S[4],S[5],2,0xb5c0fbcfec4d3b2fULL); RND(S[5],S[6],S[7],S[0],S[1],S[2],S[3],S[4],3,0xe9b5dba58189dbbcULL); RND(S[4],S[5],S[6],S[7],S[0],S[1],S[2],S[3],4,0x3956c25bf348b538ULL); Perhaps OS X has some sort of Steve Jobs special constant suffix "ULL" that Mr. Gates and the ANSI C folks have yet to accept ;-) If it works for you, then it probably means that sha512module.c was left out of the build. Maybe sha512module.c wasn't supposed to be checked in? > > Also, all of the project and config files need to be updated for the new > > modules. > > Well, yes. But if Greg is on some unix-a-like, he can only update the > unix build files (which he has done; it's in setup.py). The project files are just text files and can be updated simply and directly. But yes, that is no big deal and I'll just do it for him once the code gets to a compilable state. Aside from the project files, there is still config.c and whatnot. We should put together a checklist of all the things that need to be updated when a new module is added. Raymond From nas at arctrix.com Tue Aug 23 17:21:57 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 23 Aug 2005 09:21:57 -0600 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: References: <20050822213142.GA5702@mems-exchange.org> Message-ID: <20050823152156.GA7839@mems-exchange.org> On Tue, Aug 23, 2005 at 10:46:36AM +0200, Wolfgang Lipp wrote: > one point i don't seem to understand right now is why it says in the > function definition:: > > if type(s) is str or type(s) is unicode: > ... > > instead of using ``isinstance()``. I don't think isinstance() would be okay. That test is meant as an optimization to avoid calling __str__ on str and unicode instances. Subclasses should still have their __str__ method called otherwise they cannot override it. > the code as it stands means i have to remember that *in this special > case only* (when deriving from ``unicode``), i have to add a > ``__str__()`` method myself that simply returns ``self``. Ah, I see that unicode.__str__ returns a str instance. > then of course, one could change ``unicode.__str__()`` to return > ``self``, itself, which should work. but then, why so complicated? I think that may be the right fix. Neil From gmccaughan at synaptics-uk.com Tue Aug 23 17:32:29 2005 From: gmccaughan at synaptics-uk.com (Gareth McCaughan) Date: Tue, 23 Aug 2005 16:32:29 +0100 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: <000101c5a7f5$989d5100$8901a044@oemcomputer> References: <000101c5a7f5$989d5100$8901a044@oemcomputer> Message-ID: <200508231632.30175.gmccaughan@synaptics-uk.com> > Here's an excerpt from the check-in note for sha512module.c: > > > RND(S[0],S[1],S[2],S[3],S[4],S[5],S[6],S[7],0,0x428a2f98d728ae22ULL); > RND(S[7],S[0],S[1],S[2],S[3],S[4],S[5],S[6],1,0x7137449123ef65cdULL); > RND(S[6],S[7],S[0],S[1],S[2],S[3],S[4],S[5],2,0xb5c0fbcfec4d3b2fULL); > RND(S[5],S[6],S[7],S[0],S[1],S[2],S[3],S[4],3,0xe9b5dba58189dbbcULL); > RND(S[4],S[5],S[6],S[7],S[0],S[1],S[2],S[3],4,0x3956c25bf348b538ULL); > > Perhaps OS X has some sort of Steve Jobs special constant suffix "ULL" > that Mr. Gates and the ANSI C folks have yet to accept ;-) It's valid C99, meaning "this is an unsigned long long". -- g From pje at telecommunity.com Tue Aug 23 17:43:02 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 23 Aug 2005 11:43:02 -0400 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: <20050823152156.GA7839@mems-exchange.org> References: <20050822213142.GA5702@mems-exchange.org> Message-ID: <5.1.1.6.0.20050823112823.01b22d18@mail.telecommunity.com> At 09:21 AM 8/23/2005 -0600, Neil Schemenauer wrote: > > then of course, one could change ``unicode.__str__()`` to return > > ``self``, itself, which should work. but then, why so complicated? > >I think that may be the right fix. No, it isn't. Right now str(u"x") coerces the unicode object to a string, so changing this will be backwards-incompatible with any existing programs. I think the new builtin is actually the right way to go for both 2.x and 3.x Pythons. i.e., text() would be a builtin in 2.x, along with a new bytes() type, and in 3.x text() could replace the basestring, str and unicode types. I also think that the text() constructor should have a signature of 'text(ob,encoding="ascii")'. In the default case, strings can be returned by text() as long as they are pure ASCII (making the code str-stable *and* unicode-safe). In the non-default case, a unicode object should always be returned, making the code unicode-safe but not str-stable. Allowing text() to return 8-bit strings would be an obvious violation of its name: it's for text, not bytes. From paragate at gmx.net Tue Aug 23 17:45:27 2005 From: paragate at gmx.net (Wolfgang Lipp) Date: Tue, 23 Aug 2005 17:45:27 +0200 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: <20050823152156.GA7839@mems-exchange.org> References: <20050822213142.GA5702@mems-exchange.org> <20050823152156.GA7839@mems-exchange.org> Message-ID: i have to revise my last posting -- exporting the new ``str`` pure-python implementation breaks -- of course! -- as soon as ``isinstance(x,str)`` [sic] is used. right now it breaks because you can't have a function as the second argument of ``isinstance()``, but even if that could be avoided by canny programming, the fact remains that any object derived from e.g. a string literal will still be constructed from the underlying implementation and can't therefore be an instance of the old ``str``. also, ``str.__bases__`` is not extendable (it's a tuple) and not replaceable (it's a built-in), so there seems to be no way to get near a truly working solution except with C-level patches. On Tue, 23 Aug 2005 17:21:57 +0200, Neil Schemenauer wrote: > I don't think isinstance() would be okay. That test is meant as an > optimization to avoid calling __str__ on str and unicode instances. > Subclasses should still have their __str__ method called otherwise > they cannot override it. makes perfect sense, i'll change the line back. _wolf From mwh at python.net Tue Aug 23 17:44:56 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 23 Aug 2005 16:44:56 +0100 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: <000101c5a7f5$989d5100$8901a044@oemcomputer> (Raymond Hettinger's message of "Tue, 23 Aug 2005 11:16:11 -0400") References: <000101c5a7f5$989d5100$8901a044@oemcomputer> Message-ID: <2mmzn8slmv.fsf@starship.python.net> "Raymond Hettinger" writes: > [Raymond Hettinger] >> > This patch should be reverted or fixed so that the Py2.5 build works >> > again. >> > >> > It contains a disasterous search and replace error that prevents it > from >> > compiling. Hence, it couldn't have passed the test suite before > being >> > checked in. > > [Michael Hudson] >> It works for me, on OS X. Passes the test suite, even. I presume >> you're on Windows of some kind? > > > Here's an excerpt from the check-in note for sha512module.c: > > > RND(S[0],S[1],S[2],S[3],S[4],S[5],S[6],S[7],0,0x428a2f98d728ae22ULL); > > RND(S[7],S[0],S[1],S[2],S[3],S[4],S[5],S[6],1,0x7137449123ef65cdULL); > > RND(S[6],S[7],S[0],S[1],S[2],S[3],S[4],S[5],2,0xb5c0fbcfec4d3b2fULL); > > RND(S[5],S[6],S[7],S[0],S[1],S[2],S[3],S[4],3,0xe9b5dba58189dbbcULL); > > RND(S[4],S[5],S[6],S[7],S[0],S[1],S[2],S[3],4,0x3956c25bf348b538ULL); > > Perhaps OS X has some sort of Steve Jobs special constant suffix "ULL" > that Mr. Gates and the ANSI C folks have yet to accept ;-) It's an C99 unsigned long long literal, AFAICT (p70 of the PDF I found lying around somewhere...), so I think it's just Bill who's behind. However, Python doesn't require C99, so it's pretty dodgy code by our standards. Hmm. You have PY_LONG_LONG #define-d, right? Does VC++ 6 (that's what you use, right?) support any kind of long long literal? > If it works for you, then it probably means that sha512module.c was left > out of the build. Nope: [mwh at 82-33-185-193 build-debug]$ ./python.exe Python 2.5a0 (#1, Aug 23 2005, 13:24:32) [GCC 3.3 20030304 (Apple Computer, Inc. build 1671)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import _sha512 [44297 refs] > Maybe sha512module.c wasn't supposed to be checked in? I think if you have a sufficiently modern openssl it's unnecessary. > The project files are just text files and can be updated simply and > directly. But yes, that is no big deal and I'll just do it for him once > the code gets to a compilable state. > > Aside from the project files, there is still config.c and whatnot. Does anything need to be done there? Oh, PC/config.c, right? > We should put together a checklist of all the things that need to be > updated when a new module is added. Sounds like it! :) Cheers, mwh -- This makes it possible to pass complex object hierarchies to a C coder who thinks computer science has made no worthwhile advancements since the invention of the pointer. -- Gordon McMillan, 30 Jul 1998 From fredrik at pythonware.com Tue Aug 23 17:51:34 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 23 Aug 2005 17:51:34 +0200 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules_hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 References: <000101c5a7f5$989d5100$8901a044@oemcomputer> <200508231632.30175.gmccaughan@synaptics-uk.com> Message-ID: Gareth McCaughan wrote: > It's valid C99, meaning "this is an unsigned long long". since when does Python require C99 compilers? From mcherm at mcherm.com Tue Aug 23 18:11:02 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Tue, 23 Aug 2005 09:11:02 -0700 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicodestrings Message-ID: <20050823091102.ay5mcm8r2gco4488@login.werra.lunarpages.com> Neil Schemenauer wrote: > The PEP has been rewritten based on a suggestion by Guido to change > str() rather than adding a new built-in function. Based on my testing, I > believe the idea is feasible. Fredrik Lundh replies: > note that this breaks chapter 3 of the tutorial: > > http://docs.python.org/tut/node5.html#SECTION005130000000000000000 > > where str() is first introduced. It's hardly "introduced"... the only bit I found reads: ... When a Unicode string is printed, written to a file, or converted with str(), conversion takes place using this default encoding. >>> u"abc" u'abc' >>> str(u"abc") 'abc' >>> u"???" u'\xe4\xf6\xfc' >>> str(u"???") Traceback (most recent call last): File "", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) To convert a Unicode string into an 8-bit string using a specific encoding, Unicode objects provide an encode() method that takes one argument, the name of the encoding. Lowercase names for encodings are preferred. >>> u"???".encode('utf-8') '\xc3\xa4\xc3\xb6\xc3\xbc' I think that if we just took out the example of str() usage and replaced it with a sentence or two that DID introduce the (revised) str() function, it ought to work. In particular, it could mention that you can call str() on any object, which isn't stated here at all. -- Michael Chermside From theller at python.net Tue Aug 23 18:08:42 2005 From: theller at python.net (Thomas Heller) Date: Tue, 23 Aug 2005 18:08:42 +0200 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 References: <000101c5a7f5$989d5100$8901a044@oemcomputer> <2mmzn8slmv.fsf@starship.python.net> Message-ID: Michael Hudson writes: > "Raymond Hettinger" writes: > >> [Raymond Hettinger] >>> > This patch should be reverted or fixed so that the Py2.5 build works >>> > again. >>> > >>> > It contains a disasterous search and replace error that prevents it >> from >>> > compiling. Hence, it couldn't have passed the test suite before >> being >>> > checked in. >> >> [Michael Hudson] >>> It works for me, on OS X. Passes the test suite, even. I presume >>> you're on Windows of some kind? >> >> >> Here's an excerpt from the check-in note for sha512module.c: >> >> >> RND(S[0],S[1],S[2],S[3],S[4],S[5],S[6],S[7],0,0x428a2f98d728ae22ULL); >> >> RND(S[7],S[0],S[1],S[2],S[3],S[4],S[5],S[6],1,0x7137449123ef65cdULL); >> >> RND(S[6],S[7],S[0],S[1],S[2],S[3],S[4],S[5],2,0xb5c0fbcfec4d3b2fULL); >> >> RND(S[5],S[6],S[7],S[0],S[1],S[2],S[3],S[4],3,0xe9b5dba58189dbbcULL); >> >> RND(S[4],S[5],S[6],S[7],S[0],S[1],S[2],S[3],4,0x3956c25bf348b538ULL); >> >> Perhaps OS X has some sort of Steve Jobs special constant suffix "ULL" >> that Mr. Gates and the ANSI C folks have yet to accept ;-) > > It's an C99 unsigned long long literal, AFAICT (p70 of the PDF I found > lying around somewhere...), so I think it's just Bill who's behind. > However, Python doesn't require C99, so it's pretty dodgy code by our > standards. > > Hmm. You have PY_LONG_LONG #define-d, right? Does VC++ 6 (that's > what you use, right?) support any kind of long long literal? The suffix seems to be 'ui64'. From vc6 limits.h: #if _INTEGRAL_MAX_BITS >= 64 /* minimum signed 64 bit value */ #define _I64_MIN (-9223372036854775807i64 - 1) /* maximum signed 64 bit value */ #define _I64_MAX 9223372036854775807i64 /* maximum unsigned 64 bit value */ #define _UI64_MAX 0xffffffffffffffffui64 #endif Thomas From abkhd at hotmail.com Tue Aug 23 18:23:33 2005 From: abkhd at hotmail.com (A.B., Khalid) Date: Tue, 23 Aug 2005 16:23:33 +0000 Subject: [Python-Dev] Modules _hashopenssl, sha256, sha512 compile in MinGW, test_hmac.py passes Message-ID: Hello, I can also report that MinGW can compile the said modules and (after updating config.c, etc.) the resulting code passes as follows: $ python -i ../Lib/test/test_hmac.py test_md5_vectors (__main__.TestVectorsTestCase) ... ok test_sha_vectors (__main__.TestVectorsTestCase) ... ok test_normal (__main__.ConstructorTestCase) ... ok test_withmodule (__main__.ConstructorTestCase) ... ok test_withtext (__main__.ConstructorTestCase) ... ok test_default_is_md5 (__main__.SanityTestCase) ... ok test_exercise_all_methods (__main__.SanityTestCase) ... ok test_attributes (__main__.CopyTestCase) ... ok test_equality (__main__.CopyTestCase) ... ok test_realcopy (__main__.CopyTestCase) ... ok ---------------------------------------------------------------------- Ran 10 tests in 0.050s OK >>> Are these moduels going to be built into the core? Regards Khalid _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ From gmccaughan at synaptics-uk.com Tue Aug 23 18:38:20 2005 From: gmccaughan at synaptics-uk.com (Gareth McCaughan) Date: Tue, 23 Aug 2005 17:38:20 +0100 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules_hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: References: <000101c5a7f5$989d5100$8901a044@oemcomputer> <200508231632.30175.gmccaughan@synaptics-uk.com> Message-ID: <200508231738.20961.gmccaughan@synaptics-uk.com> On Tuesday 2005-08-23 16:51, Fredrik Lundh wrote: > Gareth McCaughan wrote: > > > It's valid C99, meaning "this is an unsigned long long". > > since when does Python require C99 compilers? > > It doesn't, of course, and I hope it won't for a good while. I was just responding to this: | Perhaps OS X has some sort of Steve Jobs special constant suffix "ULL" | that Mr. Gates and the ANSI C folks have yet to accept since in fact Mr Gates and the ANSI C folks (and the gcc folks, and probably plenty of others I can't check so easily) *have* accepted it. -- g From raymond.hettinger at verizon.net Tue Aug 23 18:46:58 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 23 Aug 2005 12:46:58 -0400 Subject: [Python-Dev] [Python-checkins]python/dist/src/Modules_hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: Message-ID: <000401c5a802$478c38a0$8901a044@oemcomputer> [Gareth] > > It's valid C99, meaning "this is an unsigned long long". > since when does Python require C99 compilers? Except from PEP 7: "Use ANSI/ISO standard C (the 1989 version of the standard)." From mwh at python.net Tue Aug 23 18:51:05 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 23 Aug 2005 17:51:05 +0100 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules_hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: (Fredrik Lundh's message of "Tue, 23 Aug 2005 17:51:34 +0200") References: <000101c5a7f5$989d5100$8901a044@oemcomputer> <200508231632.30175.gmccaughan@synaptics-uk.com> Message-ID: <2mirxwsikm.fsf@starship.python.net> "Fredrik Lundh" writes: > Gareth McCaughan wrote: > >> It's valid C99, meaning "this is an unsigned long long". > > since when does Python require C99 compilers? Well, it doesn't, but Raymond was suggesting the code was GCC specific, or something. Cheers, mwh -- Check out the comments in this source file that start with: # Oh, lord help us. -- Mark Hammond gets to play with the Outlook object model From raymond.hettinger at verizon.net Tue Aug 23 17:47:14 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 23 Aug 2005 11:47:14 -0400 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: <2mmzn8slmv.fsf@starship.python.net> Message-ID: <000101c5a7f9$efb77480$8901a044@oemcomputer> [Michael Hudson] > It's an C99 unsigned long long literal, AFAICT (p70 of the PDF I found > lying around somewhere...), so I think it's just Bill who's behind. > However, Python doesn't require C99, so it's pretty dodgy code by our > standards. More than just dodgy. Except from PEP 7: "Use ANSI/ISO standard C (the 1989 version of the standard)." Raymond From nas at arctrix.com Tue Aug 23 18:54:09 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 23 Aug 2005 10:54:09 -0600 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: <5.1.1.6.0.20050823112823.01b22d18@mail.telecommunity.com> References: <20050822213142.GA5702@mems-exchange.org> <5.1.1.6.0.20050823112823.01b22d18@mail.telecommunity.com> Message-ID: <20050823165409.GA8026@mems-exchange.org> On Tue, Aug 23, 2005 at 11:43:02AM -0400, Phillip J. Eby wrote: > At 09:21 AM 8/23/2005 -0600, Neil Schemenauer wrote: > >> then of course, one could change ``unicode.__str__()`` to return > >> ``self``, itself, which should work. but then, why so complicated? > > > >I think that may be the right fix. > > No, it isn't. Right now str(u"x") coerces the unicode object to a > string, so changing this will be backwards-incompatible with any > existing programs. I meant that for the implementation of the PEP, changing unicode.__str__ to return self seems to be the right fix. Whether you believe that str() should be allowed to return unicode instances is a different question. > I think the new builtin is actually the right way to go for both 2.x and > 3.x Pythons. i.e., text() would be a builtin in 2.x, along with a new > bytes() type, and in 3.x text() could replace the basestring, str and > unicode types. Perhaps the critical question is what will the string type in P3k be called? If it will be 'str' then I think the PEP makes sense. If it will be something else, then there should be a corresponding type slot (e.g. __text__). What method does your proposed text() built-in call? > I also think that the text() constructor should have a signature of > 'text(ob,encoding="ascii")'. I think that's a bad idea. We want to get away from ASCII and use Unicode instead. > In the default case, strings can be returned by text() as long as > they are pure ASCII (making the code str-stable *and* > unicode-safe). I think you misunderstand the PEP. Your proposed function is neither Unicode-safe nor str-stable, the worst of both worlds. Passing it a unicode string that contains non-ASCII characters would result in an exception (not Unicode-safe). Passing it a str results in a unicode return value (not str-stable). Neil From nas at arctrix.com Tue Aug 23 19:00:06 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 23 Aug 2005 11:00:06 -0600 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: References: <20050822213142.GA5702@mems-exchange.org> <20050823152156.GA7839@mems-exchange.org> Message-ID: <20050823170003.GB8026@mems-exchange.org> On Tue, Aug 23, 2005 at 05:45:27PM +0200, Wolfgang Lipp wrote: > i have to revise my last posting -- exporting the new ``str`` > pure-python implementation breaks -- of course! -- as soon > as ``isinstance(x,str)`` [sic] is used Right. I tried to come up with a pure Python version so people could test their code. This was my latest attempt before giving up (from memory): # inside site.py _old_str_new = str.__new__ def _str_new(self, s): if type(self) not in (str, unicode): return _old_str_new(self, s) if type(s) not in (str, unicode): return s r = s.__str__() if not isinstance(r, (str, unicode)): raise TypeError('__str__ returned non-string') return r str.__new__ = _str_new It doesn't work though: TypeError: can't set attributes of built-in/extension type 'str' Maybe someone else has a clever solution. Neil From pje at telecommunity.com Tue Aug 23 19:14:24 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 23 Aug 2005 13:14:24 -0400 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: <20050823165409.GA8026@mems-exchange.org> References: <5.1.1.6.0.20050823112823.01b22d18@mail.telecommunity.com> <20050822213142.GA5702@mems-exchange.org> <5.1.1.6.0.20050823112823.01b22d18@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050823130921.02a43da8@mail.telecommunity.com> At 10:54 AM 8/23/2005 -0600, Neil Schemenauer wrote: >On Tue, Aug 23, 2005 at 11:43:02AM -0400, Phillip J. Eby wrote: > > At 09:21 AM 8/23/2005 -0600, Neil Schemenauer wrote: > > >> then of course, one could change ``unicode.__str__()`` to return > > >> ``self``, itself, which should work. but then, why so complicated? > > > > > >I think that may be the right fix. > > > > No, it isn't. Right now str(u"x") coerces the unicode object to a > > string, so changing this will be backwards-incompatible with any > > existing programs. > >I meant that for the implementation of the PEP, changing >unicode.__str__ to return self seems to be the right fix. Whether >you believe that str() should be allowed to return unicode instances >is a different question. > > > I think the new builtin is actually the right way to go for both 2.x and > > 3.x Pythons. i.e., text() would be a builtin in 2.x, along with a new > > bytes() type, and in 3.x text() could replace the basestring, str and > > unicode types. > >Perhaps the critical question is what will the string type in P3k be >called? If it will be 'str' then I think the PEP makes sense. If >it will be something else, then there should be a corresponding type >slot (e.g. __text__). What method does your proposed text() >built-in call? Heck if I know. :) I think the P3k string type should just be called 'text', though, so we can leave the whole unicode/str mess behind. > > I also think that the text() constructor should have a signature of > > 'text(ob,encoding="ascii")'. > >I think that's a bad idea. We want to get away from ASCII and use >Unicode instead. It's not str-stable if it returns unicode for a string input. > > In the default case, strings can be returned by text() as long as > > they are pure ASCII (making the code str-stable *and* > > unicode-safe). > >I think you misunderstand the PEP. Your proposed function is >neither Unicode-safe nor str-stable, the worst of both worlds. >Passing it a unicode string that contains non-ASCII characters would >result in an exception (not Unicode-safe). Passing it a str results >in a unicode return value (not str-stable). I think you misunderstand my proposal. :) I'm proposing rough semantics of: def text(ob, encoding='ascii'): if isinstance(ob,unicode): return ob ob = str(ob) # or ob.__text__, then fallback to __unicode__/__str__ if encoding=='ascii' and isinstance(ob,str): unicode(ob,encoding) # check for purity return ob # return the string if it's pure return unicode(ob, encoding) This is str-stable *and* unicode-safe. From reinhold-birkenfeld-nospam at wolke7.net Tue Aug 23 19:23:25 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Tue, 23 Aug 2005 19:23:25 +0200 Subject: [Python-Dev] python/dist/src/Doc/tut tut.tex,1.276,1.277 In-Reply-To: <20050823150057.057C91E400B@bag.python.org> References: <20050823150057.057C91E400B@bag.python.org> Message-ID: rhettinger at users.sourceforge.net wrote: I'm not a native speaker, but... > @@ -114,7 +114,7 @@ > programs, or to test functions during bottom-up program development. > It is also a handy desk calculator. > > -Python allows writing very compact and readable programs. Programs > +Python enables programs to written compactly and readably. Programs > written in Python are typically much shorter than equivalent C or > \Cpp{} programs, for several reasons: > \begin{itemize} ...shouldn't it be "programs to be written compactly"? > @@ -1753,8 +1753,8 @@ > > \begin{methoddesc}[list]{pop}{\optional{i}} > Remove the item at the given position in the list, and return it. If > -no index is specified, \code{a.pop()} returns the last item in the > -list. The item is also removed from the list. (The square brackets > +no index is specified, \code{a.pop()} removes and returns the last item > +in the list. The item is also removed from the list. (The square brackets > around the \var{i} in the method signature denote that the parameter > is optional, not that you should type square brackets at that > position. You will see this notation frequently in the Thats twice the same the same (removal from list). > @@ -1985,7 +1987,9 @@ > \section{The \keyword{del} statement \label{del}} > > There is a way to remove an item from a list given its index instead > -of its value: the \keyword{del} statement. This can also be used to > +of its value: the \keyword{del} statement. Unlike the \method{pop()}) > +method which returns a value, the \keyword{del} keyword is a statement > +and can also be used to > remove slices from a list (which we did earlier by assignment of an > empty list to the slice). For example: The del keyword is a statement? > @@ -2133,8 +2137,8 @@ > keys. Tuples can be used as keys if they contain only strings, > numbers, or tuples; if a tuple contains any mutable object either > directly or indirectly, it cannot be used as a key. You can't use > -lists as keys, since lists can be modified in place using their > -\method{append()} and \method{extend()} methods, as well as slice and > +lists as keys, since lists can be modified in place using methods like > +\method{append()} and \method{extend()} or modified with slice and > indexed assignments. Is the second "modified" necessary? > @@ -5595,8 +5603,8 @@ > to round it again can't make it better: it was already as good as it > gets. > > -Another consequence is that since 0.1 is not exactly 1/10, adding 0.1 > -to itself 10 times may not yield exactly 1.0, either: > +Another consequence is that since 0.1 is not exactly 1/10, > +summing ten values of 0.1 may not yield exactly 1.0, either: > > \begin{verbatim} > >>> sum = 0.0 Is it clear from context that the "0.1 is not exactly 1/10" refers to floating point only? > @@ -5637,7 +5645,7 @@ > you can perform an exact analysis of cases like this yourself. Basic > familiarity with binary floating-point representation is assumed. > > -\dfn{Representation error} refers to that some (most, actually) > +\dfn{Representation error} refers to fact that some (most, actually) > decimal fractions cannot be represented exactly as binary (base 2) > fractions. This is the chief reason why Python (or Perl, C, \Cpp, > Java, Fortran, and many others) often won't display the exact decimal "...refers to the fact..."? Reinhold -- Mail address is perfectly valid! From pje at telecommunity.com Tue Aug 23 19:57:27 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 23 Aug 2005 13:57:27 -0400 Subject: [Python-Dev] python/dist/src/Doc/tut tut.tex,1.276,1.277 In-Reply-To: References: <20050823150057.057C91E400B@bag.python.org> <20050823150057.057C91E400B@bag.python.org> Message-ID: <5.1.1.6.0.20050823135140.032c5ea0@mail.telecommunity.com> At 07:23 PM 8/23/2005 +0200, Reinhold Birkenfeld wrote: >rhettinger at users.sourceforge.net wrote: > >I'm not a native speaker, but... > > > @@ -114,7 +114,7 @@ > > programs, or to test functions during bottom-up program development. > > It is also a handy desk calculator. > > > > -Python allows writing very compact and readable programs. Programs > > +Python enables programs to written compactly and readably. Programs > > written in Python are typically much shorter than equivalent C or > > \Cpp{} programs, for several reasons: > > \begin{itemize} > >...shouldn't it be "programs to be written compactly"? It looks to me like the original text here should stand; Python doesn't "enable programs to be written"; it enables people to write them. That is, the passive voice should be avoided if possible. ;-) > > @@ -1985,7 +1987,9 @@ > > \section{The \keyword{del} statement \label{del}} > > > > There is a way to remove an item from a list given its index instead > > -of its value: the \keyword{del} statement. This can also be used to > > +of its value: the \keyword{del} statement. Unlike the \method{pop()}) > > +method which returns a value, the \keyword{del} keyword is a statement > > +and can also be used to > > remove slices from a list (which we did earlier by assignment of an > > empty list to the slice). For example: > >The del keyword is a statement? The keyword certainly isn't. This section also looks like it should stand the way it was, or else say that "unlike the pop() method, the del statement can also be used to remove slices...". From greg at electricrain.com Tue Aug 23 20:59:29 2005 From: greg at electricrain.com (Gregory P. Smith) Date: Tue, 23 Aug 2005 11:59:29 -0700 Subject: [Python-Dev] [Python-checkins] python/dist/src setup.py, 1.219, 1.220 In-Reply-To: <003101c5a717$83be4b60$3c23a044@oemcomputer> References: <20050821184639.EF8711E4006@bag.python.org> <003101c5a717$83be4b60$3c23a044@oemcomputer> Message-ID: <20050823185929.GI16043@zot.electricrain.com> On Mon, Aug 22, 2005 at 08:46:27AM -0400, Raymond Hettinger wrote: > > A new hashlib module to replace the md5 and sha modules. It adds > > support for additional secure hashes such as SHA-256 and SHA-512. The > > hashlib module uses OpenSSL for fast platform optimized > > implementations of algorithms when available. The old md5 and sha > > modules still exist as wrappers around hashlib to preserve backwards > > compatibility. > > I'm getting compilation errors: > > C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad > suffix on number' > C:\py25\Modules\sha512module.c(146) : error C2146: syntax error : > missing ')' before identifier 'L' > C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad > suffix on number' > C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad > suffix on number' > C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad > suffix on number' > C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad > suffix on number' > C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad > suffix on number' > C:\py25\Modules\sha512module.c(146) : fatal error C1013: compiler limit > : too many open parentheses > > > Also, there should be updating entries to Misc/NEWS, > PC/VC6/pythoncore.dsp, and PC/config.c. > > > Raymond I don't have a win32 dev environment at the moment so i didn't see that. Sorry. If you remove the 'ULL' suffix from all of the 64bit constants in that file what happens? I added the ULLs to quelch the mass of warnings about constants being to large for the datatype that gcc 3.3 was spewing. -greg From greg at electricrain.com Tue Aug 23 21:04:30 2005 From: greg at electricrain.com (Gregory P. Smith) Date: Tue, 23 Aug 2005 12:04:30 -0700 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: <000601c5a7ec$9f614680$8901a044@oemcomputer> References: <20050821184613.A45C11E4288@bag.python.org> <000601c5a7ec$9f614680$8901a044@oemcomputer> Message-ID: <20050823190430.GJ16043@zot.electricrain.com> > This patch should be reverted or fixed so that the Py2.5 build works > again. > > It contains a disasterous search and replace error that prevents it from > compiling. Hence, it couldn't have passed the test suite before being > checked in. > > Also, all of the project and config files need to be updated for the new > modules. It passes fine on linux. I don't have a windows dev environment. regardless, the quick way to work around the sha512 on windows issue is to comment it out in setup.py and comment out the sha384 and sha512 tests in test_hashlib.py and commit that until the complation issues are worked out. -g > > -----Original Message----- > > From: python-checkins-bounces at python.org [mailto:python-checkins- > > bounces at python.org] On Behalf Of greg at users.sourceforge.net > > Sent: Sunday, August 21, 2005 2:46 PM > > To: python-checkins at python.org > > Subject: [Python-checkins] python/dist/src/Modules _hashopenssl.c, > > NONE,2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,2.1 > md5module.c, > > 2.35, 2.36 shamodule.c, 2.22, 2.23 > > > > Update of /cvsroot/python/python/dist/src/Modules > > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv32064/Modules > > > > Modified Files: > > md5module.c shamodule.c > > Added Files: > > _hashopenssl.c sha256module.c sha512module.c > > Log Message: > > [ sf.net patch # 1121611 ] > > > > A new hashlib module to replace the md5 and sha modules. It adds > > support for additional secure hashes such as SHA-256 and SHA-512. The > > hashlib module uses OpenSSL for fast platform optimized > > implementations of algorithms when available. The old md5 and sha > > modules still exist as wrappers around hashlib to preserve backwards > > compatibility. From raymond.hettinger at verizon.net Tue Aug 23 21:09:50 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 23 Aug 2005 15:09:50 -0400 Subject: [Python-Dev] [Python-checkins] python/dist/src setup.py, 1.219, 1.220 In-Reply-To: <20050823185929.GI16043@zot.electricrain.com> Message-ID: <000201c5a816$3caacaa0$8901a044@oemcomputer> [Gregory P. Smith] > I don't have a win32 dev environment at the moment so i didn't see > that. Sorry. No big deal. But we still have to get the code back to ANSI compliance. Do you have an ANSI-strict option with your compiler? Raymond From barry at python.org Tue Aug 23 21:27:01 2005 From: barry at python.org (Barry Warsaw) Date: Tue, 23 Aug 2005 15:27:01 -0400 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules_hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: References: <000101c5a7f5$989d5100$8901a044@oemcomputer> <200508231632.30175.gmccaughan@synaptics-uk.com> Message-ID: <1124825221.16679.4.camel@geddy.wooz.org> On Tue, 2005-08-23 at 11:51, Fredrik Lundh wrote: > Gareth McCaughan wrote: > > > It's valid C99, meaning "this is an unsigned long long". > > since when does Python require C99 compilers? Why, since Python 3.0 of course! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050823/139f430b/attachment.pgp From keir at cs.toronto.edu Tue Aug 23 22:10:21 2005 From: keir at cs.toronto.edu (Keir Mierle) Date: Tue, 23 Aug 2005 16:10:21 -0400 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) Message-ID: <20050823201021.GE32195@cs.toronto.edu> Hi, I'm working on Argon (http://www.third-bit.com/trac/argon) with Greg Wilson this summer We're having a very strange problem with Python's unicode parsing of source files. Basically, our CGI script was running extremely slowly on our production box (a pokey dual-Xeon 3GHz w/ 4GB RAM and 15K SCSI drives). Slow to the tune of 6-10 seconds per request. I eventually tracked this down to imports of our source tree; the actual request was completing in 300ms, the rest of the time was spent in __import__. After doing some gprof profiling, I discovered _PyUnicodeUCS2_IsLinebreak was getting called 51 million times. Our code is 1.2 million characters, so I hardly think it makes sense to call IsLinebreak 50 times for each character; and we're not even importing our entire source tree on every invocation. Our code is a fork of Trac, and originally had these lines at the top: # -*- coding: iso8859-1 -*- This made me suspicious, so I removed all of them. The CGI execution time immediately dropped to ~1 second. gprof revealed that _PyUnicodeUCS2_IsLinebreak is not called at all anymore. Now that our code works fast enough, I don't really care about this, but I thought python-dev might want to know something weird is going on with unicode splitlines. I documented my investigation of this problem; if anyone wants further details, just email me. (I'm not on python-dev) http://www.third-bit.com/trac/argon/ticket/525 Thanks in advance, Keir From martin at v.loewis.de Tue Aug 23 23:13:42 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 23 Aug 2005 23:13:42 +0200 Subject: [Python-Dev] [Python-checkins] python/dist/src setup.py, 1.219, 1.220 In-Reply-To: <000201c5a816$3caacaa0$8901a044@oemcomputer> References: <000201c5a816$3caacaa0$8901a044@oemcomputer> Message-ID: <430B9186.3010106@v.loewis.de> Raymond Hettinger wrote: > But we still have to get the code back to ANSI compliance. > Do you have an ANSI-strict option with your compiler? Please don't call this "ANSI compliant". ANSI does many more thinks that writing C standards, and, in the specific case, the code *is* ANSI compliant as it stands - it just doesn't comply to C89. It complies to ISO C 99, which (I believe) is also an U.S. American national (ANSI) standard. gcc does have an option to force c89 compliance, but there is a good chance that Python stops compiling with option: on many systems, essential system headers fail to comply with C89 (in addition, activating that mode also makes many extensions unavailable). Regards, Martin From tjreedy at udel.edu Tue Aug 23 23:23:50 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 23 Aug 2005 17:23:50 -0400 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 References: <2mmzn8slmv.fsf@starship.python.net> <000101c5a7f9$efb77480$8901a044@oemcomputer> Message-ID: "Raymond Hettinger" wrote in message news:000101c5a7f9$efb77480$8901a044 at oemcomputer... > Except from PEP 7: > > "Use ANSI/ISO standard C (the 1989 version of the standard)." Just checked (P&B, Standard C): only one L allowed, not two. But with C99 compilers becoming more common, accidental usages of C99-isms in submitted code will likely become more common, especially when there is not a graceful C89 alternative. While the current policy should be followed while it remains the policy, it may need revision someday. Terry J. Reedy From greg at electricrain.com Tue Aug 23 23:32:22 2005 From: greg at electricrain.com (Gregory P. Smith) Date: Tue, 23 Aug 2005 14:32:22 -0700 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: <000101c5a7f5$989d5100$8901a044@oemcomputer> References: <2mvf1wsp0k.fsf@starship.python.net> <000101c5a7f5$989d5100$8901a044@oemcomputer> Message-ID: <20050823213222.GK16043@zot.electricrain.com> > The project files are just text files and can be updated simply and > directly. But yes, that is no big deal and I'll just do it for him once > the code gets to a compilable state. I just checked in an update removing all of the ULLs. Could you check that it compiles on windows and passes test_hashlib.py now? It does leave gcc 3.x users with a big mess of compiler warnings to deal with but those can be worked around once the build is actually working everywhere. thanks. Greg > Aside from the project files, there is still config.c and whatnot. We > should put together a checklist of all the things that need to be > updated when a new module is added. that'd be helpful. :) From martin at v.loewis.de Tue Aug 23 23:43:34 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 23 Aug 2005 23:43:34 +0200 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: References: <2mmzn8slmv.fsf@starship.python.net> <000101c5a7f9$efb77480$8901a044@oemcomputer> Message-ID: <430B9886.4060004@v.loewis.de> Terry Reedy wrote: > Just checked (P&B, Standard C): only one L allowed, not two. But with C99 > compilers becoming more common, accidental usages of C99-isms in submitted > code will likely become more common, especially when there is not a > graceful C89 alternative. While the current policy should be followed > while it remains the policy, it may need revision someday. I think Python switched to C89 in 1999 (shortly before C99 was published, IIRC). So the canonical time for switching to C99 would be in 2009, provided all interesting compilers have implemented it by then, atleast to the degree that Python would typically need. Regards, Martin From raymond.hettinger at verizon.net Wed Aug 24 02:29:37 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 23 Aug 2005 20:29:37 -0400 Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23 In-Reply-To: <20050823213222.GK16043@zot.electricrain.com> Message-ID: <001001c5a842$e9ac0580$ab12c797@oemcomputer> [Gregory P. Smith] > I just checked in an update removing all of the ULLs. Could you check > that it compiles on windows and passes test_hashlib.py now? Okay, all is well. Raymond From raymond.hettinger at verizon.net Wed Aug 24 03:23:32 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 23 Aug 2005 21:23:32 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 Message-ID: <001601c5a84a$715b9200$ab12c797@oemcomputer> The latest version of PEP 348 still proposes that a bare except clause will default to Exception instead of BaseException. Initially, I had thought that might be a good idea but now think it is doomed and needs to be removed from the PEP. A bare except belongs at the end of a try suite, not in the middle. This is obvious when compared to: if a: ... elif b: ... elif c: ... else: ... # The bare else goes at the end # and serves as a catchall or switch c case a: ... case b: ... default: ... # The bare default goes at the end # and serves as a catchall In contrast, Brett's 8/9 note revealed that the following would be allowable and common if the PEP is accepted in its current form: try: ... except: ... # A bare except in the middle. WTF? except (KeyboardInterrupt, SystemExit): ... The right way is, of course: try: ... except (KeyboardInterrupt, SystemExit): ... except: # Implicit or explicit match to BaseException # that serves as a catchall For those not needing a terminating exception handler, the rest of the PEP appropriately allows and encourages a simple and explicit solution that meets most needs: try: ... except Exception: ... The core issue is that the most obvious meaning of a bare except is "catchall", not "catchmost". When the latter is intended, the simple and explicit form shown in the last example is the way to go. If the former is intended, then either a bare except clause or explicit mention of BaseException will do nicely. However, under the PEP proposal, both new and existing code will suffer from having bare except clauses that look like they catch everything, are intended to catch everything, but, in fact, do not. That kind of optical illusion error must be avoided. There is no getting around our mind's propensity to interpret the bare form as defaulting to the top of the tree rather than the middle as proposed by the PEP. Likewise, there is no getting around the mental confusion caused a bare except clause in the middle of a try-suite rather than at the end. We have to avoid code that looks like it does one thing but actually does something else. Raymond From gvanrossum at gmail.com Wed Aug 24 03:30:20 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue, 23 Aug 2005 18:30:20 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <001601c5a84a$715b9200$ab12c797@oemcomputer> References: <001601c5a84a$715b9200$ab12c797@oemcomputer> Message-ID: On 8/23/05, Raymond Hettinger wrote: > The latest version of PEP 348 still proposes that a bare except clause > will default to Exception instead of BaseException. Initially, I had > thought that might be a good idea but now think it is doomed and needs > to be removed from the PEP. If we syntactically enforce that the bare except, if present, must be last, would that remove your objection? I agree that a bare except in the middle is an anomaly, but that doesn't mean we can't keep bare except: as a shorthand for except Exception:. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Wed Aug 24 04:41:01 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 23 Aug 2005 22:41:01 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: Message-ID: <001d01c5a855$447a9d20$ab12c797@oemcomputer> [Guido van Rossum] > If we syntactically enforce that the bare except, if present, must be > last, would that remove your objection? I agree that a bare except in > the middle is an anomaly, but that doesn't mean we can't keep bare > except: as a shorthand for except Exception:. Hmm. Prohibiting mid-suite bare excepts is progress and eliminates the case that causes immediate indigestion. As for the rest, I'm not as sure and it would be helpful to get thoughts from others on this one. My sense is that blocking the clause from appearing in the middle is treating the symptom and not the disease. The motivating case for the most of the PEP was that folks were writing bare except clauses and trapping more than they should. Much of the concern was dealt with just by giving a choice between writing Exception and BareException depending on the intended result. That leaves the question of the default value a bare except with Exception being the most useful and BaseException being the most obvious. While I don't doubt that Exception is the more useful, we have already introduced a new builtin and moved two other exceptions to meet this need. Going further and altering the meaning of bare except seems like overkill for a relatively small issue. My remaining concern is about obviousness. How much code has been written or will be written that intends a bare except to mean BaseException instead of Exception. Would such code erroneously pass a code review or inspection. I suspect it would. The code looks like does one thing but actually does something else. This may or may not be a big deal. Raymond From niko at alum.mit.edu Wed Aug 24 09:07:58 2005 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 24 Aug 2005 09:07:58 +0200 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <001d01c5a855$447a9d20$ab12c797@oemcomputer> References: <001d01c5a855$447a9d20$ab12c797@oemcomputer> Message-ID: <00236367-938E-4D75-866E-2F1A5DEABEC0@alum.mit.edu> > As for the rest, I'm not as sure and it would be helpful to get > thoughts > from others on this one. My sense is that blocking the clause from > appearing in the middle is treating the symptom and not the disease. +1 It would be better to prohibit bare except entirely (well, presumably at some point in the future with appropriate warnings at the moment) than change its semantics. I agree that its intuitive meaning is "if anything is thrown", not, "if a non-programmer-error exception is thrown," but I'm not sure if that's even important. The point is that it has existing well defined semantics; changing them just seems unnecessary to the aims of the rewrite and confusing to existing Python programmers. I've written plenty of code with bare excepts and they all intended to catch *any* exception, usually in a user interface where I wanted to return to the main loop on programmer error not abort the entire program. I don't relish the thought of going back and changing existing code, and I imagine there are few who do. My 2 cents, Niko From ncoghlan at gmail.com Wed Aug 24 11:26:02 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 24 Aug 2005 19:26:02 +1000 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <001601c5a84a$715b9200$ab12c797@oemcomputer> References: <001601c5a84a$715b9200$ab12c797@oemcomputer> Message-ID: <430C3D2A.3070103@gmail.com> Raymond Hettinger wrote: > The latest version of PEP 348 still proposes that a bare except clause > will default to Exception instead of BaseException. Initially, I had > thought that might be a good idea but now think it is doomed and needs > to be removed from the PEP. One thing I assumed was that _if_ bare excepts were kept, they would still only be allowed as the last except clause. That is, this example: > try: ... > except: ... # A bare except in the middle. WTF? > except (KeyboardInterrupt, SystemExit): ... would still be a syntax error, even if bare excepts were allowed. I still have some qualms about the idea of a bare except that doesn't catch everything (I'd prefer to see them gone altogether), but I don't mind quite as much if the above code stays as a syntax error. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From walter at livinglogic.de Wed Aug 24 11:45:33 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 24 Aug 2005 11:45:33 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <20050823201021.GE32195@cs.toronto.edu> References: <20050823201021.GE32195@cs.toronto.edu> Message-ID: <430C41BD.8010602@livinglogic.de> Keir Mierle wrote: > Hi, I'm working on Argon (http://www.third-bit.com/trac/argon) with Greg > Wilson this summer > > We're having a very strange problem with Python's unicode parsing of source > files. Basically, our CGI script was running extremely slowly on our production > box (a pokey dual-Xeon 3GHz w/ 4GB RAM and 15K SCSI drives). Slow to the tune > of 6-10 seconds per request. I eventually tracked this down to imports of our > source tree; the actual request was completing in 300ms, the rest of the time > was spent in __import__. This is caused by the chances to the codecs in 2.4. Basically the codecs no longer rely on C's readline() to do line splitting (which can't work for UTF-16), but do it themselves (via unicode.splitlines()). > After doing some gprof profiling, I discovered _PyUnicodeUCS2_IsLinebreak was > getting called 51 million times. Our code is 1.2 million characters, so I > hardly think it makes sense to call IsLinebreak 50 times for each character; > and we're not even importing our entire source tree on every invocation. But if you're using CGI, you're importing your source on every invocation. Switching to a different server side technology might help. Nevertheless 50 million calls seems to be a bit much. > Our code is a fork of Trac, and originally had these lines at the top: > > # -*- coding: iso8859-1 -*- > > This made me suspicious, so I removed all of them. The CGI execution time > immediately dropped to ~1 second. gprof revealed that > _PyUnicodeUCS2_IsLinebreak is not called at all anymore. > > Now that our code works fast enough, I don't really care about this, but I > thought python-dev might want to know something weird is going on with unicode > splitlines. I wonder if we should switch back to a simple readline() implementation for those codecs that don't require the current implementation (basically every charmap codec). AFAIK source files are opened in universal newline mode, so at least we'd get proper treatment of "\n", "\r" and "\r\n" line ends, but we'd loose u"\x1c", u"\x1d", u"\x1e", u"\x85", u"\u2028" and u"\u2029" (which are line terminators according to unicode.splitlines()). > I documented my investigation of this problem; if anyone wants further details, > just email me. (I'm not on python-dev) > http://www.third-bit.com/trac/argon/ticket/525 Bye, Walter D?rwald From martin at v.loewis.de Wed Aug 24 12:16:25 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 24 Aug 2005 12:16:25 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430C41BD.8010602@livinglogic.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> Message-ID: <430C48F9.8060801@v.loewis.de> Walter D?rwald wrote: > This is caused by the chances to the codecs in 2.4. Basically the codecs > no longer rely on C's readline() to do line splitting (which can't work > for UTF-16), but do it themselves (via unicode.splitlines()). That explains why you get any calls to IsLineBreak; it doesn't explain why you get so many of them. I investigated this a bit, and one issue seems to be that StreamReader.readline performs splitline on the entire input, only to fetch the first line. It then joins the rest for later processing. In addition, it also performs splitlines on a single line, just to strip any trailing line breaks. The net effect is that, for a file with N lines, IsLineBreak is invoked up to N*N/2 times per character (atleast for the last character). So I think it would be best if Unicode characters exposed a .islinebreak method (or, failing that, codecs just knew what the line break characters are in Unicode 3.2), and then codecs would split off the first line of input itself. >>After doing some gprof profiling, I discovered _PyUnicodeUCS2_IsLinebreak was >>getting called 51 million times. Our code is 1.2 million characters, so I >>hardly think it makes sense to call IsLinebreak 50 times for each character; >>and we're not even importing our entire source tree on every invocation. > > > But if you're using CGI, you're importing your source on every > invocation. Well, no. Only the CGI script needs to be parsed every time; all modules could load off bytecode files. Which suggests that Keir Mierle doesn't use bytecode files, I think he should. Regards, Martin From mal at egenix.com Wed Aug 24 12:27:45 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 24 Aug 2005 12:27:45 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430C41BD.8010602@livinglogic.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> Message-ID: <430C4BA1.5030503@egenix.com> Walter D?rwald wrote: > I wonder if we should switch back to a simple readline() implementation > for those codecs that don't require the current implementation > (basically every charmap codec). That would be my preference as well. The 2.4 .readline() approach is really only needed for codecs that have to deal with encodings that: a) use multi-byte formats, or b) support more line-end formats than just CR, CRLF, LF, or c) are stateful. This can easily be had by using a mix-in class for codecs which do need the buffered .readline() approach. > AFAIK source files are opened in > universal newline mode, so at least we'd get proper treatment of "\n", > "\r" and "\r\n" line ends, but we'd loose u"\x1c", u"\x1d", u"\x1e", > u"\x85", u"\u2028" and u"\u2029" (which are line terminators according > to unicode.splitlines()). While the Unicode standard defines these characters as line end code points, I think their definition does not necessarily apply to data that is converted from a certain encoding to Unicode, so that's not a big loss. E.g. in ASCII or Latin-1, FILE, GROUP and RECORD SEPARATOR and NEXT LINE characters (0x1c, 0x1d, 0x1e, 0x85) are not interpreted as line end characters. Furthermore, we had no reports of anyone complaining in Python 1.6, 2.0 - 2.3 that line endings were not detected properly. All these Python versions relied on the stream's .readline() method to get the next line. The only bug reports we had were for UTF-16 which falls into the above category a) and did not support .readline() until Python 2.4. A note on the performance of _PyUnicode_IsLinebreak(): in Python 2.0 Fredrik changed this to use the two step lookup (reducing the size of the lookup tables considerably). I think it's worthwhile reconsidering this approach for character type queries that do no involve a huge number of code points. In Python 1.6 the function looked like this (and was inlined by the compiler using its own fast lookup table): int _PyUnicode_IsLinebreak(register const Py_UNICODE ch) { switch (ch) { case 0x000A: /* LINE FEED */ case 0x000D: /* CARRIAGE RETURN */ case 0x001C: /* FILE SEPARATOR */ case 0x001D: /* GROUP SEPARATOR */ case 0x001E: /* RECORD SEPARATOR */ case 0x0085: /* NEXT LINE */ case 0x2028: /* LINE SEPARATOR */ case 0x2029: /* PARAGRAPH SEPARATOR */ return 1; default: return 0; } } another candidate to convert back is: int _PyUnicode_IsWhitespace(register const Py_UNICODE ch) { switch (ch) { case 0x0009: /* HORIZONTAL TABULATION */ case 0x000A: /* LINE FEED */ case 0x000B: /* VERTICAL TABULATION */ case 0x000C: /* FORM FEED */ case 0x000D: /* CARRIAGE RETURN */ case 0x001C: /* FILE SEPARATOR */ case 0x001D: /* GROUP SEPARATOR */ case 0x001E: /* RECORD SEPARATOR */ case 0x001F: /* UNIT SEPARATOR */ case 0x0020: /* SPACE */ case 0x0085: /* NEXT LINE */ case 0x00A0: /* NO-BREAK SPACE */ case 0x1680: /* OGHAM SPACE MARK */ case 0x2000: /* EN QUAD */ case 0x2001: /* EM QUAD */ case 0x2002: /* EN SPACE */ case 0x2003: /* EM SPACE */ case 0x2004: /* THREE-PER-EM SPACE */ case 0x2005: /* FOUR-PER-EM SPACE */ case 0x2006: /* SIX-PER-EM SPACE */ case 0x2007: /* FIGURE SPACE */ case 0x2008: /* PUNCTUATION SPACE */ case 0x2009: /* THIN SPACE */ case 0x200A: /* HAIR SPACE */ case 0x200B: /* ZERO WIDTH SPACE */ case 0x2028: /* LINE SEPARATOR */ case 0x2029: /* PARAGRAPH SEPARATOR */ case 0x202F: /* NARROW NO-BREAK SPACE */ case 0x3000: /* IDEOGRAPHIC SPACE */ return 1; default: return 0; } } -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 23 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Wed Aug 24 12:56:58 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 24 Aug 2005 12:56:58 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430C4BA1.5030503@egenix.com> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C4BA1.5030503@egenix.com> Message-ID: <430C527A.8090302@v.loewis.de> M.-A. Lemburg wrote: > I think it's worthwhile reconsidering this approach for > character type queries that do no involve a huge number > of code points. I would advise against that. I measure both versions (your version called PyUnicode_IsLinebreak2) with the following code volatile int result; void unibench() { #define REPS 10000000000LL long long i; clock_t s1,s2,s3,s4,s5; s1 = clock(); for(i=0;i References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> Message-ID: <430C5E6E.2040405@livinglogic.de> Martin v. L?wis wrote: > Walter D?rwald wrote: > >>This is caused by the chances to the codecs in 2.4. Basically the codecs >>no longer rely on C's readline() to do line splitting (which can't work >>for UTF-16), but do it themselves (via unicode.splitlines()). > > That explains why you get any calls to IsLineBreak; it doesn't explain > why you get so many of them. > > I investigated this a bit, and one issue seems to be that > StreamReader.readline performs splitline on the entire input, only to > fetch the first line. It then joins the rest for later processing. > In addition, it also performs splitlines on a single line, just to > strip any trailing line breaks. This is because unicode.splitlines() is the only API available to Python that knows about unicode line feeds. > The net effect is that, for a file with N lines, IsLineBreak is invoked > up to N*N/2 times per character (atleast for the last character). > > So I think it would be best if Unicode characters exposed a .islinebreak > method (or, failing that, codecs just knew what the line break > characters are in Unicode 3.2), and then codecs would split off > the first line of input itself. I think a maxsplit argument (just as for unicode.split()) would help too. > [...] Bye, Walter D?rwald From mal at egenix.com Wed Aug 24 14:24:42 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 24 Aug 2005 14:24:42 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430C527A.8090302@v.loewis.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C4BA1.5030503@egenix.com> <430C527A.8090302@v.loewis.de> Message-ID: <430C670A.3090408@egenix.com> Martin v. L?wis wrote: > M.-A. Lemburg wrote: > >>I think it's worthwhile reconsidering this approach for >>character type queries that do no involve a huge number >>of code points. > > > I would advise against that. I measure both versions > (your version called PyUnicode_IsLinebreak2) with the > following code > > volatile int result; > void unibench() > { > #define REPS 10000000000LL > long long i; > clock_t s1,s2,s3,s4,s5; > s1 = clock(); > for(i=0;i result = _PyUnicode_IsLinebreak('('); > s2 = clock(); > for(i=0;i result = PyUnicode_IsLinebreak2('('); > s3 = clock(); > for(i=0;i result = _PyUnicode_IsLinebreak('\n'); > s4 = clock(); > for(i=0;i result = PyUnicode_IsLinebreak2('\n'); > s5 = clock(); > printf("f1, (: %d\nf2, (: %d\nf1, CR: %d\n, f2, CR: %d\n", > (int)(s2-s1),(int)(s3-s2),(int)(s4-s3),(int)(s5-s4)); > } > > and got those numbers > > f1, (: 13210000 > f2, (: 13300000 > f1, CR: 13220000 > , f2, CR: 13250000 > > What can be seen is that performance the two versions is nearly > identical, with the code currently used being slightly better. > What can also be seen is that, on my machine, 1e10 calls to > IsLinebreak take 13.2 seconds. So 51 Mio calls take about 70ms. Your test is somewhat biased: the current solution works using type records, so it has to swap in a new record for each character you test. In you benchmark, the same character is tested over and over again and the type record likely already stored in the CPU cache. The .splitlines() routine itself calls the above function for each and every character in the string, so quite a few of these type records have to be looked up. Here's a version that uses os.py as basis: #include #include #include "Python.h" int _PyUnicode_IsLinebreak16(register const Py_UNICODE ch) { switch (ch) { case 0x000A: /* LINE FEED */ case 0x000D: /* CARRIAGE RETURN */ case 0x001C: /* FILE SEPARATOR */ case 0x001D: /* GROUP SEPARATOR */ case 0x001E: /* RECORD SEPARATOR */ case 0x0085: /* NEXT LINE */ case 0x2028: /* LINE SEPARATOR */ case 0x2029: /* PARAGRAPH SEPARATOR */ return 1; default: return 0; } } #define REPS 10000 #define BUFFERSIZE 30000 int main(void) { long i, j; clock_t s1,s2,s3; char *buffer; FILE *datafile; long filelen; int result; datafile = fopen("os.py", "rb"); if (datafile == NULL) { printf("could not find os.py\n"); return -1; } buffer = (char *)malloc(BUFFERSIZE); filelen = fread(buffer, 1, BUFFERSIZE, datafile); printf("filelen=%li bytes\n", filelen); s1 = clock(); /* Python 2.4 */ for(i = 0; i < REPS; i++) for (j = 0; j < filelen; j++) result = _PyUnicode_IsLinebreak((Py_UNICODE)buffer[j]); s2 = clock(); /* Python 1.6 */ for(i = 0; i < REPS; i++) for (j = 0; j < filelen; j++) result = _PyUnicode_IsLinebreak16((Py_UNICODE)buffer[j]); s3 = clock(); printf("2.4: %d\n" "1.6: %d\n", (int)(s2-s1), (int)(s3-s2)); return 0; } Output, compiled with -O3: filelen=23147 bytes 2.4: 2570000 1.6: 1230000 That's a factor 2. > The reported performance problem is more likely in the allocation > of all these splitlines results, and the copying of the same > strings over and over again. True. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 23 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From martin at v.loewis.de Wed Aug 24 14:56:30 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 24 Aug 2005 14:56:30 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430C5E6E.2040405@livinglogic.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de> Message-ID: <430C6E7E.7070106@v.loewis.de> Walter D?rwald wrote: > I think a maxsplit argument (just as for unicode.split()) would help too. Correct - that would allow to get rid of the quadratic part. We should also strive for avoiding the second copy of the line, if the user requested keepends. I wonder whether it would be worthwhile to cache the .splitlines result. An application that has just invoked .readline() will likely invoke .readline() again. If there is more than one line left, we could return the first line right away (potentially trimming the line ending if necessary). Only when a single line is left, we would attempt to read more data. In a plain .read(), we would first join the lines back. Regards, Martin From mcherm at mcherm.com Wed Aug 24 15:08:56 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 24 Aug 2005 06:08:56 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 Message-ID: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> Raymond Hettinger writes: > The latest version of PEP 348 still proposes that a bare except clause > will default to Exception instead of BaseException. Initially, I had > thought that might be a good idea but now think it is doomed and needs > to be removed from the PEP. Guido writes: > If we syntactically enforce that the bare except, if present, must be > last, would that remove your objection? I agree that a bare except in > the middle is an anomaly, but that doesn't mean we can't keep bare > except: as a shorthand for except Exception:. Explicit is better than Implicit. I think that in newly written code "except Exception:" is better (more explicit and easier to understand) than "except:" Legacy code that uses "except:" can remain unchanged *IF* the meaning of "except:" is unchanged... but I think we all agree that this is unwise because the existing meaning is a tempting trap for the unwary. So I don't see any advantage to keeping bare "except:" in the long run. What we do to ease the transition is a different question, but one more easily resolved. -- Michael Chermside From walter at livinglogic.de Wed Aug 24 16:08:12 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 24 Aug 2005 16:08:12 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430C6E7E.7070106@v.loewis.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de> Message-ID: <430C7F4C.9010703@livinglogic.de> Martin v. L?wis wrote: > Walter D?rwald wrote: > >>I think a maxsplit argument (just as for unicode.split()) would help too. > > Correct - that would allow to get rid of the quadratic part. OK, such a patch should be rather simple. I'll give it a try. > We should also strive for avoiding the second copy of the line, > if the user requested keepends. Your suggested unicode method islinebreak() would help with that. Then we could add the following to the string module: unicodelinebreaks = u"".join(unichr(c) for c in xrange(0, sys.maxunicode) if unichr(c).islinebreak()) Then if line and not keepends: line = line.splitlines(False)[0] could be if line and not keepends: line = line.rstrip(string.unicodelinebreaks) > I wonder whether it would be worthwhile to cache the .splitlines result. > An application that has just invoked .readline() will likely invoke > .readline() again. If there is more than one line left, we could return > the first line right away (potentially trimming the line ending if > necessary). Only when a single line is left, we would attempt to > read more data. In a plain .read(), we would first join the lines > back. OK, this would mean we'd have to distinguish between a direct call to read() and one done by readline() (which we do anyway through the firstline argument). Bye, Walter D?rwald From martin at v.loewis.de Wed Aug 24 16:33:50 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 24 Aug 2005 16:33:50 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430C7F4C.9010703@livinglogic.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de> Message-ID: <430C854E.1080200@v.loewis.de> Walter D?rwald wrote: > Martin v. L?wis wrote: > >> Walter D?rwald wrote: >> >>> I think a maxsplit argument (just as for unicode.split()) would help >>> too. >> >> >> Correct - that would allow to get rid of the quadratic part. > > > OK, such a patch should be rather simple. I'll give it a try. Actually, on a second thought - it would not remove the quadratic aspect. You would still copy the rest string completely on each split. So on the first split, you copy N lines (one result line, and N-1 lines into the rest string), on the second split, N-2 lines, and so on, totalling N*N/2 line copies again. The only thing you save is the join (as the rest is already joined), and the IsLineBreak calls (which are necessary only for the first line). Please see python.org/sf/1268314; it solves the problem by keeping the splitlines result. It only invokes IsLineBreak once per character, and also copies each character only once, and allocates each line only once, totalling in O(N) for these operations. It still does contain a quadratic operation: the lines are stored in a list, and the result list is removed from the list with del lines[0]. This copies N-1 pointers, result in N*N/2 pointer copies. That should still be much faster than the current code. > unicodelinebreaks = u"".join(unichr(c) for c in xrange(0, > sys.maxunicode) if unichr(c).islinebreak()) That is very inefficient. I would rather add a static list to the string module, and have a test that says assert str.unicodelinebreaks == u"".join(ch for ch in (unichr(c) for c in xrange(0, sys.maxunicode)) if unicodedata.bidirectional(ch)=='B' or unicodedata.category(ch)=='Zl') unicodelinebreaks could then be defined as # u"\r\n\x1c\x1d\x1e\x85\u2028\u2029 '\n\r\x1c\x1d\x1e\xc2\x85\xe2\x80\xa8\xe2\x80\xa9'.decode("utf-8") > OK, this would mean we'd have to distinguish between a direct call to > read() and one done by readline() (which we do anyway through the > firstline argument). See my patch. If we have cached lines, we don't need to call .read at all. Regards, Martin From foom at fuhm.net Wed Aug 24 16:34:53 2005 From: foom at fuhm.net (James Y Knight) Date: Wed, 24 Aug 2005 10:34:53 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <001d01c5a855$447a9d20$ab12c797@oemcomputer> References: <001d01c5a855$447a9d20$ab12c797@oemcomputer> Message-ID: On Aug 23, 2005, at 10:41 PM, Raymond Hettinger wrote: > [Guido van Rossum] > >> If we syntactically enforce that the bare except, if present, must be >> last, would that remove your objection? I agree that a bare except in >> the middle is an anomaly, but that doesn't mean we can't keep bare >> except: as a shorthand for except Exception:. >> > > Hmm. Prohibiting mid-suite bare excepts is progress and eliminates > the > case that causes immediate indigestion. > > As for the rest, I'm not as sure and it would be helpful to get > thoughts > from others on this one. My sense is that blocking the clause from > appearing in the middle is treating the symptom and not the disease. > I would rather see "except:" be deprecated eventually, and force the user to say either except Exception, except BaseException, or even better, except ActualExceptionIWantToCatch. James From barry at python.org Wed Aug 24 17:03:52 2005 From: barry at python.org (Barry Warsaw) Date: Wed, 24 Aug 2005 11:03:52 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: References: <001d01c5a855$447a9d20$ab12c797@oemcomputer> Message-ID: <1124895832.19291.10.camel@geddy.wooz.org> On Wed, 2005-08-24 at 10:34, James Y Knight wrote: > I would rather see "except:" be deprecated eventually, and force the > user to say either except Exception, except BaseException, or even > better, except ActualExceptionIWantToCatch. I agree about bare except, but there is a very valid use case for an except clause that catches every possible exception. We need to make sure we don't overlook this use case. As an example, say I'm building a transaction-aware system, I'm going to want to write code like this: txn = new_transaction() try: txn.begin() rtn = do_work() except AllPossibleExceptions: txn.abort() raise else: txn.commit() return rtn I'm fine with not spelling that except statement as "except:" but I don't want there to be any exceptions that can sneak past that middle suite, including non-errors like SystemExit or KeyboardInterrupt. I can't remember ever writing a bare except with a suite that didn't contain (end in?) a bare raise. Maybe we can allow bare except, but constrain things so that the only way out of its suite is via a bare raise. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050824/46c9f344/attachment.pgp From gvanrossum at gmail.com Wed Aug 24 17:10:37 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 24 Aug 2005 08:10:37 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> Message-ID: On 8/24/05, Michael Chermside wrote: > Explicit is better than Implicit. I think that in newly written code > "except Exception:" is better (more explicit and easier to understand) > than "except:" Legacy code that uses "except:" can remain unchanged *IF* > the meaning of "except:" is unchanged... but I think we all agree that > this is unwise because the existing meaning is a tempting trap for the > unwary. So I don't see any advantage to keeping bare "except:" in the > long run. What we do to ease the transition is a different question, > but one more easily resolved. OK, I'm convinced. Let's drop bare except for Python 3.0, and deprecate them until then, without changing the meaning. The deprecation message (to be generated by the compiler!) should steer people in the direction of specifying one particular exception (e.g. KeyError etc.) rather than Exception. -- --Guido van Rossum (home page: http://www.python.org/~guido/ From foom at fuhm.net Wed Aug 24 17:23:52 2005 From: foom at fuhm.net (James Y Knight) Date: Wed, 24 Aug 2005 11:23:52 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> Message-ID: On Aug 24, 2005, at 11:10 AM, Guido van Rossum wrote: > On 8/24/05, Michael Chermside wrote: > >> Explicit is better than Implicit. I think that in newly written code >> "except Exception:" is better (more explicit and easier to >> understand) >> than "except:" Legacy code that uses "except:" can remain >> unchanged *IF* >> the meaning of "except:" is unchanged... but I think we all agree >> that >> this is unwise because the existing meaning is a tempting trap for >> the >> unwary. So I don't see any advantage to keeping bare "except:" in the >> long run. What we do to ease the transition is a different question, >> but one more easily resolved. >> > > OK, I'm convinced. Let's drop bare except for Python 3.0, and > deprecate them until then, without changing the meaning. > > The deprecation message (to be generated by the compiler!) should > steer people in the direction of specifying one particular exception > (e.g. KeyError etc.) rather than Exception. I agree but there's the minor nit of non-Exception exceptions. I think it must be the case that raising an object which does not derive from an exception class must be deprecated as well in order for "except:" to be deprecated. Otherwise, there is nothing you can change "except:" to in order not to get a deprecation warning and still have your code be correct in the face of documented features of python. James From raymond.hettinger at verizon.net Wed Aug 24 17:27:19 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 24 Aug 2005 11:27:19 -0400 Subject: [Python-Dev] FW: Bare except clauses in PEP 348 Message-ID: <003901c5a8c0$51d79fc0$b729cb97@oemcomputer> Hey guys, don't give up your bare except clauses so easily. They are useful. And, if given the natural meaning of "catch everything" and put in a natural position at the end of a suite, their meaning is plain and obvious. Remember beauty counts. I don't think there would be similar temptation to eliminate a dangling else clause and replace it with "else Everything". Nor would a final default case in a switch statement benefit from being written as "default Everything". The thought is that it is okay to have useful defaults. My whole issue was that the PEP was choosing the wrong default. If we leave it alone, all is well. An empty except will continue to mean "catch everything", it will always appear at the end, its meaning will be obvious, and existing working code won't break :-) On the occasions where you really intended to catch everything, do you really want to go on an editing binge just to uglify the code to something like: try: ... except SomeException: ... except BaseException: ... It is more beautiful and clear as: try: ... except SomeException: ... except: ... To me, the latter is more attractive and is more obviously a catchall, just like an else-clause or a default-statement. It is a strong visual cue that at least one of the except clauses will always be triggered. In contrast, the first example makes you think twice about whether the final case really does get everything (sometimes implicit IS better than explicit). Raymond From shane at hathawaymix.org Wed Aug 24 18:17:20 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Wed, 24 Aug 2005 10:17:20 -0600 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <1124895832.19291.10.camel@geddy.wooz.org> References: <001d01c5a855$447a9d20$ab12c797@oemcomputer> <1124895832.19291.10.camel@geddy.wooz.org> Message-ID: <430C9D90.6090102@hathawaymix.org> Barry Warsaw wrote: > I agree about bare except, but there is a very valid use case for an > except clause that catches every possible exception. We need to make > sure we don't overlook this use case. As an example, say I'm building a > transaction-aware system, I'm going to want to write code like this: > > txn = new_transaction() > try: > txn.begin() > rtn = do_work() > except AllPossibleExceptions: > txn.abort() > raise > else: > txn.commit() > return rtn > > I'm fine with not spelling that except statement as "except:" but I > don't want there to be any exceptions that can sneak past that middle > suite, including non-errors like SystemExit or KeyboardInterrupt. > > I can't remember ever writing a bare except with a suite that didn't > contain (end in?) a bare raise. Maybe we can allow bare except, but > constrain things so that the only way out of its suite is via a bare > raise. I also use this idiom quite frequently, but I wonder if a finally clause would be a better way to write it: txn = new_transaction() try: txn.begin() rtn = do_work() finally: if exception_occurred(): txn.abort() else: txn.commit() return rtn Since this doesn't use try/except/else, it's not affected by changes to the meaning of except clauses. However, it forces more indentation and expects a new builtin, and the name "exception_occurred" is probably too long for a builtin. Now for a weird idea. txn = new_transaction() try: txn.begin() rtn = do_work() finally except: txn.abort() finally else: txn.commit() return rtn This is what I would call qualified finally clauses. The interpreter chooses exactly one of the finally clauses. If a "finally except" clause is chosen, the exception is re-raised before execution continues. Most code that currently uses bare raise inside bare except could just prefix the "except" and "else" keywords with "finally". Shane From niko at alum.mit.edu Wed Aug 24 18:29:36 2005 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 24 Aug 2005 18:29:36 +0200 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <430C9D90.6090102@hathawaymix.org> References: <001d01c5a855$447a9d20$ab12c797@oemcomputer> <1124895832.19291.10.camel@geddy.wooz.org> <430C9D90.6090102@hathawaymix.org> Message-ID: <934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu> > > txn = new_transaction() > try: > txn.begin() > rtn = do_work() > finally: > if exception_occurred(): > txn.abort() > else: > txn.commit() > return rtn > Couldn't you just do: txn = new_transaction () try: complete = 0 txn.begin () rtn = do_work () complete = 1 finally: if not complete: txn.abort () else: txn.commit () and then not need new builtins or anything fancy? Niko From mcherm at mcherm.com Wed Aug 24 18:33:00 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 24 Aug 2005 09:33:00 -0700 Subject: [Python-Dev] FW: Bare except clauses in PEP 348 Message-ID: <20050824093300.uuj0o52cj9s0wksk@login.werra.lunarpages.com> Raymond writes: > Hey guys, don't give up your bare except clauses so easily. [...] Raymond: I agree that when comparing: // Code snippet A try: ... except SomeException: ... except BaseException: ... with // Code snippet B try: ... except SomeException: ... except: ... that B is nicer than A. Slightly nicer. It's a minor esthetic point. But consider these: // Code snippet C try: ... except Exception: ... // Code snippet D try: ... except: ... Esthetically I'd say that D is nicer than A for the same reasons. It's a minor esthetic point. But you see, this case is different. You and I would likely never bother to compare C and D because they do different things! (D is equivalent to catching BaseException, not Exception). But we know that people who are not so careful or not so knowlegable WILL make this mistake... they make it all the time today! Since situation C (catching an exception) is hundreds of times more common than situation A (needing default processing for exceptions that don't get caught, but doing it with try-except instead of try-finally because the nothing-was-thrown case is different), I would FAR rather protect beginners from erroniously confusing C and D than I would provide a marginally more elegent syntax for the experts using A or B. And that elegence is arguable... there's something to be said for simplicity, and having only one kind of "except" clause for try statements is clearly simpler than having both "except :" and also bare "except:". -- Michael Chermside From martin at v.loewis.de Wed Aug 24 18:38:17 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 24 Aug 2005 18:38:17 +0200 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu> References: <001d01c5a855$447a9d20$ab12c797@oemcomputer> <1124895832.19291.10.camel@geddy.wooz.org> <430C9D90.6090102@hathawaymix.org> <934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu> Message-ID: <430CA279.3080909@v.loewis.de> Niko Matsakis wrote: > Couldn't you just do: > > txn = new_transaction () > try: > complete = 0 > txn.begin () > rtn = do_work () > complete = 1 > finally: > if not complete: txn.abort () > else: txn.commit () > > and then not need new builtins or anything fancy? I personally dislike recording the execution path in local variables. This is like setting a flag in a loop before the break, and testing the flag afterwards. You can do this, but the else: clause of the loop is just more readable. This specific fragment has also the bug that a KeyboardInterrupt before the assignment to complete will cause a NameError/UnboundLocalError; this can easily be fixed by moving the assignment before the try block. Regards, Martin From shane at hathawaymix.org Wed Aug 24 18:42:21 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Wed, 24 Aug 2005 10:42:21 -0600 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu> References: <001d01c5a855$447a9d20$ab12c797@oemcomputer> <1124895832.19291.10.camel@geddy.wooz.org> <430C9D90.6090102@hathawaymix.org> <934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu> Message-ID: <430CA36D.8000004@hathawaymix.org> Niko Matsakis wrote: >> >> txn = new_transaction() >> try: >> txn.begin() >> rtn = do_work() >> finally: >> if exception_occurred(): >> txn.abort() >> else: >> txn.commit() >> return rtn >> > > Couldn't you just do: > > txn = new_transaction () > try: > complete = 0 > txn.begin () > rtn = do_work () > complete = 1 > finally: > if not complete: txn.abort () > else: txn.commit () > > and then not need new builtins or anything fancy? That would work, though it's less readable. If I were looking over code like that written by someone else, I'd have verify that the "complete" variable is handled correctly in all cases. (As Martin noted, your code already has a bug.) The nice try/except/else idiom we have today, with a bare except and bare raise, is much easier to verify. Shane From walter at livinglogic.de Wed Aug 24 18:59:11 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 24 Aug 2005 18:59:11 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430C854E.1080200@v.loewis.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de> <430C854E.1080200@v.loewis.de> Message-ID: <430CA75F.7090900@livinglogic.de> Martin v. L?wis wrote: > Walter D?rwald wrote: > >>Martin v. L?wis wrote: >> >>>Walter D?rwald wrote: >>> >>>>I think a maxsplit argument (just as for unicode.split()) would help >>>>too. >>> >>>Correct - that would allow to get rid of the quadratic part. >> >>OK, such a patch should be rather simple. I'll give it a try. > > Actually, on a second thought - it would not remove the quadratic > aspect. At least it would remove the quadratic number of calls to _PyUnicodeUCS2_IsLinebreak(). For each character it would be called only once. > You would still copy the rest string completely on each > split. So on the first split, you copy N lines (one result line, > and N-1 lines into the rest string), on the second split, N-2 > lines, and so on, totalling N*N/2 line copies again. OK, that's true. We could prevent string copying if we kept the unsplit string and the position of the current line terminator, but this would require a "first position after a line terminator" method. > The only > thing you save is the join (as the rest is already joined), and > the IsLineBreak calls (which are necessary only for the first > line). > > Please see python.org/sf/1268314; The last part of the patch seems to be more related to bug #1235646. With the patch test_pep263 and test_codecs fail (and test_parser, but this might be unrelated): python Lib/test/test_pep263.py gives the following output: File "Lib/test/test_pep263.py", line 22 SyntaxError: list index out of range test_codecs.py has the following two complaints: File "/var/home/walter/Achtung/Python-linecache/dist/src/Lib/codecs.py", line 366, in readline self.charbuffer = lines[1] + self.charbuffer IndexError: list index out of range and File "/var/home/walter/Achtung/Python-linecache/dist/src/Lib/codecs.py", line 336, in readline line = result.splitlines(False)[0] NameError: global name 'result' is not defined > it solves the problem by > keeping the splitlines result. It only invokes IsLineBreak > once per character, and also copies each character only once, > and allocates each line only once, totalling in O(N) for > these operations. It still does contain a quadratic operation: > the lines are stored in a list, and the result list is > removed from the list with del lines[0]. This copies N-1 > pointers, result in N*N/2 pointer copies. That should still > be much faster than the current code. Using collections.deque() should get rid of this problem. >>unicodelinebreaks = u"".join(unichr(c) for c in xrange(0, >>sys.maxunicode) if unichr(c).islinebreak()) > > That is very inefficient. I would rather add a static list > to the string module, and have a test that says > > assert str.unicodelinebreaks == u"".join(ch for ch in (unichr(c) for c > in xrange(0, sys.maxunicode)) if unicodedata.bidirectional(ch)=='B' or > unicodedata.category(ch)=='Zl') You mean, in the test suite? > unicodelinebreaks could then be defined as > > # u"\r\n\x1c\x1d\x1e\x85\u2028\u2029 > '\n\r\x1c\x1d\x1e\xc2\x85\xe2\x80\xa8\xe2\x80\xa9'.decode("utf-8") That might be better, as this definition won't change very often. BTW, why the decode() call? For a Python without unicode? >>OK, this would mean we'd have to distinguish between a direct call to >>read() and one done by readline() (which we do anyway through the >>firstline argument). > > See my patch. If we have cached lines, we don't need to call .read > at all. I wonder what happens, if calls to read() and readline() are mixed (e.g. if I'm reading Fortran source or anything with a fixed line header): read() would be used to read the first n character (which joins the line buffer) and readline() reads the rest (which would split it again) etc. (Of course this could be done via a single readline() call). But, I think a maxsplit argument for splitlines() woould make sense independent of this problem. Bye, Walter D?rwald From rrr at ronadam.com Wed Aug 24 19:03:13 2005 From: rrr at ronadam.com (Ron Adam) Date: Wed, 24 Aug 2005 13:03:13 -0400 Subject: [Python-Dev] FW: Bare except clauses in PEP 348 In-Reply-To: <003901c5a8c0$51d79fc0$b729cb97@oemcomputer> References: <003901c5a8c0$51d79fc0$b729cb97@oemcomputer> Message-ID: <430CA851.6060406@ronadam.com> Raymond Hettinger wrote: > Hey guys, don't give up your bare except clauses so easily. Yes, Don't give up. I often write code starting with a bare except, then after it works, stick a raise in it to determine exactly what exception I'm catching. Then use that to rewrite a more explicit except statement. Your comment earlier about treating the symptom is also accurate. This isn't just an issue with bare excepts not being allowed in the middle, it also comes up whenever we catch exceptions out of order in the tree. Ie.. catching an exception closer to the base will block a following except clause that tries to catch an exception on the same branch. So should except clauses be checked for orderliness? Regards, Ron From gvanrossum at gmail.com Wed Aug 24 19:15:47 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 24 Aug 2005 10:15:47 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> Message-ID: On 8/24/05, James Y Knight wrote: > I think it must be the case that raising an object which does not > derive from an exception class must be deprecated as well in order > for "except:" to be deprecated. Otherwise, there is nothing you can > change "except:" to in order not to get a deprecation warning and > still have your code be correct in the face of documented features of > python. I agree; isn't that already in ther PEP? This surely has been the thinking all along. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Wed Aug 24 19:17:56 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 24 Aug 2005 10:17:56 -0700 Subject: [Python-Dev] FW: Bare except clauses in PEP 348 In-Reply-To: <003901c5a8c0$51d79fc0$b729cb97@oemcomputer> References: <003901c5a8c0$51d79fc0$b729cb97@oemcomputer> Message-ID: On 8/24/05, Raymond Hettinger wrote: > Hey guys, don't give up your bare except clauses so easily. They are an attractive nuisance by being so much shorter to type than the "right thing to do". Especially if they default to something whose use cases are rather esoteric (i.e. BaseException). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From abo at minkirri.apana.org.au Wed Aug 24 19:26:34 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Wed, 24 Aug 2005 10:26:34 -0700 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430C854E.1080200@v.loewis.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de> <430C854E.1080200@v.loewis.de> Message-ID: <1124904393.9380.29.camel@warna.corp.google.com> On Wed, 2005-08-24 at 07:33, "Martin v. L?wis" wrote: > Walter D?rwald wrote: > > Martin v. L?wis wrote: > > > >> Walter D?rwald wrote: [...] > Actually, on a second thought - it would not remove the quadratic > aspect. You would still copy the rest string completely on each > split. So on the first split, you copy N lines (one result line, > and N-1 lines into the rest string), on the second split, N-2 > lines, and so on, totalling N*N/2 line copies again. The only > thing you save is the join (as the rest is already joined), and > the IsLineBreak calls (which are necessary only for the first > line). [...] In the past, I've avoided the string copy overhead inherent in split() by using buffers... I've always wondered why Python didn't use buffer type tricks internally for split-type operations. I haven't looked at Python's string implementation, but the fact that strings are immutable surely means that you can safely and efficiently reference an implementation level "data" object for all strings... ie all strings are "buffers". The only problem I can see with this is huge "data" objects might hang around just because some small fragment of it is still referenced by a string. Surely a simple huristic or two like "if len(string) < len(data)/8: copy data; else: reference data" would go a long way towards avoiding that. In my limited playing around with manipulating of strings and benchmarking stuff, the biggest overhead is nearly always the copys. -- Donovan Baarda From walter at livinglogic.de Wed Aug 24 19:35:11 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 24 Aug 2005 19:35:11 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430C4BA1.5030503@egenix.com> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C4BA1.5030503@egenix.com> Message-ID: <430CAFCF.3040109@livinglogic.de> M.-A. Lemburg wrote: > Walter D?rwald wrote: > >>I wonder if we should switch back to a simple readline() implementation >>for those codecs that don't require the current implementation >>(basically every charmap codec). > > That would be my preference as well. The 2.4 .readline() approach > is really only needed for codecs that have to deal with encodings > that: > > a) use multi-byte formats, or > b) support more line-end formats than just CR, CRLF, LF, or > c) are stateful. > > This can easily be had by using a mix-in class for > codecs which do need the buffered .readline() approach. Should this be a mix-in or should we simply have two base classes? Which of those bases/mix-ins should be the default? >>AFAIK source files are opened in >>universal newline mode, so at least we'd get proper treatment of "\n", >>"\r" and "\r\n" line ends, but we'd loose u"\x1c", u"\x1d", u"\x1e", >>u"\x85", u"\u2028" and u"\u2029" (which are line terminators according >>to unicode.splitlines()). > > While the Unicode standard defines these characters as line > end code points, I think their definition does not necessarily > apply to data that is converted from a certain encoding to > Unicode, so that's not a big loss. > > E.g. in ASCII or Latin-1, FILE, GROUP and RECORD > SEPARATOR and NEXT LINE characters (0x1c, 0x1d, 0x1e, 0x85) > are not interpreted as line end characters. > > Furthermore, we had no reports of anyone complaining in > Python 1.6, 2.0 - 2.3 that line endings were not detected > properly. All these Python versions relied on the stream's > .readline() method to get the next line. The only bug reports > we had were for UTF-16 which falls into the above > category a) and did not support .readline() until Python 2.4. True. Bye, Walter D?rwald From martin at v.loewis.de Wed Aug 24 19:38:54 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 24 Aug 2005 19:38:54 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430CA75F.7090900@livinglogic.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de> <430C854E.1080200@v.loewis.de> <430CA75F.7090900@livinglogic.de> Message-ID: <430CB0AE.1040201@v.loewis.de> Walter D?rwald wrote: > At least it would remove the quadratic number of calls to > _PyUnicodeUCS2_IsLinebreak(). For each character it would be called only > once. Correct. However, I very much doubt that this is the cause of the slowdown. > The last part of the patch seems to be more related to bug #1235646. You mean the last chunk (linebuffer=None)? This is just the extension to reset. > With the patch test_pep263 and test_codecs fail (and test_parser, but > this might be unrelated): Oops, I thought I ran the test suite, but apparently with the patch removed. New version uploaded. > Using collections.deque() should get rid of this problem. Alright. There are so many types in Python I've never heard of :-) > You mean, in the test suite? Right. > BTW, why the decode() call? For a Python without unicode? Right. Not sure what people think whether this should still be supported, but I keep supporting it whenever I think of it. > I wonder what happens, if calls to read() and readline() are mixed (e.g. > if I'm reading Fortran source or anything with a fixed line header): > read() would be used to read the first n character (which joins the line > buffer) and readline() reads the rest (which would split it again) etc. > (Of course this could be done via a single readline() call). Then performance would drop again - it should still be correct, though. If this is becomes a frequent problem, we could satisfy read requests from the split lines as well (i.e. join as many lines as you need). However, I would rather expect that callers of read() typically want the entire file, or want to read in large chunks (with no line orientation at all). > But, I think a maxsplit argument for splitlines() woould make sense > independent of this problem. I'm not so sure anymore. It is good for consistency, but I doubt there are actual use cases: how often do you want only the first n lines of some string? Reading the first n lines of a file might be an application, but then, you would rather use .readline() directly. For readline, I don't think there is a clear case for splitting of only the first line (unless you want to return an index instead of the rest string): if the application eventually wants all of the data, we better split it right away into individual strings, instead of dealing with a gradually decreasing trailer. Anyway, I don't think we should go back to C's readline/fgets. This is just too messy wrt. buffering and text vs. binary mode. I wish Python would stop using stdio entirely. Regards, Martin From walter at livinglogic.de Wed Aug 24 20:16:39 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 24 Aug 2005 20:16:39 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430CB0AE.1040201@v.loewis.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de> <430C854E.1080200@v.loewis.de> <430CA75F.7090900@livinglogic.de> <430CB0AE.1040201@v.loewis.de> Message-ID: <430CB987.5000601@livinglogic.de> Martin v. L?wis wrote: > Walter D?rwald wrote: > >>At least it would remove the quadratic number of calls to >>_PyUnicodeUCS2_IsLinebreak(). For each character it would be called only >>once. > > Correct. However, I very much doubt that this is the cause of the > slowdown. Probably. We'd need a test with the original Argon source to really know. >>The last part of the patch seems to be more related to bug #1235646. > > You mean the last chunk (linebuffer=None)? This is just the extension > to reset. Ouch, you're right: The part of "cvs diff" was part of my checkout, not your patch. I have so many Python checkouts, that I sometimes forget which is which! ;) >>With the patch test_pep263 and test_codecs fail (and test_parser, but >>this might be unrelated): > > Oops, I thought I ran the test suite, but apparently with the patch > removed. New version uploaded. Looks much better now. >>Using collections.deque() should get rid of this problem. > > Alright. There are so many types in Python I've never heard of :-) The problem is that unicode.splitlines() returns a list, so the push/pop performance advantange of collections.deque might be eaten by having to create a collections.deque object in the first place. >>You mean, in the test suite? > > Right. > >>BTW, why the decode() call? For a Python without unicode? > > Right. Not sure what people think whether this should still be > supported, but I keep supporting it whenever I think of it. OK, so should we add this for 2.4.2 or only for 2.5? Should this really be put into string.py, or should it be a class attribute of unicode? (At least that's what was proposed for the other strings in string.py (string.whitespace etc.) too. >>I wonder what happens, if calls to read() and readline() are mixed (e.g. >>if I'm reading Fortran source or anything with a fixed line header): >>read() would be used to read the first n character (which joins the line >>buffer) and readline() reads the rest (which would split it again) etc. >>(Of course this could be done via a single readline() call). > > Then performance would drop again - it should still be correct, though. > > If this is becomes a frequent problem, we could satisfy read requests > from the split lines as well (i.e. join as many lines as you need). > However, I would rather expect that callers of read() typically want > the entire file, or want to read in large chunks (with no line > orientation at all). Agreed! Don't fix a bug that hasn't been reported! ;) >>But, I think a maxsplit argument for splitlines() woould make sense >>independent of this problem. > > I'm not so sure anymore. It is good for consistency, but I doubt there > are actual use cases: how often do you want only the first n lines > of some string? Reading the first n lines of a file might be an > application, but then, you would rather use .readline() directly. Not every unicode string is read from a StreamReader. > For readline, I don't think there is a clear case for splitting of > only the first line (unless you want to return an index instead of > the rest string): if the application eventually wants all of the > data, we better split it right away into individual strings, instead > of dealing with a gradually decreasing trailer. True, this would be best for a readline loop. Another solution would be to have a unicode.itersplitlines() and store the iterator. Then we wouldn't need a maxsplit because you simply can stop iterating once you have what you want. > Anyway, I don't think we should go back to C's readline/fgets. This > is just too messy wrt. buffering and text vs. binary mode. I don't know about C's readline, but StreamReader.read() and StreamReader.readline() are messy enough. But at least it's something we can fix ourselves. > I wish > Python would stop using stdio entirely. So reverting to the 2.3 behaviour for simple codecs is out? Bye, Walter D?rwald From barry at python.org Wed Aug 24 20:25:21 2005 From: barry at python.org (Barry Warsaw) Date: Wed, 24 Aug 2005 14:25:21 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <430CA279.3080909@v.loewis.de> References: <001d01c5a855$447a9d20$ab12c797@oemcomputer> <1124895832.19291.10.camel@geddy.wooz.org> <430C9D90.6090102@hathawaymix.org> <934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu> <430CA279.3080909@v.loewis.de> Message-ID: <1124907921.19925.5.camel@geddy.wooz.org> On Wed, 2005-08-24 at 12:38, "Martin v. L?wis" wrote: > I personally dislike recording the execution path in > local variables. This is like setting a flag in a loop > before the break, and testing the flag afterwards. > You can do this, but the else: clause of the loop is > just more readable. Agreed! > This specific fragment has also the bug that a > KeyboardInterrupt before the assignment to complete > will cause a NameError/UnboundLocalError; this > can easily be fixed by moving the assignment before > the try block. And that begs the question whether getting rid of this common idiom is trading one common problem for another. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050824/eeedd633/attachment.pgp From reinhold-birkenfeld-nospam at wolke7.net Wed Aug 24 20:33:02 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Wed, 24 Aug 2005 20:33:02 +0200 Subject: [Python-Dev] Docs/Pointer to Tools/scripts? Message-ID: Hi, after adding Oleg Broytmann's findnocoding.py to Tools/scripts, I wonder whether the Tools directory is documented at all. There are many useful scripts there which many people will not find if they are not listed anywhere in the docs. Just a thought. Reinhold -- Mail address is perfectly valid! From phd at mail2.phd.pp.ru Wed Aug 24 20:44:11 2005 From: phd at mail2.phd.pp.ru (Oleg Broytmann) Date: Wed, 24 Aug 2005 22:44:11 +0400 Subject: [Python-Dev] Docs/Pointer to Tools/scripts? In-Reply-To: References: Message-ID: <20050824184410.GB5666@phd.pp.ru> Hello! On Wed, Aug 24, 2005 at 08:33:02PM +0200, Reinhold Birkenfeld wrote: > after adding Oleg Broytmann's findnocoding.py to Tools/scripts What's more, pysource.py is more than just a script - it's a generally useful module. Thank you for committing the code. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From martin at v.loewis.de Wed Aug 24 21:15:09 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 24 Aug 2005 21:15:09 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430CB987.5000601@livinglogic.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de> <430C854E.1080200@v.loewis.de> <430CA75F.7090900@livinglogic.de> <430CB0AE.1040201@v.loewis.de> <430CB987.5000601@livinglogic.de> Message-ID: <430CC73D.1050401@v.loewis.de> Walter D?rwald wrote: >> Right. Not sure what people think whether this should still be >> supported, but I keep supporting it whenever I think of it. > > > OK, so should we add this for 2.4.2 or only for 2.5? You mean, string.unicodelinebreaks? I think something needs to be done to fix the performance problem. In doing so, API changes might occur. We should not add API changes in 2.4.2 unless they contribute to the bug fix, and even then, the release manager probably needs to approve them (in any case, they certainly need to be backwards compatible) > Should this really be put into string.py, or should it be a class > attribute of unicode? (At least that's what was proposed for the other > strings in string.py (string.whitespace etc.) too. If the 2.4.2 fix is based on this kind of data, I think it should go into a private attribute of codecs.py. For 2.5, I would put it into strings for tradition. There is no point in having some of these constants in strings and others as class attributes (unless we also add them as class attributes in 2.5, in which case adding unicodelinebreaks into strings would be pointless). So I think in 2.5, I would like to see # string.py ascii_letters = str.ascii_letters in which case unicode.linebreaks would be the right spelling. >> I'm not so sure anymore. It is good for consistency, but I doubt there >> are actual use cases: how often do you want only the first n lines >> of some string? Reading the first n lines of a file might be an >> application, but then, you would rather use .readline() directly. > > > Not every unicode string is read from a StreamReader. Sure: but how often do you want to fetch the first line of a Unicode string you happen to have in memory, without iterating over all lines eventually? > Another solution would be to have a unicode.itersplitlines() and store > the iterator. Then we wouldn't need a maxsplit because you simply can > stop iterating once you have what you want. That might work. I would then ask for itersplitlines to return pairs of (line, truncated) so you can easily know whether you merely ran into the end of the string, or whether you got a complete line (although it might be a bit too specific for the readlines() case) > So reverting to the 2.3 behaviour for simple codecs is out? I'm -1, atleast. It would also fix the problem at hand, for the reported case. However, it does leave some codecs in the cold, most notably UTF-8 (which, in turn, isn't an issue for PEP 262, since UTF-8 is built-in in the parser). I think the UTF-8 stream reader should support all Unicode line breaks, so it should continue to use the Python approach. However, UTF-8 is fairly common, so that reading an UTF-8-encoded file line-by-line shouldn't suck. Regards, Martin From raymond.hettinger at verizon.net Wed Aug 24 21:15:12 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 24 Aug 2005 15:15:12 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: Message-ID: <005a01c5a8e0$27480860$b729cb97@oemcomputer> [Guido van Rossum] > > OK, I'm convinced. Let's drop bare except for Python 3.0, and > > deprecate them until then, without changing the meaning. > > > > The deprecation message (to be generated by the compiler!) should > > steer people in the direction of specifying one particular exception > > (e.g. KeyError etc.) rather than Exception. [James Y Knight] > I agree but there's the minor nit of non-Exception exceptions. > > I think it must be the case that raising an object which does not > derive from an exception class must be deprecated as well in order > for "except:" to be deprecated. Otherwise, there is nothing you can > change "except:" to in order not to get a deprecation warning and > still have your code be correct in the face of documented features of > python. Hmm, that may not be a killer. I wonder if it is possible to treat BaseException as a constant (like we do with None) and teach the compiler to interpret it as catching anything that gets raised so that "except BaseException" will work like a bare except clause does now. Raymond From barry at python.org Wed Aug 24 21:21:47 2005 From: barry at python.org (Barry Warsaw) Date: Wed, 24 Aug 2005 15:21:47 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <005a01c5a8e0$27480860$b729cb97@oemcomputer> References: <005a01c5a8e0$27480860$b729cb97@oemcomputer> Message-ID: <1124911307.19921.11.camel@geddy.wooz.org> On Wed, 2005-08-24 at 15:15, Raymond Hettinger wrote: > Hmm, that may not be a killer. I wonder if it is possible to treat > BaseException as a constant (like we do with None) and teach the > compiler to interpret it as catching anything that gets raised so that > "except BaseException" will work like a bare except clause does now. Sorry Raymond, but my first reaction is "ick" :). That seems to be a big change in the semantics of exception matching. I think I'd rather keep bare except than add that! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050824/f1600681/attachment-0001.pgp From raymond.hettinger at verizon.net Wed Aug 24 21:30:28 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 24 Aug 2005 15:30:28 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <1124911307.19921.11.camel@geddy.wooz.org> Message-ID: <005e01c5a8e2$49102fc0$b729cb97@oemcomputer> > > Hmm, that may not be a killer. I wonder if it is possible to treat > > BaseException as a constant (like we do with None) and teach the > > compiler to interpret it as catching anything that gets raised so that > > "except BaseException" will work like a bare except clause does now. > > Sorry Raymond, but my first reaction is "ick" :). That seems to be a > big change in the semantics of exception matching. I think I'd rather > keep bare except than add that! That may be your only other option if we're waiting until 3.0 to eliminate string exceptions and class exceptions not derived from the hierarchy. Raymond From mwh at python.net Wed Aug 24 21:52:13 2005 From: mwh at python.net (Michael Hudson) Date: Wed, 24 Aug 2005 20:52:13 +0100 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <005e01c5a8e2$49102fc0$b729cb97@oemcomputer> (Raymond Hettinger's message of "Wed, 24 Aug 2005 15:30:28 -0400") References: <005e01c5a8e2$49102fc0$b729cb97@oemcomputer> Message-ID: <2mbr3nru36.fsf@starship.python.net> "Raymond Hettinger" writes: >> > Hmm, that may not be a killer. I wonder if it is possible to treat >> > BaseException as a constant (like we do with None) and teach the >> > compiler to interpret it as catching anything that gets raised so > that >> > "except BaseException" will work like a bare except clause does now. >> >> Sorry Raymond, but my first reaction is "ick" :). That seems to be a >> big change in the semantics of exception matching. I think I'd rather >> keep bare except than add that! > > That may be your only other option if we're waiting until 3.0 to > eliminate string exceptions and class exceptions not derived from the > hierarchy. I really hope string exceptions can be killed off before 3.0. They should be fully deprecated in 2.5. Cheers, mwh -- The Oxford Bottled Beer Database heartily disapproves of the excessive consumption of alcohol. No, really. -- http://www.bottledbeer.co.uk/beergames.html (now sadly gone to the big 404 in the sky) From walter at livinglogic.de Wed Aug 24 22:14:32 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 24 Aug 2005 22:14:32 +0200 Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430CC73D.1050401@v.loewis.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de> <430C854E.1080200@v.loewis.de> <430CA75F.7090900@livinglogic.de> <430CB0AE.1040201@v.loewis.de> <430CB987.5000601@livinglogic.de> <430CC73D.1050401@v.loewis.de> Message-ID: <671A6329-ED68-491F-84CB-1D2CF00A2F6A@livinglogic.de> Am 24.08.2005 um 21:15 schrieb Martin v. L?wis: > Walter D?rwald wrote: > > >>> Right. Not sure what people think whether this should still be >>> supported, but I keep supporting it whenever I think of it. >>> >> >> OK, so should we add this for 2.4.2 or only for 2.5? >> > > You mean, string.unicodelinebreaks? > Yes. > I think something needs to be > done to fix the performance problem. In doing so, API changes > might occur. We should not add API changes in 2.4.2 unless they > contribute to the bug fix, and even then, the release manager > probably needs to approve them (in any case, they certainly > need to be backwards compatible) > OK. Your version of the patch (without replacing line = line.splitlines(False)[0] with something better) might be enough for 2.4.2. >> Should this really be put into string.py, or should it be a class >> attribute of unicode? (At least that's what was proposed for the >> other >> strings in string.py (string.whitespace etc.) too. >> > > If the 2.4.2 fix is based on this kind of data, I think it should go > into a private attribute of codecs.py. > I think codecs.unicodelinebreaks has one big problem: it will not work for codecs that do str->str decoding. > For 2.5, I would put it > into strings for tradition. There is no point in having some of these > constants in strings and others as class attributes (unless we also > add them as class attributes in 2.5, in which case adding > unicodelinebreaks into strings would be pointless). > > So I think in 2.5, I would like to see > > # string.py > ascii_letters = str.ascii_letters > > in which case unicode.linebreaks would be the right spelling. > And it would have the advantage, that it could work both with str and unicode if we had both str.linebreaks and unicode.linebreaks >>> I'm not so sure anymore. It is good for consistency, but I doubt >>> there >>> are actual use cases: how often do you want only the first n lines >>> of some string? Reading the first n lines of a file might be an >>> application, but then, you would rather use .readline() directly. >>> >> >> Not every unicode string is read from a StreamReader. >> > > Sure: but how often do you want to fetch the first line of a Unicode > string you happen to have in memory, without iterating over all lines > eventually? > I don't know. The only obvious spot in the standard library (apart from codecs.py) seems to be def shortdescription(self): return self.description().splitlines() [0] in Lib/plat-mac/pimp.py >> Another solution would be to have a unicode.itersplitlines() and >> store >> the iterator. Then we wouldn't need a maxsplit because you simply can >> stop iterating once you have what you want. >> > > That might work. I would then ask for itersplitlines to return pairs > of (line, truncated) so you can easily know whether you merely ran > into the end of the string, or whether you got a complete line > (although it might be a bit too specific for the readlines() case) > Or maybe (line, terminatorlength) which gives you the same info (terminatorlength == 0 means truncated) and makes it easy to strip the terminator. >> So reverting to the 2.3 behaviour for simple codecs is out? >> > > I'm -1, atleast. It would also fix the problem at hand, for the > reported > case. However, it does leave some codecs in the cold, most notably > UTF-8 (which, in turn, isn't an issue for PEP 262, since UTF-8 is > built-in in the parser). > You meant PEP 263, right? > I think the UTF-8 stream reader should support > all Unicode line breaks, so it should continue to use the Python > approach. > OK. > However, UTF-8 is fairly common, so that reading an > UTF-8-encoded file line-by-line shouldn't suck. > OK, so what's missing is a solution for str->str codecs (or we keep line = line.splitlines(False)[0] and test, whether this is fast enough). Bye, Walter D?rwald From tzot at mediconsa.com Wed Aug 24 22:48:28 2005 From: tzot at mediconsa.com (Christos Georgiou) Date: Wed, 24 Aug 2005 23:48:28 +0300 Subject: [Python-Dev] Docs/Pointer to Tools/scripts? References: Message-ID: "Reinhold Birkenfeld" wrote in message news:deiegu$jqf$1 at sea.gmane.org... > Hi, > > after adding Oleg Broytmann's findnocoding.py to Tools/scripts, I wonder > whether the Tools directory is documented at all. There are many useful > scripts there which many people will not find if they are not listed > anywhere in the docs. AFAIK the only documentation is the README file in said directory. From walter at livinglogic.de Wed Aug 24 23:12:38 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed, 24 Aug 2005 23:12:38 +0200 Subject: [Python-Dev] [Argon] Re: 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de> <430C854E.1080200@v.loewis.de> <430CA75F.7090900@livinglogic.de> <430CB0AE.1040201@v.loewis.de> <430CB987.5000601@livinglogic.de> Message-ID: <8FD4A0C3-D54B-403C-9BC7-052D2FB1F0E5@livinglogic.de> Am 24.08.2005 um 20:20 schrieb Greg Wilson: >>> Walter D?rwald wrote: >>> >>>> At least it would remove the quadratic number of calls to >>>> _PyUnicodeUCS2_IsLinebreak(). For each character it would be >>>> called only >>>> once. > >> Martin v. L?wis wrote: >> >>> Correct. However, I very much doubt that this is the cause of the >>> slowdown. > >> Walter D?rwald wrote: >> Probably. We'd need a test with the original Argon source to >> really know. > > We can do that. So, can you try Martin's patch? >> OK, so should we add this for 2.4.2 or only for 2.5? > > 2.4.2 please ;-) If we use the patch as is, I think it can go into 2.4.2. Bye, Walter D?rwald From martin at v.loewis.de Wed Aug 24 23:37:53 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 24 Aug 2005 23:37:53 +0200 Subject: [Python-Dev] Docs/Pointer to Tools/scripts? In-Reply-To: References: Message-ID: <430CE8B1.4030409@v.loewis.de> Reinhold Birkenfeld wrote: > after adding Oleg Broytmann's findnocoding.py to Tools/scripts, I > wonder whether the Tools directory is documented at all. There are > many useful scripts there which many people will not find if they are > not listed anywhere in the docs. Christos already mentioned it: there is a README file in both Tools and Tools/scripts; you should update it whenever you add something. Regards, Martin From gvanrossum at gmail.com Thu Aug 25 02:28:35 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 24 Aug 2005 17:28:35 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <2mbr3nru36.fsf@starship.python.net> References: <005e01c5a8e2$49102fc0$b729cb97@oemcomputer> <2mbr3nru36.fsf@starship.python.net> Message-ID: On 8/24/05, Michael Hudson wrote: > I really hope string exceptions can be killed off before 3.0. They > should be fully deprecated in 2.5. But what about class exceptions that don't inherit from Exception? That will take a while before we can deprecate that. Anyway, there have been plenty of cases where I was only interested in catching arbitrary exceptions generated by *Python* (as opposed to broken 3rd party code or even obscure Python library code) and those all inherit from Exception. And in those cases I've written "except Exception:" and so far never regretted it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bcannon at gmail.com Thu Aug 25 03:39:35 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 24 Aug 2005 18:39:35 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> Message-ID: On 8/24/05, Guido van Rossum wrote: > On 8/24/05, James Y Knight wrote: > > I think it must be the case that raising an object which does not > > derive from an exception class must be deprecated as well in order > > for "except:" to be deprecated. Otherwise, there is nothing you can > > change "except:" to in order not to get a deprecation warning and > > still have your code be correct in the face of documented features of > > python. > > I agree; isn't that already in ther PEP? This surely has been the > thinking all along. > Requiring inheritance of BaseException in order to pass it to 'raise' has been in the PEP since the beginning. -Brett From bcannon at gmail.com Thu Aug 25 03:43:23 2005 From: bcannon at gmail.com (Brett Cannon) Date: Wed, 24 Aug 2005 18:43:23 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> Message-ID: On 8/24/05, Guido van Rossum wrote: > On 8/24/05, Michael Chermside wrote: > > Explicit is better than Implicit. I think that in newly written code > > "except Exception:" is better (more explicit and easier to understand) > > than "except:" Legacy code that uses "except:" can remain unchanged *IF* > > the meaning of "except:" is unchanged... but I think we all agree that > > this is unwise because the existing meaning is a tempting trap for the > > unwary. So I don't see any advantage to keeping bare "except:" in the > > long run. What we do to ease the transition is a different question, > > but one more easily resolved. > > OK, I'm convinced. Let's drop bare except for Python 3.0, and > deprecate them until then, without changing the meaning. > Woohoo! I am currently on vacation before school starts (orientation is Sept 1., classes start Sept. 6), so it might take me a little while to edit the PEP, but I will try to fit into my schedule ASAP (assuming the tide doesn't turn on me before then). > The deprecation message (to be generated by the compiler!) should > steer people in the direction of specifying one particular exception > (e.g. KeyError etc.) rather than Exception. Is there any desire for a __future__ statement that makes it a syntax error? How about making 'raise' statements only work with objects that inherit from BaseException? -Brett From gvanrossum at gmail.com Thu Aug 25 04:02:00 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 24 Aug 2005 19:02:00 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> Message-ID: On 8/24/05, Brett Cannon wrote: > Is there any desire for a __future__ statement that makes it a syntax > error? How about making 'raise' statements only work with objects > that inherit from BaseException? I doubt it. Few people are going to put a __future__ statement in to make sure that *don't* use a particular feature: it's just as easy to grep your source code for "except:". __future__ is in general only used to enable new syntax that previously has a different meaning. Anyway, you can make it an error globally by using the -W option creatively. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From amk at amk.ca Thu Aug 25 04:33:42 2005 From: amk at amk.ca (A.M. Kuchling) Date: Wed, 24 Aug 2005 22:33:42 -0400 Subject: [Python-Dev] New mailbox module Message-ID: <20050825023342.GA20941@rogue.amk.ca> Gregory K. Johnson, who's been working on the mailbox module in nondist/sandbox/mailbox for Google's Summer of Code, thinks his project is essentially complete. He's added the ability to modifying mailboxes by adding and removing messages, adding test cases for the new features, and written the corresponding documentation. So, it's time to start considering it for inclusion in the standard library. This is a big change to a non-obscure module, so don't feel able to make this decision on my own. I believe the code quality is acceptable, but would appreciate comments on any cleanups that need to be made. I still need to read through the docs and make editing suggestions, and check that the code is still backward-compatible with the old version of the module. --amk From barry at python.org Thu Aug 25 06:22:43 2005 From: barry at python.org (Barry Warsaw) Date: Thu, 25 Aug 2005 00:22:43 -0400 Subject: [Python-Dev] New mailbox module In-Reply-To: <20050825023342.GA20941@rogue.amk.ca> References: <20050825023342.GA20941@rogue.amk.ca> Message-ID: <1124943762.10479.0.camel@geddy.wooz.org> On Wed, 2005-08-24 at 22:33, A.M. Kuchling wrote: > So, it's time to start considering it for inclusion in the standard > library. This is a big change to a non-obscure module, so don't feel > able to make this decision on my own. > > I believe the code quality is acceptable, but would appreciate > comments on any cleanups that need to be made. I still need to read > through the docs and make editing suggestions, and check that the code > is still backward-compatible with the old version of the module. I plan to take a look at it, but won't get a chance to do so for several days. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050825/fd21d4b6/attachment.pgp From foom at fuhm.net Thu Aug 25 06:45:12 2005 From: foom at fuhm.net (James Y Knight) Date: Thu, 25 Aug 2005 00:45:12 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> Message-ID: <5BB88A76-FB4E-41F7-B82D-4B7C5B6D28DD@fuhm.net> On Aug 24, 2005, at 9:39 PM, Brett Cannon wrote: > On 8/24/05, Guido van Rossum wrote: >> On 8/24/05, James Y Knight wrote: >>> I think it must be the case that raising an object which does not >>> derive from an exception class must be deprecated as well in order >>> for "except:" to be deprecated. Otherwise, there is nothing you can >>> change "except:" to in order not to get a deprecation warning and >>> still have your code be correct in the face of documented >>> features of >>> python. >>> >> >> I agree; isn't that already in ther PEP? This surely has been the >> thinking all along. >> >> > > Requiring inheritance of BaseException in order to pass it to 'raise' > has been in the PEP since the beginning. Yes, it talks about that as a change that will happen in Python 3.0. I was responding to >> OK, I'm convinced. Let's drop bare except for Python 3.0, and >> deprecate them until then, without changing the meaning. which is talking about deprecating bare excepts in Python 2.5. Now maybe it's the idea that everything that's slated for removal in Python 3.0 by PEP 348 is supposed to be getting a deprecation warning in Python 2.5, but that certainly isn't stated. The transition plan section says that all that will happen in Python 2.5 is the addition of "BaseException". James From gvanrossum at gmail.com Thu Aug 25 07:13:09 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed, 24 Aug 2005 22:13:09 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <5BB88A76-FB4E-41F7-B82D-4B7C5B6D28DD@fuhm.net> References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> <5BB88A76-FB4E-41F7-B82D-4B7C5B6D28DD@fuhm.net> Message-ID: On 8/24/05, James Y Knight wrote: > On Aug 24, 2005, at 9:39 PM, Brett Cannon wrote: > > On 8/24/05, Guido van Rossum wrote: > >> On 8/24/05, James Y Knight wrote: > >>> I think it must be the case that raising an object which does not > >>> derive from an exception class must be deprecated as well in order > >>> for "except:" to be deprecated. Otherwise, there is nothing you can > >>> change "except:" to in order not to get a deprecation warning and > >>> still have your code be correct in the face of documented > >>> features of > >>> python. > >>> > >> > >> I agree; isn't that already in ther PEP? This surely has been the > >> thinking all along. > >> > >> > > > > Requiring inheritance of BaseException in order to pass it to 'raise' > > has been in the PEP since the beginning. > > Yes, it talks about that as a change that will happen in Python 3.0. > I was responding to > > >> OK, I'm convinced. Let's drop bare except for Python 3.0, and > >> deprecate them until then, without changing the meaning. > > which is talking about deprecating bare excepts in Python 2.5. Now > maybe it's the idea that everything that's slated for removal in > Python 3.0 by PEP 348 is supposed to be getting a deprecation warning > in Python 2.5, but that certainly isn't stated. The transition plan > section says that all that will happen in Python 2.5 is the addition > of "BaseException". Then maybe the PEP isn't perfect just yet. :-) It's never too early to start deprecating a feature we know will disappear in 3.0. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh at python.net Thu Aug 25 10:16:02 2005 From: mwh at python.net (Michael Hudson) Date: Thu, 25 Aug 2005 09:16:02 +0100 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: (Guido van Rossum's message of "Wed, 24 Aug 2005 17:28:35 -0700") References: <005e01c5a8e2$49102fc0$b729cb97@oemcomputer> <2mbr3nru36.fsf@starship.python.net> Message-ID: <2m7jeasa7x.fsf@starship.python.net> Guido van Rossum writes: > On 8/24/05, Michael Hudson wrote: >> I really hope string exceptions can be killed off before 3.0. They >> should be fully deprecated in 2.5. > > But what about class exceptions that don't inherit from Exception? > That will take a while before we can deprecate that. Oh, for sure. I didn't mean to imply anything else. Cheers, mwh -- "Sturgeon's Law (90% of everything is crap) applies to Usenet." "Nothing guarantees that the 10% isn't crap, too." -- Gene Spafford's Axiom #2 of Usenet, and a corollary From t-meyer at ihug.co.nz Thu Aug 25 10:51:18 2005 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Thu, 25 Aug 2005 20:51:18 +1200 Subject: [Python-Dev] python-dev Summary for 2005-08-01 through 2005-08-15 [draft] Message-ID: Here's August Part One. As usual, if anyone can spare the time to proofread this, that would be great! Please send any corrections or suggestions to Steve (steven.bethard at gmail.com) and/or me, rather than cluttering the list. Ta! ============= Announcements ============= ---------------------------- QOTF: Quote of the Fortnight ---------------------------- Some wise words from Donovan Baarda in the PEP 347 discussions: It is true that some well designed/developed software becomes reliable very quickly. However, it still takes heavy use over time to prove that. Contributing thread: - `PEP: Migrating the Python CVS to Subversion `__ [SJB] ------------ Process PEPs ------------ The PEP editors have introduced a new PEP category: "Process", for PEPs that don't fit into the "Standards Track" and "Informational" categories. More detail can be found in `PEP 1`_, which it itself a Process PEP. .. _PEP 1: http://www.python.org/peps/pep-0001.html Contributing thread: - `new PEP type: Process `__ [TAM] ----------------------------------------------- Tentative Schedule for 2.4.2 and 2.5a1 Releases ----------------------------------------------- Python 2.4.2 is tentatively scheduled for a mid-to-late September release, and a first alpha of Python 2.5 for March 2006 (with a final release around May/June). This means that a PEP for the 2.5 release, detailing what will be included, will likely be created soon; at present there are various accepted PEPs that have not yet been implemented. Contributing thread: - `plans for 2.4.2 and 2.5a1 `__ [TAM] ========= Summaries ========= ------------------------------- Moving Python CVS to Subversion ------------------------------- The `PEP 347`_ discussion from last fortnight continued this week, with a revision of the PEP, and a lot more discussion about possible version control software (RCS) for the Python repository, and where the repository should be hosted. Note that this is not a discussion about bug trackers, which will remain with Sourceforge (unless a separate PEP is developed for moving that). Many revision control systems were extensively discussed, including `Subversion`_ (SVN), `Perforce`_, and `Monotone`_. Whichever system is moved to, it should be able to be hosted somewhere (if *.python.org, then it needs to be easily installable), needs to have software available to convert a repository from CVS, and ideally would be open-source; similarity to CVS is also an advantage in that it requires a smaller learning curve for existing developers. While Martin isn't willing to discuss every system there is, he will investigate those that make him curious, and will add other people's submissions to the PEP, where appropriate. The thread included a short discussion about the authentication mechanism that svn.python.org will use; svn+ssh seems to be a clear winner, and a test repository will be setup by Martin next fortnight. The possibility of moving to a distributed revision control system (particularly `Bazaar-NG`_) was also brought up. Many people liked the idea of using a distributed revision control system, but it seems unlikely that Bazaar-NG is mature enough to be used for the main Python repository at the current time (a move to it at a later time is possible, but outside the scope of the PEP). Distributed RCS are meant to reduce the barrier to participation (anyone can create the their own branches, for example); Bazaar-NG is also implemented in Python, which is of some benefit. James Y Knight pointed out `svk`_, which lets developers create their own branches within SVN. In general, the python-dev crowd is in favour of moving to SVN. Initial concern about the demands on the volunteer admins should the repository be hosted at svn.python.org were addressed by Barry Warsaw, who believes that the load will be easily managed with the existing volunteers. Various alternative hosts were discussed, and if detailed reports about any of them are created, these can be added to the PEP. While the fate of all PEPS lie with the BDFL (Guido), it is likely that the preferences of those that frequently check in changes, the pydotorg admins, and the release managers (who have all given favourable reports so far), will have a significant effect on the pronouncement of this PEP. .. _PEP 347: http://www.python.org/peps/pep-0347.html .. _svk: http://svk.elixus.org/ .. _Perforce: http://www.perforce.com/ .. _Subversion: http://subversion.tigris.org/ .. _Monotone: http://venge.net/monotone/ .. _Bazaar-NG: http://www.bazaar-ng.org/ Contributing threads: - `PEP: Migrating the Python CVS to Subversion `__ - `PEP 347: Migration to Subversion `__ - `Hosting svn.python.org `__ - `Fwd: Distributed RCS `__ - `cvs to bzr? `__ - `Distributed RCS `__ - `Fwd: PEP: Migrating the Python CVS to Subversion `__ - `On distributed vs centralised SCM for Python `__ [TAM] ------------------------------------------ PEP 348: Exception Hierarchy in Python 3.0 ------------------------------------------ This fortnight mostly concluded the previous discussion about `PEP 348`_, which sets out a roadmap for changes to the exception hierarchy in Python 3.0. The proposal was heavily scaled back to retain most of the current exception hierarchy unchanged. A new exception, BaseException, will be introduced above Exception in the current hierarchy, and KeyboardInterrupt and SystemExit will become siblings of Exception. The goal here is that:: except Exception: will now do the right thing for most cases, that is, it will catch all the exceptions that you can generally recover from. The PEP would also move NotImplementedError out from under RuntimeError, and alter the semantics of the bare except so that:: except: is the equivalent of:: except Exception: Only BaseException will appear in Python 2.5. The remaining modifications will not occur until Python 3.0. .. _PEP 348: http://www.python.org/peps/pep-0348.html Contributing threads: - `Pre-PEP: Exception Reorganization for Python 3.0 `__ - `PEP, take 2: Exception Reorganization for Python 3.0 `__ - `Exception Reorg PEP checked in `__ - `PEP 348: Exception Reorganization for Python 3.0 `__ - `Major revision of PEP 348 committed `__ - `Exception Reorg PEP revised yet again `__ - `PEP 348 and ControlFlow `__ - `PEP 348 (exception reorg) revised again `__ [SJB] ---------------------- Moving towards Unicode ---------------------- Neil Schemenauer presented `PEP 349`_, which tries to ease the transition to Python 3.0, in which there will be a bytes() type for byte data and a str() type for text data. Currently to convert an object to text, you have one of three options: * Call str(). This breaks with a UnicodeEncodeError if the object is of type unicode (or a subtype) or can only represent itself in unicode and therefore returns unicode from __str__. * Call unicode(). This can break external code that is not yet Unicode-safe and that passed a str object to your code but got a unicode object back. * Use the "%s" format specifier. This breaks with a UnicodeEncodeError if the object can only represent itself in unicode and therefore returns unicode from __str__. `PEP 349`_ attempts to address this problem by introducing a text() builtin which returns str or unicode instances unmodified, and returns the result of calling __str__() on the object otherwise. Guido preferred to instead relax the restrictions on str() to allow it to return unicode objects. Neil implemented such a patch, and found that it broke only two test cases. The discussion stopped shortly after Neil's report however, so it was unclear if any permanent changes had been agreed upon. Guido made a few other Python 3.0 suggestions in this thread: * The bytes() type should be mutable with a corresponding frozenbytes() immutable type * Opening a file in binary or text mode would cause it to return bytes() or str() objects, respectively * The file type should grow a getpos()/setpos() pair that are identical to tell()/seek() when a file is open in binary mode, and which work like tell()/seek() but on characters instead of bytes when a file is open in text mode However, none of these seemed to be solid commitments. .. _PEP 349: http://www.python.org/peps/pep-0349.html Contributing threads: - `PEP: Generalised String Coercion `__ - `Generalised String Coercion `__ [SJB] ---------------------------- PEP 344 and reference cycles ---------------------------- Armin Rigo brought up an issue with `PEP 344`_ which proposes, among other things, adding a __traceback__ attribute to exceptions to avoid the hassle of extracting it from sys.exc_info(). Armin pointed out that if exceptions grow a __traceback__ attribute, every statement:: except Exception, e: will create a cycle:: e.__traceback__.tb_frame.f_locals['e'] Despite the fact that Python has cyclic garbage collection, there are still some situations where cycles like this can cause problems. Armin showed an example of such a case:: class X: def __del__(self): try: typo except Exception, e: e_type, e_value, e_tb = sys.exc_info() Even in current Python, instances of the X class are uncollectible. When garbage collection runs and tries to collect an X object, it calls the __del__() method. This creates the cycle:: e_tb.tb_frame.f_locals['e_tb'] The X object itself is available through this cycle (in ``f_locals['self']``), so the X object's refcount does not drop to 0 when __del__() returns, so it cannot be collected. The next time garbage collection runs, it finds that the X object has not been collected, calls its __del__() method again and repeats the process. Tim Peters suggested this problem could be solved by declaring that __del__() methods are called exactly once. This allows the above X object to be collected because on the second run of the garbage collection, __del__() is not called again. Thus, the refcount of the X object is not incremented, and so it is collected by garbage collection. However, guaranteeing that __del__() is called only once means keeping track somehow of which objects' __del__() methods were called, which seemed somewhat unattractive. There was also brief talk about removing __del__ in favor of weakrefs, but those waters seemed about as murky as the garbage collection ones. .. _PEP 344: http://www.python.org/peps/pep-0344.html Contributing thread: - `__traceback__ and reference cycles `__ [SJB] ---------------------------- Style for raising exceptions ---------------------------- Guido explained that these days exceptions should always be raised as:: raise SomeException("some argument") instead of:: raise SomeException, "some argument" The second will go away in Python 3.0, and is only present now for backwards compatibility. (It was necessary when strings could be exceptions, in order to pass both the exception "type" and message.) PEPs 8_ and 3000_ were accordingly updated. .. _8: http://www.python.org/peps/pep-0008.html .. _3000: http://www.python.org/peps/pep-3000.html Contributing threads: - `PEP 8: exception style `__ - `FW: PEP 8: exception style `__ [SJB] ----------------------------------- Skipping list comprehensions in pdb ----------------------------------- When using pdb, the only way to skip to the end of a loop is to set a breakpoint on the line after the loop. Ilya Sandler suggested adding an optimal numeric argument to pdb's "next" comment to indicate how many lines of code should be skipped. Martin v. L?wis pointed out that this differs from gdb's "next " command, which does "next" n times. Ilya suggested implementing gdb's "until" command instead, which gained Martin's approval. It was also pointed out that pdb is one of the less Pythonic modules, particularly in terms of the ability to subclass/extend, and would be a good candidate for rewriting, if anyone had the inclination and time. Contributing threads: - `pdb: should next command be extended? `__ - `an alternative suggestion, Re: pdb: should next command be extended? `__ [TAM] ------------------ Sets in Python 2.5 ------------------ Raymond Hettinger has been checking-in the new implementation for sets in Python 2.5. The implementation is based heavily on dictobject.c, the code for Python dict() objects, and generally deviates only when there is an obvious gain in doing so. Raymond posted his new API for discussion, but there didn't appear to be any comments. Contributing threads: - `[Python-checkins] python/dist/src/Objects setobject.c, 1.45, 1.46 `__ - `Discussion draft: Proposed Py2.5 C API for set and frozenset objects `__ [SJB] ================================ Deferred Threads (for next time) ================================ - `SWIG and rlcompleter `__ =============== Skipped Threads =============== - `Extension of struct to handle non byte aligned values? `__ - `Syscall Proxying in Python `__ - `__autoinit__ (Was: Proposal: reducing self.x=x; self.y=y; self.z=z boilerplate code) `__ - `Weekly Python Patch/Bug Summary `__ - `PEP 342 Implementation `__ - `String exceptions in Python source `__ - `[ python-Patches-790710 ] breakpoint command lists in pdb `__ - `[C++-sig] GCC version compatibility `__ - `PyTuple_Pack added references undocumented `__ - `PEP-- Context Managment variant `__ - `Sourceforge CVS down? `__ - `PSF grant / contacts `__ - `Python + Ping `__ - `Terminology for PEP 343 `__ - `dev listinfo page (was: Re: Python + Ping) `__ - `set.remove feature/bug `__ - `Extension to dl module to allow passing strings from native function `__ - `build problems on macosx (CVS HEAD) `__ - `request for code review - hashlib - patch #1121611 `__ - `python-dev Summary for 2005-07-16 through 2005-07-31 [draft] `__ - `string_join overrides TypeError exception thrown in generator `__ - `implementation of copy standard lib `__ - `xml.parsers.expat no userdata in callback functions `__ From mal at egenix.com Thu Aug 25 11:35:59 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 25 Aug 2005 11:35:59 +0200 Subject: [Python-Dev] Style for raising exceptions (python-dev Summary for 2005-08-01 through 2005-08-15 [draft]) In-Reply-To: References: Message-ID: <430D90FF.6060206@egenix.com> I must have missed this one: > ---------------------------- > Style for raising exceptions > ---------------------------- > > Guido explained that these days exceptions should always be raised as:: > > raise SomeException("some argument") > > instead of:: > > raise SomeException, "some argument" > > The second will go away in Python 3.0, and is only present now for backwards > compatibility. (It was necessary when strings could be exceptions, in > order to pass both the exception "type" and message.) PEPs 8_ and 3000_ > were accordingly updated. AFAIR, the second form was also meant to be able to defer the instantiation of the exception class until really needed in order to reduce the overhead related to raising exceptions in Python. However, that optimization never made it into the implementation, I guess. > .. _8: http://www.python.org/peps/pep-0008.html > .. _3000: http://www.python.org/peps/pep-3000.html > > Contributing threads: > > - `PEP 8: exception style > `__ > - `FW: PEP 8: exception style > `__ -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 25 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From raymond.hettinger at verizon.net Thu Aug 25 11:35:31 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 25 Aug 2005 05:35:31 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: Message-ID: <000d01c5a958$566e1bc0$b729cb97@oemcomputer> > > OK, I'm convinced. Let's drop bare except for Python 3.0, and > > deprecate them until then, without changing the meaning. > > > > Woohoo That's no cause for celebration. Efforts to improve Py3.0 have spilled over into breaking Py2.x code with no compensating benefits. Bare except clauses appear in almost every Python book that has ever been written and occur at least once in most major Python applications. I had thought the plan was to introduce Py3.0 capabilities into 2.x as they become possible but not to break anything. Isn't that why string exceptions, buffer(), and repr() still live and breathe? We don't have to wreck 2.x in order to make 3.0 better. I wish the 3.0 PEPs would stop until we are actually working on the project and have some chance of making people's lives better. If people avoid 2.5 just to avert unnecessary breakage, then Py3.0 doesn't benefit at all. I propose that the transition plan be as simple as introducing BaseException. This allows people to write code that will work on both 2.x and 3.0. It doesn't break anything. The guidance for cross-version (2.5 to 3.0) code would be: * To catch all but terminating exceptions, write: except (KeyError, SystemExit): raise except Exception: ... * To catch all exceptions, write: except BaseException: ... To make the code also run on 2.4 and prior, add transition code: try: BaseException except NameError: class BaseException(Exception): pass With that minimal guidance, people can write code that works on from 2.0 to 3.0 and not break anything that is currently working. No deprecations are necessary. Remember, the ONLY benefit from the whole PEP is that in 3.0, it will no longer be necessary to write "except (KeyError, SystemExit): raise". Steven and Jack's research show that that doesn't arise much in practice anyway. IOW, there's nothing worth inflicting destruction on tons of 2.x code. Raymond From mcherm at mcherm.com Thu Aug 25 14:28:44 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Thu, 25 Aug 2005 05:28:44 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 Message-ID: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com> Raymond writes: > Efforts to improve Py3.0 have spilled > over into breaking Py2.x code with no compensating benefits. [...] > We don't have to wreck 2.x in order to make 3.0 better. I think you're overstating things a bit here. > Remember, the ONLY benefit from the whole PEP is that in 3.0, it will no > longer be necessary to write "except (KeyError, SystemExit): raise". > [...] IOW, there's nothing worth inflicting destruction on tons of > 2.x code. And now I *KNOW* you're overstating things. There are LOTS of benefits to the PEP in 3.0. My own personal favorite is that users can be guaranteed that all exceptions thrown will share a particular common ancestor and type. And no one is proposing "destruction" of 2.x code. On the other hand, I thought these were very good points: > Bare > except clauses appear in almost every Python book that has ever been > written and occur at least once in most major Python applications. [...] > I had thought the plan was to introduce Py3.0 capabilities into 2.x as > they become possible but not to break anything. [...] > I propose that the transition plan be as simple as introducing > BaseException. This allows people to write code that will work on both > 2.x and 3.0. I think the situation is both better than and worse than you describe. The PEP is now proposing that bare "except:" be removed in Python 3.0. If I understand Guido correctly, he is proposing that in 2.5 the use of bare "except:" generate a PendingDeprecationWarning so that conscientious developers who want to write code now that will continue to work in Python 3.0 can avoid using bare "except:". Perhaps I'm misreading him here, but I presume this was intended as a PENDINGDeprecationWarning so that it's easy to ignore. But it's a bit worse than it might seem, because conscientious users aren't ABLE to write safe 2.5 code that will run in 3.0. The problem arises when you need to write code that calls someone else's library but then unconditionally recovers from errors in it. Correct 2.4 syntax for this reads as follows: try: my_result = call_some_library(my_data) except (KeyboardInterrupt, MemoryError, SystemError): raise except: report_error() Correct 3.0 syntax will read like this: try: my_result = call_some_library(my_data) except (KeyboardInterrupt, MemoryError, SystemError): raise except BaseException: report_error() But no syntax will work in BOTH 2.5 and 3.0. The 2.4 syntax is illegal in 3.0, and the 3.0 syntax fails to catch exceptions that do not inherit from BaseException. Such exceptions are deprecated (by documentation, if not by code) so our conscientious programmer will never raise them and the standard library avoids doing so. But "call_some_library()" was written by some less careful developer, and may well contain these atavisims. The only complete solution that comes to mind immediately is for the raising of anything not extending BaseException to raise a PendingDeprecationWarning as well. Then the conscientious developer can feel confident again so long as her unit tests are reasonably exhaustive. If we cannot produce a warning for these, then I'd rather not produce the warning for the use of bare "except:". After all, as it's been pointed out, if the use of bare "except:" is all you are interested in it is quite easy to grep the code to find all uses. -- Michael Chermside From raymond.hettinger at verizon.net Thu Aug 25 15:03:36 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 25 Aug 2005 09:03:36 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com> Message-ID: <001b01c5a975$68909220$b729cb97@oemcomputer> > > Efforts to improve Py3.0 have spilled > > over into breaking Py2.x code with no compensating benefits. [...] > > We don't have to wreck 2.x in order to make 3.0 better. > > I think you're overstating things a bit here. It's only an overstatement if Guido didn't mean what he said. If bare except clauses are deprecated in 2.x, it WILL affect tons of existing code and invalidate a portion of almost all Python books. > > Remember, the ONLY benefit from the whole PEP is that in 3.0, it will no > > longer be necessary to write "except (KeyError, SystemExit): raise". > > [...] IOW, there's nothing worth inflicting destruction on tons of > > 2.x code. > > And now I *KNOW* you're overstating things. There are LOTS of benefits > to the PEP in 3.0. My own personal favorite is that users can be > guaranteed that all exceptions thrown will share a particular common > ancestor and type. Right, there are a couple of parts of the PEP that were non-controversial from the start and would likely have happened even in the absence of the PEP. My point was that a lot of machinery is being thrown at a tiny problem. To eliminate the need for "except (KeyError, SystemExit): raise", we're rearranging the tree, introducing a new builtin, banning an existing and popular form of an except clause, and introducing a non-trivial deprecation that will affect most users. This is a lot of firepower directed at a somewhat small problem. > But no syntax will work in BOTH 2.5 and 3.0. There's the rub. If you can't write code that will work for both, then there is no reason to force 2.x users to make any changes to their existing code, especially given that they won't see any benefit from the mass edits. > If we cannot produce a warning for these, then I'd > rather not produce the warning for the use of bare "except:". > After all, as it's been pointed out, if the use of bare "except:" > is all you are interested in it is quite easy to grep the code to > find all uses. Bingo. A bare except clause is well known as a consenting adults construct. If Guido feels driven to eliminate it from Py3.0, then that is the way it is. But for 2.x, why introduce unnecessary pain. Of course, if bare except clauses weren't banned for 3.0, then we would have no problem writing code that works on all versions on Python from 2.0 to 3.0, that doen't break existing code, and that doesn't invalidate the text in Python books. IMO, that is a nice situation. Just how badly do you want to kill bare except clauses. I propose that leave them alone, and be happy that in 3.0 we can write "except Exception" and get what we want without any fuss. Raymond From sjoerd at acm.org Thu Aug 25 16:13:40 2005 From: sjoerd at acm.org (Sjoerd Mullender) Date: Thu, 25 Aug 2005 16:13:40 +0200 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com> References: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com> Message-ID: <430DD214.2050208@acm.org> Michael Chermside wrote: > Raymond writes: > >>Efforts to improve Py3.0 have spilled >>over into breaking Py2.x code with no compensating benefits. [...] >>We don't have to wreck 2.x in order to make 3.0 better. > > > I think you're overstating things a bit here. There is an important point, though. Recently I read complaints about the lack of backward compatibility in Python on the fedora-list (mailing list for users of Fedora Core). Somebody asked what language he should learn and people answered, don't learn Python because it changes too often in backward incompatible ways. They even suggested using that other P language because that was much more backward compatible. Check out the thread starting at https://www.redhat.com/archives/fedora-list/2005-August/msg01682.html . -- Sjoerd Mullender -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 369 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20050825/e4902759/signature.pgp From gvanrossum at gmail.com Thu Aug 25 17:10:52 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 25 Aug 2005 08:10:52 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <001b01c5a975$68909220$b729cb97@oemcomputer> References: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com> <001b01c5a975$68909220$b729cb97@oemcomputer> Message-ID: On 8/25/05, Raymond Hettinger wrote: > It's only an overstatement if Guido didn't mean what he said. If bare > except clauses are deprecated in 2.x, it WILL affect tons of existing > code and invalidate a portion of almost all Python books. Deprecation means your code will still work I hope every book that documents "except:" also adds "but don't use this except under very special circumstances". I think you're overreacting (again), Raymond. 3.0 will be much more successful if we can introduce many of its features into 2.x. Many of those features are in fact improvements of the language even if they break old code. We're trying to balance between breaking old code and introducing new features; deprecation is the accepted way to do this. Regarding the complaint that Python is changing too fast, that really sounds like FUD to me. With a new release every 18 months Python is about as stable as it gets barring dead languages. The PHP is in the throws of the 4->5 conversion which breaks worse than Python 2->3 will (Rasmus ia changing object assignment semantics from copying to sharing). Maybe they should be warned not to learn Perl because Larry is deconstructing it all for Perl 6? :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tlesher at gmail.com Thu Aug 25 17:16:10 2005 From: tlesher at gmail.com (Tim Lesher) Date: Thu, 25 Aug 2005 11:16:10 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <430DD214.2050208@acm.org> References: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com> <430DD214.2050208@acm.org> Message-ID: <9613db6005082508162794cd5b@mail.gmail.com> On 8/25/05, Sjoerd Mullender wrote: > There is an important point, though. Recently I read complaints about > the lack of backward compatibility in Python on the fedora-list (mailing > list for users of Fedora Core). Somebody asked what language he should > learn and people answered, don't learn Python because it changes too > often in backward incompatible ways. They even suggested using that > other P language because that was much more backward compatible. I think you're overstating what actually happened there. Here's the actual quote from the thread: : perl is more portable than python - programs written for perl are far : more likely to run on a new version of perl than the equivalent for : python. However, python is probably more readable and writable than perl : for a new user, and is the language most Fedora system utilities (e.g. : yum) are written in. Both perl and python run on Windows too. : : You have to be very careful about how you write your code to make it : portable to both environments. If you need a GUI, you'll need a : cross-platform GUI toolkit like Qt too. : : If it's only one language to learn, and you're a Fedora user, I'd go for : python. Yes, later there were additional posts about portability and backwards-compatibility, but they were for the most part factually incorrect (reliance on new 2.x features, not backwards-incompatibility, were the issue with CML1) and relied to "I heard that..." information So your point is well-taken, but the problem is one of user perception. That's not a dismissal of the problem--witness the "JAVA/LISP/Python is too slow" and "all PERL code is cryptic" memes. To me, this perception problem alone raises the bar on backwards compatibility. Even if obsoleted features are seldom useed, "$language breaks old code!" is a virulent meme, in both senses of the word. -- Tim Lesher From gvanrossum at gmail.com Thu Aug 25 17:17:24 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 25 Aug 2005 08:17:24 -0700 Subject: [Python-Dev] Style for raising exceptions (python-dev Summary for 2005-08-01 through 2005-08-15 [draft]) In-Reply-To: <430D90FF.6060206@egenix.com> References: <430D90FF.6060206@egenix.com> Message-ID: On 8/25/05, M.-A. Lemburg wrote: > I must have missed this one: > > > ---------------------------- > > Style for raising exceptions > > ---------------------------- > > > > Guido explained that these days exceptions should always be raised as:: > > > > raise SomeException("some argument") > > > > instead of:: > > > > raise SomeException, "some argument" > > > > The second will go away in Python 3.0, and is only present now for backwards > > compatibility. (It was necessary when strings could be exceptions, in > > order to pass both the exception "type" and message.) PEPs 8_ and 3000_ > > were accordingly updated. > > AFAIR, the second form was also meant to be able to defer > the instantiation of the exception class until really > needed in order to reduce the overhead related to raising > exceptions in Python. > > However, that optimization never made it into the implementation, > I guess. Something equivalent is used internally in the C code, but that doesn't mean we'll need it in Python code. The optimization only works if the exception is also *caught* in C code, BTW (it is instantiated as soon as it is handled by a Python except clause). Originally, the second syntax was the only available syntax, because all we had were string exceptions. Now that string exceptions are dead (although not yet buried :) I really don't see why we need to keep both versions of the syntax; Python 3.0 will only have one version. (We're still debating what to do with the traceback argument; wanna revive PEP 344?) If you need to raise exceptions fast, pre-instantiate an instance. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Thu Aug 25 17:58:48 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 25 Aug 2005 11:58:48 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: Message-ID: <000a01c5a98d$e1c20940$b729cb97@oemcomputer> > Deprecation means your code will still work I hope every book that > documents "except:" also adds "but don't use this except under very > special circumstances". > > I think you're overreacting (again), Raymond. 3.0 will be much more > successful if we can introduce many of its features into 2.x. Many of > those features are in fact improvements of the language even if they > break old code. We're trying to balance between breaking old code and > introducing new features; deprecation is the accepted way to do this. IMO, the proponents of 2.x deprecation are underreacting. Deprecation has a cost -- there needs to be a corresponding payoff. Deprecation is warranted if the substitute code would still run on future Pythons (Michael explained the issues here). Deprecation is only warranted if the interim substitute works -- AFAICT, there is no other way to broadly catch exceptions not derived from Exception. The effort is only warranted if it makes the code better -- but here nothing is currently broken and the new code will be much less attractive and less readable (if the changes are done correctly); only 3.0 will offer the tools to do it readably and beautifully. Also, as we learned with apply(), even if ignored, the deprecation machinery has a tremendous runtime cost. None of this will make upgrading to Py2.5 an attractive option. There is a reason that over 120 bare except clauses remain in the standard library despite a number of attempts to get rid of them. It won't be trivial to properly evaluate whether each should be Exception or BaseException; to catch string exceptions; to write the test cases; to follow other PEPs requiring compatibility with older Pythons; or to do this in a way that it won't have to be done again for Py3.0. If the proponents don't have time to fix the standard library, how can they in good conscience mandate change for the rest of the world. Besides, I thought Guido was opposed to efforts to roam through mountains of code, making alterations in a non-holistic way. With a change this complex, the odds of introducing errors are very high. Fredrik, please speak up. Someone should represent the users here. I'm reached my limit on how much time I can devote to thinking out the implications of these proposals. Someone else needs to "overreact". From nas at arctrix.com Thu Aug 25 18:33:03 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 25 Aug 2005 10:33:03 -0600 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <000a01c5a98d$e1c20940$b729cb97@oemcomputer> References: <000a01c5a98d$e1c20940$b729cb97@oemcomputer> Message-ID: <20050825163302.GA21089@mems-exchange.org> On Thu, Aug 25, 2005 at 11:58:48AM -0400, Raymond Hettinger wrote: > Deprecation is only warranted if the interim substitute works -- > AFAICT, there is no other way to broadly catch exceptions not > derived from Exception. This seems to get to the heart of the problem. I'm no fan of bare excepts but I think we could handle them in 2.x (at least for the next few releases) by providing a workable alternative and then strongly discouraging their use (like we do for "from x import *"). Neil From dieter at handshake.de Wed Aug 24 21:11:18 2005 From: dieter at handshake.de (Dieter Maurer) Date: 24 Aug 2005 21:11:18 +0200 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: References: Message-ID: The following message is a courtesy copy of an article that has been posted to comp.lang.python as well. Neil Schemenauer writes on Mon, 22 Aug 2005 15:31:42 -0600: > ... > Some code may require that str() returns a str instance. In the > standard library, only one such case has been found so far. The > function email.header_decode() requires a str instance and the > email.Header.decode_header() function tries to ensure this by > calling str() on its argument. The code was fixed by changing > the line "header = str(header)" to: > > if isinstance(header, unicode): > header = header.encode('ascii') Note, that this is not equivalent to the old "str(header)": "str(header)" used Python's "default encoding" while the new code uses 'ascii'. The new code might be more correct than the old one has been. > ... > Alternative Solutions > > A new built-in function could be added instead of changing str(). > Doing so would introduce virtually no backwards compatibility > problems. However, since the compatibility problems are expected to > rare, changing str() seems preferable to adding a new built-in. Can we get a new builtin with the exact same behaviour as the current "str" which can be used when we do require an "str" (and cannot use a "unicode"). Dieter From gvwilson at cs.utoronto.ca Wed Aug 24 14:05:23 2005 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Wed, 24 Aug 2005 08:05:23 -0400 (EDT) Subject: [Python-Dev] [Argon] Re: 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430C48F9.8060801@v.loewis.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> Message-ID: Hi Martin (and everyone else); thanks for your mail. The N*N/2 invocations would explain why we saw such a large number of invocations --- thanks for figuring it out. W.r.t. how we're invoking our script: > > But if you're using CGI, you're importing your source on every > > invocation. > > Well, no. Only the CGI script needs to be parsed every time; all modules > could load off bytecode files. > > Which suggests that Keir Mierle doesn't use bytecode files, I think he > should. Yes, mod_python and .pyc's are the obviously way to go --- once the code actually works ;-). I just wanted students to have as few moving parts as possible while debugging. Thanks again, Greg From gvwilson at cs.utoronto.ca Wed Aug 24 20:20:59 2005 From: gvwilson at cs.utoronto.ca (Greg Wilson) Date: Wed, 24 Aug 2005 14:20:59 -0400 (EDT) Subject: [Python-Dev] [Argon] Re: 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???) In-Reply-To: <430CB987.5000601@livinglogic.de> References: <20050823201021.GE32195@cs.toronto.edu> <430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de> <430C854E.1080200@v.loewis.de> <430CA75F.7090900@livinglogic.de> <430CB0AE.1040201@v.loewis.de> <430CB987.5000601@livinglogic.de> Message-ID: > > Walter D?rwald wrote: > >>At least it would remove the quadratic number of calls to > >>_PyUnicodeUCS2_IsLinebreak(). For each character it would be called only > >>once. > Martin v. L?wis wrote: > > Correct. However, I very much doubt that this is the cause of the > > slowdown. > Walter D?rwald wrote: > Probably. We'd need a test with the original Argon source to really know. We can do that. > OK, so should we add this for 2.4.2 or only for 2.5? 2.4.2 please ;-) Thanks, Greg From gvanrossum at gmail.com Thu Aug 25 19:01:33 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 25 Aug 2005 10:01:33 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <000a01c5a98d$e1c20940$b729cb97@oemcomputer> References: <000a01c5a98d$e1c20940$b729cb97@oemcomputer> Message-ID: On 8/25/05, Raymond Hettinger wrote: >[...] AFAICT, there is no other way to broadly > catch exceptions not derived from Exception. But there is rarely a need to do so. I bet you that 99 out of 100 bare excepts in the stdlib could be replaced by "except Exception" without breaking anything, since they only expect a wide variety of standard exceptions, and don't care about string exceptions or user exceptions. The exception is the first of the two bare except: clauses in code.py. > The effort is only > warranted if it makes the code better -- but here nothing is currently > broken and the new code will be much less attractive and less readable > (if the changes are done correctly); only 3.0 will offer the tools to do > it readably and beautifully. Please explain? If 9 out of 10 bare excepts can safely be replaced by "except Exception", what's not beautiful about that? > Also, as we learned with apply(), even if > ignored, the deprecation machinery has a tremendous runtime cost. None > of this will make upgrading to Py2.5 an attractive option. Not in this case; bare except: can be flagged by the parser so the warning happens only once per compilation. > There is a reason that over 120 bare except clauses remain in the > standard library despite a number of attempts to get rid of them. I betcha almost all of then can safely be replaced with "except Exception". > Besides, I thought Guido was opposed to efforts to roam through > mountains of code, making alterations in a non-holistic way. This is trumped by the need to keep the standard library warning-free. But how about the following compromise: make it a silent deprecation in 2.5, and a full deprecation in 2.6. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nas at arctrix.com Thu Aug 25 19:03:32 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Thu, 25 Aug 2005 11:03:32 -0600 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: References: Message-ID: <20050825170332.GA21225@mems-exchange.org> On Wed, Aug 24, 2005 at 09:11:18PM +0200, Dieter Maurer wrote: > Neil Schemenauer writes on Mon, 22 Aug 2005 15:31:42 -0600: > > The code was fixed by changing > > the line "header = str(header)" to: > > > > if isinstance(header, unicode): > > header = header.encode('ascii') > > Note, that this is not equivalent to the old "str(header)": > > "str(header)" used Python's "default encoding" while the > new code uses 'ascii'. It also doesn't call __str__ if the object is not a basestring instance. I have a hard time understanding the exact purpose of calling str() here. Maybe Barry can comment. > Can we get a new builtin with the exact same behaviour as > the current "str" which can be used when we do require an "str" > (and cannot use a "unicode"). That fact that no code in the standard library requires such a function (AFAIK), leads me to believe that it would not be useful enough to be made a built-in. You would just write it yourself: def mystr(s): s = str(s) if isinstance(s, unicode): s = s.encode(sys.getdefaultencoding()) return s Cheers, Neil From reinhold-birkenfeld-nospam at wolke7.net Thu Aug 25 19:17:14 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Thu, 25 Aug 2005 19:17:14 +0200 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <000a01c5a98d$e1c20940$b729cb97@oemcomputer> References: <000a01c5a98d$e1c20940$b729cb97@oemcomputer> Message-ID: Raymond Hettinger wrote: >> Deprecation means your code will still work I hope every book that >> documents "except:" also adds "but don't use this except under very >> special circumstances". >> >> I think you're overreacting (again), Raymond. 3.0 will be much more >> successful if we can introduce many of its features into 2.x. Many of >> those features are in fact improvements of the language even if they >> break old code. We're trying to balance between breaking old code and >> introducing new features; deprecation is the accepted way to do this. > Fredrik, please speak up. Someone should represent the users here. I'm > reached my limit on how much time I can devote to thinking out the > implications of these proposals. Someone else needs to "overreact". Perhaps I may add a pragmatic POV (yes, I know that "pragmatic" is usually attributed to another language ;-). If "except:" issues a deprecation warning in 2.5, many people will come and say "woohoo, Python breaks backwards compatibility" and "I knew it, Python is unreliable, my script issues 1,233 warnings now" and such. You can see this effect looking at the discussion that broke out when Guido announced that map, filter and reduce would vanish (as builtins) in 3.0. People spoke up and said, "if that's going to be the plan, I'll stop using Python" etc. That said, I think that unless it is a new feature (like with statements) transitions to Python 3.0 shouldn't be enforced in the 2.x series. With 3.0, everyone expects a clear cut and a compatibility breach. Reinhold -- Mail address is perfectly valid! From raymond.hettinger at verizon.net Thu Aug 25 19:28:15 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu, 25 Aug 2005 13:28:15 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: Message-ID: <000a01c5a99a$61080360$b729cb97@oemcomputer> > > Also, as we learned with apply(), even if > > ignored, the deprecation machinery has a tremendous runtime cost. None > > of this will make upgrading to Py2.5 an attractive option. > > Not in this case; bare except: can be flagged by the parser so the > warning happens only once per compilation. That's good news. It mitigates runtime cost completely. > > There is a reason that over 120 bare except clauses remain in the > > standard library despite a number of attempts to get rid of them. > > I betcha almost all of then can safely be replaced with "except > Exception". Because the tree is not being re-arranged until 3.0, those cases should also introduce a preceding: except (KeyboardInterrupt, SystemExit): raise Anywhere that doesn't apply will need: except BaseException: . . . and also some corresponding backwards compatibility code to work with older pythons. If any are expected to work with user or third-party modules, then they cannot safely ignore string exceptions and exceptions not derived from Exception. Each of those changes needs to be accompanied by test cases so that all code paths get exercised. After the change, we should run Zope, Twisted, Gadfly, etc to make sure no major application got broken. Long running apps should verify that their recover and restart routines haven't been compromised. This is doubly true if the invariant for a bare except was being relied upon as a security measure (this may or may not be a real issue). > But how about the following compromise: make it a silent deprecation > in 2.5, and a full deprecation in 2.6. I'd love to compromise but it's your language. If you're going to deprecate, just do it. Pulling the band-aid off slowly doesn't lessen the total pain. My preference is of course, to leave 2.x alone and make this part of the attraction to 3.0. Remember, none of the code changes buys us anything in 2.x. It is an exercise without payoff. My even stronger preference is to leave bare excepts in for Py3.0. That buys us a happy world where code old code continues to work and new code can be written that functions as intended on all pythons new and old. I'm no fan of bare exceptions, but I'm not inclined to shoot myself in the foot to be rid of them. I wish Fredrik would chime in. He would have something pithy, angry, and incisive to say about this. Raymond From Scott.Daniels at Acm.Org Thu Aug 25 19:30:22 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Thu, 25 Aug 2005 10:30:22 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <000d01c5a958$566e1bc0$b729cb97@oemcomputer> References: <000d01c5a958$566e1bc0$b729cb97@oemcomputer> Message-ID: Raymond Hettinger wrote: >... I propose that the transition plan be as simple as introducing > BaseException. This allows people to write code that will work on both > 2.x and 3.0. It doesn't break anything. > > The guidance for cross-version (2.5 to 3.0) code would be: > > * To catch all but terminating exceptions, write: > > except (KeyError, SystemExit): > raise > except Exception: > ... How about: except BaseException, error: if not isinstance(error, Exception): raise ... This would accommodate other invented exceptions such as "FoundConvergance(BaseException)", which is my pseudo-example for an exiting exception that is not properly a subclass of either KeyError or SystemExit. The idea is a relaxation stops when it doesn't move and may start generating something silly like divide-by-zero. Not the end of an App, but the end of a Phase. --Scott David Daniels Scott.Daniels at Acm.Org From gvanrossum at gmail.com Thu Aug 25 19:43:45 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 25 Aug 2005 10:43:45 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <000a01c5a99a$61080360$b729cb97@oemcomputer> References: <000a01c5a99a$61080360$b729cb97@oemcomputer> Message-ID: On 8/25/05, Raymond Hettinger wrote: > I wish Fredrik would chime in. He would > have something pithy, angry, and incisive to say about this. Raymond, I'm sick of the abuse. Consider the PEP rejected. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mcherm at mcherm.com Thu Aug 25 19:52:19 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Thu, 25 Aug 2005 10:52:19 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 Message-ID: <20050825105219.o99mjihmuycgckos@login.werra.lunarpages.com> Guido: > But how about the following compromise: make it a silent deprecation > in 2.5, and a full deprecation in 2.6. Reinhold Birkenfeld: > That said, I think that unless it is a new feature (like with statements) > transitions to Python 3.0 shouldn't be enforced in the 2.x series. With 3.0, > everyone expects a clear cut and a compatibility breach. Raymond: > I'd love to compromise but it's your language. If you're going to > deprecate, just do it. Pulling the band-aid off slowly doesn't lessen > the total pain. There are actually THREE possible levels of deprecation available. In order of severity, they are: 1. Modifying the documentation to advise people to avoid this feature. No one gets alerted. 2. Using a PendingDeprecationWarning so people who explicitly request it can have the compiler alert them when they use it. 3. Using a DeprecationWarning so people using it are alerted unless they explicitly request NOT to be alerted. I think 3 is unwarrented in this case. For reasons I explained in a previous posting, I would be in favor of 2 if we can *also* have a PendingDeprecationWarning for use of string exceptions and arbitrary-object exceptions (those not derived from BaseException). I am in favor of 3 in any case. Of course, that's just one person's opinion... Raymond also raised this excellent point: > There is a reason that over 120 bare except clauses remain in the > standard library despite a number of attempts to get rid of them. [...] > If the proponents don't have time to fix the standard library, how can > they in good conscience mandate change for the rest of the world. That seems like a fair criticism to me. As we've already noted, it is impossible to replace ALL uses of bare "except:" in 2.5 (particularly the From mcherm at mcherm.com Thu Aug 25 19:55:30 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Thu, 25 Aug 2005 10:55:30 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 Message-ID: <20050825105530.ev2xwy7r754wogo0@login.werra.lunarpages.com> [PLEASE IGNORE PREVIOUS EMAIL... I HIT [Send] BY MISTAKE] Guido: > But how about the following compromise: make it a silent deprecation > in 2.5, and a full deprecation in 2.6. Reinhold Birkenfeld: > That said, I think that unless it is a new feature (like with statements) > transitions to Python 3.0 shouldn't be enforced in the 2.x series. With 3.0, > everyone expects a clear cut and a compatibility breach. Raymond: > I'd love to compromise but it's your language. If you're going to > deprecate, just do it. Pulling the band-aid off slowly doesn't lessen > the total pain. There are actually THREE possible levels of deprecation available. In order of severity, they are: 1. Modifying the documentation to advise people to avoid this feature. No one gets alerted. 2. Using a PendingDeprecationWarning so people who explicitly request it can have the compiler alert them when they use it. 3. Using a DeprecationWarning so people using it are alerted unless they explicitly request NOT to be alerted. I think 3 is unwarrented in this case. For reasons I explained in a previous posting, I would be in favor of 2 if we can *also* have a PendingDeprecationWarning for use of string exceptions and arbitrary-object exceptions (those not derived from BaseException). I am in favor of 3 in any case. Of course, that's just one person's opinion... Raymond also raised this excellent point: > There is a reason that over 120 bare except clauses remain in the > standard library despite a number of attempts to get rid of them. [...] > If the proponents don't have time to fix the standard library, how can > they in good conscience mandate change for the rest of the world. That seems like a fair criticism to me. As we've already noted, it is impossible to replace ALL uses of bare "except:" in 2.5 (particularly the use in code.py that Guido referred to). But we ought to make an extra effort to remove unnecessary uses of bare "except:" from the standard library if we intend to deprecate it. -- Michael Chermisde From steve at holdenweb.com Thu Aug 25 20:30:53 2005 From: steve at holdenweb.com (Steve Holden) Date: Thu, 25 Aug 2005 14:30:53 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: References: <000a01c5a99a$61080360$b729cb97@oemcomputer> Message-ID: <430E0E5D.8010503@holdenweb.com> Guido van Rossum wrote: > On 8/25/05, Raymond Hettinger wrote: > > >>I wish Fredrik would chime in. He would >>have something pithy, angry, and incisive to say about this. > > > Raymond, I'm sick of the abuse. Consider the PEP rejected. > Perhaps you should go for the ?10 argument next door? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ From steve at holdenweb.com Thu Aug 25 20:30:53 2005 From: steve at holdenweb.com (Steve Holden) Date: Thu, 25 Aug 2005 14:30:53 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: References: <000a01c5a99a$61080360$b729cb97@oemcomputer> Message-ID: <430E0E5D.8010503@holdenweb.com> Guido van Rossum wrote: > On 8/25/05, Raymond Hettinger wrote: > > >>I wish Fredrik would chime in. He would >>have something pithy, angry, and incisive to say about this. > > > Raymond, I'm sick of the abuse. Consider the PEP rejected. > Perhaps you should go for the ?10 argument next door? regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ From rrr at ronadam.com Thu Aug 25 20:33:35 2005 From: rrr at ronadam.com (Ron Adam) Date: Thu, 25 Aug 2005 14:33:35 -0400 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <000a01c5a98d$e1c20940$b729cb97@oemcomputer> References: <000a01c5a98d$e1c20940$b729cb97@oemcomputer> Message-ID: <430E0EFF.1080707@ronadam.com> Raymond Hettinger wrote: >>Deprecation means your code will still work I hope every book that >>documents "except:" also adds "but don't use this except under very >>special circumstances". >> >>I think you're overreacting (again), Raymond. 3.0 will be much more >>successful if we can introduce many of its features into 2.x. Many of >>those features are in fact improvements of the language even if they >>break old code. We're trying to balance between breaking old code and >>introducing new features; deprecation is the accepted way to do this. > Fredrik, please speak up. Someone should represent the users here. I'm > reached my limit on how much time I can devote to thinking out the > implications of these proposals. Someone else needs to "overreact". How about a middle of the road (or there abouts) opinion from an average user? Just my 2 cents anyways. I get the impression that just how much existing code will work or not work in 3.0 is still fairly up in the air. Python 3.0 still quite a ways off from what I understand. So to me.. depreciating anything at this time that's not going to be removed *before* Python 3.0 is possibly jumping the gun a bit. (IMHO) It definitely makes since to depreciate anything that will be removed prior to Python 3.0. And to also document anything that will be changed in 3.0. (but not depreciate yet) If/when it is decided (maybe it already has) that a smooth transition can be made between 2.x and 3.0 with a high degree of backwards compatibility, then depreciating 2.x features that will be removed from 3.0 makes since at some point but maybe not in 2.5. If it turns out that the amount of changes in 3.0 are such as to be a "New but non backwards compatible version of Python" with a lot of really great new features. Then depreciating items in 2.x that will not be removed from 2.x seems like it gives a since of false hope. It might be better to just document the differences (but not depreciate them) and make a clean break. Or to put it another way... having a lot of depreciated items in the final 2.x version may give a message 2.x is flawed, yet it may not be possible for many programs to move to 3.0 easily for some time if there are a lot of large changes. My opinion is... I would rather see the final version of 2.x not have any depreciated items and efforts be made to make it the best and most dependable 2.x version that will be around for a while. And then have Python 3.0 be a new beginning and an open book without the backwards compatible chains holding it back. That dosen't mean it won't be, I think it's just too soon to tell to what degree. At this time the efforts towards 3.0 seem to be towards those improvements that may be included in some future version of 2.x which is great. Is it possible the big changes have yet to be considered for Python 3.0? Cheers, Ron From ianb at colorstudy.com Thu Aug 25 21:10:43 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 25 Aug 2005 14:10:43 -0500 Subject: [Python-Dev] PEP 342: simple example, closure alternative Message-ID: <430E17B3.3080900@colorstudy.com> I was trying to translate a pattern that uses closures in a language like Scheme (where closed values can be written to) to generators using PEP 342, but I'm not clear exactly how it works; the examples in the PEP have different motivations. Since I can't actually run these examples, perhaps someone could confirm or debug these: A closure based accumulator (using Scheme): (define (accum n) (lambda (incr) (set! n (+ n incr)) n)) (define s (accum 0)) (s 1) ; -> 1 == 0+1 (s 5) ; -> 6 == 1+5 So I thought the generator version might look like: def accum(n): while 1: incr = (yield n) or 0 n += incr >>> s = accum(0) >>> s.next() >>> s.send(1) 0 >>> s.send(5) 1 >>> s.send(1) 6 Is the order of the output correct? Is there a better way to write accum, that makes it feel more like the closure-based version? Is this for loop correct? >>> s = accum(0) >>> for i in s: ... if i >= 10: break ... print i, ... assert s.send(2) == i 0 2 4 6 8 Hmm... maybe this would make it feel more closure-like: def closure_like(func): def replacement(*args, **kw): return ClosureLike(func(*args, **kw)) return replacement class ClosureLike(object): def __init__(self, iterator): self.iterator = iterator # I think this initial .next() is required, but I'm # very confused on this point: assert self.iterator.next() is None def __call__(self, input): assert self.iterator.send(input) is None return self.iterator.next() @closure_like def accum(n): while 1: # yields should always be in pairs, the first yield is input # and the second yield is output. incr = (yield) # this line is equivalent to (lambda (incr)... n += incr # equivalent to (set! ...) yield n # equivalent to n; this yield always returns None >>> s = accum(0) >>> s(1) 1 >>> s(5) 6 Everything before the first (yield) is equivalent to the closed values between "(define (accum n)" and "(lambda" (for this example there's nothing there; I guess a more interesting example would have closed variables that were written to that were not function parameters). -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From pje at telecommunity.com Thu Aug 25 21:23:10 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 25 Aug 2005 15:23:10 -0400 Subject: [Python-Dev] PEP 342: simple example, closure alternative In-Reply-To: <430E17B3.3080900@colorstudy.com> Message-ID: <5.1.1.6.0.20050825151456.028cf830@mail.telecommunity.com> At 02:10 PM 8/25/2005 -0500, Ian Bicking wrote: >I was trying to translate a pattern that uses closures in a language >like Scheme (where closed values can be written to) to generators using >PEP 342, but I'm not clear exactly how it works; the examples in the PEP >have different motivations. Since I can't actually run these examples, >perhaps someone could confirm or debug these: > >A closure based accumulator (using Scheme): > >(define (accum n) > (lambda (incr) > (set! n (+ n incr)) > n)) >(define s (accum 0)) >(s 1) ; -> 1 == 0+1 >(s 5) ; -> 6 == 1+5 > >So I thought the generator version might look like: > >def accum(n): > while 1: > incr = (yield n) or 0 > n += incr > > >>> s = accum(0) > >>> s.next() The initial next() will yield 0, not None. > >>> s.send(1) >0 1 > >>> s.send(5) >1 6 > >>> s.send(1) >6 7 >Is the order of the output correct? Is there a better way to write >accum, that makes it feel more like the closure-based version? > >Is this for loop correct? > > >>> s = accum(0) > >>> for i in s: >... if i >= 10: break >... print i, >... assert s.send(2) == i >0 2 4 6 8 The assert will fail on the first pass. s.send(2) will == i+2, e.g.: >>> s = accum(0) >>> for i in s: ... if i>=10: break ... print i, ... assert s.send(2) == i+2 ... 0 2 4 6 8 From ianb at colorstudy.com Thu Aug 25 22:12:35 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 25 Aug 2005 15:12:35 -0500 Subject: [Python-Dev] PEP 342: simple example, closure alternative In-Reply-To: <5.1.1.6.0.20050825151456.028cf830@mail.telecommunity.com> References: <5.1.1.6.0.20050825151456.028cf830@mail.telecommunity.com> Message-ID: <430E2633.1020902@colorstudy.com> Phillip J. Eby wrote: > At 02:10 PM 8/25/2005 -0500, Ian Bicking wrote: > >>I was trying to translate a pattern that uses closures in a language >>like Scheme (where closed values can be written to) to generators using >>PEP 342, but I'm not clear exactly how it works; the examples in the PEP >>have different motivations. Since I can't actually run these examples, >>perhaps someone could confirm or debug these: >> >>A closure based accumulator (using Scheme): >> >>(define (accum n) >> (lambda (incr) >> (set! n (+ n incr)) >> n)) >>(define s (accum 0)) >>(s 1) ; -> 1 == 0+1 >>(s 5) ; -> 6 == 1+5 >> >>So I thought the generator version might look like: >> >>def accum(n): >> while 1: >> incr = (yield n) or 0 >> n += incr Bah, I don't know why this had me so confused. Well, I kind of know why. So maybe this example would be better written: def accum(n): incr = yield # wait to get the first incr to be sent in while 1: n += incr incr = yield n # return the new value, wait for next incr This way it is more explicit all around -- the first call to .next() is just setup, kind of like __init__ in an object, except it has to be explicitly invoked. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From skip at pobox.com Thu Aug 25 22:23:51 2005 From: skip at pobox.com (skip@pobox.com) Date: Thu, 25 Aug 2005 15:23:51 -0500 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> <5BB88A76-FB4E-41F7-B82D-4B7C5B6D28DD@fuhm.net> Message-ID: <17166.10455.121002.213724@montanaro.dyndns.org> Guido> It's never too early to start deprecating a feature we know will Guido> disappear in 3.0. Though if it's a widely used feature the troops will be highly annoyed by all the deprecation warnings. (Or does deprecation not coincide with emitting warnings?) Skip From skip at pobox.com Thu Aug 25 22:45:00 2005 From: skip at pobox.com (skip@pobox.com) Date: Thu, 25 Aug 2005 15:45:00 -0500 Subject: [Python-Dev] Style for raising exceptions (python-dev Summary for 2005-08-01 through 2005-08-15 [draft]) In-Reply-To: <430D90FF.6060206@egenix.com> References: <430D90FF.6060206@egenix.com> Message-ID: <17166.11724.133034.374929@montanaro.dyndns.org> MAL> I must have missed this one: That's because it was brief and to the point, so the discussion lasted for maybe three messages. Also, someone told us you were on holiday so we thought we could squeak it through without you noticing. Darn those Aussies. Late on the pydev summary again! >> ---------------------------- >> Style for raising exceptions >> ---------------------------- >> >> Guido explained that these days exceptions should always be raised as:: >> >> raise SomeException("some argument") >> >> instead of:: >> >> raise SomeException, "some argument" >> >> The second will go away in Python 3.0, and is only present now for >> backwards compatibility. (It was necessary when strings could be >> exceptions, in order to pass both the exception "type" and message.) >> PEPs 8_ and 3000_ were accordingly updated. I do have a followup question on the style thing. (I'll leave others to answer MAL's question about optimization.) If I want to raise an exception without an argument, which of the following is the proper form? raise ValueError raise ValueError() Skip From gvanrossum at gmail.com Fri Aug 26 01:59:31 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 25 Aug 2005 16:59:31 -0700 Subject: [Python-Dev] Style for raising exceptions (python-dev Summary for 2005-08-01 through 2005-08-15 [draft]) In-Reply-To: <17166.11724.133034.374929@montanaro.dyndns.org> References: <430D90FF.6060206@egenix.com> <17166.11724.133034.374929@montanaro.dyndns.org> Message-ID: On 8/25/05, skip at pobox.com wrote: > I do have a followup question on the style thing. (I'll leave others to > answer MAL's question about optimization.) If I want to raise an exception > without an argument, which of the following is the proper form? > > raise ValueError > raise ValueError() The latter. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Fri Aug 26 02:04:44 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 25 Aug 2005 17:04:44 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: <17166.10455.121002.213724@montanaro.dyndns.org> References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com> <5BB88A76-FB4E-41F7-B82D-4B7C5B6D28DD@fuhm.net> <17166.10455.121002.213724@montanaro.dyndns.org> Message-ID: On 8/25/05, skip at pobox.com wrote: > > Guido> It's never too early to start deprecating a feature we know will > Guido> disappear in 3.0. > > Though if it's a widely used feature the troops will be highly annoyed by > all the deprecation warnings. (Or does deprecation not coincide with > emitting warnings?) See Michael Chermside's post. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ark at acm.org Fri Aug 26 05:18:43 2005 From: ark at acm.org (Andrew Koenig) Date: Thu, 25 Aug 2005 23:18:43 -0400 Subject: [Python-Dev] PEP 342: simple example, closure alternative In-Reply-To: <430E17B3.3080900@colorstudy.com> Message-ID: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop> > A closure based accumulator (using Scheme): > > (define (accum n) > (lambda (incr) > (set! n (+ n incr)) > n)) > (define s (accum 0)) > (s 1) ; -> 1 == 0+1 > (s 5) ; -> 6 == 1+5 > > So I thought the generator version might look like: > > def accum(n): > while 1: > incr = (yield n) or 0 > n += incr Maybe I'm missing something but this example seems needlessly tricky to me. How about doing it this way? def accum(n): acc = [n] def f(incr): acc[0] += incr return acc[0] return f Here, the [0] turns "read-only" access into write access to a list element. The list itself isn't written; only its element is. From ianb at colorstudy.com Fri Aug 26 06:59:42 2005 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 25 Aug 2005 23:59:42 -0500 Subject: [Python-Dev] PEP 342: simple example, closure alternative In-Reply-To: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop> References: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop> Message-ID: <430EA1BE.9090804@colorstudy.com> Andrew Koenig wrote: >>A closure based accumulator (using Scheme): >> >>(define (accum n) >> (lambda (incr) >> (set! n (+ n incr)) >> n)) >>(define s (accum 0)) >>(s 1) ; -> 1 == 0+1 >>(s 5) ; -> 6 == 1+5 >> >>So I thought the generator version might look like: >> >>def accum(n): >> while 1: >> incr = (yield n) or 0 >> n += incr > > > Maybe I'm missing something but this example seems needlessly tricky to me. > How about doing it this way? > > def accum(n): > acc = [n] > def f(incr): > acc[0] += incr > return acc[0] > return f > > Here, the [0] turns "read-only" access into write access to a list element. > The list itself isn't written; only its element is. I was just exploring how it could be done with coroutines. But also because using lists as pointers isn't that elegant, and isn't something I'd encourage people do to coming from other languages (where closures are used more heavily). More generally, I've been doing some language comparisons, and I don't like literal but non-idiomatic translations of programming patterns. So I'm considering better ways to translate some of the same use cases. -- Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org From bcannon at gmail.com Fri Aug 26 08:01:33 2005 From: bcannon at gmail.com (Brett Cannon) Date: Thu, 25 Aug 2005 23:01:33 -0700 Subject: [Python-Dev] Bare except clauses in PEP 348 In-Reply-To: References: <000a01c5a99a$61080360$b729cb97@oemcomputer> Message-ID: The PEP has been rejected. -Brett On 8/25/05, Guido van Rossum wrote: > On 8/25/05, Raymond Hettinger wrote: > > > I wish Fredrik would chime in. He would > > have something pithy, angry, and incisive to say about this. > > Raymond, I'm sick of the abuse. Consider the PEP rejected. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From mal at egenix.com Fri Aug 26 10:12:12 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 26 Aug 2005 10:12:12 +0200 Subject: [Python-Dev] Style for raising exceptions (python-dev Summary for 2005-08-01 through 2005-08-15 [draft]) In-Reply-To: References: <430D90FF.6060206@egenix.com> Message-ID: <430ECEDC.7040206@egenix.com> Guido van Rossum wrote: > On 8/25/05, M.-A. Lemburg wrote: > >>I must have missed this one: >> >> >>>---------------------------- >>>Style for raising exceptions >>>---------------------------- >>> >>>Guido explained that these days exceptions should always be raised as:: >>> >>> raise SomeException("some argument") >>> >>>instead of:: >>> >>> raise SomeException, "some argument" >>> >>>The second will go away in Python 3.0, and is only present now for backwards >>>compatibility. (It was necessary when strings could be exceptions, in >>>order to pass both the exception "type" and message.) PEPs 8_ and 3000_ >>>were accordingly updated. >> >>AFAIR, the second form was also meant to be able to defer >>the instantiation of the exception class until really >>needed in order to reduce the overhead related to raising >>exceptions in Python. >> >>However, that optimization never made it into the implementation, >>I guess. > > > Something equivalent is used internally in the C code, but that > doesn't mean we'll need it in Python code. The optimization only works > if the exception is also *caught* in C code, BTW (it is instantiated > as soon as it is handled by a Python except clause). Ah, I knew it was in there somewhere (just couldn't find yesterday when I was looking for the optimization :-). > Originally, the second syntax was the only available syntax, because > all we had were string exceptions. Now that string exceptions are dead > (although not yet buried :) I really don't see why we need to keep > both versions of the syntax; Python 3.0 will only have one version. Actually, we do only have one version: the first syntax is just a special case of the second (with the value argument set to None). I don't see a need for two or more syntaxes either, but most code nowadays uses the second variant (I don't know of any code that uses the traceback argument), which puts up a high barrier for changes. This is from a comment in ceval.c: /* We support the following forms of raise: raise , raise , raise , None raise , raise , None raise , raise , None An omitted second argument is the same as None. In addition, raise , is the same as raising the tuple's first item (and it better have one!); this rule is applied recursively. Finally, an optional third argument can be supplied, which gives the traceback to be substituted (useful when re-raising an exception after examining it). */ That's quite a list of combinations that will all break in Python 3.0 if we only allow "raise ". I guess the reason for most code using the variante "raise , " is that it simply looks a lot like the corresponding "except , errorobj" clause. > (We're still debating what to do with the traceback argument; wanna > revive PEP 344?) > > If you need to raise exceptions fast, pre-instantiate an instance. Ideally, I'd like Python to take care of such optimizations rather than having to explicitly code for them: If I write "raise ValueError, 'bad format'" and then catch the error with just "except ValueError", there would be no need for Python to actually instantiate the exception object. OTOH, lazy instantiation may have unwanted side-effects (just like any lazy evaluation), e.g. the instantiation could result in another exception to get raised. Can't have 'em all, I guess. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 26 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From abkhd at hotmail.com Fri Aug 26 14:35:22 2005 From: abkhd at hotmail.com (A.B., Khalid) Date: Fri, 26 Aug 2005 12:35:22 +0000 Subject: [Python-Dev] operator.c for release24-maint and test_bz2 on Python 2.4.1 Message-ID: Hello there, The release24-maint check-ins for today contained this typo: =================================================================== RCS file: /cvsroot/python/python/dist/src/Modules/operator.c,v retrieving revision 2.29 retrieving revision 2.29.4.1 diff -u -d -r2.29 -r2.29.4.1 --- operator.c 4 Dec 2003 22:17:49 -0000 2.29 +++ operator.c 26 Aug 2005 06:43:16 -0000 2.29.4.1 @@ -267,6 +267,9 @@ itemgetterobject *ig; PyObject *item; + if (!_PyArg_NoKeywords("itemgetter()", kdws)) <----- kdws should be kwds + return NULL; + if (!PyArg_UnpackTuple(args, "itemgetter", 1, 1, &item)) return NULL; Also I wish to report that testBug1191043 of test_bz2 still fails in some cases on Python 2.4.1 on both WinXP Pro and Win98. Following is the output of the said test. #----------------------------Python 2.5a0------------------------------# # In intrepreted session mode #-----------------------------------------------------------------------------# $ python -i Python 2.5a0 (#65, Aug 26 2005, 14:57:28) [GCC 3.4.4 (mingw special)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>>from test import test_bz2 as t >>>t.test_main() testBug1191043 (test.test_bz2.BZ2FileTest) ... ok testIterator (test.test_bz2.BZ2FileTest) ... ok testModeU (test.test_bz2.BZ2FileTest) ... ok testOpenDel (test.test_bz2.BZ2FileTest) ... ok testOpenNonexistent (test.test_bz2.BZ2FileTest) ... ok testRead (test.test_bz2.BZ2FileTest) ... ok testRead100 (test.test_bz2.BZ2FileTest) ... ok testReadChunk10 (test.test_bz2.BZ2FileTest) ... ok testReadLine (test.test_bz2.BZ2FileTest) ... ok testReadLines (test.test_bz2.BZ2FileTest) ... ok testSeekBackwards (test.test_bz2.BZ2FileTest) ... ok testSeekBackwardsFromEnd (test.test_bz2.BZ2FileTest) ... ok testSeekForward (test.test_bz2.BZ2FileTest) ... ok testSeekPostEnd (test.test_bz2.BZ2FileTest) ... ok testSeekPostEndTwice (test.test_bz2.BZ2FileTest) ... ok testSeekPreStart (test.test_bz2.BZ2FileTest) ... ok testUniversalNewlinesCRLF (test.test_bz2.BZ2FileTest) ... ok testUniversalNewlinesLF (test.test_bz2.BZ2FileTest) ... ok testWrite (test.test_bz2.BZ2FileTest) ... ok testWriteChunks10 (test.test_bz2.BZ2FileTest) ... ok testWriteLines (test.test_bz2.BZ2FileTest) ... ok testXReadLines (test.test_bz2.BZ2FileTest) ... ok testCompress (test.test_bz2.BZ2CompressorTest) ... ok testCompressChunks10 (test.test_bz2.BZ2CompressorTest) ... ok testDecompress (test.test_bz2.BZ2DecompressorTest) ... ok testDecompressChunks10 (test.test_bz2.BZ2DecompressorTest) ... ok testDecompressUnusedData (test.test_bz2.BZ2DecompressorTest) ... ok testEOFError (test.test_bz2.BZ2DecompressorTest) ... ok test_Constructor (test.test_bz2.BZ2DecompressorTest) ... ok testCompress (test.test_bz2.FuncTest) ... ok testDecompress (test.test_bz2.FuncTest) ... ok testDecompressEmpty (test.test_bz2.FuncTest) ... ok testDecompressIncomplete (test.test_bz2.FuncTest) ... ok ---------------------------------------------------------------------- Ran 33 tests in 4.730s OK #----------------------------Python 2.4.1 from CVS ----------------# # In intrepreted session mode #-----------------------------------------------------------------------------# $ python -i Python 2.4.1 (#65, Aug 26 2005, 14:38:48) [GCC 3.4.4 (mingw special)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>>from test import test_bz2 as t >>>t.test_main() testBug1191043 (test.test_bz2.BZ2FileTest) ... ok testIterator (test.test_bz2.BZ2FileTest) ... ok testModeU (test.test_bz2.BZ2FileTest) ... ok testOpenDel (test.test_bz2.BZ2FileTest) ... ok testOpenNonexistent (test.test_bz2.BZ2FileTest) ... ok testRead (test.test_bz2.BZ2FileTest) ... ok testRead100 (test.test_bz2.BZ2FileTest) ... ok testReadChunk10 (test.test_bz2.BZ2FileTest) ... ok testReadLine (test.test_bz2.BZ2FileTest) ... ok testReadLines (test.test_bz2.BZ2FileTest) ... ok testSeekBackwards (test.test_bz2.BZ2FileTest) ... ok testSeekBackwardsFromEnd (test.test_bz2.BZ2FileTest) ... ok testSeekForward (test.test_bz2.BZ2FileTest) ... ok testSeekPostEnd (test.test_bz2.BZ2FileTest) ... ok testSeekPostEndTwice (test.test_bz2.BZ2FileTest) ... ok testSeekPreStart (test.test_bz2.BZ2FileTest) ... ok testUniversalNewlinesCRLF (test.test_bz2.BZ2FileTest) ... ok testUniversalNewlinesLF (test.test_bz2.BZ2FileTest) ... ok testWrite (test.test_bz2.BZ2FileTest) ... ok testWriteChunks10 (test.test_bz2.BZ2FileTest) ... ok testWriteLines (test.test_bz2.BZ2FileTest) ... ok testXReadLines (test.test_bz2.BZ2FileTest) ... ok testCompress (test.test_bz2.BZ2CompressorTest) ... ok testCompressChunks10 (test.test_bz2.BZ2CompressorTest) ... ok testDecompress (test.test_bz2.BZ2DecompressorTest) ... ok testDecompressChunks10 (test.test_bz2.BZ2DecompressorTest) ... ok testDecompressUnusedData (test.test_bz2.BZ2DecompressorTest) ... ok testEOFError (test.test_bz2.BZ2DecompressorTest) ... ok test_Constructor (test.test_bz2.BZ2DecompressorTest) ... ok testCompress (test.test_bz2.FuncTest) ... ok testDecompress (test.test_bz2.FuncTest) ... ok testDecompressEmpty (test.test_bz2.FuncTest) ... ok testDecompressIncomplete (test.test_bz2.FuncTest) ... ok ---------------------------------------------------------------------- Ran 33 tests in 5.060s OK So here we have a passing test_bz2 test when invoked from inside a running Python. #-------------------------- Python 2.4.1 from CVS -----------------# # Not in intrepreted session mode #-----------------------------------------------------------------------------# However, and in Python 2.4.1 the following happens when the test is not invoked from an interpreted session: $ python ../Lib/test/test_bz2.py testBug1191043 (__main__.BZ2FileTest) ... ERROR ERROR testIterator (__main__.BZ2FileTest) ... ok testModeU (__main__.BZ2FileTest) ... ok testOpenDel (__main__.BZ2FileTest) ... ok testOpenNonexistent (__main__.BZ2FileTest) ... ok testRead (__main__.BZ2FileTest) ... ok testRead100 (__main__.BZ2FileTest) ... ok testReadChunk10 (__main__.BZ2FileTest) ... ok testReadLine (__main__.BZ2FileTest) ... ok testReadLines (__main__.BZ2FileTest) ... ok testSeekBackwards (__main__.BZ2FileTest) ... ok testSeekBackwardsFromEnd (__main__.BZ2FileTest) ... ok testSeekForward (__main__.BZ2FileTest) ... ok testSeekPostEnd (__main__.BZ2FileTest) ... ok testSeekPostEndTwice (__main__.BZ2FileTest) ... ok testSeekPreStart (__main__.BZ2FileTest) ... ok testUniversalNewlinesCRLF (__main__.BZ2FileTest) ... ok testUniversalNewlinesLF (__main__.BZ2FileTest) ... ok testWrite (__main__.BZ2FileTest) ... ok testWriteChunks10 (__main__.BZ2FileTest) ... ok testWriteLines (__main__.BZ2FileTest) ... ok testXReadLines (__main__.BZ2FileTest) ... ok testCompress (__main__.BZ2CompressorTest) ... ok testCompressChunks10 (__main__.BZ2CompressorTest) ... ok testDecompress (__main__.BZ2DecompressorTest) ... ok testDecompressChunks10 (__main__.BZ2DecompressorTest) ... ok testDecompressUnusedData (__main__.BZ2DecompressorTest) ... ok testEOFError (__main__.BZ2DecompressorTest) ... ok test_Constructor (__main__.BZ2DecompressorTest) ... ok testCompress (__main__.FuncTest) ... ok testDecompress (__main__.FuncTest) ... ok testDecompressEmpty (__main__.FuncTest) ... ok testDecompressIncomplete (__main__.FuncTest) ... ok ====================================================================== ERROR: testBug1191043 (__main__.BZ2FileTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "../Lib/test/test_bz2.py", line 255, in testBug1191043 lines = bz2f.readlines() RuntimeError: wrong sequence of bz2 library commands used ====================================================================== ERROR: testBug1191043 (__main__.BZ2FileTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "../Lib/test/test_bz2.py", line 47, in tearDown os.unlink(self.filename) OSError: [Errno 13] Permission denied: '@test' ---------------------------------------------------------------------- Ran 33 tests in 6.210s FAILED (errors=2) Traceback (most recent call last): File "../Lib/test/test_bz2.py", line 357, in ? test_main() File "../Lib/test/test_bz2.py", line 353, in test_main FuncTest File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 290, in run_unittest run_suite(suite, testclass) File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 274, in run_suite raise TestFailed(msg) test.test_support.TestFailed: errors occurred; run in verbose mode for details #-------------------------- Python 2.5a0 from CVS -----------------# # Not in intrepreted session mode #-----------------------------------------------------------------------------# That problem disappears in Python 2.5a0: $ python ../Lib/test/test_bz2.py testBug1191043 (__main__.BZ2FileTest) ... ok testIterator (__main__.BZ2FileTest) ... ok testModeU (__main__.BZ2FileTest) ... ok testOpenDel (__main__.BZ2FileTest) ... ok testOpenNonexistent (__main__.BZ2FileTest) ... ok testRead (__main__.BZ2FileTest) ... ok testRead100 (__main__.BZ2FileTest) ... ok testReadChunk10 (__main__.BZ2FileTest) ... ok testReadLine (__main__.BZ2FileTest) ... ok testReadLines (__main__.BZ2FileTest) ... ok testSeekBackwards (__main__.BZ2FileTest) ... ok testSeekBackwardsFromEnd (__main__.BZ2FileTest) ... ok testSeekForward (__main__.BZ2FileTest) ... ok testSeekPostEnd (__main__.BZ2FileTest) ... ok testSeekPostEndTwice (__main__.BZ2FileTest) ... ok testSeekPreStart (__main__.BZ2FileTest) ... ok testUniversalNewlinesCRLF (__main__.BZ2FileTest) ... ok testUniversalNewlinesLF (__main__.BZ2FileTest) ... ok testWrite (__main__.BZ2FileTest) ... ok testWriteChunks10 (__main__.BZ2FileTest) ... ok testWriteLines (__main__.BZ2FileTest) ... ok testXReadLines (__main__.BZ2FileTest) ... ok testCompress (__main__.BZ2CompressorTest) ... ok testCompressChunks10 (__main__.BZ2CompressorTest) ... ok testDecompress (__main__.BZ2DecompressorTest) ... ok testDecompressChunks10 (__main__.BZ2DecompressorTest) ... ok testDecompressUnusedData (__main__.BZ2DecompressorTest) ... ok testEOFError (__main__.BZ2DecompressorTest) ... ok test_Constructor (__main__.BZ2DecompressorTest) ... ok testCompress (__main__.FuncTest) ... ok testDecompress (__main__.FuncTest) ... ok testDecompressEmpty (__main__.FuncTest) ... ok testDecompressIncomplete (__main__.FuncTest) ... ok ---------------------------------------------------------------------- Ran 33 tests in 5.880s OK Regards Khalid _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From reinhold-birkenfeld-nospam at wolke7.net Fri Aug 26 15:08:57 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Fri, 26 Aug 2005 15:08:57 +0200 Subject: [Python-Dev] operator.c for release24-maint and test_bz2 on Python 2.4.1 In-Reply-To: References: Message-ID: A.B., Khalid wrote: > Hello there, > > > The release24-maint check-ins for today contained this typo: > > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Modules/operator.c,v > retrieving revision 2.29 > retrieving revision 2.29.4.1 > diff -u -d -r2.29 -r2.29.4.1 > --- operator.c 4 Dec 2003 22:17:49 -0000 2.29 > +++ operator.c 26 Aug 2005 06:43:16 -0000 2.29.4.1 > @@ -267,6 +267,9 @@ > itemgetterobject *ig; > PyObject *item; > > + if (!_PyArg_NoKeywords("itemgetter()", kdws)) <----- kdws should be kwds > + return NULL; > + > if (!PyArg_UnpackTuple(args, "itemgetter", 1, 1, &item)) > return NULL; Thank you, that is corrected now. > However, and in Python 2.4.1 the following happens when the test is not > invoked from an interpreted session: > > $ python ../Lib/test/test_bz2.py > testBug1191043 (__main__.BZ2FileTest) ... ERROR > ERROR [...] > ====================================================================== > ERROR: testBug1191043 (__main__.BZ2FileTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "../Lib/test/test_bz2.py", line 255, in testBug1191043 > lines = bz2f.readlines() > RuntimeError: wrong sequence of bz2 library commands used > > ====================================================================== > ERROR: testBug1191043 (__main__.BZ2FileTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "../Lib/test/test_bz2.py", line 47, in tearDown > os.unlink(self.filename) > OSError: [Errno 13] Permission denied: '@test' > > ---------------------------------------------------------------------- > Ran 33 tests in 6.210s > > FAILED (errors=2) > Traceback (most recent call last): > File "../Lib/test/test_bz2.py", line 357, in ? > test_main() > File "../Lib/test/test_bz2.py", line 353, in test_main > FuncTest > File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 290, in > run_unittest > run_suite(suite, testclass) > File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 274, in > run_suite > raise TestFailed(msg) > test.test_support.TestFailed: errors occurred; run in verbose mode for > details Are you sure that you are calling the newly-built python.exe? It is strange that the test should pass in interactive mode when it doesn't in normal mode. For a confirmation, can you execute this piece of code both interactively and from a file: data = 'BZh91AY&SY\xd9b\x89]\x00\x00\x00\x03\x80\x04\x00\x02\x00\x0c\x00 \x00!\x9ah3M\x13<]\xc9\x14\xe1BCe\x8a%t' f = open('test.bz2', "wb") f.write(data) f.close() bz2f = BZ2File('test.bz2') lines = bz2f.readlines() bz2f.close() assert lines == ['Test'] bz2f = BZ2File('test.bz2) xlines = list(bz2f.xreadlines()) bz2f.close() assert lines == ['Test'] os.unlink('test.bz2') Reinhold -- Mail address is perfectly valid! From mcherm at mcherm.com Fri Aug 26 15:15:17 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Fri, 26 Aug 2005 06:15:17 -0700 Subject: [Python-Dev] Style for raising exceptions (python-dev Summary for 2005-08-01 through 2005-08-15 [draft]) Message-ID: <20050826061517.jxvx356u9ow0wksw@login.werra.lunarpages.com> Marc-Andre Lemburg writes: > This is from a comment in ceval.c: > > /* We support the following forms of raise: > raise , > raise , > raise , None > raise , > raise , None > raise , > raise , None > > An omitted second argument is the same as None. > > In addition, raise , is the same as > raising the tuple's first item (and it better have one!); > this rule is applied recursively. > > Finally, an optional third argument can be supplied, which > gives the traceback to be substituted (useful when > re-raising an exception after examining it). */ > > That's quite a list of combinations that will all break > in Python 3.0 if we only allow "raise ". Oh my GOD! Are you saying that in order to correctly read Python code that a programmer must know all of THAT! I would be entirely unsurprised to learn that NO ONE on this list... in fact, no one in the whole world could have reproduced that specification from memory accurately. I have never seen a more convincing argument for why we should allow only limited forms in Python 3.0. And next time that I find myself in need of an obfuscated python entry, I've got a great trick up my sleeve. -- Michael Chermside From abkhd at hotmail.com Fri Aug 26 15:45:25 2005 From: abkhd at hotmail.com (A.B., Khalid) Date: Fri, 26 Aug 2005 13:45:25 +0000 Subject: [Python-Dev] test_bz2 on Python 2.4.1 Message-ID: Reinhold Birkenfeld wrote: >Are you sure that you are calling the newly-built python.exe? It is strange >that >the test should pass in interactive mode when it doesn't in normal mode. >For a confirmation, can you execute this piece of code both interactively >and >from a file: Yes, both Python's tested are fresh from CVS. Here is the output of the test you asked I run #---------------------- # File: testbz2.py #---------------------- """ import os from bz2 import BZ2File data = 'BZh91AY&SY\xd9b\x89]\x00\x00\x00\x03\x80\x04\x00\x02\x00\x0c\x00 \x00!\x9ah3M\x13<]\xc9\x14\xe1BCe\x8a%t' f = open('test.bz2', "wb") f.write(data) f.close() bz2f = BZ2File('test.bz2') lines = bz2f.readlines() bz2f.close() assert lines == ['Test'] bz2f = BZ2File('test.bz2') xlines = list(bz2f.xreadlines()) bz2f.close() assert lines == ['Test'] os.unlink('test.bz2') """ ------------- RESULTS: ------------- #--------------------------- Python 2.5a0 from CVS -----------------# # Result: passes $ /g/projs/py25/python/dist/src/MinGW/python testbz2.py #--------------------------- Python 2.4.1 from CVS -----------------# # Result: fails $ /g/projs/py24/python/dist/src/MinGW/python testbz2.py Traceback (most recent call last): File "testbz2.py", line 9, in ? lines = bz2f.readlines() RuntimeError: wrong sequence of bz2 library commands used #--------------------------- Python 2.4.1 from CVS -----------------# # Interpreted session: testbz2 fails here as well now $ /g/projs/py24/python/dist/src/MinGW/python -i Python 2.4.1 (#65, Aug 26 2005, 14:38:48) [GCC 3.4.4 (mingw special)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>>import os >>>from bz2 import BZ2File >>>data = 'BZh91AY&SY\xd9b\x89]\x00\x00\x00\x03\x80\x04\x00\x02\x00\x0c\x00 >>>\x0 0!\x9ah3M\x13<]\xc9\x14\xe1BCe\x8a%t' >>>f = open('test.bz2', "wb") >>>f.write(data) >>>f.close() >>>bz2f = BZ2File('test.bz2') >>>lines = bz2f.readlines() Traceback (most recent call last): File "", line 1, in ? RuntimeError: wrong sequence of bz2 library commands used >>>raise SystemExit #--------------------------- Python 2.5a0 from CVS -----------------# # Interpreted session: testbz2 passes $ /g/projs/py25/python/dist/src/MinGW/python -i Python 2.5a0 (#65, Aug 26 2005, 14:57:28) [GCC 3.4.4 (mingw special)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>>import os >>>from bz2 import BZ2File >>>data = 'BZh91AY&SY\xd9b\x89]\x00\x00\x00\x03\x80\x04\x00\x02\x00\x0c\x00 >>>\x0 0!\x9ah3M\x13<]\xc9\x14\xe1BCe\x8a%t' >>>f = open('test.bz2', "wb") >>>f.write(data) >>>f.close() >>>bz2f = BZ2File('test.bz2') >>>lines = bz2f.readlines() >>>bz2f.close() >>>assert lines == ['Test'] >>>bz2f = BZ2File('test.bz2') >>>xlines = list(bz2f.xreadlines()) >>>bz2f.close() >>>assert lines == ['Test'] >>>os.unlink('test.bz2') >>>raise SystemExit Regards, Khalid _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ From fdrake at acm.org Fri Aug 26 16:01:37 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 26 Aug 2005 10:01:37 -0400 Subject: [Python-Dev] =?iso-8859-1?q?Style_for_raising_exceptions_=28pytho?= =?iso-8859-1?q?n-dev_Summary=09for_2005-08-01_through_2005-08-15_?= =?iso-8859-1?q?=5Bdraft=5D=29?= In-Reply-To: <20050826061517.jxvx356u9ow0wksw@login.werra.lunarpages.com> References: <20050826061517.jxvx356u9ow0wksw@login.werra.lunarpages.com> Message-ID: <200508261001.37893.fdrake@acm.org> On Friday 26 August 2005 09:15, Michael Chermside wrote: > Oh my GOD! Are you saying that in order to correctly read Python code > that a programmer must know all of THAT! I would be entirely > unsurprised to learn that NO ONE on this list... in fact, no one > in the whole world could have reproduced that specification from > memory accurately. I have never seen a more convincing argument for > why we should allow only limited forms in Python 3.0. No kidding. The stuff about the tuples is particularly painful, but is specifically there to deal with string exceptions and the idiom that an exception could be defined as a tuple of exceptions. In fact, anydbm is particularly eggregious: it defines an error class derived from Exception, and then adds that to a tuple with the string exceptions from the specific modules it fronts for. The tuple handling in raise allows anydbm.error to be raised and then caught again abstractly, in addition to allow anydbm.error to act as a "base" exception that catches the specific errors raised by the backend databases. -Fred -- Fred L. Drake, Jr. From paragate at gmx.net Tue Aug 23 14:59:28 2005 From: paragate at gmx.net (Wolfgang Lipp) Date: Tue, 23 Aug 2005 14:59:28 +0200 Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings In-Reply-To: <430AFCC7.9030402@egenix.com> References: <20050822213142.GA5702@mems-exchange.org> <430AFCC7.9030402@egenix.com> Message-ID: just tested the proposed implementation on a unicode-naive module basically using import sys import __builtin__ reload( sys ); sys.setdefaultencoding( 'utf-8' ) __builtin__.__dict__[ 'str' ] = new_str_function et voil?, str() calls in the module are rewritten, and print u'd?sseldorf' does work as expected(*) (even on systems where i have no access to sitecustomize, like at my python-friendly isp's servers). --- * my expectation is that unicode strings do print out as utf-8, as i can't see any better solution. i suggest to make this option available e.g. via a module in the standard lib to ease transition for people in case the pep doesn't make it. it may be applied where deemed necessary and left ignored otherwise. if nobody thinks the reload hack is too awful and this solution stands testing, i guess i'll post it to the aspn cookbook. after all these countless hours of hunting down ordinal not in range, finally i'm starting to see some light in the issue. _wolf On Tue, 23 Aug 2005 12:39:03 +0200, M.-A. Lemburg wrote: > Thomas Heller wrote: >> Neil Schemenauer writes: >> >> >>> [Please mail followups to python-dev at python.org.] >>> >>> The PEP has been rewritten based on a suggestion by Guido to change >>> str() rather than adding a new built-in function. Based on my >>> testing, I believe the idea is feasible. It would be helpful if >>> people could test the patched Python with their own applications and >>> report any incompatibilities. >>> >> >> >> I like the fact that currently unicode(x) is guarateed to return a >> unicode instance, or raises a UnicodeDecodeError. Same for str(x), >> which is guaranteed to return a (byte) string instance or raise an >> error. >> >> Wouldn't also a new function make the intent clearer? >> >> So I think I'm +1 on the text() built-in, and -0 on changing str. > > Same here. > > A new API would also help make the transition easier from the > current mixed data/text type (strings) to data-only (bytes) > and text-only (text, renamed from unicode) in Py3.0. > -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/ -- http://mail.python.org/mailman/listinfo/python-list From tim.peters at gmail.com Fri Aug 26 16:32:37 2005 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 26 Aug 2005 10:32:37 -0400 Subject: [Python-Dev] [Python-checkins] python/dist/src/Lib/test test_bz2.py, 1.18, 1.19 In-Reply-To: <20050826132405.30B221E4003@bag.python.org> References: <20050826132405.30B221E4003@bag.python.org> Message-ID: <1f7befae05082607323db5ceee@mail.gmail.com> [birkenfeld at users.sourceforge.net] > Update of /cvsroot/python/python/dist/src/Lib/test > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv4822/Lib/test > > Modified Files: > test_bz2.py > Log Message: > Add list() around xreadlines() > > > > Index: test_bz2.py > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Lib/test/test_bz2.py,v > retrieving revision 1.18 > retrieving revision 1.19 > diff -u -d -r1.18 -r1.19 > --- test_bz2.py 21 Aug 2005 14:16:04 -0000 1.18 > +++ test_bz2.py 26 Aug 2005 13:23:54 -0000 1.19 > @@ -191,7 +191,7 @@ > def testSeekBackwardsFromEnd(self): > # "Test BZ2File.seek(-150, 2)" > self.createTempFile() > - bz2f = BZ2File(self.filename) > + )bz2f = BZ2File(self.filename) Note that this added a right parenthesis to the start of the line. That creates a syntax error, so this test could not have been tried before checking in. It also causes test_compiler to fail. From reinhold-birkenfeld-nospam at wolke7.net Fri Aug 26 16:46:20 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Fri, 26 Aug 2005 16:46:20 +0200 Subject: [Python-Dev] [Python-checkins] python/dist/src/Lib/test test_bz2.py, 1.18, 1.19 In-Reply-To: <1f7befae05082607323db5ceee@mail.gmail.com> References: <20050826132405.30B221E4003@bag.python.org> <1f7befae05082607323db5ceee@mail.gmail.com> Message-ID: Tim Peters wrote: > [birkenfeld at users.sourceforge.net] >> Update of /cvsroot/python/python/dist/src/Lib/test >> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv4822/Lib/test >> >> Modified Files: >> test_bz2.py >> Log Message: >> Add list() around xreadlines() >> >> >> >> Index: test_bz2.py >> =================================================================== >> RCS file: /cvsroot/python/python/dist/src/Lib/test/test_bz2.py,v >> retrieving revision 1.18 >> retrieving revision 1.19 >> diff -u -d -r1.18 -r1.19 >> --- test_bz2.py 21 Aug 2005 14:16:04 -0000 1.18 >> +++ test_bz2.py 26 Aug 2005 13:23:54 -0000 1.19 >> @@ -191,7 +191,7 @@ >> def testSeekBackwardsFromEnd(self): >> # "Test BZ2File.seek(-150, 2)" >> self.createTempFile() >> - bz2f = BZ2File(self.filename) >> + )bz2f = BZ2File(self.filename) > > Note that this added a right parenthesis to the start of the line. > That creates a syntax error, so this test could not have been tried > before checking in. It also causes test_compiler to fail. Thank you for correcting. The parenthesis must have been accidentally slipped in while I was reviewing the change for correctness. Reinhold -- Mail address is perfectly valid! From gvanrossum at gmail.com Fri Aug 26 16:57:08 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri, 26 Aug 2005 07:57:08 -0700 Subject: [Python-Dev] PEP 342: simple example, closure alternative In-Reply-To: <430EA1BE.9090804@colorstudy.com> References: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop> <430EA1BE.9090804@colorstudy.com> Message-ID: On 8/25/05, Ian Bicking wrote: > More generally, I've been doing some language comparisons, and I don't > like literal but non-idiomatic translations of programming patterns. True. (But that doesn't mean I think using generators for this example is great either.) > So I'm considering better ways to translate some of the same use cases. Remember that this particuar example was invented to show the superiority of Lisp; it has no practical value when taken literally. If you substitute a method call for the "acc += incr" operation, the Python translation using nested functions is very natural. For larger examples, I'd recommend defining a class as always. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From alain.poirier at net-ng.com Fri Aug 26 18:21:58 2005 From: alain.poirier at net-ng.com (Alain Poirier) Date: Fri, 26 Aug 2005 18:21:58 +0200 Subject: [Python-Dev] PEP 342: simple example, closure alternative In-Reply-To: References: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop> <430EA1BE.9090804@colorstudy.com> Message-ID: <200508261821.58392.alain.poirier@net-ng.com> Le Vendredi 26 Ao?t 2005 16:57, Guido van Rossum a ?crit : > On 8/25/05, Ian Bicking wrote: > > More generally, I've been doing some language comparisons, and I don't > > like literal but non-idiomatic translations of programming patterns. > > True. (But that doesn't mean I think using generators for this example > is great either.) > > > So I'm considering better ways to translate some of the same use cases. > > Remember that this particuar example was invented to show the > superiority of Lisp; it has no practical value when taken literally. > If you substitute a method call for the "acc += incr" operation, the > Python translation using nested functions is very natural. For larger > examples, I'd recommend defining a class as always. For example, I often use this class to help me in functional programming : _marker = () class var: def __init__(self, v=None): self.v = v def __call__(self, v=_marker): if v is not _marker: self.v = v return self.v and so the nested functions become very functional : def accum(n): acc = var(n) return lambda incr: acc(acc()+incr) From nas at arctrix.com Fri Aug 26 18:33:26 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Fri, 26 Aug 2005 10:33:26 -0600 Subject: [Python-Dev] PEP 342: simple example, closure alternative In-Reply-To: <200508261821.58392.alain.poirier@net-ng.com> References: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop> <430EA1BE.9090804@colorstudy.com> <200508261821.58392.alain.poirier@net-ng.com> Message-ID: <20050826163326.GA2382@mems-exchange.org> On Fri, Aug 26, 2005 at 06:21:58PM +0200, Alain Poirier wrote: > For example, I often use this class to help me in functional programming : > > _marker = () [...] You should not use an immutable object here (e.g. the empty tuple is shared). My preferred idiom is: _marker = object() Cheers, Neil From tjreedy at udel.edu Fri Aug 26 21:54:10 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 26 Aug 2005 15:54:10 -0400 Subject: [Python-Dev] Remove str.find in 3.0? Message-ID: Can str.find be listed in PEP 3000 (under builtins) for removal? Would anyone really object? Reasons: 1. Str.find is essentially redundant with str.index. The only difference is that str.index Pythonically indicates 'not found' by raising an exception while str.find does the same by anomalously returning -1. As best as I can remember, this is common for Unix system calls but unique among Python builtin functions. Learning and remembering both is a nuisance. 2. As is being discussed in a current c.l.p thread, -1 is a legal indexing subscript. If one uses the return value as a subscript without checking, the bug is not caught. None would be a better return value should find not be deleted. 3. Anyone who prefers to test return values instead of catch exceptions can write (simplified, without start,end params): def sfind(string, target): try: return string.index(target) except ValueError: return None # or -1 for back compatibility, but None better This can of course be done for any function/method that indicates input errors with exceptions instead of a special return value. I see no reason other than history that this particular method should be doubled. If .find is scheduled for the dustbin of history, I would be willing to suggest doc and docstring changes. (str.index.__doc__ currently refers to str.find.__doc__. This should be reversed.) Terry J. Reedy From gvanrossum at gmail.com Fri Aug 26 22:10:00 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri, 26 Aug 2005 13:10:00 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: Message-ID: On 8/26/05, Terry Reedy wrote: > Can str.find be listed in PEP 3000 (under builtins) for removal? Yes please. (Except it's not technically a builtin but a string method.) > Would anyone really object? Not me. > Reasons: > > 1. Str.find is essentially redundant with str.index. The only difference > is that str.index Pythonically indicates 'not found' by raising an > exception while str.find does the same by anomalously returning -1. As > best as I can remember, this is common for Unix system calls but unique > among Python builtin functions. Learning and remembering both is a > nuisance. > > 2. As is being discussed in a current c.l.p thread, -1 is a legal indexing > subscript. If one uses the return value as a subscript without checking, > the bug is not caught. None would be a better return value should find not > be deleted. > > 3. Anyone who prefers to test return values instead of catch exceptions can > write (simplified, without start,end params): > > def sfind(string, target): > try: > return string.index(target) > except ValueError: > return None # or -1 for back compatibility, but None better > > This can of course be done for any function/method that indicates input > errors with exceptions instead of a special return value. I see no reason > other than history that this particular method should be doubled. I'd like to add: 4. The no. 1 use case for str.find() used to be testing whether a substring was present or not; "if s.find(sub) >= 0" can now be written as "if sub in s". This avoids the nasty bug in "if s.find(sub)". > If .find is scheduled for the dustbin of history, I would be willing to > suggest doc and docstring changes. (str.index.__doc__ currently refers to > str.find.__doc__. This should be reversed.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Fri Aug 26 22:08:33 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 26 Aug 2005 16:08:33 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: Message-ID: <000a01c5aa79$f0414020$a8bb9d8d@oemcomputer> > Can str.find be listed in PEP 3000 (under builtins) for removal? FWIW, here is a sample code transformation (extracted from zipfile.py). Judge for yourself whether the index version is better: Existing code: -------------- END_BLOCK = min(filesize, 1024 * 4) fpin.seek(filesize - END_BLOCK, 0) data = fpin.read() start = data.rfind(stringEndArchive) if start >= 0: # Correct signature string was found endrec = struct.unpack(structEndArchive, data[start:start+22]) endrec = list(endrec) comment = data[start+22:] if endrec[7] == len(comment): # Comment length checks out # Append the archive comment and start offset endrec.append(comment) endrec.append(filesize - END_BLOCK + start) return endrec return # Error, return None Revised code: ------------- END_BLOCK = min(filesize, 1024 * 4) fpin.seek(filesize - END_BLOCK, 0) data = fpin.read() try: start = data.rindex(stringEndArchive) except ValueError: pass else: # Correct signature string was found endrec = struct.unpack(structEndArchive, data[start:start+22]) endrec = list(endrec) comment = data[start+22:] if endrec[7] == len(comment): # Comment length checks out # Append the archive comment and start offset endrec.append(comment) endrec.append(filesize - END_BLOCK + start) return endrec return # Error, return None From raymond.hettinger at verizon.net Fri Aug 26 22:34:04 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Fri, 26 Aug 2005 16:34:04 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: Message-ID: <000d01c5aa7d$80564ae0$a8bb9d8d@oemcomputer> > Can str.find be listed in PEP 3000 (under builtins) for removal? > Would anyone really object? > > Reasons: . . . I had one further thought. In addition to your excellent list of reasons, it would be great if these kind of requests were accompanied by a patch that removed the offending construct from the standard library. The most important reason for the patch is that looking at the context diff will provide an objective look at how real code will look before and after the change. This would make subsequent discussions substantially more informed and less anecdotal. The second reason is that the revised library code becomes more likely to survive the transition to 3.0. Further, it can continue to serve as example code which highlights current best practices. This patch wouldn't take long. I've tried about a half dozen cases since you first posted. Each provided a new insight (zipfile was not improved, webbrowser was improved, and urlparse was about the same). Raymond From jcarlson at uci.edu Fri Aug 26 22:54:35 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 26 Aug 2005 13:54:35 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: Message-ID: <20050826134317.7DFD.JCARLSON@uci.edu> "Terry Reedy" wrote: > > Can str.find be listed in PEP 3000 (under builtins) for removal? > Would anyone really object? I would object to the removal of str.find() . In fact, older versions of Python which only allowed for single-character 'x in str' containment tests offered 'str.find(...) != -1' as a suitable replacement option, which is found in the standard library more than a few times... Further, forcing users to use try/except when they are looking for the offset of a substring seems at least a little strange (if not a lot braindead, no offense to those who prefer their code to spew exceptions at every turn). I've been thinking for years that .find should be part of the set of operations offered to most, if not all sequences (lists, buffers, tuples, ...). Considering the apparent dislike/hatred for str.find, it seems I was wise in not requesting it in the past. > > Reasons: > > 1. Str.find is essentially redundant with str.index. The only difference > is that str.index Pythonically indicates 'not found' by raising an > exception while str.find does the same by anomalously returning -1. As > best as I can remember, this is common for Unix system calls but unique > among Python builtin functions. Learning and remembering both is a > nuisance. So pick one and forget the other. I think of .index as a list method (because it doesn't offer .find), not a string method, even though it is. > 2. As is being discussed in a current c.l.p thread, -1 is a legal indexing > subscript. If one uses the return value as a subscript without checking, > the bug is not caught. None would be a better return value should find not > be deleted. And would break potentially thousands of lines of code in the wild which expect -1 right now. Look in the standard library for starting examples, and google around for others. > 3. Anyone who prefers to test return values instead of catch exceptions can > write (simplified, without start,end params): > > def sfind(string, target): > try: > return string.index(target) > except ValueError: > return None # or -1 for back compatibility, but None better > > This can of course be done for any function/method that indicates input > errors with exceptions instead of a special return value. I see no reason > other than history that this particular method should be doubled. I prefer my methods to stay on my instances, and I could have sworn that the string module's functions were generally deprecated in favor of string methods. Now you are (implicitly) advocating the reversal of such for one method which doesn't return an exception under a very normal circumstance. Would you further request that .rfind be removed from strings? The inclusion of .rindex? - Josiah From tjreedy at udel.edu Sat Aug 27 01:48:54 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 26 Aug 2005 19:48:54 -0400 Subject: [Python-Dev] Remove str.find in 3.0? References: Message-ID: "Guido van Rossum" wrote in message news:ca471dc20508261310809b1e3 at mail.gmail.com... > On 8/26/05, Terry Reedy wrote: >> Can str.find be listed in PEP 3000 (under builtins) for removal? > > Yes please. (Except it's not technically a builtin but a string method.) To avoid suggesting a new header, I interpreted Built-ins broadly to include builtin types. The header could be expanded to Built-in Constants, Functions, and Types or Built-ins and Built-in Types but I leave such details to the PEP authors. Terry J. Reedy From tjreedy at udel.edu Sat Aug 27 03:07:49 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 26 Aug 2005 21:07:49 -0400 Subject: [Python-Dev] Remove str.find in 3.0? References: <000a01c5aa79$f0414020$a8bb9d8d@oemcomputer> Message-ID: "Raymond Hettinger" wrote in message news:000a01c5aa79$f0414020$a8bb9d8d at oemcomputer... >> Can str.find be listed in PEP 3000 (under builtins) for removal? > > FWIW, here is a sample code transformation (extracted from zipfile.py). > Judge for yourself whether the index version is better: I am sure that we both could write similar code that would be smoother if the math module also had a 'powhalf' function that was the same as sqrt except for returning -1 instead of raising an error on negative or non-numerical input. I'll continue in response to Josiah... Terry J. Reedy From tjreedy at udel.edu Sat Aug 27 03:07:46 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 26 Aug 2005 21:07:46 -0400 Subject: [Python-Dev] Remove str.find in 3.0? References: <20050826134317.7DFD.JCARLSON@uci.edu> Message-ID: "Josiah Carlson" wrote in message news:20050826134317.7DFD.JCARLSON at uci.edu... > > "Terry Reedy" wrote: >> >> Can str.find be listed in PEP 3000 (under builtins) for removal? Guido has already approved, but I will try to explain my reasoning a bit better for you. There are basically two ways for a system, such as a Python function, to indicate 'I cannot give a normal response." One (1a) is to give an inband signal that is like a normal response except that it is not (str.find returing -1). A variation (1b) is to give an inband response that is more obviously not a real response (many None returns). The other (2) is to not respond (never return normally) but to give an out-of-band signal of some sort (str.index raising ValueError). Python as distributed usually chooses 1b or 2. I believe str.find and .rfind are unique in the choice of 1a. I am pretty sure that the choice of -1 as error return, instead of, for instance, None, goes back the the need in static languages such as C to return something of the declared return type. But Python is not C, etcetera. I believe that this pair is also unique in having exact counterparts of type 2. (But maybe I forgot something.) >> Would anyone really object? > I would object to the removal of str.find(). So, I wonder, what is your favored alternative? A. Status quo: ignore the opportunity to streamline the language. B. Change the return type of .find to None. C. Remove .(r)index instead. D. Add more redundancy for those who do not like exceptions. > Further, forcing users to use try/except when they are looking for the > offset of a substring seems at least a little strange (if not a lot > braindead, no offense to those who prefer their code to spew exceptions > at every turn). So are you advocating D above or claiming that substring indexing is uniquely deserving of having two versions? If the latter, why so special? If we only has str.index, would you actually suggest adding this particular duplication? > Considering the apparent dislike/hatred for str.find. I don't hate str.find. I simply (a) recognize that a function designed for static typing constraints is out of place in Python, which does not have those constraints and (b) believe that there is no reason other than history for the duplication and (c) believe that dropping .find is definitely better than dropping .index and changing .find. > Would you further request that .rfind be removed from strings? Of course. Thanks for reminding me. > The inclusion of .rindex? Yes, the continued inclusion of .rindex, which we already have. Terry J. Reedy From janssen at parc.com Sat Aug 27 04:40:29 2005 From: janssen at parc.com (Bill Janssen) Date: Fri, 26 Aug 2005 19:40:29 PDT Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: Your message of "Fri, 26 Aug 2005 18:07:46 PDT." Message-ID: <05Aug26.194031pdt."58617"@synergy1.parc.xerox.com> > There are basically two ways for a system, such as a > Python function, to indicate 'I cannot give a normal response." One (1a) > is to give an inband signal that is like a normal response except that it > is not (str.find returing -1). A variation (1b) is to give an inband > response that is more obviously not a real response (many None returns). > The other (2) is to not respond (never return normally) but to give an > out-of-band signal of some sort (str.index raising ValueError). > > Python as distributed usually chooses 1b or 2. I believe str.find and > .rfind are unique in the choice of 1a. Doubt it. The problem with returning None is that it tests as False, but so does 0, which is a valid string index position. The reason string.find() returns -1 is probably to allow a test: if line.find("\f"): ... do something Might add a boolean "str.contains()" to cover this test case. Bill From gvanrossum at gmail.com Sat Aug 27 05:05:29 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri, 26 Aug 2005 20:05:29 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <-1439615371039171070@unknownmsgid> References: <-1439615371039171070@unknownmsgid> Message-ID: On 8/26/05, Bill Janssen wrote: > Doubt it. The problem with returning None is that it tests as False, > but so does 0, which is a valid string index position. The reason > string.find() returns -1 is probably to allow a test: > > if line.find("\f"): > ... do something This has a bug; it is equivalent to "if not line.startswith("\f"):". This mistake (which I have made more than once myself and have seen many times in code by others) is one of the main reasons to want to get rid of this style of return value. > Might add a boolean "str.contains()" to cover this test case. We already got that: "\f" in line. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Sat Aug 27 05:14:53 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri, 26 Aug 2005 20:14:53 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <000d01c5aa7d$80564ae0$a8bb9d8d@oemcomputer> References: <000d01c5aa7d$80564ae0$a8bb9d8d@oemcomputer> Message-ID: On 8/26/05, Raymond Hettinger wrote: > I had one further thought. In addition to your excellent list of > reasons, it would be great if these kind of requests were accompanied by > a patch that removed the offending construct from the standard library. Um? Are we now requiring patches for PYTHON THREE DOT OH proposals? Raymond, we all know and agree that Python 3.0 will be incompatible in many ways. range() and keys() becoming iterators, int/int returning float, and so on; we can safely say that it will break nearly every module under the sun, and no amount of defensive coding in Python 2.x will save us. > The most important reason for the patch is that looking at the context > diff will provide an objective look at how real code will look before > and after the change. This would make subsequent discussions > substantially more informed and less anecdotal. No, you're just artificially trying to raise the bar for Python 3.0 proposals to an unreasonable height. > The second reason is that the revised library code becomes more likely > to survive the transition to 3.0. Further, it can continue to serve as > example code which highlights current best practices. But we don't *want* all of the library code to survive. Much of it is 10-15 years old and in dear need of a total rewrite. See Anthony Baxter's lightning talk at OSCON (I'm sure Google can find it for you). > This patch wouldn't take long. I've tried about a half dozen cases > since you first posted. Each provided a new insight (zipfile was not > improved, webbrowser was improved, and urlparse was about the same). So it's neutral in terms of code readability. Great. Given all the other advantages for the proposal (an eminent member of this group just posted a buggy example :-) I'm now doubly convinced that we should do it. Also remember, the standard library is rather atypical -- while some of it makes great example code, other parts of it are highly contorted in order to either maintain backwards compatibility or provide an unusually high level of defensiveness. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From anthony at interlink.com.au Sat Aug 27 05:55:10 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Sat, 27 Aug 2005 13:55:10 +1000 Subject: [Python-Dev] [Python-checkins] python/dist/src/Lib/test test_bz2.py, 1.18, 1.19 In-Reply-To: References: <20050826132405.30B221E4003@bag.python.org> <1f7befae05082607323db5ceee@mail.gmail.com> Message-ID: <200508271355.12750.anthony@interlink.com.au> On Saturday 27 August 2005 00:46, Reinhold Birkenfeld wrote: > > Note that this added a right parenthesis to the start of the line. > > That creates a syntax error, so this test could not have been tried > > before checking in. It also causes test_compiler to fail. > > Thank you for correcting. The parenthesis must have been accidentally > slipped in while I was reviewing the change for correctness. Please ensure that you run the test suite before checking code in! Thanks, Anthony -- Anthony Baxter It's never too late to have a happy childhood. From janssen at parc.com Sat Aug 27 05:58:23 2005 From: janssen at parc.com (Bill Janssen) Date: Fri, 26 Aug 2005 20:58:23 PDT Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: Your message of "Fri, 26 Aug 2005 20:05:29 PDT." Message-ID: <05Aug26.205831pdt."58617"@synergy1.parc.xerox.com> Don't know *what* I wasn't thinking :-). Bill > On 8/26/05, Bill Janssen wrote: > > Doubt it. The problem with returning None is that it tests as False, > > but so does 0, which is a valid string index position. The reason > > string.find() returns -1 is probably to allow a test: > > > > if line.find("\f"): > > ... do something > > This has a bug; it is equivalent to "if not line.startswith("\f"):". > > This mistake (which I have made more than once myself and have seen > many times in code by others) is one of the main reasons to want to > get rid of this style of return value. > > > Might add a boolean "str.contains()" to cover this test case. > > We already got that: "\f" in line. From jcarlson at uci.edu Sat Aug 27 06:48:27 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 26 Aug 2005 21:48:27 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <20050826134317.7DFD.JCARLSON@uci.edu> Message-ID: <20050826184634.7E06.JCARLSON@uci.edu> "Terry Reedy" wrote: > > "Josiah Carlson" wrote in message > news:20050826134317.7DFD.JCARLSON at uci.edu... > > > > "Terry Reedy" wrote: > >> > >> Can str.find be listed in PEP 3000 (under builtins) for removal? > > Guido has already approved, I noticed, but he approved before anyone could say anything. I understand it is a dictatorship, but he seems to take advisment and reverse (or not) his decisions on occasion based on additional information. Whether this will lead to such, I don't know. > but I will try to explain my reasoning a bit > better for you. There are basically two ways for a system, such as a > Python function, to indicate 'I cannot give a normal response." One (1a) > is to give an inband signal that is like a normal response except that it > is not (str.find returing -1). A variation (1b) is to give an inband > response that is more obviously not a real response (many None returns). > The other (2) is to not respond (never return normally) but to give an > out-of-band signal of some sort (str.index raising ValueError). > > Python as distributed usually chooses 1b or 2. I believe str.find and > .rfind are unique in the choice of 1a. I am pretty sure that the choice > of -1 as error return, instead of, for instance, None, goes back the the > need in static languages such as C to return something of the declared > return type. But Python is not C, etcetera. I believe that this pair is > also unique in having exact counterparts of type 2. (But maybe I forgot > something.) Taking a look at the commits that Guido did way back in 1993, he doesn't mention why he added .find, only that he did. Maybe it was another of the 'functional language additions' that he now regrets, I don't know. > >> Would anyone really object? > > > I would object to the removal of str.find(). > > So, I wonder, what is your favored alternative? > > A. Status quo: ignore the opportunity to streamline the language. str.find is not a language construct. It is a method on a built-in type that many people use. This is my vote. > B. Change the return type of .find to None. Again, this would break potentially thousands of lines of user code that is in the wild. Are we talking about changes for 2.5 here, or 3.0? > C. Remove .(r)index instead. see below * > D. Add more redundancy for those who do not like exceptions. In 99% of the cases, such implementations would be minimal. While I understand that "There should be one-- and preferably only one --obvious way to do it.", please see below *. > > Further, forcing users to use try/except when they are looking for the > > offset of a substring seems at least a little strange (if not a lot > > braindead, no offense to those who prefer their code to spew exceptions > > at every turn). > > So are you advocating D above or claiming that substring indexing is > uniquely deserving of having two versions? If the latter, why so special? > If we only has str.index, would you actually suggest adding this particular > duplication? Apparently everyone has forgotten the dozens of threads on similar topics over the years. I'll attempt to summarize. Adding functionality that isn't used is harmful, but not nearly as harmful as removing functionality that people use. If you take just two seconds and do a search on '.find(' vs '.index(' in the standard library, you will notice that '.find(' is used more often than '.index(' regardless of type (I don't have the time this evening to pick out which ones are string only, but I doubt the standard library uses mmap.find, DocTestFinder.find, or gettext.find). This example seems to show that people find str.find to be more intuitive and/or useful than str.index, even though you spent two large paragraphs explaining that Python 'doesn't do it that way very often so it isn't Pythonic'. Apparently the majority of people who have been working on the standard library for the last decade disagree. > > Considering the apparent dislike/hatred for str.find. > > I don't hate str.find. I simply (a) recognize that a function designed for > static typing constraints is out of place in Python, which does not have > those constraints and (b) believe that there is no reason other than > history for the duplication and (c) believe that dropping .find is > definitely better than dropping .index and changing .find. * I don't see why it is necessary to drop or change either one. We've got list() and [] for construcing a list. Heck, we've even got list(iterable) and [i for i in iterable] for making a list copy of any arbitrary iterable. This goes against TSBOOWTDI, so why don't we toss list comprehensions now that we have list(generator expression)? Or did I miss something and this was already going to happen? > > Would you further request that .rfind be removed from strings? > > Of course. Thanks for reminding me. No problem, but again, do a search in the standard library... I found 4 examples if str.rindex, but over 40 of str.rfind. Koders.com offers 1153 and 100 for .rfind and .rindex respectively (probably not all string methods, but I'm too lazy to check every one). A common factor of over 10. If koders.com had a decent way to search for the full name of a method call, we could do a find vs. index as well, though I expect we would see closer to 4:3 with .find winning (those are the approximate numbers I get when checking the standard library). The reason I'm making a stink is because you are proposing (and Guido has agreed) to get rid of methods V,W which are used more often than methods X,Y in order to 'streamline the language' for 3.0. The removal of two methods and their implementations will not go terribly far towards streamlining the language, especially when all four methods (find, rfind, index, rindex) call the same C function to do the actual search. - Josiah From martin at v.loewis.de Sat Aug 27 08:54:12 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 27 Aug 2005 08:54:12 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <20050826134317.7DFD.JCARLSON@uci.edu> Message-ID: <43100E14.6080009@v.loewis.de> Terry Reedy wrote: > One (1a) > is to give an inband signal that is like a normal response except that it > is not (str.find returing -1). > > Python as distributed usually chooses 1b or 2. I believe str.find and > .rfind are unique in the choice of 1a. That is not true. str.find's choice is not 1a, and there are other functions which chose 1a): -1 does *not* look like a normal response, since a normal response is non-negative. It is *not* the only method with choice 1a): dict.get returns None if the key is not found, even though None could also be the value for the key. For another example, file.read() returns an empty string at EOF. > I am pretty sure that the choice > of -1 as error return, instead of, for instance, None, goes back the the > need in static languages such as C to return something of the declared > return type. But Python is not C, etcetera. I believe that this pair is > also unique in having exact counterparts of type 2. dict.__getitem__ is a counterpart of type 2 of dict.get. > So, I wonder, what is your favored alternative? > > A. Status quo: ignore the opportunity to streamline the language. My favourite choice is the status quo. I probably don't fully understand the word "to streamline", but I don't see this as rationalizing. Instead, some applications will be more tedious to write. > So are you advocating D above or claiming that substring indexing is > uniquely deserving of having two versions? If the latter, why so special? Because it is no exception that a string is not part of another string, and because the question I'm asking "is the string in the other string, and if so, where?". This is similar to the question "does the dictionary have a value for that key, and if so, which?" > If we only has str.index, would you actually suggest adding this particular > duplication? That is what happened to dict.get: it was not originally there (I believe), but added later. Regards, Martin From reinhold-birkenfeld-nospam at wolke7.net Sat Aug 27 09:39:37 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Sat, 27 Aug 2005 09:39:37 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <05Aug26.194031pdt."58617"@synergy1.parc.xerox.com> References: <05Aug26.194031pdt."58617"@synergy1.parc.xerox.com> Message-ID: Bill Janssen wrote: >> There are basically two ways for a system, such as a >> Python function, to indicate 'I cannot give a normal response." One (1a) >> is to give an inband signal that is like a normal response except that it >> is not (str.find returing -1). A variation (1b) is to give an inband >> response that is more obviously not a real response (many None returns). >> The other (2) is to not respond (never return normally) but to give an >> out-of-band signal of some sort (str.index raising ValueError). >> >> Python as distributed usually chooses 1b or 2. I believe str.find and >> .rfind are unique in the choice of 1a. > > Doubt it. The problem with returning None is that it tests as False, > but so does 0, which is a valid string index position. Heh. You know what the Perl6 folks would suggest in this case? return 0 but true; # literally! > Might add a boolean "str.contains()" to cover this test case. There's already __contains__. Reinhold -- Mail address is perfectly valid! From raymond.hettinger at verizon.net Sat Aug 27 10:20:38 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 27 Aug 2005 04:20:38 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: Message-ID: <002f01c5aae0$35aaa820$a8bb9d8d@oemcomputer> > > The most important reason for the patch is that looking at the context > > diff will provide an objective look at how real code will look before > > and after the change. This would make subsequent discussions > > substantially more informed and less anecdotal. > > No, you're just artificially trying to raise the bar for Python 3.0 > proposals to an unreasonable height. Not really. I'm mostly for the proposal (+0), but am certain the conversation about the proposal would be substantially more informed if we had a side-by-side comparison of what real-world code looks like before and after the change. There are not too many instances of str.find() in the library and it is an easy patch to make. I'm just asking for a basic, objective investigative tool. Unlike more complex proposals, this one doesn't rely on any new functionality. It just says don't use X anymore. That makes it particularly easy to investigate in an objective way. BTW, this isn't unprecedented. We're already done it once when backticks got slated for removal in 3.0. All instances of it got changed in the standard library. As a result of the patch, we were able to 1) get an idea of how much work it took, 2) determine every category of use case, 3) learn that the resulting code was more beautiful, readable, and only microscopically slower, 4) learn about a handful of cases that were unexpectedly difficult to convert, and 5) update the library to be an example of what we think modern code looks like. That patch painlessly informed the decision making and validated that we were doing the right thing. The premise of Terry's proposal is that Python code is better when str.find() is not used. This is a testable proposition. Why not use the wealth of data at our fingertips to augment a priori reasoning and anecdotes. I'm not at all arguing against the proposal; I'm just asking for a thoughtful design process. Raymond P.S. Josiah was not alone. The comp.lang.python discussion had other posts expressing distaste for raising exceptions instead of using return codes. While I don't feel the same way, I don't think the respondants should be ignored. "Those people who love sausage and respect the law should not watch either one being made." From raymond.hettinger at verizon.net Sat Aug 27 10:28:28 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 27 Aug 2005 04:28:28 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <43100E14.6080009@v.loewis.de> Message-ID: <003001c5aae1$4d843280$a8bb9d8d@oemcomputer> [Martin] > For another example, file.read() returns an empty string at EOF. When my turn comes for making 3.0 proposals, I'm going to recommend nixing the "empty string at EOF" API. That is a carry-over from C that made some sense before there were iterators. Now, we have the option of introducing much cleaner iterator versions of these methods that use compact, fast, and readable for-loops instead of multi-statement while-loop boilerplate. Raymond From reinhold-birkenfeld-nospam at wolke7.net Sat Aug 27 12:01:55 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Sat, 27 Aug 2005 12:01:55 +0200 Subject: [Python-Dev] test_bz2 on Python 2.4.1 In-Reply-To: References: Message-ID: A.B., Khalid wrote: > #--------------------------- Python 2.5a0 from CVS -----------------# > # Result: passes > $ /g/projs/py25/python/dist/src/MinGW/python testbz2.py > > > #--------------------------- Python 2.4.1 from CVS -----------------# > # Result: fails > $ /g/projs/py24/python/dist/src/MinGW/python testbz2.py > Traceback (most recent call last): > File "testbz2.py", line 9, in ? > lines = bz2f.readlines() > RuntimeError: wrong sequence of bz2 library commands used I don't understand this. The sources for the bz2 modules are exactly equal in both branches. How do you check out the 2.4.1 from CVS? Reinhold -- Mail address is perfectly valid! From paragate at gmx.net Sat Aug 27 12:21:03 2005 From: paragate at gmx.net (Wolfgang Lipp) Date: Sat, 27 Aug 2005 12:21:03 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <43100E14.6080009@v.loewis.de> References: <20050826134317.7DFD.JCARLSON@uci.edu> <43100E14.6080009@v.loewis.de> Message-ID: On Sat, 27 Aug 2005 08:54:12 +0200, Martin v. L?wis wrote: > with choice 1a): dict.get returns None if the key is not found, even > though None could also be the value for the key. that's a bug! i had to *test* it to find out it's true! i've been writing code for *years* all in the understanding that dict.get(x) acts precisely like dict['x'] *except* you get a chance to define a default value. which, for me, has become sort of a standard solution to the problem the last ten or so postings were all about: when i write a function and realize that's one of the cases where python philosophy strongly favors raising an exception because something e.g. could not be found where expected, i make it so that a reasonable exception is raised *and* where meaningful i give consumers a chance to pass in a default value to eschew exceptions. i believe this is the way to go to resolve this .index/.find conflict. and, no, returning -1 when a substring is not found and None when a key is not found is *highly* problematic. i'd sure like to see cases like that to go. i'm not sure why .rindex() should go (correct?), and how to do what it does (reverse the string before doing .index()? is that what is done internally?) and of course, like always, there is the question why these are methods at all and why there is a function len(str) but a method str.index(); one could just as well have *either* str.length and str.index() *or* length(str) and, say, a builtin locate( x, element, start = 0 , stop = None, reversed = False, default = Misfit ) (where Misfit indicates a 'meta-None', so None is still a valid default value; i also like to indicate 'up to the end' with stop=None) that does on iterables (or only on sequences) what the methods do now, but with this strange pattern: ------------------------------------------------------------------ .index() .find() .get() .pop() list + ?(3) + tuple ?(3) ??(1) str + + ?(3) ??(1) dict x(2) x(2) + + (1) one could argue this should return a copy of a tuple or str, but doubtful. (2) index/find meaningless for dicts. (3) there is no .get() for list, tuple, str, although it would make sense: return the indexed element, or raise IndexError where not found if no default return value given. ------------------------------------------------------------------ what bites me here is expecially that we have both index and find for str *but a gaping hole* for tuples. assuming tuples are not slated for removal, i suggest to move in a direction that makes things look more like this: ------------------------------------------------------------------ .index() .get() .pop() list + + + tuple + + str + + dict + + ------------------------------------------------------------------ where .index() looks like locate, above: ------------------------------------------------------------------ {list,tuple,str}.index( element, # element in the collection start = 0, # where to start searching; default is zero stop = None, # where to end; the default, None, indicates # 'to the end' reversed = False, # should we search from the back? *may* cause # reversion of sequence, depending on impl. default = _Misfit, # default value, when given, prevents # IndexError from being raised ) ------------------------------------------------------------------ hope i didn't miss out crucial points here. _wolf -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/ From martin at v.loewis.de Sat Aug 27 12:35:30 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 27 Aug 2005 12:35:30 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <20050826134317.7DFD.JCARLSON@uci.edu> <43100E14.6080009@v.loewis.de> Message-ID: <431041F2.6060307@v.loewis.de> Wolfgang Lipp wrote: > that's a bug! i had to *test* it to find out it's true! i've been writing > code for *years* all in the understanding that dict.get(x) acts precisely > like dict['x'] *except* you get a chance to define a default value. Clearly, your understanding *all* these years *was* wrong. If you don't specify *a* default value, *it* defaults to None. Regards, Martin P.S. Emphasis mine :-) From paragate at gmx.net Sat Aug 27 12:47:55 2005 From: paragate at gmx.net (Wolfgang Lipp) Date: Sat, 27 Aug 2005 12:47:55 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <431041F2.6060307@v.loewis.de> References: <20050826134317.7DFD.JCARLSON@uci.edu> <43100E14.6080009@v.loewis.de> <431041F2.6060307@v.loewis.de> Message-ID: On Sat, 27 Aug 2005 12:35:30 +0200, Martin v. L?wis wrote: > P.S. Emphasis mine :-) no, emphasis all **mine** :-) just to reflect i never expected .get() to work that way (return an unsolicited None) -- i do consider this behavior harmful and suggest it be removed. _wolf From just at letterror.com Sat Aug 27 13:01:02 2005 From: just at letterror.com (Just van Rossum) Date: Sat, 27 Aug 2005 13:01:02 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: Message-ID: Wolfgang Lipp wrote: > On Sat, 27 Aug 2005 12:35:30 +0200, Martin v. L?wis > wrote: > > P.S. Emphasis mine :-) > > no, emphasis all **mine** :-) just to reflect i never expected .get() > to work that way (return an unsolicited None) -- i do consider this > behavior harmful and suggest it be removed. Just because you don't read the documentation and guessed wrong d.get() needs to be removed?!? It's a *feature* of d.get(k) to never raise KeyError. If you need an exception, why not just use d[k]? Just From paragate at gmx.net Sat Aug 27 13:33:40 2005 From: paragate at gmx.net (Wolfgang Lipp) Date: Sat, 27 Aug 2005 13:33:40 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: Message-ID: On Sat, 27 Aug 2005 13:01:02 +0200, Just van Rossum wrote: > Just because you don't read the documentation and guessed wrong d.get() > needs to be removed?!? no, not removed... never said that. > It's a *feature* of d.get(k) to never raise KeyError. If you need an > exception, why not just use d[k]? i agree i misread the specs, but then, i read the specs a lot, and i guess everyone here agrees that if it's in the specs doesn't mean it's automatically what we want or expect -- else there's nothing to discuss. i say d.get('x') == None <== { ( 'x' not in d ) OR ( d['x'] == None ) } is not what i expect (even tho the specs say so) especially since d.pop('x') *does* throw a KeyError when 'x' is not a key in mydict. ok, pop is not get and so on but still i perceive this a problematic behavior (to the point i call it a 'bug' in a jocular way, no offense implied). the reason of being for d.get() -- to me -- is simply so you get a chance to pass a default value, which is syntactically well-nigh impossible with d['x']. _wolf From skip at pobox.com Sat Aug 27 14:48:20 2005 From: skip at pobox.com (skip@pobox.com) Date: Sat, 27 Aug 2005 07:48:20 -0500 Subject: [Python-Dev] Style for raising exceptions (python-dev Summary for 2005-08-01 through 2005-08-15 [draft]) In-Reply-To: <430ECEDC.7040206@egenix.com> References: <430D90FF.6060206@egenix.com> <430ECEDC.7040206@egenix.com> Message-ID: <17168.24852.883667.347938@montanaro.dyndns.org> MAL> I don't see a need for two or more syntaxes either, but most code MAL> nowadays uses the second variant (I don't know of any code that MAL> uses the traceback argument), which puts up a high barrier for MAL> changes. Earlier this week I managed to fix all the instances in the projects I'm involved with at my day job in a couple rounds of grep/emacs macro sessions. It took all of about 20 minutes, so I don't think the conversion will be onerous. Skip From kay.schluehr at gmx.net Sat Aug 27 14:57:08 2005 From: kay.schluehr at gmx.net (Kay Schluehr) Date: Sat, 27 Aug 2005 14:57:08 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <20050826134317.7DFD.JCARLSON@uci.edu> Message-ID: Terry Reedy wrote: >>I would object to the removal of str.find(). > > > So, I wonder, what is your favored alternative? > > A. Status quo: ignore the opportunity to streamline the language. I actually don't see much benefits from the user perspective. The discourse about Python3000 has shrunken from the expectation of the "next big thing" into a depressive rhetorics of feature elimination. The language doesn't seem to become deeper, smaller and more powerfull but just smaller. > B. Change the return type of .find to None. > > C. Remove .(r)index instead. > > D. Add more redundancy for those who do not like exceptions. Why not turning index() into an iterator that yields indices sucessively? From this generalized perspective we can try to reconstruct behaviour of Python 2.X. Sometimes I use a custom keep() function if I want to prevent defining a block for catching StopIteration. The keep() function takes an iterator and returns a default value in case of StopIteration: def keep(iter, default=None): try: return iter.next() except StopIteration: return default Together with an index iterator the user can mimic the behaviour he wants. Instead of a ValueError a StopIteration exception can hold as an "external" information ( other than a default value ): >>> keep( "abcdabc".index("bc"), default=-1) # current behaviour of the # find() function >>> (idx for idx in "abcdabc".rindex("bc")) # generator expression Since the find() method acts on a string literal it is not easy to replace it syntactically. But why not add functions that can be hooked into classes whose objects are represented by literals? def find( string, substring): return keep( string.index( substring), default=-1) str.register(find) >>> "abcdabc".find("bc") 1 Now find() can be stored in a pure Python module without maintaining it on interpreter level ( same as with reduce, map and filter ). Kay From raymond.hettinger at verizon.net Sat Aug 27 14:56:28 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 27 Aug 2005 08:56:28 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: Message-ID: <000101c5ab06$bdff3620$a8bb9d8d@oemcomputer> FWIW, here are three more comparative code fragments. They are presented without judgment as an evaluation tool to let everyone form their own opinion about the merits of each: --- From CGIHTTPServer.py --------------- def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info i = rest.rfind('?') if i >= 0: rest, query = rest[:i], rest[i+1:] else: query = '' i = rest.find('/') if i >= 0: script, rest = rest[:i], rest[i:] else: script, rest = rest, '' . . . def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info try: i = rest.rindex('?') except ValueError(): query = '' else: rest, query = rest[:i], rest[i+1:] try: i = rest.index('/') except ValueError(): script, rest = rest, '' else: script, rest = rest[:i], rest[i:] . . . --- From ConfigParser.py --------------- optname, vi, optval = mo.group('option', 'vi', 'value') if vi in ('=', ':') and ';' in optval: # ';' is a comment delimiter only if it follows # a spacing character pos = optval.find(';') if pos != -1 and optval[pos-1].isspace(): optval = optval[:pos] optval = optval.strip() . . . optname, vi, optval = mo.group('option', 'vi', 'value') if vi in ('=', ':') and ';' in optval: # ';' is a comment delimiter only if it follows # a spacing character try: pos = optval.index(';') except ValueError(): pass else: if optval[pos-1].isspace(): optval = optval[:pos] optval = optval.strip() . . . --- StringIO.py --------------- i = self.buf.find('\n', self.pos) if i < 0: newpos = self.len else: newpos = i+1 . . . try: i = self.buf.find('\n', self.pos) except ValueError(): newpos = self.len else: newpos = i+1 . . . My notes so far weren't meant to judge the proposal. I'm just suggesting that examining fragments like the ones above will help inform the design process. Peace, Raymond From just at letterror.com Sat Aug 27 15:08:34 2005 From: just at letterror.com (Just van Rossum) Date: Sat, 27 Aug 2005 15:08:34 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: Message-ID: Wolfgang Lipp wrote: > > Just because you don't read the documentation and guessed wrong > > d.get() needs to be removed?!? > > no, not removed... never said that. Fair enough, you proposed to remove the behavior. Not sure how that's all that much less bad, though... > implied). the reason of being for d.get() -- to me -- is simply so you > get a chance to pass a default value, which is syntactically well-nigh > impossible with d['x']. Close, but the main reason to add d.get() was to avoid the exception. The need to specify a default value followed from that. Just From paragate at gmx.net Sat Aug 27 15:16:13 2005 From: paragate at gmx.net (Wolfgang Lipp) Date: Sat, 27 Aug 2005 15:16:13 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <20050826134317.7DFD.JCARLSON@uci.edu> Message-ID: kay, your suggestion makes perfect sense for me, i haven't actually tried the examples tho. guess there could be a find() or index() or indices() or iterIndices() ??? function 'f' roughly with these arguments: def f( x, element, start = 0, stop = None, default = _Misfit, maxcount = None, reverse = False ) that iterates over the indices of x where element (a substring, key, or value in a sequence or iterator) is found, raising sth. like IndexError when nothing at all is found except when default is not '_Misfit' (mata-None), and starts looking from the right end when reverse is True (this *may* imply that reversed(x) is done on x where no better implementation is available). not quite sure whether it makes sense to me to always return default as the last value of the iteration -- i tend to say rather not. ah yes, only up to maxcount indices are yielded. bet it said that passing an iterator for x would mean that the iterator is gone up to where the last index was yielded; passing an iterator is not acceptable for reverse = True. MHO, _wolf On Sat, 27 Aug 2005 14:57:08 +0200, Kay Schluehr wrote: > > def keep(iter, default=None): > try: > return iter.next() > except StopIteration: > return default > > Together with an index iterator the user can mimic the behaviour he > wants. Instead of a ValueError a StopIteration exception can hold as > an "external" information ( other than a default value ): > > >>> keep( "abcdabc".index("bc"), default=-1) # current behaviour of the > # find() function > >>> (idx for idx in "abcdabc".rindex("bc")) # generator expression > > > Since the find() method acts on a string literal it is not easy to > replace it syntactically. But why not add functions that can be hooked > into classes whose objects are represented by literals? > > def find( string, substring): > return keep( string.index( substring), default=-1) > > str.register(find) > > >>> "abcdabc".find("bc") > 1 > > Now find() can be stored in a pure Python module without maintaining it > on interpreter level ( same as with reduce, map and filter ). > > Kay From abkhd at hotmail.com Sat Aug 27 16:18:49 2005 From: abkhd at hotmail.com (A.B., Khalid) Date: Sat, 27 Aug 2005 14:18:49 +0000 Subject: [Python-Dev] test_bz2 on Python 2.4.1 Message-ID: Reinhold Birkenfeld wrote: >>#--------------------------- Python 2.5a0 from CVS -----------------# >># Result: passes >>$ /g/projs/py25/python/dist/src/MinGW/python testbz2.py >> >> >>#--------------------------- Python 2.4.1 from CVS -----------------# >># Result: fails >>$ /g/projs/py24/python/dist/src/MinGW/python testbz2.py >>Traceback (most recent call last): >> File "testbz2.py", line 9, in ? >> lines = bz2f.readlines() >>RuntimeError: wrong sequence of bz2 library commands used > >I don't understand this. The sources for the bz2 modules are exactly equal >in both branches. I know. Even the tests are equal. I didn't say that these files are to blame, I just said that the test is failing in Python 2.4.1 on Windows. >How do you check out the 2.4.1 from CVS? Well, I've been updating Python from CVS from more than a year now and I doubt that this is the problem. After all, Python 2.3.5 is passing the regrtests, and last time I checked, so is Python 2.5a0. Python 2.4.1 was also passing all the regtests until recently (not sure exatcly when, but it could be about a month ago). But anyway, here is how I update my copy of Python 2.4 from CVS. Roughly, cvs -d :pserver:anonymous at cvs.sourceforge.net:/cvsroot/python login [Enter] cvs -z7 -d :pserver:anonymous at cvs.sourceforge.net:/cvsroot/python update -dP -r release24-maint python And it is, more or less, the same way I check out other branches. I will download the Python 2.4.1 source archieve and to build it to see what happens. I'll report back when I am done. Regards, Khalid _________________________________________________________________ Don’t just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/ From gvanrossum at gmail.com Sat Aug 27 16:29:07 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 27 Aug 2005 07:29:07 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <003001c5aae1$4d843280$a8bb9d8d@oemcomputer> References: <43100E14.6080009@v.loewis.de> <003001c5aae1$4d843280$a8bb9d8d@oemcomputer> Message-ID: On 8/27/05, Raymond Hettinger wrote: > [Martin] > > For another example, file.read() returns an empty string at EOF. > > When my turn comes for making 3.0 proposals, I'm going to recommend > nixing the "empty string at EOF" API. That is a carry-over from C that > made some sense before there were iterators. Now, we have the option of > introducing much cleaner iterator versions of these methods that use > compact, fast, and readable for-loops instead of multi-statement > while-loop boilerplate. -1. For reading lines we already have that in the status quo. For reading bytes, I *know* that a lot of code would become uglier if the API changed to raise EOFError exceptions. It's not a coincidence that raw_input() raises EOFError but readline() doesn't -- the readline API was designed after externsive experience with raw_input(). The situation is different than for find(): - there aren't two APIs that only differ in their handling of the exceptional case - the error return value tests false and all non-error return values tests true - in many cases processing the error return value the same as non-error return values works just fine (as long as you have another way to test for termination) Also, even if read() raised EOFError instead of returning '', code that expects certain data wouldn't be simplified -- after attempting to read e.g. 4 bytes, you'd still have to check that you got exactly 4, so there'd be three cases to handle (EOFError, short, good) instead of two (short, good). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Sat Aug 27 16:36:46 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 27 Aug 2005 07:36:46 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <000101c5ab06$bdff3620$a8bb9d8d@oemcomputer> References: <000101c5ab06$bdff3620$a8bb9d8d@oemcomputer> Message-ID: On 8/27/05, Raymond Hettinger wrote: > --- From ConfigParser.py --------------- > > optname, vi, optval = mo.group('option', 'vi', 'value') > if vi in ('=', ':') and ';' in optval: > # ';' is a comment delimiter only if it follows > # a spacing character > pos = optval.find(';') > if pos != -1 and optval[pos-1].isspace(): > optval = optval[:pos] > optval = optval.strip() > . . . > > > optname, vi, optval = mo.group('option', 'vi', 'value') > if vi in ('=', ':') and ';' in optval: > # ';' is a comment delimiter only if it follows > # a spacing character > try: > pos = optval.index(';') > except ValueError(): I'm sure you meant "except ValueError:" > pass > else: > if optval[pos-1].isspace(): > optval = optval[:pos] > optval = optval.strip() > . . . That code is buggy before and after the transformation -- consider what happens if optval *starts* with a semicolon. Also, the code is searching optval for ';' twice. Suggestion: if vi in ('=',':'): try: pos = optval.index(';') except ValueError: pass else: if pos > 0 and optval[pos-1].isspace(): optval = optval[:pos] -- --Guido van Rossum (home page: http://www.python.org/~guido/) From reinhold-birkenfeld-nospam at wolke7.net Sat Aug 27 16:40:36 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Sat, 27 Aug 2005 16:40:36 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <003001c5aae1$4d843280$a8bb9d8d@oemcomputer> References: <43100E14.6080009@v.loewis.de> <003001c5aae1$4d843280$a8bb9d8d@oemcomputer> Message-ID: Raymond Hettinger wrote: > [Martin] >> For another example, file.read() returns an empty string at EOF. > > When my turn comes for making 3.0 proposals, I'm going to recommend > nixing the "empty string at EOF" API. That is a carry-over from C that > made some sense before there were iterators. Now, we have the option of > introducing much cleaner iterator versions of these methods that use > compact, fast, and readable for-loops instead of multi-statement > while-loop boilerplate. I think for char in iter(lambda: f.read(1), ''): pass is not bad, too. Reinhold -- Mail address is perfectly valid! From gvanrossum at gmail.com Sat Aug 27 16:42:48 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 27 Aug 2005 07:42:48 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <20050826134317.7DFD.JCARLSON@uci.edu> Message-ID: On 8/27/05, Kay Schluehr wrote: > The discourse about Python3000 has shrunken from the expectation of the > "next big thing" into a depressive rhetorics of feature elimination. > The language doesn't seem to become deeper, smaller and more powerfull > but just smaller. I understand how your perception reading python-dev would make you think that, but it's not true. There is much focus on removing things, because we want to be able to add new stuff but we don't want the language to grow. Python-dev is (correctly) very focused on the status quo and the near future, so discussions on what can be removed without hurting are valuable here. Discussions on what to add should probably happen elsewhere, since the proposals tend to range from genius to insane (sometimes within one proposal :-) and the discussion tends to become even more rampant than the discussions about changes in 2.5. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Sat Aug 27 16:46:07 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 27 Aug 2005 07:46:07 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <20050826134317.7DFD.JCARLSON@uci.edu> <43100E14.6080009@v.loewis.de> <431041F2.6060307@v.loewis.de> Message-ID: On 8/27/05, Wolfgang Lipp wrote: > i never expected .get() > to work that way (return an unsolicited None) -- i do consider this > behavior harmful and suggest it be removed. That's a bizarre attitude. You don't read the docs and hence you want a feature you weren't aware of to be removed? I'm glad you're not on *my* team. (Emphasis mine. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From reinhold-birkenfeld-nospam at wolke7.net Sat Aug 27 16:50:58 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Sat, 27 Aug 2005 16:50:58 +0200 Subject: [Python-Dev] test_bz2 on Python 2.4.1 In-Reply-To: References: Message-ID: A.B., Khalid wrote: >>>#--------------------------- Python 2.4.1 from CVS -----------------# [test_bz2] >>>RuntimeError: wrong sequence of bz2 library commands used >> >>I don't understand this. The sources for the bz2 modules are exactly equal >>in both branches. > > I know. Even the tests are equal. I didn't say that these files are to > blame, I just said that the test is failing in Python 2.4.1 on Windows. > cvs -d :pserver:anonymous at cvs.sourceforge.net:/cvsroot/python login > cvs -z7 -d :pserver:anonymous at cvs.sourceforge.net:/cvsroot/python update -dP > -r release24-maint python > > And it is, more or less, the same way I check out other branches. No problem here, just eliminating possibilities. Could anyone else on Windows please try the test_bz2, too? Reinhold -- Mail address is perfectly valid! From gvanrossum at gmail.com Sat Aug 27 17:03:35 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 27 Aug 2005 08:03:35 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <20050826184634.7E06.JCARLSON@uci.edu> References: <20050826134317.7DFD.JCARLSON@uci.edu> <20050826184634.7E06.JCARLSON@uci.edu> Message-ID: On 8/26/05, Josiah Carlson wrote: > Taking a look at the commits that Guido did way back in 1993, he doesn't > mention why he added .find, only that he did. Maybe it was another of > the 'functional language additions' that he now regrets, I don't know. There's nothing functional about it. I remember adding it after finding it cumbersome to write code using index/rindex. However, that was long before we added startswith(), endswith(), and 's in t' for multichar s. Clearly all sorts of varieties of substring matching are important, or we wouldn't have so many methods devoted to it! (Not to mention the 're' module.) However, after 12 years, I believe that the small benefit of having find() is outweighed by the frequent occurrence of bugs in its use. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Sat Aug 27 17:04:54 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 27 Aug 2005 11:04:54 -0400 Subject: [Python-Dev] empty string api for files In-Reply-To: Message-ID: <000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer> > For reading bytes, I *know* that a lot of code would become uglier if > the API changed to raise EOFError exceptions I had StopIteration in mind. Instead of writing: while 1: block = f.read(20) if line == '': break . . . We would use: for block in f.readblocks(20): . . . More beauty, a little faster, more concise, and less error-prone. Of course, there are likely better choices for the method name, but you get the gist of it. From paragate at gmx.net Sat Aug 27 17:12:45 2005 From: paragate at gmx.net (Wolfgang Lipp) Date: Sat, 27 Aug 2005 17:12:45 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <20050826134317.7DFD.JCARLSON@uci.edu> <43100E14.6080009@v.loewis.de> <431041F2.6060307@v.loewis.de> Message-ID: On Sat, 27 Aug 2005 16:46:07 +0200, Guido van Rossum wrote: > On 8/27/05, Wolfgang Lipp wrote: >> i never expected .get() >> to work that way (return an unsolicited None) -- i do consider this >> behavior harmful and suggest it be removed. > > That's a bizarre attitude. You don't read the docs and hence you want > a feature you weren't aware of to be removed? i do read the docs, and i believe i do keep a lot of detail in my head. every now and then, tho, you piece sth together using a logic that is not 100% the way it was intended, or the way it came about. let me say that for someone who did developement for python for a while it is natural to know that ~.get() is there for avoidance of exceptions, and default values are an afterthought, but for someone who did developement *with* python (and lacks experience of the other side) this ain't necessarily so. that said, i believe it to be more expressive and safer to demand ~.get('x',None) to be written to achieve the present behavior, and let ~.get('x') raise an exception. personally, i can live with either way, and am happier the second. just my thoughts. > I'm glad you're not on *my* team. (Emphasis mine. :-) i wonder what that would be like. _wolf From abkhd at hotmail.com Sat Aug 27 17:23:35 2005 From: abkhd at hotmail.com (A.B., Khalid) Date: Sat, 27 Aug 2005 15:23:35 +0000 Subject: [Python-Dev] test_bz2 fails on Python 2.4.1 from CVS, passes on same from source archieve Message-ID: Okay here is the output of test_bz2 on Python 2.4.1 updated and compiled fresh from CVS, and on Python 2.4.1 from the source archieve from python.org (http://www.python.org/ftp/python/2.4.1/Python-2.4.1.tar.bz2). #----------------------------------------------------------------------------- # Python 2.4.1 compiled from source archieve: # Result: passes #----------------------------------------------------------------------------- $ cd /g/projs/py241-src-arc/mingw $ python ../Lib/test/test_bz2.py testIterator (__main__.BZ2FileTest) ... ok testOpenDel (__main__.BZ2FileTest) ... ok testOpenNonexistent (__main__.BZ2FileTest) ... ok testRead (__main__.BZ2FileTest) ... ok testRead100 (__main__.BZ2FileTest) ... ok testReadChunk10 (__main__.BZ2FileTest) ... ok testReadLine (__main__.BZ2FileTest) ... ok testReadLines (__main__.BZ2FileTest) ... ok testSeekBackwards (__main__.BZ2FileTest) ... ok testSeekBackwardsFromEnd (__main__.BZ2FileTest) ... ok testSeekForward (__main__.BZ2FileTest) ... ok testSeekPostEnd (__main__.BZ2FileTest) ... ok testSeekPostEndTwice (__main__.BZ2FileTest) ... ok testSeekPreStart (__main__.BZ2FileTest) ... ok testUniversalNewlinesCRLF (__main__.BZ2FileTest) ... ok testUniversalNewlinesLF (__main__.BZ2FileTest) ... ok testWrite (__main__.BZ2FileTest) ... ok testWriteChunks10 (__main__.BZ2FileTest) ... ok testWriteLines (__main__.BZ2FileTest) ... ok testXReadLines (__main__.BZ2FileTest) ... ok testCompress (__main__.BZ2CompressorTest) ... ok testCompressChunks10 (__main__.BZ2CompressorTest) ... ok testDecompress (__main__.BZ2DecompressorTest) ... ok testDecompressChunks10 (__main__.BZ2DecompressorTest) ... ok testDecompressUnusedData (__main__.BZ2DecompressorTest) ... ok testEOFError (__main__.BZ2DecompressorTest) ... ok test_Constructor (__main__.BZ2DecompressorTest) ... ok testCompress (__main__.FuncTest) ... ok testDecompress (__main__.FuncTest) ... ok testDecompressEmpty (__main__.FuncTest) ... ok testDecompressIncomplete (__main__.FuncTest) ... ok ---------------------------------------------------------------------- Ran 31 tests in 6.430s OK #----------------------------------------------------------------------------- # Python 2.4.1 compiled from CVS updated even today: # Result: fails #----------------------------------------------------------------------------- $ cd /g/projs/py24/python/dist/src/MinGW $ python ../Lib/test/test_bz2.py testBug1191043 (__main__.BZ2FileTest) ... ERROR ERROR testIterator (__main__.BZ2FileTest) ... ok testModeU (__main__.BZ2FileTest) ... ok testOpenDel (__main__.BZ2FileTest) ... ok testOpenNonexistent (__main__.BZ2FileTest) ... ok testRead (__main__.BZ2FileTest) ... ok testRead100 (__main__.BZ2FileTest) ... ok testReadChunk10 (__main__.BZ2FileTest) ... ok testReadLine (__main__.BZ2FileTest) ... ok testReadLines (__main__.BZ2FileTest) ... ok testSeekBackwards (__main__.BZ2FileTest) ... ok testSeekBackwardsFromEnd (__main__.BZ2FileTest) ... ok testSeekForward (__main__.BZ2FileTest) ... ok testSeekPostEnd (__main__.BZ2FileTest) ... ok testSeekPostEndTwice (__main__.BZ2FileTest) ... ok testSeekPreStart (__main__.BZ2FileTest) ... ok testUniversalNewlinesCRLF (__main__.BZ2FileTest) ... ok testUniversalNewlinesLF (__main__.BZ2FileTest) ... ok testWrite (__main__.BZ2FileTest) ... ok testWriteChunks10 (__main__.BZ2FileTest) ... ok testWriteLines (__main__.BZ2FileTest) ... ok testXReadLines (__main__.BZ2FileTest) ... ok testCompress (__main__.BZ2CompressorTest) ... ok testCompressChunks10 (__main__.BZ2CompressorTest) ... ok testDecompress (__main__.BZ2DecompressorTest) ... ok testDecompressChunks10 (__main__.BZ2DecompressorTest) ... ok testDecompressUnusedData (__main__.BZ2DecompressorTest) ... ok testEOFError (__main__.BZ2DecompressorTest) ... ok test_Constructor (__main__.BZ2DecompressorTest) ... ok testCompress (__main__.FuncTest) ... ok testDecompress (__main__.FuncTest) ... ok testDecompressEmpty (__main__.FuncTest) ... ok testDecompressIncomplete (__main__.FuncTest) ... ok ====================================================================== ERROR: testBug1191043 (__main__.BZ2FileTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "../Lib/test/test_bz2.py", line 255, in testBug1191043 lines = bz2f.readlines() RuntimeError: wrong sequence of bz2 library commands used ====================================================================== ERROR: testBug1191043 (__main__.BZ2FileTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "../Lib/test/test_bz2.py", line 47, in tearDown os.unlink(self.filename) OSError: [Errno 13] Permission denied: '@test' ---------------------------------------------------------------------- Ran 33 tests in 5.980s FAILED (errors=2) Traceback (most recent call last): File "../Lib/test/test_bz2.py", line 357, in ? test_main() File "../Lib/test/test_bz2.py", line 353, in test_main FuncTest File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 290, in run_unittest run_suite(suite, testclass) File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 274, in run_suite raise TestFailed(msg) test.test_support.TestFailed: errors occurred; run in verbose mode for details _________________________________________________________________ Don't just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/ From raymond.hettinger at verizon.net Sat Aug 27 17:54:39 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 27 Aug 2005 11:54:39 -0400 Subject: [Python-Dev] Remove str.find in 3.0? Message-ID: <001801c5ab1f$a2375bc0$a8bb9d8d@oemcomputer> [Guido] > However, after 12 years, I believe that the small benefit of having > find() is outweighed by the frequent occurrence of bugs in its use. My little code transformation exercise is bearing that out. Two of the first four cases in the standard library were buggy :-( Raymond From tim.peters at gmail.com Sat Aug 27 18:38:47 2005 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 27 Aug 2005 12:38:47 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <000101c5ab06$bdff3620$a8bb9d8d@oemcomputer> References: <000101c5ab06$bdff3620$a8bb9d8d@oemcomputer> Message-ID: <1f7befae05082709382c6701b5@mail.gmail.com> [Raymond Hettinger, rewrites some code] > ... > --- StringIO.py --------------- > > i = self.buf.find('\n', self.pos) > if i < 0: > newpos = self.len > else: > newpos = i+1 > . . . > > > try: > i = self.buf.find('\n', self.pos) > except ValueError(): > newpos = self.len > else: > newpos = i+1 > . . . You probably want "except ValueError:" in all these, not "except ValueError():". Leaving that alone, the last example particularly shows one thing I dislike about try/except here: in a language with properties, how is the code reader supposed to guess that it's specifically and only the .find() call that _can_ raise ValueError in i = self.buf.find('\n', self.pos) ? I agree it's clear enough here from context, but there's no confusion possible on this point in the original spelling: it's immediately obvious that the result of find() is the only thing being tested. There's also strong temptation to slam everything into the 'try' block, and reduce nesting: newpos = self.len try: newpos = self.buf.find('\n', self.pos) + 1 except ValueError: pass I've often seen code in the wild with, say, two-three dozen lines in a ``try`` block, with an "except AttributeError:" that was _intended_ to catch an expected AttributeError only in the second of those lines. Of course that hides legitimate bugs too. Like ``object.attr``, the result of ``string.find()`` is normally used in further computation, so the temptation is to slam the computation inside the ``try`` block too. .find() is a little delicate to use, but IME sloppy try/except practice (putting much more in the ``try`` block than the specific little operation where an exception is expected) is common, and harder to get people to change because it requires thought instead of just reading the manual to see that -1 means "not there" <0.5 wink>. Another consideration is code that needs to use .find() a _lot_. In my programs of that sort, try/except is a lot more expensive than letting -1 signal "not there". From raymond.hettinger at verizon.net Sat Aug 27 18:46:17 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sat, 27 Aug 2005 12:46:17 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <1f7befae05082709382c6701b5@mail.gmail.com> Message-ID: <001a01c5ab26$d901adc0$a8bb9d8d@oemcomputer> [Tim] > You probably want "except ValueError:" in all these, not "except > ValueError():". Right. I was misremembering the new edict to write: raise ValueError() Raymond From tim.peters at gmail.com Sat Aug 27 19:09:20 2005 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 27 Aug 2005 13:09:20 -0400 Subject: [Python-Dev] test_bz2 on Python 2.4.1 In-Reply-To: References: Message-ID: <1f7befae05082710091228649@mail.gmail.com> [Reinhold Birkenfeld] > Could anyone else on Windows please try the test_bz2, too? test_bz2 works fine here, on WinXP Pro SP2, under release and debug builds, on current CVS HEAD and on current CVS release24-maint branch. I built those 4 Pythons with the MS compiler, not MinGW. From jcarlson at uci.edu Sat Aug 27 19:16:34 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 27 Aug 2005 10:16:34 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <20050826184634.7E06.JCARLSON@uci.edu> Message-ID: <20050827095203.7E0C.JCARLSON@uci.edu> Guido van Rossum wrote: > > On 8/26/05, Josiah Carlson wrote: > > Taking a look at the commits that Guido did way back in 1993, he doesn't > > mention why he added .find, only that he did. Maybe it was another of > > the 'functional language additions' that he now regrets, I don't know. > > There's nothing functional about it. I remember adding it after > finding it cumbersome to write code using index/rindex. However, that > was long before we added startswith(), endswith(), and 's in t' for > multichar s. Clearly all sorts of varieties of substring matching are > important, or we wouldn't have so many methods devoted to it! (Not to > mention the 're' module.) > > However, after 12 years, I believe that the small benefit of having > find() is outweighed by the frequent occurrence of bugs in its use. Oh, there's a good thing to bring up; regular expressions! re.search returns a match object on success, None on failure. With this "failure -> Exception" idea, shouldn't they raise exceptions instead? And goodness, defining a good regular expression can be quite hard, possibly leading to not insignificant "my regular expression doesn't do what I want it to do" bugs. Just look at all of those escape sequences and the syntax! It's enough to make a new user of Python gasp. Most of us are consenting adults here. If someone writes buggy code with str.find, that is unfortunate, maybe they should have used regular expressions and tested for None, maybe they should have used str.startswith (which is sometimes slower than m == n[:len(m)], but I digress), maybe they should have used str.index. But just because buggy code can be written with it, doesn't mean that it should be removed. Buggy code can, will, and has been written with every Python mechanism that has ever existed or will ever exist. With the existance of literally thousands of uses of .find and .rfind in the wild, any removal consideration should be weighed heavily - which honestly doesn't seem to be the case here with the ~15 minute reply time yesterday (just my observation and opinion). If you had been ruminating over this previously, great, but that did not seem clear to me in your original reply to Terry Reedy. - Josiah From bcannon at gmail.com Sat Aug 27 20:28:20 2005 From: bcannon at gmail.com (Brett Cannon) Date: Sat, 27 Aug 2005 11:28:20 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: Message-ID: On 8/26/05, Guido van Rossum wrote: > On 8/26/05, Terry Reedy wrote: > > Can str.find be listed in PEP 3000 (under builtins) for removal? > > Yes please. (Except it's not technically a builtin but a string method.) > Done. Added an "Atomic Types" section to the PEP as well. -Brett From aahz at pythoncraft.com Sat Aug 27 20:51:47 2005 From: aahz at pythoncraft.com (Aahz) Date: Sat, 27 Aug 2005 11:51:47 -0700 Subject: [Python-Dev] Python 3.0 blocks? In-Reply-To: References: <000101c5ab06$bdff3620$a8bb9d8d@oemcomputer> Message-ID: <20050827185146.GA28094@panix.com> On Sat, Aug 27, 2005, Guido van Rossum wrote: > > if vi in ('=',':'): > try: pos = optval.index(';') > except ValueError: pass > else: > if pos > 0 and optval[pos-1].isspace(): > optval = optval[:pos] IIRC, one of your proposals for Python 3.0 was that single-line blocks would be banned. Is my memory wrong? -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From Scott.Daniels at Acm.Org Sat Aug 27 23:08:08 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sat, 27 Aug 2005 14:08:08 -0700 Subject: [Python-Dev] empty string api for files In-Reply-To: <000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer> References: <000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer> Message-ID: Raymond Hettinger wrote: > We would use: > for block in f.readblocks(20): > . . . What would be nice is a reader that allows a range of bytes. Often when you read a chunk, you don't care about the exact size you get, example uses include the re-blocking that makes reading from compressed data sources unnecessarily inefficient. --Scott David Daniels Scott.Daniels at Acm.Org From gvanrossum at gmail.com Sat Aug 27 23:54:13 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 27 Aug 2005 14:54:13 -0700 Subject: [Python-Dev] empty string api for files In-Reply-To: <000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer> References: <000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer> Message-ID: On 8/27/05, Raymond Hettinger wrote: > > For reading bytes, I *know* that a lot of code would become uglier if > > the API changed to raise EOFError exceptions > > I had StopIteration in mind. Instead of writing: > > while 1: > block = f.read(20) > if line == '': > break > . . . > > We would use: > > for block in f.readblocks(20): > . . . > > More beauty, a little faster, more concise, and less error-prone. Of > course, there are likely better choices for the method name, but you get > the gist of it. I'm not convinced. Where would you ever care about reading a file in N-bytes chucks? I really think you've got a solution in search of a problem by the horns here. While this would be useful for a copying loop, it falls down for most practical uses of reading bytes (e.g. reading GIF or WAVE file). I've thought a lot about redesigning the file/stream API, but the problems thi API change would solve just aren't high on my list. Much more important are transparency of the buffering (for better integration with select()), and various translations like universal newlines or character set encodings. Some of my work on this is nondist/sandbox/sio/. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Sat Aug 27 23:58:23 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 27 Aug 2005 14:58:23 -0700 Subject: [Python-Dev] Python 3.0 blocks? In-Reply-To: <20050827185146.GA28094@panix.com> References: <000101c5ab06$bdff3620$a8bb9d8d@oemcomputer> <20050827185146.GA28094@panix.com> Message-ID: On 8/27/05, Aahz wrote: > On Sat, Aug 27, 2005, Guido van Rossum wrote: > > > > if vi in ('=',':'): > > try: pos = optval.index(';') > > except ValueError: pass > > else: > > if pos > 0 and optval[pos-1].isspace(): > > optval = optval[:pos] > > IIRC, one of your proposals for Python 3.0 was that single-line blocks > would be banned. Is my memory wrong? It's a proposal. I'm on the fence about it. I was just trying to get the posting out quick before my family came knowcking on my door. :) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Sun Aug 28 00:54:41 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat, 27 Aug 2005 15:54:41 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <20050827095203.7E0C.JCARLSON@uci.edu> References: <20050826184634.7E06.JCARLSON@uci.edu> <20050827095203.7E0C.JCARLSON@uci.edu> Message-ID: On 8/27/05, Josiah Carlson wrote: > With the existance of literally thousands of uses of .find and .rfind in > the wild, any removal consideration should be weighed heavily - which > honestly doesn't seem to be the case here with the ~15 minute reply time > yesterday (just my observation and opinion). If you had been ruminating > over this previously, great, but that did not seem clear to me in your > original reply to Terry Reedy. I hadn't been ruminating about deleting it previously, but I was well aware of the likelihood of writing buggy tests for find()'s return value. I believe that str.find() is not just something that can be used to write buggy code, but something that *causes* bugs over and over again. (However, see below.) The argument that there are thousands of usages in the wild doesn't carry much weight when we're talking about Python 3.0. There are at least a similar number of modules that expect dict.keys(), zip() and range() to return lists, or that depend on the distinction between Unicode strings and 8-bit strings, or on bare except:, on any other feature that is slated for deletion in Python 3.0 for which the replacement requires careful rethinking of the code rather than a mechanical translation. The *premise* of Python 3.0 is that it drops backwards compatibility in order to make the language better in the long term. Surely you believe that the majority of all Python programs have yet to be written? The only argument in this thread in favor of find() that made sense to me was Tim Peters' observation that the requirement to use a try/except clause leads to another kind of sloppy code. It's hard to judge which is worse -- the buggy find() calls or the buggy/cumbersome try/except code. Note that all code (unless it needs to be backwards compatible to Python 2.2 and before) which is using find() to merely detect whether a given substring is present should be using 's1 in s2' instead. Another observation: despite the derogatory remarks about regular expressions, they have one thing going for them: they provide a higher level of abstraction for string parsing, which this is all about. (They are higher level in that you don't have to be counting characters, which is about the lowest-level activity in programming -- only counting bytes is lower!) Maybe if we had a *good* way of specifying string parsing we wouldn't be needing to call find() or index() so much at all! (A good example is the code that Raymond lifted from ConfigParser: a semicolon preceded by whitespace starts a comment, other semicolons don't. Surely there ought to be a better way to write that.) All in all, I'm still happy to see find() go in Python 3.0, but I'm leaving the door ajar: if you read this post carefully, you'll know what arguments can be used to persuade me. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From abo at minkirri.apana.org.au Sun Aug 28 03:52:25 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Sun, 28 Aug 2005 11:52:25 +1000 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <20050827095203.7E0C.JCARLSON@uci.edu> References: <20050826184634.7E06.JCARLSON@uci.edu> <20050827095203.7E0C.JCARLSON@uci.edu> Message-ID: <1125193945.5215.26.camel@localhost> On Sat, 2005-08-27 at 10:16 -0700, Josiah Carlson wrote: > Guido van Rossum wrote: [...] > Oh, there's a good thing to bring up; regular expressions! re.search > returns a match object on success, None on failure. With this "failure > -> Exception" idea, shouldn't they raise exceptions instead? And > goodness, defining a good regular expression can be quite hard, possibly > leading to not insignificant "my regular expression doesn't do what I > want it to do" bugs. Just look at all of those escape sequences and the > syntax! It's enough to make a new user of Python gasp. I think re.match() returning None is an example of 1b (as categorised by Terry Reedy). In this particular case a 1b style response is OK. Why; 1) any successful match evaluates to "True", and None evaluates to "False". This allows simple code like; if myreg.match(s): do something. Note you can't do this for find, as 0 is a successful "find" and evaluates to False, whereas other results including -1 evaluate to True. Even worse, -1 is a valid index. 2) exceptions are for unexpected events, where unexpected means "much less likely than other possibilities". The re.match() operation asks "does this match this", which implies you have an about even chance of not matching... ie a failure to match is not unexpected. The result None makes sense... "what match did we get? None, OK". For str.index() you are asking "give me the index of this inside this", which implies you expect it to be in there... ie not finding it _is_ unexpected, and should raise an exception. Note that re.match() returning None will raise exceptions if the rest of your code doesn't expect it; index = myreg.match(s).start() tail = s[index:] This will raise an exception if there was no match. Unlike str.find(); index = s.find(r) tail = s[index:] Which will happily return the last character if there was no match. This is why find() should return None instead of -1. > With the existance of literally thousands of uses of .find and .rfind in > the wild, any removal consideration should be weighed heavily - which > honestly doesn't seem to be the case here with the ~15 minute reply time > yesterday (just my observation and opinion). If you had been ruminating > over this previously, great, but that did not seem clear to me in your > original reply to Terry Reedy. bare in mind they are talking about Python 3.0... I think :-) -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From sharprazor at gmail.com Sun Aug 28 06:02:02 2005 From: sharprazor at gmail.com (FAN) Date: Sun, 28 Aug 2005 12:02:02 +0800 Subject: [Python-Dev] Any detail list of change between version 2.1-2.2-2.3-2.4 of Python? Message-ID: <51ec6a95050827210210a408e9@mail.gmail.com> hi, all You know Jython (Java version of Python) has only a stable version of 2.1, and two alpha version was release after 3 years. So if it wants to evolve to 2.2 , 2.3 or 2.4 as Python, some detail change list was need, and it's great if there are some test case script to test the new implemention version. So does Python has this kinds of things? Where can I find them or something like this? Regards FAN From jcarlson at uci.edu Sun Aug 28 07:52:31 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 27 Aug 2005 22:52:31 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <1125193945.5215.26.camel@localhost> References: <20050827095203.7E0C.JCARLSON@uci.edu> <1125193945.5215.26.camel@localhost> Message-ID: <20050827215414.7E27.JCARLSON@uci.edu> Donovan Baarda wrote: > > On Sat, 2005-08-27 at 10:16 -0700, Josiah Carlson wrote: > > Guido van Rossum wrote: > [...] > > Oh, there's a good thing to bring up; regular expressions! re.search > > returns a match object on success, None on failure. With this "failure > > -> Exception" idea, shouldn't they raise exceptions instead? And > > goodness, defining a good regular expression can be quite hard, possibly > > leading to not insignificant "my regular expression doesn't do what I > > want it to do" bugs. Just look at all of those escape sequences and the > > syntax! It's enough to make a new user of Python gasp. > > I think re.match() returning None is an example of 1b (as categorised by > Terry Reedy). In this particular case a 1b style response is OK. Why; My tongue was firmly planted in my cheek during my discussion of regular expressions. I was using it as an example of when one starts applying some arbitrary rule to one example, and not noticing other examples that do very similar, if not the same thing. [snip discussion of re.match, re.search, str.find] If you are really going to compare re.match, re.search and str.find, you need to point out that neither re.match nor re.search raise an exception when something isn't found (only when you try to work with None). This puts str.index as the odd-man-out in this discussion of searching a string - so the proposal of tossing str.find as the 'weird one' is a little strange. One thing that has gotten my underwear in a twist is that no one has really offered up a transition mechanism from "str.find working like now" and some future "str.find or lack of" other than "use str.index". Obviously, I personally find the removal of str.find to be a nonstarter (don't make me catch exceptions or use regular expressions when both are unnecessary, please), but a proper transition of str.find from -1 to None on failure would be beneficial (can which one be chosen at runtime via __future__ import?). During a transition which uses __future__, it would encourage the /proper/ use of str.find in all modules and extensions in which use it... x = y.find(z) if x >= 0: #... Forcing people to use the proper semantic in their modules so as to be compatible with other modules which may or may not use str.find returns None, would (I believe) result in an overall reduction (if not elimination) of bugs stemming from str.find, and would prevent former str.find users from stumbling down the try/except/else misuse that Tim Peters highlighted. Heck, if you can get the __future__ import working for choosing which str.find to use (on a global, not per-module basis), I say toss it into 2.6, or even 2.5 if there is really a push for this prior to 3.0 . - Josiah From tjreedy at udel.edu Sun Aug 28 07:51:48 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 28 Aug 2005 01:51:48 -0400 Subject: [Python-Dev] Any detail list of change between version2.1-2.2-2.3-2.4 of Python? References: <51ec6a95050827210210a408e9@mail.gmail.com> Message-ID: "FAN" wrote in message news:51ec6a95050827210210a408e9 at mail.gmail.com... > You know Jython (Java version of Python) has only a stable version of > 2.1, and two alpha version was release after 3 years. > So if it wants to evolve to 2.2 , 2.3 or 2.4 as Python, some detail > change list was need, and it's great if there are some test case > script to test the new implemention version. > So does Python has this kinds of things? Where can I find them or > something like this? I believe this question is off-topic here, which is for dicussion of future changes. If you ask the same question on comp.lang.python or the mail or gmane.org equivalent, or perhaps in the search box at python.org, I am sure you will get an answer. Terry J. Reedy From steve at holdenweb.com Sun Aug 28 08:58:39 2005 From: steve at holdenweb.com (Steve Holden) Date: Sun, 28 Aug 2005 02:58:39 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <20050827215414.7E27.JCARLSON@uci.edu> References: <20050827095203.7E0C.JCARLSON@uci.edu> <1125193945.5215.26.camel@localhost> <20050827215414.7E27.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > Donovan Baarda wrote: [...] > > One thing that has gotten my underwear in a twist is that no one has > really offered up a transition mechanism from "str.find working like now" > and some future "str.find or lack of" other than "use str.index". > Obviously, I personally find the removal of str.find to be a nonstarter > (don't make me catch exceptions or use regular expressions when both are > unnecessary, please), but a proper transition of str.find from -1 to > None on failure would be beneficial (can which one be chosen at runtime > via __future__ import?). > > During a transition which uses __future__, it would encourage the > /proper/ use of str.find in all modules and extensions in which use it... > > x = y.find(z) > if x >= 0: > #... > It does seem rather fragile to rely on the continuation of the current behavior >>> None >= 0 False for the correctness of "proper usage". Is this guaranteed in future implementations? Especially when: >>> type(None) >= 0 True > Forcing people to use the proper semantic in their modules so as to be > compatible with other modules which may or may not use str.find returns > None, would (I believe) result in an overall reduction (if not > elimination) of bugs stemming from str.find, and would prevent former > str.find users from stumbling down the try/except/else misuse that Tim > Peters highlighted. > Once "str.find() returns None to fail" becomes the norm then surely the correct usage would be x = y.find(z) if x is not None: #... which is still a rather ugly paradigm, but acceptable. So the transition is bound to be troubling. > Heck, if you can get the __future__ import working for choosing which > str.find to use (on a global, not per-module basis), I say toss it into > 2.6, or even 2.5 if there is really a push for this prior to 3.0 . The real problem is surely that one of find()'s legitimate return values evaluates false in a Boolean context. It's especially troubling that the value that does so doesn't indicate search failure. I'd prefer to live with the wart until 3.0 introduces something more satisfactory, or simply removes find() altogether. Otherwise the resulting code breakage when the future arrives just causes unnecessary pain. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ From jcarlson at uci.edu Sun Aug 28 12:50:17 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 28 Aug 2005 03:50:17 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <20050827215414.7E27.JCARLSON@uci.edu> Message-ID: <20050828030405.7E2D.JCARLSON@uci.edu> Steve Holden wrote: > > Josiah Carlson wrote: > > Donovan Baarda wrote: > [...] > > > > One thing that has gotten my underwear in a twist is that no one has > > really offered up a transition mechanism from "str.find working like now" > > and some future "str.find or lack of" other than "use str.index". > > Obviously, I personally find the removal of str.find to be a nonstarter > > (don't make me catch exceptions or use regular expressions when both are > > unnecessary, please), but a proper transition of str.find from -1 to > > None on failure would be beneficial (can which one be chosen at runtime > > via __future__ import?). > > > > During a transition which uses __future__, it would encourage the > > /proper/ use of str.find in all modules and extensions in which use it... > > > > x = y.find(z) > > if x >= 0: > > #... > > > It does seem rather fragile to rely on the continuation of the current > behavior > > >>> None >= 0 > False Please see this previous post on None comparisons and why it is unlikely to change: http://mail.python.org/pipermail/python-dev/2003-December/041374.html > for the correctness of "proper usage". Is this guaranteed in future > implementations? Especially when: > > >>> type(None) >= 0 > True That is an interesting, but subjectively useless comparison: >>> type(0) >= 0 True >>> type(int) >= 0 True When do you ever compare the type of an object with the value of another object? > > Forcing people to use the proper semantic in their modules so as to be > > compatible with other modules which may or may not use str.find returns > > None, would (I believe) result in an overall reduction (if not > > elimination) of bugs stemming from str.find, and would prevent former > > str.find users from stumbling down the try/except/else misuse that Tim > > Peters highlighted. > > > Once "str.find() returns None to fail" becomes the norm then surely the > correct usage would be > > x = y.find(z) > if x is not None: > #... > > which is still a rather ugly paradigm, but acceptable. So the transition > is bound to be troubling. Perhaps, which is why I offered "x >= 0". > > Heck, if you can get the __future__ import working for choosing which > > str.find to use (on a global, not per-module basis), I say toss it into > > 2.6, or even 2.5 if there is really a push for this prior to 3.0 . > > The real problem is surely that one of find()'s legitimate return values > evaluates false in a Boolean context. It's especially troubling that the > value that does so doesn't indicate search failure. I'd prefer to live > with the wart until 3.0 introduces something more satisfactory, or > simply removes find() altogether. Otherwise the resulting code breakage > when the future arrives just causes unnecessary pain. Here's a current (horrible but common) solution: x = string.find(substring) + 1 if x: x -= 1 ... ...I'm up way to late. - Josiah From gregory.lielens at fft.be Sun Aug 28 13:09:02 2005 From: gregory.lielens at fft.be (Gregory Lielens) Date: Sun, 28 Aug 2005 13:09:02 +0200 Subject: [Python-Dev] info/advices about python readline implementation Message-ID: <1125227342.13393.6.camel@Athlon64.home> Hi, Lisandro Dalcin and me are working on a common version of our patches ([1232343],[955928]) that we plan to submit soon (this would close the two previously proposed patches, and we plan also to review 5 other patches to push this one before 2.5 ;-) ). We would like this new patch to be as clean and as safe as possible, but to do so we would need some infos/advices from the list, and in particular peoples having worked in the readline C implementation, i.e. in modules Modules/readline.c, Parser/myreadline.c (PyOS_StdioReadline, PyOS_StdioReadline, vms__StdioReadline), Python/bltinmodule.c (builtin_raw_input). First a general question about implementation guidelines for CPython: -is it ok to initialize a static pointer to a non-null value (the address of a predefined function) at compile time? ANSI-C (or even pre-ansi C afaik) accept this, we have tested it on various linux and unix, and there are occurrences of similar construct in the python C sources, but not so many and not for function pointers (or I did not found it ;) ). We wonder if this can cause problem on some platforms not correctly implementing C standard(s) but that python have to support nonetheless, or if there is a feeling against it...The idea is to initialize PyOS_ReadlineFunctionPointer this way. Then something about the VMS platform support: -readline seems to make uses of the extern function vms__StdioReadline() on VMS...Where can we find the source or doc about this function? In particular, we would like to know if this function call (or can call) PyOS_StdioReadline, which would cause infinite recursion in some version of our patch....without havind access to VMS for testing or info about vms__StdioReadline, this is impossible to know... Thanks for any info, Greg. From mozbugbox at yahoo.com.au Sun Aug 28 12:17:21 2005 From: mozbugbox at yahoo.com.au (JustFillBug) Date: Sun, 28 Aug 2005 10:17:21 +0000 (UTC) Subject: [Python-Dev] Remove str.find in 3.0? References: Message-ID: On 2005-08-26, Terry Reedy wrote: > Can str.find be listed in PEP 3000 (under builtins) for removal? > Would anyone really object? > With all the discussion, I think you guys should realize that the find/index method are actually convenient function which do 2 things in one call: 1) If the key exists? 2) If the key exists, find it out. But whether you use find or index, at the end, you *have to* break it into 2 step at then end in order to make bug free code. Without find, you can do: if s in txt: i = txt.index(s) ... else: pass or: try: i = txt.index(s) ... except ValueError: pass With find: i = txt.index(s) if i >= 0: ... else: pass The code is about the same except with exception, the test of Exception is pushed far apart instead of immediately. No much coding was saved. From abkhd at hotmail.com Sun Aug 28 13:24:36 2005 From: abkhd at hotmail.com (A.B., Khalid) Date: Sun, 28 Aug 2005 11:24:36 +0000 Subject: [Python-Dev] test_bz2 and Python 2.4.1 Message-ID: Okay. Even though I know that most people here would probably find it difficult to give input when MinGW is used to build Python, I am going to post what I found out so far anyway concerning the test_bz2 situation for referencing purposes. -------------------------------------------------------------------- Python version Mode used Location of test Result from CVS -------------------------------------------------------------------- Python 2.5a0 normal ../Lib/test/ PASS Python 2.5a0 normal CWD of Python PASS Python 2.5a0 interactive ../Lib/test/ PASS Python 2.5a0 interactive CWD of Python PASS Python 2.4.1 normal ../Lib/test/ FAIL Python 2.4.1 normal CWD of Python PASS Python 2.4.1 interactive ../Lib/test/ PASS Python 2.4.1 interactive CWD of Python PASS -------------------------------------------------------------------- For python 2.4.1, tried both bzip2-1.0.2, and bzip2-1.0.3 on Win98 SE, and WinXP Pro SP2, using MinGW 3.4.4. I'll try to see what else I can find out. _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From raymond.hettinger at verizon.net Sun Aug 28 14:05:54 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 28 Aug 2005 08:05:54 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: Message-ID: <000301c5abc8$d7fe3fe0$d206a044@oemcomputer> [Guido] > Another observation: despite the derogatory remarks about regular > expressions, they have one thing going for them: they provide a higher > level of abstraction for string parsing, which this is all about. > (They are higher level in that you don't have to be counting > characters, which is about the lowest-level activity in programming -- > only counting bytes is lower!) > > Maybe if we had a *good* way of specifying string parsing we wouldn't > be needing to call find() or index() so much at all! (A good example > is the code that Raymond lifted from ConfigParser: a semicolon > preceded by whitespace starts a comment, other semicolons don't. > Surely there ought to be a better way to write that.) A higher level abstraction is surely the way to go. I looked over the use cases for find and index. As from cases which are now covered by the "in" operator, it looks like you almost always want the index to support a subsequent partition of the string. That suggests that we need a variant of split() that has been customized for typical find/index use cases. Perhaps introduce a new pair of methods, partition() and rpartition() which work like this: >>> s = 'http://www.python.org' >>> s.partition('://') ('http', '://', 'www.python.org') >>> s.rpartition('.') ('http://www.python', '.', 'org') >>> s.partition('?') (''http://www.python.org', '', '') The idea is still preliminary and I have only applied it to a handful of the library's find() and index() examples. Here are some of the design considerations: * The function always succeeds unless the separator argument is not a string type or is an empty string. So, a typical call doesn't have to be wrapped in a try-suite for normal usage. * The split invariant is: s == ''.join(s.partition(t)) * The result of the partition is always a three element tuple. This allows the results to be unpacked directly: head, sep, tail = s.partition(t) * The use cases for find() indicates a need to both test for the presence of the split element and to then to make a slice at that point. If we used a contains test for the first step, we could end-up having to search the string twice (once for detection and once for splitting). However, by providing the middle element of the result tuple, we can determine found or not-found without an additional search. Accordingly, the middle element has a nice Boolean interpretation with '' for not-found and a non-empty string meaning found. Given (a,b,c)=s.partition(p), the following invariant holds: b == '' or b is p * Returning the left, center, and right portions of the split supports a simple programming pattern for repeated partitions: while s: head, part, s = s.partition(t) . . . Of course, if this idea survives the day, then I'll meet my own requirements and write a context diff on the standard library. That ought to give a good indication of how well the new methods meet existing needs and whether the resulting code is better, cleaner, clearer, faster, etc. Raymond From pinard at iro.umontreal.ca Sun Aug 28 14:21:05 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Sun, 28 Aug 2005 08:21:05 -0400 Subject: [Python-Dev] Python 3.0 blocks? In-Reply-To: References: <000101c5ab06$bdff3620$a8bb9d8d@oemcomputer> <20050827185146.GA28094@panix.com> Message-ID: <20050828122105.GA6786@phenix.progiciels-bpi.ca> [Guido van Rossum] > [Aahz] > > IIRC, one of your proposals for Python 3.0 was that single-line > > blocks would be banned. Is my memory wrong? > It's a proposal. I'm on the fence about it. A difficult decision indeed. Most single line blocks I've seen would be more legible if they were written with two lines each, so I'm carefully avoiding them as a personal rule. But each rule has exceptions. There are a few rare cases, usually sequences of repetitive code, where single line blocks well succeed in stressing both the repetitive structure and the differences, making the code more legible then. As someone well put it already, this is all about Python helping good coders at writing good code, against Python preventing bad coders from writing bad code. Sadly enough, looking around, it seems Python could be a bit more aggressive against bad practices in this particular case, even if this might hurt good coders once in a while. But I'm not sure! -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From mal at egenix.com Sun Aug 28 15:10:14 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 28 Aug 2005 15:10:14 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <000301c5abc8$d7fe3fe0$d206a044@oemcomputer> References: <000301c5abc8$d7fe3fe0$d206a044@oemcomputer> Message-ID: <4311B7B6.8070503@egenix.com> Raymond Hettinger wrote: > [Guido] > >>Another observation: despite the derogatory remarks about regular >>expressions, they have one thing going for them: they provide a higher >>level of abstraction for string parsing, which this is all about. >>(They are higher level in that you don't have to be counting >>characters, which is about the lowest-level activity in programming -- >>only counting bytes is lower!) >> >>Maybe if we had a *good* way of specifying string parsing we wouldn't >>be needing to call find() or index() so much at all! (A good example >>is the code that Raymond lifted from ConfigParser: a semicolon >>preceded by whitespace starts a comment, other semicolons don't. >>Surely there ought to be a better way to write that.) > > A higher level abstraction is surely the way to go. I may be missing something, but why invent yet another parsing method - we already have the re module. I'd suggest to use it :-) If re is not fast enough or you want more control over the parsing process, you could also have a look at mxTextTools: http://www.egenix.com/files/python/mxTextTools.html -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 28 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From stephen at xemacs.org Sun Aug 28 15:27:04 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 28 Aug 2005 22:27:04 +0900 Subject: [Python-Dev] [Python-checkins] python/dist/src setup.py, 1.219, 1.220 In-Reply-To: <430B9186.3010106@v.loewis.de> ( =?iso-8859-1?q?Martin_v=2E_L=F6wis's_message_of?= "Tue, 23 Aug 2005 23:13:42 +0200") References: <000201c5a816$3caacaa0$8901a044@oemcomputer> <430B9186.3010106@v.loewis.de> Message-ID: <87hddambtj.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Martin" == Martin v L?wis writes: Martin> Raymond Hettinger wrote: >> Do you have an ANSI-strict option with your compiler? Martin> gcc does have an option to force c89 compliance, but there Martin> is a good chance that Python stops compiling with option: Martin> on many systems, essential system headers fail to comply Martin> with C89 (in addition, activating that mode also makes Martin> many extensions unavailable). However, it might be a reasonable pre-checkin test to try compiling changed files with the option enabled, depending on the number of nonconforming system headers, etc., and grep the output for whinging about c89-nonconformance. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From aahz at pythoncraft.com Sun Aug 28 16:00:41 2005 From: aahz at pythoncraft.com (Aahz) Date: Sun, 28 Aug 2005 07:00:41 -0700 Subject: [Python-Dev] Any detail list of change between version2.1-2.2-2.3-2.4 of Python? In-Reply-To: References: <51ec6a95050827210210a408e9@mail.gmail.com> Message-ID: <20050828140041.GA25264@panix.com> On Sun, Aug 28, 2005, Terry Reedy wrote: > "FAN" wrote in message > news:51ec6a95050827210210a408e9 at mail.gmail.com... >> >> You know Jython (Java version of Python) has only a stable version >> of 2.1, and two alpha version was release after 3 years. So if it >> wants to evolve to 2.2 , 2.3 or 2.4 as Python, some detail change >> list was need, and it's great if there are some test case script to >> test the new implemention version. So does Python has this kinds of >> things? Where can I find them or something like this? All changes are supposed to be in Misc/NEWS. You should also be able to use most of the test cases in Python itself, which are in Lib/test/ However, you should also read http://www.catb.org/~esr/faqs/smart-questions.html Had you read the various docs about Python development, you would certainly have figured out Lib/test/ on your own. > I believe this question is off-topic here, which is for dicussion of > future changes. If you ask the same question on comp.lang.python or > the mail or gmane.org equivalent, or perhaps in the search box at > python.org, I am sure you will get an answer. Because this is about the future of Jython, it's entirely appropriate for discussion here -- python-dev is *NOT* just for CPython. (It's similar to questions about porting.) As long as people ask questions of the appropriate level, that is. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From tjreedy at udel.edu Sun Aug 28 16:29:59 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 28 Aug 2005 10:29:59 -0400 Subject: [Python-Dev] empty string api for files References: <000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer> Message-ID: > I'm not convinced. Where would you ever care about reading a file in > N-bytes chucks? This was once a standard paradigm for IBM mainframe files. I vaguely remember having to specify the block/record size when opening such files. I have no idea of today's practice though. Terry J. Reedy From raymond.hettinger at verizon.net Sun Aug 28 16:32:24 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 28 Aug 2005 10:32:24 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <4311B7B6.8070503@egenix.com> Message-ID: <000701c5abdd$4f0b7440$d206a044@oemcomputer> [Marc-Andre Lemburg] > I may be missing something, but why invent yet another parsing > method - we already have the re module. I'd suggest to > use it :-) > > If re is not fast enough or you want more control over the > parsing process, you could also have a look at mxTextTools: > > http://www.egenix.com/files/python/mxTextTools.html Both are excellent tools. Neither is as lightweight, as trivial to learn, or as transparently obvious as the proposed s.partition(sep). The idea is to find a viable replacement for s.find(). Looking at sample code transformations shows that the high-power mxTextTools and re approaches do not simplify code that currently uses s.find(). In contrast, the proposed partition() method is a joy to use and has no surprises. The following code transformation shows unbeatable simplicity and clarity. --- From CGIHTTPServer.py --------------- def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info i = rest.rfind('?') if i >= 0: rest, query = rest[:i], rest[i+1:] else: query = '' i = rest.find('/') if i >= 0: script, rest = rest[:i], rest[i:] else: script, rest = rest, '' . . . def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info rest, _, query = rest.rpartition('?') script, _, rest = rest.partition('/') . . . The new proposal does not help every use case though. In ConfigParser.py, the problem description reads, "a semi-colon is a comment delimiter only if it follows a spacing character". This cries out for a regular expression. In StringIO.py, since the task at hand IS calculating an index, an indexless higher level construct doesn't help. However, many of the other s.find() use cases in the library simplify as readily and directly as the above cgi server example. Raymond ------------------------------------------------------- P.S. FWIW, if you want to experiment with it, here a concrete implementation of partition() expressed as a function: def partition(s, t): """ Returns a three element tuple, (head, sep, tail) where: head + sep + tail == s t not in head sep == '' or sep is t bool(sep) == (t in s) # sep indicates if the string was found >>> s = 'http://www.python.org' >>> partition(s, '://') ('http', '://', 'www.python.org') >>> partition(s, '?') ('http://www.python.org', '', '') >>> partition(s, 'http://') ('', 'http://', 'www.python.org') >>> partition(s, 'org') ('http://www.python.', 'org', '') """ if not isinstance(t, basestring) or not t: raise ValueError('partititon argument must be a non-empty string') parts = s.split(t, 1) if len(parts) == 1: result = (s, '', '') else: result = (parts[0], t, parts[1]) assert len(result) == 3 assert ''.join(result) == s assert result[1] == '' or result[1] is t assert t not in result[0] return result import doctest print doctest.testmod() From raymond.hettinger at verizon.net Sun Aug 28 16:35:05 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 28 Aug 2005 10:35:05 -0400 Subject: [Python-Dev] empty string api for files In-Reply-To: Message-ID: <000801c5abdd$af18fe20$d206a044@oemcomputer> > > I'm not convinced. Where would you ever care about reading a file in > > N-bytes chucks? > > This was once a standard paradigm for IBM mainframe files. I vaguely > remember having to specify the block/record size when opening such files. > I have no idea of today's practice though. I believe this still comes up in 100% of the cases where you're buffering reads of large binary files. Given lot of RAM, this probably doesn't come up as much nowadays. Raymond From guido at python.org Sun Aug 28 17:23:23 2005 From: guido at python.org (Guido van Rossum) Date: Sun, 28 Aug 2005 08:23:23 -0700 Subject: [Python-Dev] info/advices about python readline implementation In-Reply-To: <1125227342.13393.6.camel@Athlon64.home> References: <1125227342.13393.6.camel@Athlon64.home> Message-ID: On 8/28/05, Gregory Lielens wrote: > -is it ok to initialize a static pointer to a non-null value (the > address of a predefined function) at compile time? Yes. All of Python's standard types and modules use this idiom. > We wonder if this can cause problem on some platforms not correctly > implementing C standard(s) but that python have to support nonetheless, If a platform doesn't have a working C89 compiler, we generally wait for the compiler to be fixed (or for GCC to be ported). We might compromise when a platform doesn't support full POSIX, but this seems purely a language issue and there can be no excuses -- C89 is older than Python! > Then something about the VMS platform support: > -readline seems to make uses of the extern function > vms__StdioReadline() on VMS...Where can we find the source or doc about > this function? In particular, we would like to know if this function > call (or can call) PyOS_StdioReadline, which would cause infinite > recursion in some version of our patch....without havind access to VMS > for testing or info about vms__StdioReadline, this is impossible to > know... I have no idea; Googling for it only showed up discussions of readline.c. You might write the authors of the patch that introduced it (the same Google query will find the info); if they don't respond, I'm not sure that it's worth worrying about. My personal guess is that it's probably a VMS internal function, which would reduce the probability of it calling back to PyOS_StdioReadline to zero. It can't be a Python specific thing, because it doesn't have a 'Py' prefix. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From arigo at tunes.org Sun Aug 28 18:05:03 2005 From: arigo at tunes.org (Armin Rigo) Date: Sun, 28 Aug 2005 18:05:03 +0200 Subject: [Python-Dev] PyPy release 0.7.0 Message-ID: <20050828160503.GA4908@code1.codespeak.net> Hi Python-dev'ers, The first Python implementation of Python is now also the second C implementation of Python :-) Samuele & Armin (& the rest of the team) -+-+- pypy-0.7.0: first PyPy-generated Python Implementations ============================================================== What was once just an idea between a few people discussing on some nested mailing list thread and in a pub became reality ... the PyPy development team is happy to announce its first public release of a fully translatable self contained Python implementation. The 0.7 release showcases the results of our efforts in the last few months since the 0.6 preview release which have been partially funded by the European Union: - whole program type inference on our Python Interpreter implementation with full translation to two different machine-level targets: C and LLVM - a translation choice of using a refcounting or Boehm garbage collectors - the ability to translate with or without thread support - very complete language-level compliancy with CPython 2.4.1 What is PyPy (about)? ------------------------------------------------ PyPy is a MIT-licensed research-oriented reimplementation of Python written in Python itself, flexible and easy to experiment with. It translates itself to lower level languages. Our goals are to target a large variety of platforms, small and large, by providing a compilation toolsuite that can produce custom Python versions. Platform, Memory and Threading models are to become aspects of the translation process - as opposed to encoding low level details into a language implementation itself. Eventually, dynamic optimization techniques - implemented as another translation aspect - should become robust against language changes. Note that PyPy is mainly a research and development project and does not by itself focus on getting a production-ready Python implementation although we do hope and expect it to become a viable contender in that area sometime next year. Where to start? ----------------------------- Getting started: http://codespeak.net/pypy/dist/pypy/doc/getting-started.html PyPy Documentation: http://codespeak.net/pypy/dist/pypy/doc/ PyPy Homepage: http://codespeak.net/pypy/ The interpreter and object model implementations shipped with the 0.7 version can run on their own and implement the core language features of Python as of CPython 2.4. However, we still do not recommend using PyPy for anything else than for education, playing or research purposes. Ongoing work and near term goals --------------------------------- PyPy has been developed during approximately 15 coding sprints across Europe and the US. It continues to be a very dynamically and incrementally evolving project with many one-week meetings to follow. You are invited to consider coming to the next such meeting in Paris mid October 2005 where we intend to plan and head for an even more intense phase of the project involving building a JIT-Compiler and enabling unique features not found in other Python language implementations. PyPy has been a community effort from the start and it would not have got that far without the coding and feedback support from numerous people. Please feel free to give feedback and raise questions. contact points: http://codespeak.net/pypy/dist/pypy/doc/contact.html contributor list: http://codespeak.net/pypy/dist/pypy/doc/contributor.html have fun, the pypy team, of which here is a partial snapshot of mainly involved persons: Armin Rigo, Samuele Pedroni, Holger Krekel, Christian Tismer, Carl Friedrich Bolz, Michael Hudson, Eric van Riet Paap, Richard Emslie, Anders Chrigstroem, Anders Lehmann, Ludovic Aubry, Adrien Di Mascio, Niklaus Haldimann, Jacob Hallen, Bea During, Laura Creighton, and many contributors ... PyPy development and activities happen as an open source project and with the support of a consortium partially funded by a two year European Union IST research grant. Here is a list of the full partners of that consortium: Heinrich-Heine University (Germany), AB Strakt (Sweden) merlinux GmbH (Germany), tismerysoft GmbH(Germany) Logilab Paris (France), DFKI GmbH (Germany) ChangeMaker (Sweden), Impara (Germany) From gregory.lielens at fft.be Sun Aug 28 18:06:53 2005 From: gregory.lielens at fft.be (Gregory Lielens) Date: Sun, 28 Aug 2005 18:06:53 +0200 Subject: [Python-Dev] info/advices about python readline implementation In-Reply-To: References: <1125227342.13393.6.camel@Athlon64.home> Message-ID: <1125245214.13393.18.camel@Athlon64.home> > > Then something about the VMS platform support: > > -readline seems to make uses of the extern function > > vms__StdioReadline() on VMS...Where can we find the source or doc about > > this function? In particular, we would like to know if this function > > call (or can call) PyOS_StdioReadline, which would cause infinite > > recursion in some version of our patch....without havind access to VMS > > for testing or info about vms__StdioReadline, this is impossible to > > know... > > I have no idea; Googling for it only showed up discussions of > readline.c. You might write the authors of the patch that introduced > it (the same Google query will find the info); if they don't respond, > I'm not sure that it's worth worrying about. Googling only returned comments or queries by either Lisandro or me, but it was loewis (Martin v. L?wis ?) that comited this change in May 2003 with the comment Patch #708495: Port more stuff to OpenVMS. Tha patch was introduced by Jean-Fran?ois Pi?ronne, who explained: myreadline.c Use of vms__StdioReadline > My personal guess is that it's probably a VMS internal function, which > would reduce the probability of it calling back to PyOS_StdioReadline > to zero. It can't be a Python specific thing, because it doesn't have > a 'Py' prefix. My guess too, especially as using PyOS_StdioReadline (which is not in the python API) would be asking for trouble...We will thus consider that there is no risk of infinite recursion, except if someone say otherwise... Thanks a lot for these fast and usefull informations! Greg. From jcarlson at uci.edu Sun Aug 28 20:31:46 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 28 Aug 2005 11:31:46 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <000301c5abc8$d7fe3fe0$d206a044@oemcomputer> References: <000301c5abc8$d7fe3fe0$d206a044@oemcomputer> Message-ID: <20050828105627.7E33.JCARLSON@uci.edu> "Raymond Hettinger" wrote: > [Guido] > > Another observation: despite the derogatory remarks about regular > > expressions, they have one thing going for them: they provide a higher > > level of abstraction for string parsing, which this is all about. > > (They are higher level in that you don't have to be counting > > characters, which is about the lowest-level activity in programming -- > > only counting bytes is lower!) > > > > Maybe if we had a *good* way of specifying string parsing we wouldn't > > be needing to call find() or index() so much at all! (A good example > > is the code that Raymond lifted from ConfigParser: a semicolon > > preceded by whitespace starts a comment, other semicolons don't. > > Surely there ought to be a better way to write that.) > > A higher level abstraction is surely the way to go. Perhaps... > Of course, if this idea survives the day, then I'll meet my own > requirements and write a context diff on the standard library. That > ought to give a good indication of how well the new methods meet > existing needs and whether the resulting code is better, cleaner, > clearer, faster, etc. My first thought when reading the proposal was "that's just str.split/str.rsplit with maxsplit=1, returning the thing you split on, with 3 items always returned, what's the big deal?" Two second later it hit me, that is the big deal. Right now it is a bit of a pain to get string.split to return consistant numbers of return values; I myself have used: l,r = (x.split(y, 1)+[''])[:2] ...around 10 times - 10 times more than I really should have. Taking a wander through my code, this improves the look and flow in almost every case (the exceptions being where I should have rewritten to use 'substr in str' after I started using Python 2.3). Taking a walk through examples of str.rfind at koders.com leads me to believe that .partition/.rpartition would generally improve the flow, correctness, and beauty of code which had previously been using .find/.rfind. I hope the idea survives the day. - Josiah From mal at egenix.com Sun Aug 28 20:33:58 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 28 Aug 2005 20:33:58 +0200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <000701c5abdd$4f0b7440$d206a044@oemcomputer> References: <000701c5abdd$4f0b7440$d206a044@oemcomputer> Message-ID: <43120396.30406@egenix.com> Raymond Hettinger wrote: > [Marc-Andre Lemburg] > >>I may be missing something, but why invent yet another parsing >>method - we already have the re module. I'd suggest to >>use it :-) >> >>If re is not fast enough or you want more control over the >>parsing process, you could also have a look at mxTextTools: >> >> http://www.egenix.com/files/python/mxTextTools.html > > > Both are excellent tools. Neither is as lightweight, as trivial to > learn, or as transparently obvious as the proposed s.partition(sep). > The idea is to find a viable replacement for s.find(). Your partition idea could be had with an additional argument to .split() (e.g. keepsep=1); no need to learn a new method. Also, as I understand Terry's request, .find() should be removed in favor of just leaving .index() (which is the identical method without the funny -1 return code logic). So your proposal really doesn't have all that much to do with Terry's request, but is a new and separate proposal (which does have some value in few cases, but not enough to warrant a new method). > Looking at sample code transformations shows that the high-power > mxTextTools and re approaches do not simplify code that currently uses > s.find(). In contrast, the proposed partition() method is a joy to use > and has no surprises. The following code transformation shows > unbeatable simplicity and clarity. > > > --- From CGIHTTPServer.py --------------- > > def run_cgi(self): > """Execute a CGI script.""" > dir, rest = self.cgi_info > i = rest.rfind('?') > if i >= 0: > rest, query = rest[:i], rest[i+1:] > else: > query = '' > i = rest.find('/') > if i >= 0: > script, rest = rest[:i], rest[i:] > else: > script, rest = rest, '' > . . . > > > def run_cgi(self): > """Execute a CGI script.""" > dir, rest = self.cgi_info > rest, _, query = rest.rpartition('?') > script, _, rest = rest.partition('/') Wouldn't this do the same ?! ... rest, query = rest.rsplit('?', maxsplit=1) script, rest = rest.split('/', maxsplit=1) > . . . > > > The new proposal does not help every use case though. In > ConfigParser.py, the problem description reads, "a semi-colon is a > comment delimiter only if it follows a spacing character". This cries > out for a regular expression. In StringIO.py, since the task at hand IS > calculating an index, an indexless higher level construct doesn't help. > However, many of the other s.find() use cases in the library simplify as > readily and directly as the above cgi server example. > > > > Raymond > > > ------------------------------------------------------- > > P.S. FWIW, if you want to experiment with it, here a concrete > implementation of partition() expressed as a function: > > def partition(s, t): > """ Returns a three element tuple, (head, sep, tail) where: > > head + sep + tail == s > t not in head > sep == '' or sep is t > bool(sep) == (t in s) # sep indicates if the string was > found > > >>> s = 'http://www.python.org' > >>> partition(s, '://') > ('http', '://', 'www.python.org') > >>> partition(s, '?') > ('http://www.python.org', '', '') > >>> partition(s, 'http://') > ('', 'http://', 'www.python.org') > >>> partition(s, 'org') > ('http://www.python.', 'org', '') > > """ > if not isinstance(t, basestring) or not t: > raise ValueError('partititon argument must be a non-empty > string') > parts = s.split(t, 1) > if len(parts) == 1: > result = (s, '', '') > else: > result = (parts[0], t, parts[1]) > assert len(result) == 3 > assert ''.join(result) == s > assert result[1] == '' or result[1] is t > assert t not in result[0] > return result > > > import doctest > print doctest.testmod() -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 28 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From rrr at ronadam.com Sun Aug 28 20:46:53 2005 From: rrr at ronadam.com (Ron Adam) Date: Sun, 28 Aug 2005 14:46:53 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <000701c5abdd$4f0b7440$d206a044@oemcomputer> References: <000701c5abdd$4f0b7440$d206a044@oemcomputer> Message-ID: <4312069D.9030109@ronadam.com> Raymond Hettinger wrote: > Looking at sample code transformations shows that the high-power > mxTextTools and re approaches do not simplify code that currently uses > s.find(). In contrast, the proposed partition() method is a joy to use > and has no surprises. The following code transformation shows > unbeatable simplicity and clarity. +1 This doesn't cause any backward compatible issues as well! > --- From CGIHTTPServer.py --------------- > > def run_cgi(self): > """Execute a CGI script.""" > dir, rest = self.cgi_info > i = rest.rfind('?') > if i >= 0: > rest, query = rest[:i], rest[i+1:] > else: > query = '' > i = rest.find('/') > if i >= 0: > script, rest = rest[:i], rest[i:] > else: > script, rest = rest, '' > . . . > > > def run_cgi(self): > """Execute a CGI script.""" > dir, rest = self.cgi_info > rest, _, query = rest.rpartition('?') > script, _, rest = rest.partition('/') > . . . +1 Much easier to read and understand! Cheers, Ron From raymond.hettinger at verizon.net Sun Aug 28 21:04:10 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Sun, 28 Aug 2005 15:04:10 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <43120396.30406@egenix.com> Message-ID: <001501c5ac03$4679e480$d206a044@oemcomputer> [M.-A. Lemburg] > Also, as I understand Terry's request, .find() should be removed > in favor of just leaving .index() (which is the identical method > without the funny -1 return code logic). > > So your proposal really doesn't have all that much to do > with Terry's request, but is a new and separate proposal > (which does have some value in few cases, but not enough > to warrant a new method). It is new and separate, but it is also related. The core of Terry's request is the assertion that str.find() is bug-prone and should not be used. The principal arguments against accepting his request (advanced by Tim) are that the str.index() alternative is slightly more awkward to code, more likely to result in try-suites that catch more than intended, and that the resulting code is slower. Those arguments fall to the wayside if str.partition() becomes available as a superior alternative. IOW, it makes Terry's request much more palatable. > > def run_cgi(self): > > """Execute a CGI script.""" > > dir, rest = self.cgi_info > > rest, _, query = rest.rpartition('?') > > script, _, rest = rest.partition('/') [MAL] > Wouldn't this do the same ?! ... > > rest, query = rest.rsplit('?', maxsplit=1) > script, rest = rest.split('/', maxsplit=1) No. The split() versions are buggy. They fail catastrophically when the original string does not contain '?' or does not contain '/': >>> rest = 'http://www.example.org/subdir' >>> rest, query = rest.rsplit('?', 1) Traceback (most recent call last): File "", line 1, in -toplevel- rest, query = rest.rsplit('?', 1) ValueError: need more than 1 value to unpack The whole point of str.partition() is to repackage str.split() in a way that is conducive to fulfilling many of the existing use cases for str.find() and str.index(). In going through the library examples, I've not found a single case where a direct use of str.split() would improve code that currently uses str.find(). Raymond From steve at holdenweb.com Sun Aug 28 23:03:26 2005 From: steve at holdenweb.com (Steve Holden) Date: Sun, 28 Aug 2005 17:03:26 -0400 Subject: [Python-Dev] empty string api for files In-Reply-To: References: <000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer> Message-ID: Terry Reedy wrote: >>I'm not convinced. Where would you ever care about reading a file in >>N-bytes chucks? > > > This was once a standard paradigm for IBM mainframe files. I vaguely > remember having to specify the block/record size when opening such files. > I have no idea of today's practice though. > Indeed. Something like: SYSIN DD *,BLKSIZE=80 IIRC (which I may well not do after thirty years or so). People used to solve generic programming problems in JCL just for the hell of it. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ From pinard at iro.umontreal.ca Mon Aug 29 01:05:25 2005 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Sun, 28 Aug 2005 19:05:25 -0400 Subject: [Python-Dev] empty string api for files In-Reply-To: References: Message-ID: <20050828230525.GA14562@alcyon.progiciels-bpi.ca> [Steve Holden] > Terry Reedy wrote: > > This was once a standard paradigm for IBM mainframe files. I > > vaguely remember having to specify the block/record size when > > opening such files. I have no idea of today's practice though. > Indeed. Something like: > SYSIN DD *,BLKSIZE=80 Oh! The "*" is pretty magical, and came from HASP (Houston Automatic Spooling Program, if I remember well), and not from IBM. It took a lot of years before IBM even acknowledged the existence of HASP (in dark times when salesmen and engineers ought to strictly obey company mandated attitudes). Nevertheless, almost every IBM customer was installing HASP under the scene, because without the "*", people ought to specify on their DD cards the preallocation of disk space, even for spool files, as a number of cylinders and sectors for the primary extent, and a number of cylinders and sectors for all secondary extents. I later learned that IBM gave in, including HASP facilities as standard. > People used to solve generic programming problems in JCL just for the > hell of it. The hell is the right word to describe it! :-) I wonder if JCL could emulate a Turing Machine, but it at least addressed the Halting Problem! One-who-happily-forgot-all-bout-this-ly yours... P.S. - How is this related to Python? Luckily! -- that is: *not*! :-) -- Fran?ois Pinard http://pinard.progiciels-bpi.ca From raymond.hettinger at verizon.net Mon Aug 29 07:48:57 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Mon, 29 Aug 2005 01:48:57 -0400 Subject: [Python-Dev] Proof of the pudding: str.partition() Message-ID: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer> As promised, here is a full set of real-world comparative code transformations using str.partition(). The patch isn't intended to be applied; rather, it is here to test/demonstrate whether the new construct offers benefits under a variety of use cases. Overall, I found that partition() usefully encapsulated commonly occurring low-level programming patterns. In most cases, it completely eliminated the need for slicing and indices. In several cases, code was simplified dramatically; in some, the simplification was minor; and in a few cases, the complexity was about the same. No cases were made worse. Most patterns using str.find() directly translated into an equivalent using partition. The only awkwardness that arose was in cases where the original code had a test like, "if s.find(pat) > 0". That case translated to a double-term test, "if found and head". Also, some pieces of code needed a tail that included the separator. That need was met by inserting a line like "tail = sep + tail". And that solution led to a minor naming discomfort for the middle term of the result tuple, it was being used as both a Boolean found flag and as a string containing the separator (hence conflicting the choice of names between "found" and "sep"). In most cases, there was some increase in efficiency resulting fewer total steps and tests, and from eliminating double searches. However, in a few cases, the new code was less efficient because the fragment only needed either the head or tail but not both as provided by partition(). In every case, the code was clearer after the transformation. Also, none of the transformations required str.partition() to be used in a tricky way. In contrast, I found many contortions using str.find() where I had to diagram every possible path to understand what the code was trying to do or to assure myself that it worked. The new methods excelled at reducing cyclomatic complexity by eliminating conditional paths. The methods were especially helpful in the context of multiple finds (i.e. split at the leftmost colon if present within a group following the rightmost forward slash if present). In several cases, the replaced code exactly matched the pure python version of str.partition() -- this confirms that people are routinely writing multi-step low-level in-line code that duplicates was str.partition() does in a single step. The more complex transformations were handled by first figuring out exactly was the original code did under all possible cases and then writing the partition() version to match that spec. The lesson was that it is much easier to program from scratch using partition() than it is to code using find(). The new method more naturally expresses a series of parsing steps interleaved with other code. With further ado, here are the comparative code fragments: Index: CGIHTTPServer.py =================================================================== *** 106,121 **** def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info ! i = rest.rfind('?') ! if i >= 0: ! rest, query = rest[:i], rest[i+1:] ! else: ! query = '' ! i = rest.find('/') ! if i >= 0: ! script, rest = rest[:i], rest[i:] ! else: ! script, rest = rest, '' scriptname = dir + '/' + script scriptfile = self.translate_path(scriptname) if not os.path.exists(scriptfile): --- 106,113 ---- def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info ! rest, _, query = rest.rpartition('?') ! script, _, rest = rest.partition('/') scriptname = dir + '/' + script scriptfile = self.translate_path(scriptname) if not os.path.exists(scriptfile): Index: ConfigParser.py =================================================================== *** 599,612 **** if depth > MAX_INTERPOLATION_DEPTH: raise InterpolationDepthError(option, section, rest) while rest: ! p = rest.find("%") ! if p < 0: ! accum.append(rest) return ! if p > 0: ! accum.append(rest[:p]) ! rest = rest[p:] ! # p is no longer used c = rest[1:2] if c == "%": accum.append("%") --- 599,611 ---- if depth > MAX_INTERPOLATION_DEPTH: raise InterpolationDepthError(option, section, rest) while rest: ! head, sep, rest = rest.partition("%") ! if not sep: ! accum.append(head) return ! rest = sep + rest ! if found and head: ! accum.append(head) c = rest[1:2] if c == "%": accum.append("%") Index: cgi.py =================================================================== *** 337,346 **** key = plist.pop(0).lower() pdict = {} for p in plist: ! i = p.find('=') ! if i >= 0: ! name = p[:i].strip().lower() ! value = p[i+1:].strip() if len(value) >= 2 and value[0] == value[-1] == '"': value = value[1:-1] value = value.replace('\\\\', '\\').replace('\\"', '"') --- 337,346 ---- key = plist.pop(0).lower() pdict = {} for p in plist: ! name, found, value = p.partition('=') ! if found: ! name = name.strip().lower() ! value = value.strip() if len(value) >= 2 and value[0] == value[-1] == '"': value = value[1:-1] value = value.replace('\\\\', '\\').replace('\\"', '"') Index: cookielib.py =================================================================== *** 610,618 **** def request_port(request): host = request.get_host() ! i = host.find(':') ! if i >= 0: ! port = host[i+1:] try: int(port) except ValueError: --- 610,617 ---- def request_port(request): host = request.get_host() ! _, sep, port = host.partition(':') ! if sep: try: int(port) except ValueError: *************** *** 670,681 **** '.local' """ ! i = h.find(".") ! if i >= 0: ! #a = h[:i] # this line is only here to show what a is ! b = h[i+1:] ! i = b.find(".") ! if is_HDN(h) and (i >= 0 or b == "local"): return "."+b return h --- 669,677 ---- '.local' """ ! a, found, b = h.partition('.') ! if found: ! if is_HDN(h) and ('.' in b or b == "local"): return "."+b return h *************** *** 1451,1463 **** else: path_specified = False path = request_path(request) ! i = path.rfind("/") ! if i != -1: if version == 0: # Netscape spec parts company from reality here ! path = path[:i] else: ! path = path[:i+1] if len(path) == 0: path = "/" # set default domain --- 1447,1459 ---- else: path_specified = False path = request_path(request) ! head, sep, _ = path.rpartition('/') ! if sep: if version == 0: # Netscape spec parts company from reality here ! path = head else: ! path = head + sep if len(path) == 0: path = "/" # set default domain Index: gopherlib.py =================================================================== *** 57,65 **** """Send a selector to a given host and port, return a file with the reply.""" import socket if not port: ! i = host.find(':') ! if i >= 0: ! host, port = host[:i], int(host[i+1:]) if not port: port = DEF_PORT elif type(port) == type(''): --- 57,65 ---- """Send a selector to a given host and port, return a file with the reply.""" import socket if not port: ! head, found, tail = host.partition(':') ! if found: ! host, port = head, int(tail) if not port: port = DEF_PORT elif type(port) == type(''): Index: httplib.py =================================================================== *** 490,498 **** while True: if chunk_left is None: line = self.fp.readline() ! i = line.find(';') ! if i >= 0: ! line = line[:i] # strip chunk-extensions chunk_left = int(line, 16) if chunk_left == 0: break --- 490,496 ---- while True: if chunk_left is None: line = self.fp.readline() ! line, _, _ = line.partition(';') # strip chunk-extensions chunk_left = int(line, 16) if chunk_left == 0: break *************** *** 586,599 **** def _set_hostport(self, host, port): if port is None: ! i = host.rfind(':') ! j = host.rfind(']') # ipv6 addresses have [...] ! if i > j: try: ! port = int(host[i+1:]) except ValueError: ! raise InvalidURL("nonnumeric port: '%s'" % host[i+1:]) ! host = host[:i] else: port = self.default_port if host and host[0] == '[' and host[-1] == ']': --- 584,595 ---- def _set_hostport(self, host, port): if port is None: ! host, _, port = host.rpartition(':') ! if ']' not in port: # ipv6 addresses have [...] try: ! port = int(port) except ValueError: ! raise InvalidURL("nonnumeric port: '%s'" % port) else: port = self.default_port if host and host[0] == '[' and host[-1] == ']': *************** *** 976,998 **** L = [self._buf] self._buf = '' while 1: ! i = L[-1].find("\n") ! if i >= 0: break s = self._read() if s == '': break L.append(s) ! if i == -1: # loop exited because there is no more data return "".join(L) else: ! all = "".join(L) ! # XXX could do enough bookkeeping not to do a 2nd search ! i = all.find("\n") + 1 ! line = all[:i] ! self._buf = all[i:] ! return line def readlines(self, sizehint=0): total = 0 --- 972,990 ---- L = [self._buf] self._buf = '' while 1: ! head, found, tail = L[-1].partition('\n') ! if found: break s = self._read() if s == '': break L.append(s) ! if not found: # loop exited because there is no more data return "".join(L) else: ! self._buf = found + tail ! return "".join(L) + head def readlines(self, sizehint=0): total = 0 Index: ihooks.py =================================================================== *** 426,438 **** return None def find_head_package(self, parent, name): ! if '.' in name: ! i = name.find('.') ! head = name[:i] ! tail = name[i+1:] ! else: ! head = name ! tail = "" if parent: qname = "%s.%s" % (parent.__name__, head) else: --- 426,432 ---- return None def find_head_package(self, parent, name): ! head, _, tail = name.partition('.') if parent: qname = "%s.%s" % (parent.__name__, head) else: *************** *** 449,457 **** def load_tail(self, q, tail): m = q while tail: ! i = tail.find('.') ! if i < 0: i = len(tail) ! head, tail = tail[:i], tail[i+1:] mname = "%s.%s" % (m.__name__, head) m = self.import_it(head, mname, m) if not m: --- 443,449 ---- def load_tail(self, q, tail): m = q while tail: ! head, _, tail = tail.partition('.') mname = "%s.%s" % (m.__name__, head) m = self.import_it(head, mname, m) if not m: Index: locale.py =================================================================== *** 98,106 **** seps = 0 spaces = "" if s[-1] == ' ': ! sp = s.find(' ') ! spaces = s[sp:] ! s = s[:sp] while s and grouping: # if grouping is -1, we are done if grouping[0]==CHAR_MAX: --- 98,105 ---- seps = 0 spaces = "" if s[-1] == ' ': ! spaces, sep, tail = s.partition(' ') ! s = sep + tail while s and grouping: # if grouping is -1, we are done if grouping[0]==CHAR_MAX: *************** *** 148,156 **** # so, kill as much spaces as there where separators. # Leading zeroes as fillers are not yet dealt with, as it is # not clear how they should interact with grouping. ! sp = result.find(" ") ! if sp==-1:break ! result = result[:sp]+result[sp+1:] seps -= 1 return result --- 147,156 ---- # so, kill as much spaces as there where separators. # Leading zeroes as fillers are not yet dealt with, as it is # not clear how they should interact with grouping. ! head, found, tail = result.partition(' ') ! if not found: ! break ! result = head + tail seps -= 1 return result Index: mailcap.py =================================================================== *** 105,117 **** key, view, rest = fields[0], fields[1], fields[2:] fields = {'view': view} for field in rest: ! i = field.find('=') ! if i < 0: ! fkey = field ! fvalue = "" ! else: ! fkey = field[:i].strip() ! fvalue = field[i+1:].strip() if fkey in fields: # Ignore it pass --- 105,113 ---- key, view, rest = fields[0], fields[1], fields[2:] fields = {'view': view} for field in rest: ! fkey, found, fvalue = field.partition('=') ! fkey = fkey.strip() ! fvalue = fvalue.strip() if fkey in fields: # Ignore it pass Index: mhlib.py =================================================================== *** 356,364 **** if seq == 'all': return all # Test for X:Y before X-Y because 'seq:-n' matches both ! i = seq.find(':') ! if i >= 0: ! head, dir, tail = seq[:i], '', seq[i+1:] if tail[:1] in '-+': dir, tail = tail[:1], tail[1:] if not isnumeric(tail): --- 356,364 ---- if seq == 'all': return all # Test for X:Y before X-Y because 'seq:-n' matches both ! head, found, tail = seq.partition(':') ! if found: ! dir = '' if tail[:1] in '-+': dir, tail = tail[:1], tail[1:] if not isnumeric(tail): *************** *** 394,403 **** i = bisect(all, anchor-1) return all[i:i+count] # Test for X-Y next ! i = seq.find('-') ! if i >= 0: ! begin = self._parseindex(seq[:i], all) ! end = self._parseindex(seq[i+1:], all) i = bisect(all, begin-1) j = bisect(all, end) r = all[i:j] --- 394,403 ---- i = bisect(all, anchor-1) return all[i:i+count] # Test for X-Y next ! head, found, tail = seq.find('-') ! if found: ! begin = self._parseindex(head, all) ! end = self._parseindex(tail, all) i = bisect(all, begin-1) j = bisect(all, end) r = all[i:j] Index: modulefinder.py =================================================================== *** 140,148 **** assert caller is parent self.msgout(4, "determine_parent ->", parent) return parent ! if '.' in pname: ! i = pname.rfind('.') ! pname = pname[:i] parent = self.modules[pname] assert parent.__name__ == pname self.msgout(4, "determine_parent ->", parent) --- 140,147 ---- assert caller is parent self.msgout(4, "determine_parent ->", parent) return parent ! pname, found, _ = pname.rpartition('.') ! if found: parent = self.modules[pname] assert parent.__name__ == pname self.msgout(4, "determine_parent ->", parent) *************** *** 152,164 **** def find_head_package(self, parent, name): self.msgin(4, "find_head_package", parent, name) ! if '.' in name: ! i = name.find('.') ! head = name[:i] ! tail = name[i+1:] ! else: ! head = name ! tail = "" if parent: qname = "%s.%s" % (parent.__name__, head) else: --- 151,157 ---- def find_head_package(self, parent, name): self.msgin(4, "find_head_package", parent, name) ! head, _, tail = name.partition('.') if parent: qname = "%s.%s" % (parent.__name__, head) else: Index: pdb.py =================================================================== *** 189,200 **** # split into ';;' separated commands # unless it's an alias command if args[0] != 'alias': ! marker = line.find(';;') ! if marker >= 0: ! # queue up everything after marker ! next = line[marker+2:].lstrip() self.cmdqueue.append(next) ! line = line[:marker].rstrip() return line # Command definitions, called by cmdloop() --- 189,200 ---- # split into ';;' separated commands # unless it's an alias command if args[0] != 'alias': ! line, found, next = line.partition(';;') ! if found: ! # queue up everything after command separator ! next = next.lstrip() self.cmdqueue.append(next) ! line = line.rstrip() return line # Command definitions, called by cmdloop() *************** *** 217,232 **** filename = None lineno = None cond = None ! comma = arg.find(',') ! if comma > 0: # parse stuff after comma: "condition" ! cond = arg[comma+1:].lstrip() ! arg = arg[:comma].rstrip() # parse stuff before comma: [filename:]lineno | function - colon = arg.rfind(':') funcname = None ! if colon >= 0: ! filename = arg[:colon].rstrip() f = self.lookupmodule(filename) if not f: print '*** ', repr(filename), --- 217,232 ---- filename = None lineno = None cond = None ! arg, found, cond = arg.partition(',') ! if found and arg: # parse stuff after comma: "condition" ! arg = arg.rstrip() ! cond = cond.lstrip() # parse stuff before comma: [filename:]lineno | function funcname = None ! filename, found, arg = arg.rpartition(':') ! if found: ! filename = filename.rstrip() f = self.lookupmodule(filename) if not f: print '*** ', repr(filename), *************** *** 234,240 **** return else: filename = f ! arg = arg[colon+1:].lstrip() try: lineno = int(arg) except ValueError, msg: --- 234,240 ---- return else: filename = f ! arg = arg.lstrip() try: lineno = int(arg) except ValueError, msg: *************** *** 437,445 **** return if ':' in arg: # Make sure it works for "clear C:\foo\bar.py:12" ! i = arg.rfind(':') ! filename = arg[:i] ! arg = arg[i+1:] try: lineno = int(arg) except: --- 437,443 ---- return if ':' in arg: # Make sure it works for "clear C:\foo\bar.py:12" ! filename, _, arg = arg.rpartition(':') try: lineno = int(arg) except: Index: rfc822.py =================================================================== *** 197,205 **** You may override this method in order to use Message parsing on tagged data in RFC 2822-like formats with special header formats. """ ! i = line.find(':') ! if i > 0: ! return line[:i].lower() return None def islast(self, line): --- 197,205 ---- You may override this method in order to use Message parsing on tagged data in RFC 2822-like formats with special header formats. """ ! head, found, tail = line.partition(':') ! if found and head: ! return head.lower() return None def islast(self, line): *************** *** 340,348 **** else: if raw: raw.append(', ') ! i = h.find(':') ! if i > 0: ! addr = h[i+1:] raw.append(addr) alladdrs = ''.join(raw) a = AddressList(alladdrs) --- 340,348 ---- else: if raw: raw.append(', ') ! head, found, tail = h.partition(':') ! if found and head: ! addr = tail raw.append(addr) alladdrs = ''.join(raw) a = AddressList(alladdrs) *************** *** 859,867 **** data = stuff + data[1:] if len(data) == 4: s = data[3] ! i = s.find('+') ! if i > 0: ! data[3:] = [s[:i], s[i+1:]] else: data.append('') # Dummy tz if len(data) < 5: --- 859,867 ---- data = stuff + data[1:] if len(data) == 4: s = data[3] ! head, found, tail = s.partition('+') ! if found and head: ! data[3:] = [head, tail] else: data.append('') # Dummy tz if len(data) < 5: Index: robotparser.py =================================================================== *** 104,112 **** entry = Entry() state = 0 # remove optional comment and strip line ! i = line.find('#') ! if i>=0: ! line = line[:i] line = line.strip() if not line: continue --- 104,110 ---- entry = Entry() state = 0 # remove optional comment and strip line ! line, _, _ = line.partition('#') line = line.strip() if not line: continue Index: smtpd.py =================================================================== *** 144,156 **** self.push('500 Error: bad syntax') return method = None ! i = line.find(' ') ! if i < 0: ! command = line.upper() arg = None else: ! command = line[:i].upper() ! arg = line[i+1:].strip() method = getattr(self, 'smtp_' + command, None) if not method: self.push('502 Error: command "%s" not implemented' % command) --- 144,155 ---- self.push('500 Error: bad syntax') return method = None ! command, found, arg = line.partition(' ') ! command = command.upper() ! if not found: arg = None else: ! arg = tail.strip() method = getattr(self, 'smtp_' + command, None) if not method: self.push('502 Error: command "%s" not implemented' % command) *************** *** 495,514 **** usage(1, 'Invalid arguments: %s' % COMMASPACE.join(args)) # split into host/port pairs ! i = localspec.find(':') ! if i < 0: usage(1, 'Bad local spec: %s' % localspec) ! options.localhost = localspec[:i] try: ! options.localport = int(localspec[i+1:]) except ValueError: usage(1, 'Bad local port: %s' % localspec) ! i = remotespec.find(':') ! if i < 0: usage(1, 'Bad remote spec: %s' % remotespec) ! options.remotehost = remotespec[:i] try: ! options.remoteport = int(remotespec[i+1:]) except ValueError: usage(1, 'Bad remote port: %s' % remotespec) return options --- 494,513 ---- usage(1, 'Invalid arguments: %s' % COMMASPACE.join(args)) # split into host/port pairs ! head, found, tail = localspec.partition(':') ! if not found: usage(1, 'Bad local spec: %s' % localspec) ! options.localhost = head try: ! options.localport = int(tail) except ValueError: usage(1, 'Bad local port: %s' % localspec) ! head, found, tail = remotespec.partition(':') ! if not found: usage(1, 'Bad remote spec: %s' % remotespec) ! options.remotehost = head try: ! options.remoteport = int(tail) except ValueError: usage(1, 'Bad remote port: %s' % remotespec) return options Index: smtplib.py =================================================================== *** 276,284 **** """ if not port and (host.find(':') == host.rfind(':')): ! i = host.rfind(':') ! if i >= 0: ! host, port = host[:i], host[i+1:] try: port = int(port) except ValueError: raise socket.error, "nonnumeric port" --- 276,283 ---- """ if not port and (host.find(':') == host.rfind(':')): ! host, found, port = host.rpartition(':') ! if found: try: port = int(port) except ValueError: raise socket.error, "nonnumeric port" Index: urllib2.py =================================================================== *** 289,301 **** def add_handler(self, handler): added = False for meth in dir(handler): ! i = meth.find("_") ! protocol = meth[:i] ! condition = meth[i+1:] ! if condition.startswith("error"): ! j = condition.find("_") + i + 1 ! kind = meth[j+1:] try: kind = int(kind) except ValueError: --- 289,297 ---- def add_handler(self, handler): added = False for meth in dir(handler): ! protocol, _, condition = meth.partition('_') if condition.startswith("error"): ! _, _, kind = condition.partition('_') try: kind = int(kind) except ValueError: Index: zipfile.py =================================================================== *** 117,125 **** self.orig_filename = filename # Original file name in archive # Terminate the file name at the first null byte. Null bytes in file # names are used as tricks by viruses in archives. ! null_byte = filename.find(chr(0)) ! if null_byte >= 0: ! filename = filename[0:null_byte] # This is used to ensure paths in generated ZIP files always use # forward slashes as the directory separator, as required by the # ZIP format specification. --- 117,123 ---- self.orig_filename = filename # Original file name in archive # Terminate the file name at the first null byte. Null bytes in file # names are used as tricks by viruses in archives. ! filename, _, _ = filename.partition(chr(0)) # This is used to ensure paths in generated ZIP files always use # forward slashes as the directory separator, as required by the # ZIP format specification. From jcarlson at uci.edu Mon Aug 29 08:29:58 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 28 Aug 2005 23:29:58 -0700 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer> References: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer> Message-ID: <20050828231650.7E4B.JCARLSON@uci.edu> "Raymond Hettinger" wrote: > As promised, here is a full set of real-world comparative code > transformations using str.partition(). The patch isn't intended to be > applied; rather, it is here to test/demonstrate whether the new > construct offers benefits under a variety of use cases. Having looked at many of Raymond's transformations earlier today (just emailing him a copy of my thoughts and changes minutes ago), I agree that this simplifies essentially every example I have seen translated, and translated myself. There are a handful of errors I found during my pass, most of which seem corrected in the version he has sent to python-dev (though not all). To those who are to reply in this thread, rather than nitpicking about the correctness of individual transformations (though perhaps you should email him directly about those), comment about how much better/worse they look. Vote to add str.partition to 2.5: +1 Vote to dump str.find sometime later if str.partition makes it: +1 - Josiah From ncoghlan at gmail.com Mon Aug 29 13:16:18 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 29 Aug 2005 21:16:18 +1000 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer> References: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer> Message-ID: <4312EE82.90207@gmail.com> Raymond Hettinger wrote: > Most patterns using str.find() directly translated into an equivalent > using partition. The only awkwardness that arose was in cases where the > original code had a test like, "if s.find(pat) > 0". That case > translated to a double-term test, "if found and head". That said, the latter would give me much greater confidence that the test for "found, but not right at the start" was deliberate. With the original version I would need to study the surrounding code to satisfy myself that it wasn't a simple typo that resulted in '>' being written where '>=' was intended. > With further ado, here are the comparative code fragments: There's another one below that you previously tried rewriting to use str.index that also benefits from str.partition. This rewrite makes it easier to avoid the bug that afflicts the current code, and would make that bug raise an exception if it wasn't fixed - "head[-1]" would raise IndexError if the head was empty. Cheers, Nick. --- From ConfigParser.py (current) --------------- optname, vi, optval = mo.group('option', 'vi', 'value') if vi in ('=', ':') and ';' in optval: # ';' is a comment delimiter only if it follows # a spacing character pos = optval.find(';') if pos != -1 and optval[pos-1].isspace(): optval = optval[:pos] optval = optval.strip() --- From ConfigParser.py (with str.partition) --------------- optname, vi, optval = mo.group('option', 'vi', 'value') if vi in ('=', ':'): # ';' is a comment delimiter only if it follows # a spacing character head, found, _ = optval.partition(';') if found and head and head[-1].isspace(): optval = head optval = optval.strip() -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From python at dynkin.com Mon Aug 29 15:07:55 2005 From: python at dynkin.com (George Yoshida) Date: Mon, 29 Aug 2005 22:07:55 +0900 Subject: [Python-Dev] [Python-checkins] python/dist/src/Doc/whatsnew whatsnew25.tex, 1.18, 1.19 In-Reply-To: <20050827184558.6A9981E401F@bag.python.org> References: <20050827184558.6A9981E401F@bag.python.org> Message-ID: <431308AB.40401@dynkin.com> akuchling at users.sourceforge.net wrote: > Update of /cvsroot/python/python/dist/src/Doc/whatsnew > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv29055 > > Modified Files: > whatsnew25.tex > Log Message: > Write section on PEP 342 > > Index: whatsnew25.tex > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Doc/whatsnew/whatsnew25.tex,v > retrieving revision 1.18 > retrieving revision 1.19 > diff -u -d -r1.18 -r1.19 > --- whatsnew25.tex 23 Aug 2005 00:56:06 -0000 1.18 > +++ whatsnew25.tex 27 Aug 2005 18:45:47 -0000 1.19 > [snip] > +\begin{verbatim} > +>>> it = counter(10) > +>>> print it.next() > +0 > +>>> print it.next() > +1 > +>>> print it.send(8) > +8 > +>>> print it.next() > +9 > +>>> print it.next() > +Traceback (most recent call last): > + File ``t.py'', line 15, in ? > + print it.next() > +StopIteration > > +Because \keyword{yield} will often be returning \constant{None}, > +you shouldn't just use its value in expressions unless you're sure > +that only the \method{send()} method will be used. This part creates a syntax error. \begin{verbatim} does not have its end tag. - george From mcherm at mcherm.com Mon Aug 29 21:53:08 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Mon, 29 Aug 2005 12:53:08 -0700 Subject: [Python-Dev] Remove str.find in 3.0? Message-ID: <20050829125308.uylb8wyc0yw4sosc@login.werra.lunarpages.com> Raymond writes: > That suggests that we need a variant of split() that has been customized > for typical find/index use cases. Perhaps introduce a new pair of > methods, partition() and rpartition() +1 My only suggestion is that when you're about to make a truly inspired suggestion like this one, that you use a new subject header. It will make it easier for the Python-Dev summary authors and for the people who look back in 20 years to ask "That str.partition() function is really swiggy! It's everywhere now, but I wonder what language had it first and who came up with it?" -- Michael Chermside [PS: To explain what "swiggy" means I'd probably have to borrow the time machine.] From tdelaney at avaya.com Tue Aug 30 01:31:34 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Tue, 30 Aug 2005 09:31:34 +1000 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) Message-ID: <2773CAC687FD5F4689F526998C7E4E5F0742BA@au3010avexu1.global.avaya.com> Michael Chermside wrote: > Raymond writes: >> That suggests that we need a variant of split() that has been >> customized for typical find/index use cases. Perhaps introduce a >> new pair of methods, partition() and rpartition() > > +1 > > My only suggestion is that when you're about to make a truly > inspired suggestion like this one, that you use a new subject > header. It will make it easier for the Python-Dev summary > authors and for the people who look back in 20 years to ask > "That str.partition() function is really swiggy! It's everywhere > now, but I wonder what language had it first and who came up with > it?" +1 This is very useful behaviour IMO. Have the precise return values of partition() been defined? Specifically, given: 'a'.split('b') we could get back: ('a', '', '') ('a', None, None) Similarly: 'ab'.split('b') could be either: ('a', 'b', '') ('a', 'b', None) IMO the most useful (and intuitive) behaviour is to return strings in all cases. My major issue is with the names - partition() doesn't sound right to me. split() of course sounds best, but it has additional stuff we don't necessarily want. However, I think we should aim to get the idea accepted first, then work out the best name. Tim Delaney From t-meyer at ihug.co.nz Tue Aug 30 02:05:03 2005 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Tue, 30 Aug 2005 12:05:03 +1200 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: Message-ID: [Kay Schluehr] >> The discourse about Python3000 has shrunken from the expectation >> of the "next big thing" into a depressive rhetorics of feature >> elimination. The language doesn't seem to become deeper, smaller >> and more powerfull but just smaller. [Guido] > There is much focus on removing things, because we want to be able > to add new stuff but we don't want the language to grow. ISTM that a major reason that the Python 3.0 discussion seems focused more on removal than addition is that a lot of addition can be (and is being) done in Python 2.x. This is a huge benefit, of course, since people can start doing things the "new and improved" way in 2.x, even though it's not until 3.0 that the "old and evil" ;) way is actually removed. Removal of map/filter/reduce is an example - there isn't discussion about addition of new features, because list comps/gen expressions are already here... =Tony.Meyer From greg.ewing at canterbury.ac.nz Tue Aug 30 02:49:26 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 30 Aug 2005 12:49:26 +1200 Subject: [Python-Dev] Alternative name for str.partition() In-Reply-To: <4312EE82.90207@gmail.com> References: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer> <4312EE82.90207@gmail.com> Message-ID: <4313AD16.3070608@canterbury.ac.nz> A more descriptive name than 'partition' would be 'split_at'. -- Greg From raymond.hettinger at verizon.net Tue Aug 30 03:26:35 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Mon, 29 Aug 2005 21:26:35 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742BA@au3010avexu1.global.avaya.com> Message-ID: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> [Delaney, Timothy (Tim)] > +1 > > This is very useful behaviour IMO. Thanks. It seems to be getting +1s all around. > Have the precise return values of partition() been defined? . . . > IMO the most useful (and intuitive) behaviour is to return strings in > all cases. Yes, there is a precise spec and yes it always returns three strings. Movitation and spec: http://mail.python.org/pipermail/python-dev/2005-August/055764.html Pure python implementation, sample invocations, and tests: http://mail.python.org/pipermail/python-dev/2005-August/055764.html > My major issue is with the names - partition() doesn't sound right to > me. FWIW, I am VERY happy with the name partition(). It has a long and delightful history in conjunction with the quicksort algorithm where it does something very similar to what we're doing here: partitioning data into three groups (left,center,right) with a small center element (called a pivot in the quicksort context and called a separator in our string parsing context). This name has enjoyed great descriptive success in communicating that the total data size is unchanged and that the parts can be recombined to the whole. IOW, it is exactly the right word. I won't part with it easily. http://www.google.com/search?q=quicksort+partition Raymond From tdelaney at avaya.com Tue Aug 30 03:46:37 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Tue, 30 Aug 2005 11:46:37 +1000 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) Message-ID: <2773CAC687FD5F4689F526998C7E4E5F0742BB@au3010avexu1.global.avaya.com> Raymond Hettinger wrote: > Yes, there is a precise spec and yes it always returns three strings. > > Movitation and spec: > http://mail.python.org/pipermail/python-dev/2005-August/055764.html Ah - thanks. Missed that in the mass of emails. >> My major issue is with the names - partition() doesn't sound right to >> me. > > FWIW, I am VERY happy with the name partition(). It has a long and > delightful history in conjunction with the quicksort algorithm where > it does something very similar to what we're doing here: I guessed that the motivation came from quicksort. My concern is that "partition" is not something that most users would associate with strings. I know I certainly wouldn't (at least, not immediately). The behaviour is obvious from the name, but I don't feel the name is obvious from the behaviour. If I were explaining the behaviour of partition() to someone, the words I would use are something like: partition() splits a string into 3 parts - the bit before the first occurrance of the separator, the separator, and the bit after the separator. If the separator isn't in the string at all then the entire string is returned as "the bit before" and the returned separator and bit after are empty strings. I'd probably also explain that if the separator is the very last thing in the string the "bit after" would be an empty string, but that is fairly intuitive in any case IMO. It's a pity split() is already taken - but then, you would want split() to do more in any case (specifically, split multiple times). Tim Delaney From anthony at interlink.com.au Tue Aug 30 04:09:16 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue, 30 Aug 2005 12:09:16 +1000 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> Message-ID: <200508301209.19693.anthony@interlink.com.au> On Tuesday 30 August 2005 11:26, Raymond Hettinger wrote: > > My major issue is with the names - partition() doesn't sound right to > > me. > > FWIW, I am VERY happy with the name partition(). I'm +1 on the functionality, and +1 on the name partition(). The only other name that comes to mind is 'separate()', but a) I always spell it 'seperate' (and I don't need another lamdba ) b) It's too similar in name to 'split()' Anthony -- Anthony Baxter It's never too late to have a happy childhood. From stephen at xemacs.org Tue Aug 30 04:37:53 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 30 Aug 2005 11:37:53 +0900 Subject: [Python-Dev] partition() In-Reply-To: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> (Raymond Hettinger's message of "Mon, 29 Aug 2005 21:26:35 -0400") References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> Message-ID: <87hdd8jgji.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Raymond" == Raymond Hettinger writes: Raymond> FWIW, I am VERY happy with the name partition(). Raymond> ... [I]t is exactly the right word. I won't part with it Raymond> easily. +1 I note that Emacs has a split-string function which does not have those happy properties. In particular it never preserves the separator, and (by default) it discards empty strings. Raymond> It has a long and delightful history in conjunction with Raymond> the quicksort algorithm Now, that is a delightful mnemonic! -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From ldlandis at gmail.com Tue Aug 30 05:29:16 2005 From: ldlandis at gmail.com (LD "Gus" Landis) Date: Mon, 29 Aug 2005 22:29:16 -0500 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <200508301209.19693.anthony@interlink.com.au> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <200508301209.19693.anthony@interlink.com.au> Message-ID: Hi, How about piece() ? Anthony can have his "e"s that way too! ;-) and it's the same number of characters as .split(). Cheers, --ldl On 8/29/05, Anthony Baxter wrote: > On Tuesday 30 August 2005 11:26, Raymond Hettinger wrote: > > > My major issue is with the names - partition() doesn't sound right to > > > me. > > > > FWIW, I am VERY happy with the name partition(). > > I'm +1 on the functionality, and +1 on the name partition(). The only other > name that comes to mind is 'separate()', but > a) I always spell it 'seperate' (and I don't need another lamdba ) > b) It's too similar in name to 'split()' > > Anthony > > -- > Anthony Baxter > It's never too late to have a happy childhood. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ldlandis%40gmail.com > -- LD Landis - N0YRQ - from the St Paul side of Minneapolis From ldlandis at gmail.com Tue Aug 30 05:33:25 2005 From: ldlandis at gmail.com (LD "Gus" Landis) Date: Mon, 29 Aug 2005 22:33:25 -0500 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <200508301209.19693.anthony@interlink.com.au> Message-ID: Hi, Re: multiples, etc... Check out (and Pythonify) the ANSI M[UMPS] $PIECE(). See: http://www.jacquardsystems.com/Examples/function/piece.htm Cheers, --ldl On 8/29/05, LD Gus Landis wrote: > Hi, > > How about piece() ? Anthony can have his "e"s that way too! ;-) > and it's the same number of characters as .split(). > > Cheers, > --ldl > -- LD Landis - N0YRQ - from the St Paul side of Minneapolis From fdrake at acm.org Tue Aug 30 05:53:36 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon, 29 Aug 2005 23:53:36 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <200508301209.19693.anthony@interlink.com.au> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <200508301209.19693.anthony@interlink.com.au> Message-ID: <200508292353.36549.fdrake@acm.org> On Tuesday 30 August 2005 11:26, Raymond Hettinger wrote: > FWIW, I am VERY happy with the name partition(). I like it too. +1 -Fred -- Fred L. Drake, Jr. From pje at telecommunity.com Tue Aug 30 06:00:19 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 30 Aug 2005 00:00:19 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <200508301209.19693.anthony@interlink.com.au> Message-ID: <5.1.1.6.0.20050829235726.029224c0@mail.telecommunity.com> At 10:33 PM 8/29/2005 -0500, LD \"Gus\" Landis wrote: >Hi, > > Re: multiples, etc... > > Check out (and Pythonify) the ANSI M[UMPS] $PIECE(). See: > http://www.jacquardsystems.com/Examples/function/piece.htm > >Cheers, > --ldl As far as I can see, either you misunderstand what partition() does, or I'm completely misunderstanding what $PIECE does. As far as I can tell, $PIECE and partition() have absolutely nothing in common except that they take strings as arguments. :) -1 on piece(), +1 for partition(). From tdelaney at avaya.com Tue Aug 30 06:07:59 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Tue, 30 Aug 2005 14:07:59 +1000 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) Message-ID: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> Phillip J. Eby wrote: > +1 for partition(). Looks like I'm getting seriously outvoted here ... Still, as I said I don't think the name is overly important until the idea has been accepted anyway. How long did we go with people in favour of "resource manager" until "context manager" came up? Of course, if I (or someone else) can't come up with an obviously better name, partition() will win by default. I don't think it's a *bad* name - just don't think it's a particularly *obvious* name. I think that one of the things I have against it is that most times I type it, I get a typo. If this function is accepted, I think it will (and should!) become one of the most used string functions around. As such, the name should be *very* easy to type. Tim Delaney From shane at hathawaymix.org Tue Aug 30 06:47:44 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Mon, 29 Aug 2005 22:47:44 -0600 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> Message-ID: <4313E4F0.6070906@hathawaymix.org> Delaney, Timothy (Tim) wrote: > I think that one of the things I have against it is that most times I > type it, I get a typo. If this function is accepted, I think it will > (and should!) become one of the most used string functions around. As > such, the name should be *very* easy to type. FWIW, the analogy with quicksort convinced me that partition is a good name, even though I'm a terirlbe tpyist. I'm a pretty good proofreader, though. ;-) Shane From aahz at pythoncraft.com Tue Aug 30 06:50:05 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon, 29 Aug 2005 21:50:05 -0700 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> Message-ID: <20050830045005.GA7988@panix.com> On Tue, Aug 30, 2005, Delaney, Timothy (Tim) wrote: > > Looks like I'm getting seriously outvoted here ... Still, as I said I > don't think the name is overly important until the idea has been > accepted anyway. How long did we go with people in favour of "resource > manager" until "context manager" came up? In that case, though, it was more, "Well, I'm not that happy with 'context manager', but there doesn't seem to be anything better." This time, it's closer to, "That's a good name for the concept, yup." As you say, if someone comes up with a clearly better name, it likely will win; however, partition has been blessed by enough people that it's not worth putting much effort into finding anything better. > Of course, if I (or someone else) can't come up with an obviously > better name, partition() will win by default. I don't think it's a > *bad* name - just don't think it's a particularly *obvious* name. It's at least as obvious as translate(). -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything. From tjreedy at udel.edu Tue Aug 30 06:54:50 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 30 Aug 2005 00:54:50 -0400 Subject: [Python-Dev] Remove str.find in 3.0? References: <43120396.30406@egenix.com> <001501c5ac03$4679e480$d206a044@oemcomputer> Message-ID: "Raymond Hettinger" wrote in message news:001501c5ac03$4679e480$d206a044 at oemcomputer... > [M.-A. Lemburg] >> Also, as I understand Terry's request, .find() should be removed >> in favor of just leaving .index() (which is the identical method >> without the funny -1 return code logic). My proposal is to use the 3.0 opportunity to improve the language in this particular area. I considered and ranked five alternatives more or less as follows. 1. Keep .index and delete .find. 2. Keep .index and repair .find to return None instead of -1. 3.5 Delete .index and repair .find. 3.5 Keep .index and .find as is. 5. Delete .index and keep .find as is. > It is new and separate, but it is also related. I see it as a 6th option: keep.index, delete .find, and replace with .partition. I rank this at least second and maybe first. It is separable in that the replacement can be done now, while the deletion has to wait. > The core of Terry's request is the assertion that str.find() > is bug-prone and should not be used. That and the redundancy, both of which bothered me a bit since I first learned the string module functions. Terry J. Reedy From tjreedy at udel.edu Tue Aug 30 07:12:41 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 30 Aug 2005 01:12:41 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) References: <2773CAC687FD5F4689F526998C7E4E5F0742BA@au3010avexu1.global.avaya.com> <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> Message-ID: "Raymond Hettinger" wrote in > Yes, there is a precise spec and yes it always returns three strings. While the find/index discussion was about "what is the best way to indicate 'cannot answer'", part of the conclusion is that any way can be awkward. So I am generally in favor of defining a function, when possible, so that it can always deliver an answer (giving inputs of the appropriate types) and so that the 'best way' question is moot. Nicely done. I think the name 'partition' is fine too. It does not preclude putting a quicksort-type partition function in a module of list functions. The only alternative I can think of is 'tripart', but I do *not* prefer that. Terry J. Reedy From mozbugbox at yahoo.com.au Tue Aug 30 07:43:43 2005 From: mozbugbox at yahoo.com.au (JustFillBug) Date: Tue, 30 Aug 2005 05:43:43 +0000 (UTC) Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <200508301209.19693.anthony@interlink.com.au> Message-ID: On 2005-08-30, Anthony Baxter wrote: > On Tuesday 30 August 2005 11:26, Raymond Hettinger wrote: >> > My major issue is with the names - partition() doesn't sound right to >> > me. >> >> FWIW, I am VERY happy with the name partition(). > > I'm +1 on the functionality, and +1 on the name partition(). The only other > name that comes to mind is 'separate()', but > a) I always spell it 'seperate' (and I don't need another lamdba ) > b) It's too similar in name to 'split()' > trisplit() From tdelaney at avaya.com Tue Aug 30 07:50:21 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Tue, 30 Aug 2005 15:50:21 +1000 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) Message-ID: <2773CAC687FD5F4689F526998C7E4E5F05CCAA@au3010avexu1.global.avaya.com> Raymond Hettinger wrote: > Heh! Maybe AttributeError and NameError should be renamed to > TypoError ;-) Afterall, the only time I get these exceptions is > when the fingers press different buttons than the brain requested. You misspelled TyopError ;) Tim Delaney From mwh at python.net Tue Aug 30 09:54:46 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 30 Aug 2005 08:54:46 +0100 Subject: [Python-Dev] partition() In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> (Timothy Delaney's message of "Tue, 30 Aug 2005 14:07:59 +1000") References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> Message-ID: <2my86jq2pl.fsf@starship.python.net> "Delaney, Timothy (Tim)" writes: > Phillip J. Eby wrote: > >> +1 for partition(). > > Looks like I'm getting seriously outvoted here ... Still, as I said I > don't think the name is overly important until the idea has been > accepted anyway. How long did we go with people in favour of "resource > manager" until "context manager" came up? Certainly no longer than until I got up the morning after the discussion started :) partition() works for me. It's not perfect, but it'll do. The idea works for me rather more; it even simplifies the if s.startswith(prefix): t = s[len(prefix):] ... idiom I occasionally wince at. Cheers, mwh -- Gullible editorial staff continues to post links to any and all articles that vaguely criticize Linux in any way. -- Reason #4 for quitting slashdot today, from http://www.cs.washington.edu/homes/klee/misc/slashdot.html From fredrik at pythonware.com Tue Aug 30 10:01:03 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 30 Aug 2005 10:01:03 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer><200508301209.19693.anthony@interlink.com.au> <5.1.1.6.0.20050829235726.029224c0@mail.telecommunity.com> Message-ID: Phillip J. Eby wrote: >> Check out (and Pythonify) the ANSI M[UMPS] $PIECE(). See: >> http://www.jacquardsystems.com/Examples/function/piece.htm > > As far as I can see, either you misunderstand what partition() does, or > I'm > completely misunderstanding what $PIECE does. As far as I can tell, > $PIECE > and partition() have absolutely nothing in common except that they take > strings as arguments. :) both split on a given token. partition splits once, and returns all three parts, while piece returns the part you ask for (the 3-argument form is similar to x.split(s)[i]) From pierre.barbier at cirad.fr Tue Aug 30 10:11:23 2005 From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille) Date: Tue, 30 Aug 2005 10:11:23 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> Message-ID: <431414AB.4010005@cirad.fr> Well, I want to come back on a point that wasn't discussed. I only found one positive comment here : http://mail.python.org/pipermail/python-dev/2005-August/055775.html It's about that : Raymond Hettinger wrote: > * The function always succeeds unless the separator argument is not a > string type or is an empty string. So, a typical call doesn't have to > be wrapped in a try-suite for normal usage. Well, I wonder if it's so good ! Almost all the use case I find would require something like: head, sep, tail = s.partition(t) if sep: do something else: do something else Like, if you want to extract the drive letter from a windows path : drive, sep, tail = path.partition(":") if not sep: drive = get_current_drive() # Because it's a local path Or, if I want to iterate over all the path parts in a UNIX path: sep = '/' while sep: head, sep, path = path.partition(sep) IMO, that read strange ... partitionning until sep is None :S Then, testing with "if" in Python is always a lot slower than having an exception launched from C extension inside a try...except block. So both construct would read like already a lot of Python code: try: head,sep,tail = s.partition(t) do something except SeparatorException: do something else and: sep='/' try: while 1: head, drop, path = path.partition(sep) except SeparatorException: The end To me, the try..except block to test end or error conditions are just part of Python design. So I don't understand why you don't want it ! For the separator, keeping it in the return values may be very useful, mainly because I would really like to use this function replacing string with a regexp (like a simplified version of the Qt method QStringList::split) and, in that case, the separator would be the actual matched separator string. Pierre -- Pierre Barbier de Reuille INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP Botanique et Bio-informatique de l'Architecture des Plantes TA40/PSII, Boulevard de la Lironde 34398 MONTPELLIER CEDEX 5, France tel : (33) 4 67 61 65 77 fax : (33) 4 67 61 56 68 From oren.tirosh at gmail.com Tue Aug 30 10:17:10 2005 From: oren.tirosh at gmail.com (Oren Tirosh) Date: Tue, 30 Aug 2005 11:17:10 +0300 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <200508301209.19693.anthony@interlink.com.au> Message-ID: <7168d65a0508300117411a04ad@mail.gmail.com> On 30/08/05, JustFillBug wrote: > On 2005-08-30, Anthony Baxter wrote: > > On Tuesday 30 August 2005 11:26, Raymond Hettinger wrote: > >> > My major issue is with the names - partition() doesn't sound right to > >> > me. > >> > >> FWIW, I am VERY happy with the name partition(). > > > > I'm +1 on the functionality, and +1 on the name partition(). The only other > > name that comes to mind is 'separate()', but > > a) I always spell it 'seperate' (and I don't need another lamdba ) > > b) It's too similar in name to 'split()' > > > > trisplit() split3() ? I'm +1 on the name "partition" but I think this is shorter, communicates the similarity to split and the fact that it always returns exactly three parts. Oren From jcarlson at uci.edu Tue Aug 30 10:42:22 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 30 Aug 2005 01:42:22 -0700 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <431414AB.4010005@cirad.fr> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> Message-ID: <20050830011440.7E5E.JCARLSON@uci.edu> Pierre Barbier de Reuille wrote: > Well, I want to come back on a point that wasn't discussed. I only found > one positive comment here : > http://mail.python.org/pipermail/python-dev/2005-August/055775.html You apparently haven't been reading python-dev for around 36 hours, because there have been over a dozen positive comments in regards to str.partition(). > Raymond Hettinger wrote: > > * The function always succeeds unless the separator argument is not a > > string type or is an empty string. So, a typical call doesn't have to > > be wrapped in a try-suite for normal usage. > > Well, I wonder if it's so good ! Almost all the use case I find would > require something like: > > head, sep, tail = s.partition(t) > if sep: > do something > else: > do something else Why don't you pause for a second and read Raymond's post here: http://mail.python.org/pipermail/python-dev/2005-August/055781.html In that email there is a listing of standard library translations from str.find to str.partition, and in every case, it is improved. If you believe that str.index would be better used, take a moment and do a few translations of the sections provided and compare them with the str.partition examples. > Like, if you want to extract the drive letter from a windows path : > > drive, sep, tail = path.partition(":") > if not sep: > drive = get_current_drive() # Because it's a local path > > Or, if I want to iterate over all the path parts in a UNIX path: > > sep = '/' > while sep: > head, sep, path = path.partition(sep) > > IMO, that read strange ... partitionning until sep is None :S > Then, testing with "if" in Python is always a lot slower than having an > exception launched from C extension inside a try...except block. In the vast majority of cases, all three portions of the returned partition result are used. The remaining few are generally split between one or two instances. In the microbenchmarks I've conducted, manually generating the slicings are measureably slower than when Python does it automatically. Also, exceptions are actually quite slow in relation to comparisons, specifically in the case of find vs. index (using 2.4)... >>> if 1: ... x = 'h' ... t = time.time() ... for i in xrange(1000000): ... if x.find('i')>=0: ... pass ... print time.time()-t ... 0.953000068665 >>> if 1: ... x = 'h' ... t = time.time() ... for i in xrange(1000000): ... try: ... x.index('i') ... except ValueError: ... pass ... print time.time()-t ... 6.53100013733 I urge you to take some time to read Raymond's translations. - Josiah From pierre.barbier at cirad.fr Tue Aug 30 11:02:11 2005 From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille) Date: Tue, 30 Aug 2005 11:02:11 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <20050830011440.7E5E.JCARLSON@uci.edu> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> Message-ID: <43142093.4080104@cirad.fr> Josiah Carlson a ?crit : > Pierre Barbier de Reuille wrote: > >>Well, I want to come back on a point that wasn't discussed. I only found >>one positive comment here : >>http://mail.python.org/pipermail/python-dev/2005-August/055775.html > > > You apparently haven't been reading python-dev for around 36 hours, > because there have been over a dozen positive comments in regards to > str.partition(). Well, I wasn't criticizing the overall idea of str.partition, which I found very useful ! I'm just discussing one particular idea, which is to avoid the use of exceptions. > >>Raymond Hettinger wrote: >> >>>* The function always succeeds unless the separator argument is not a >>>string type or is an empty string. So, a typical call doesn't have to >>>be wrapped in a try-suite for normal usage. >> >>Well, I wonder if it's so good ! Almost all the use case I find would >>require something like: >> >>head, sep, tail = s.partition(t) >>if sep: >> do something >>else: >> do something else > > > Why don't you pause for a second and read Raymond's post here: > http://mail.python.org/pipermail/python-dev/2005-August/055781.html > > In that email there is a listing of standard library translations from > str.find to str.partition, and in every case, it is improved. If you > believe that str.index would be better used, take a moment and do a few > translations of the sections provided and compare them with the > str.partition examples. Well, what it does is exactly what I tought, you can express most of the use-cases of partition with: head, sep, tail = s.partition(sep) if not sep: #do something when it does not work else: #do something when it works And I propose to replace it by : try: head, sep, tail = s.partition(sep) # do something when it works except SeparatorError: # do something when it does not work What I'm talking about is consistency. In most cases in Python, or at least AFAIU, error testing is avoided and exception launching is preferred mainly for efficiency reasons. So my question remains: why prefer for that specific method returning an "error" value (i.e. an empty separator) against an exception ? Pierre -- Pierre Barbier de Reuille INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP Botanique et Bio-informatique de l'Architecture des Plantes TA40/PSII, Boulevard de la Lironde 34398 MONTPELLIER CEDEX 5, France tel : (33) 4 67 61 65 77 fax : (33) 4 67 61 56 68 From ncoghlan at gmail.com Tue Aug 30 14:27:27 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 30 Aug 2005 22:27:27 +1000 Subject: [Python-Dev] partition() In-Reply-To: <2my86jq2pl.fsf@starship.python.net> References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> <2my86jq2pl.fsf@starship.python.net> Message-ID: <431450AF.4020902@gmail.com> Michael Hudson wrote: > partition() works for me. It's not perfect, but it'll do. The idea > works for me rather more; it even simplifies the > > if s.startswith(prefix): > t = s[len(prefix):] > ... How would you do it? Something like: head, found, tail = s.partition(prefix) if found and not head: ... I guess I agree that's an improvement - only a slight one, though. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Tue Aug 30 14:42:20 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 30 Aug 2005 22:42:20 +1000 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <43142093.4080104@cirad.fr> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> Message-ID: <4314542C.7080000@gmail.com> Pierre Barbier de Reuille wrote: > What I'm talking about is consistency. In most cases in Python, or at > least AFAIU, error testing is avoided and exception launching is > preferred mainly for efficiency reasons. So my question remains: why > prefer for that specific method returning an "error" value (i.e. an > empty separator) against an exception ? Because, in many cases, there is more to it than just the separator not being found. Given a non-empty some_str and some_sep: head, sep, tail = some_str.partition(some_sep) There are actually five possible results: head and not sep and not tail (the separator was not found) head and sep and not tail (the separator is at the end) head and sep and tail (the separator is somewhere in the middle) not head and sep and tail (the separator is at the start) not head and sep and not tail (the separator is the whole string) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Tue Aug 30 14:49:48 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 30 Aug 2005 22:49:48 +1000 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> Message-ID: <431455EC.6050402@gmail.com> Delaney, Timothy (Tim) wrote: > Of course, if I (or someone else) can't come up with an obviously better > name, partition() will win by default. I don't think it's a *bad* name - > just don't think it's a particularly *obvious* name. What about simply "str.parts" and "str.rparts"? That is, rather than splitting the string on a separator, we are breaking it into parts - the part before the separator, the separator itself, and the part after the separator. Same concept as 'partition', just a shorter method name. Another option would be simply "str.part()" and "str.rpart()". Then you could think of it as an abbreviation of either 'partition' or 'parts' depending on your inclination. > I think that one of the things I have against it is that most times I > type it, I get a typo. If this function is accepted, I think it will > (and should!) become one of the most used string functions around. As > such, the name should be *very* easy to type. I've been typing 'partition' a lot lately at work, and Tim's right - typing this correctly is harder than you might think. It is very easy to only type the 'ti' in the middle once, so that you end up with 'partion'. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From ncoghlan at gmail.com Tue Aug 30 15:20:03 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 30 Aug 2005 23:20:03 +1000 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <431455EC.6050402@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> <431455EC.6050402@gmail.com> Message-ID: <43145D03.6090801@gmail.com> Nick Coghlan wrote: > Another option would be simply "str.part()" and "str.rpart()". Then you could > think of it as an abbreviation of either 'partition' or 'parts' depending on > your inclination. I momentarily forgot that "part" is also a verb in its own right, with the right meaning, too (think "parting your hair" and "parting the Red Sea"). So call it +1 for str.part and str.rpart, and +0 for str.partition and str.rpartition. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From solipsis at pitrou.net Tue Aug 30 15:23:36 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Aug 2005 15:23:36 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <43145D03.6090801@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> <431455EC.6050402@gmail.com> <43145D03.6090801@gmail.com> Message-ID: <1125408216.17470.6.camel@p-dvsi-418-1.rd.francetelecom.fr> (unlurking) Le mardi 30 ao?t 2005 ? 23:20 +1000, Nick Coghlan a ?crit : > I momentarily forgot that "part" is also a verb in its own right, with the > right meaning, too (think "parting your hair" and "parting the Red Sea"). "parts" sounds more obvious than the verb "part" which is little known to non-native English speakers (at least to me anyway). Just my 2 cents. Regards Antoine. From jason.orendorff at gmail.com Tue Aug 30 15:51:13 2005 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Tue, 30 Aug 2005 09:51:13 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <1125408216.17470.6.camel@p-dvsi-418-1.rd.francetelecom.fr> References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> <431455EC.6050402@gmail.com> <43145D03.6090801@gmail.com> <1125408216.17470.6.camel@p-dvsi-418-1.rd.francetelecom.fr> Message-ID: Concerning names for partition(), I immediately thought of break(). Unfortunately it's taken. So, how about snap()? head, sep, tail = line.snap(':') -j From eric.nieuwland at xs4all.nl Tue Aug 30 16:28:05 2005 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Tue, 30 Aug 2005 16:28:05 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <4314542C.7080000@gmail.com> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> Message-ID: <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> I have some use cases with: cut_at = some_str.find(sep) head, tail = some_str[:cut_at], some_str[cut_at:] and: cut_at = some_str.find(sep) head, tail = some_str[:cut_at], some_str[cut_at+offset:] # offset != len(sep) So if partition() [or whatever it'll be called] could have an optional second argument that defines the width of the 'cut' made, I would be helped enormously. The default for this second argument would be len(sep), to preserve the current proposal. --eric From python at discworld.dyndns.org Tue Aug 30 16:36:07 2005 From: python at discworld.dyndns.org (Charles Cazabon) Date: Tue, 30 Aug 2005 08:36:07 -0600 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> <431455EC.6050402@gmail.com> <43145D03.6090801@gmail.com> <1125408216.17470.6.camel@p-dvsi-418-1.rd.francetelecom.fr> Message-ID: <20050830143607.GB23985@discworld.dyndns.org> Jason Orendorff wrote: > Concerning names for partition(), I immediately thought of break(). > Unfortunately it's taken. > > So, how about snap()? I like .part()/.rpart() (or failing that, .parts()/.rparts()). But if you really want something short that's similar in meaning, there's also .cut(). Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://pyropus.ca/software/ ----------------------------------------------------------------------- From pierre.barbier at cirad.fr Tue Aug 30 17:01:36 2005 From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille) Date: Tue, 30 Aug 2005 17:01:36 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> Message-ID: <431474D0.70300@cirad.fr> Eric Nieuwland a ?crit : > I have some use cases with: > cut_at = some_str.find(sep) > head, tail = some_str[:cut_at], some_str[cut_at:] > and: > cut_at = some_str.find(sep) > head, tail = some_str[:cut_at], some_str[cut_at+offset:] # offset != > len(sep) > > So if partition() [or whatever it'll be called] could have an optional > second argument that defines the width of the 'cut' made, I would be > helped enormously. The default for this second argument would be > len(sep), to preserve the current proposal. Well, IMO, your example is much better written: import re rsep = re.compile(sep + '.'*offset) lst = re.split(resp, some_str, 1) head = lst[0] tail = lst[1] Or you want to have some "partition" method which accept regular expressions: head, sep, tail = some_str.partition(re.compile(sep+'.'*offset)) > > --eric > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/pierre.barbier%40cirad.fr > -- Pierre Barbier de Reuille INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP Botanique et Bio-informatique de l'Architecture des Plantes TA40/PSII, Boulevard de la Lironde 34398 MONTPELLIER CEDEX 5, France tel : (33) 4 67 61 65 77 fax : (33) 4 67 61 56 68 From skip at pobox.com Tue Aug 30 17:01:12 2005 From: skip at pobox.com (skip@pobox.com) Date: Tue, 30 Aug 2005 10:01:12 -0500 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <431455EC.6050402@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> <431455EC.6050402@gmail.com> Message-ID: <17172.29880.406663.490117@montanaro.dyndns.org> Nick> What about simply "str.parts" and "str.rparts"? -1 because "parts" is not a verb. When I see an attribute that is a noun I generally expect it to be a data attribute. Skip From raymond.hettinger at verizon.net Tue Aug 30 17:11:53 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 30 Aug 2005 11:11:53 -0400 Subject: [Python-Dev] partition() In-Reply-To: <20050830143607.GB23985@discworld.dyndns.org> Message-ID: <001d01c5ad75$2841b1a0$8832c797@oemcomputer> Hey guys, don't get lost in random naming suggestions (cut, snap, part, parts, yada yada yada). Each of those is much less descriptive and provides less differentiation from other string methods. Saving a few characters is not worth introducing ambiguity. Also, the longer name provides a useful visual balance between the three assigned variables and the separator argument. As an extreme example, contrast the following: head, found, tail = s.p(separator) head, found, tail = s.partition(separator) The verb gets lost if it doesn't have visual weight. Also, for those suggesting alternate semantics (raising exceptions when the separator is not found), I challenge you to prove their worth by doing all the code transformations that I did. It is a remarkably informative exercise that quickly reveals that this alternative is dead-on-arrival. For the poster suggesting an optional length argument, I suggest writing out the revised method invariants. I think you'll find that it snarls them into incomprehensibility and makes the tool much more difficult to learn. Also, I recommend scanning my sample library code transformations to see if any of them would benefit from the length argument. I think you'll find that it comes up so infrequently and with such differing needs that it would be a mistake to bake this into the proposal. Raymond From pje at telecommunity.com Tue Aug 30 17:22:25 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 30 Aug 2005 11:22:25 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <200508301209.19693.anthony@interlink.com.au> <5.1.1.6.0.20050829235726.029224c0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050830112036.01b21aa8@mail.telecommunity.com> At 10:01 AM 8/30/2005 +0200, Fredrik Lundh wrote: >Phillip J. Eby wrote: > > >> Check out (and Pythonify) the ANSI M[UMPS] $PIECE(). See: > >> http://www.jacquardsystems.com/Examples/function/piece.htm > > > > As far as I can see, either you misunderstand what partition() does, or > > I'm > > completely misunderstanding what $PIECE does. As far as I can tell, > > $PIECE > > and partition() have absolutely nothing in common except that they take > > strings as arguments. :) > >both split on a given token. partition splits once, and returns all three >parts, while piece returns the part you ask for No, because looking at that URL, there is no piece that is the token split on. partition() always returns 3 parts for 1 occurrence of the token, whereas $PIECE only has 2. >(the 3-argument form is >similar to x.split(s)[i]) Which is quite thoroughly unlike partition. From pje at telecommunity.com Tue Aug 30 17:27:54 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 30 Aug 2005 11:27:54 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> References: <4314542C.7080000@gmail.com> <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> Message-ID: <5.1.1.6.0.20050830112541.01b25cb0@mail.telecommunity.com> At 04:28 PM 8/30/2005 +0200, Eric Nieuwland wrote: >I have some use cases with: > cut_at = some_str.find(sep) > head, tail = some_str[:cut_at], some_str[cut_at:] >and: > cut_at = some_str.find(sep) > head, tail = some_str[:cut_at], some_str[cut_at+offset:] # offset != >len(sep) > >So if partition() [or whatever it'll be called] could have an optional >second argument that defines the width of the 'cut' made, I would be >helped enormously. The default for this second argument would be >len(sep), to preserve the current proposal. Unrelated comment: maybe 'cut()' and rcut() would be nice short names. I'm not seeing the offset parameter, though, because this: head,__,tail = some_str.cut(sep) tail = tail[offset:] is still better than the original example. From bjourne at gmail.com Tue Aug 30 17:29:07 2005 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Tue, 30 Aug 2005 17:29:07 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <17172.29880.406663.490117@montanaro.dyndns.org> References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> <431455EC.6050402@gmail.com> <17172.29880.406663.490117@montanaro.dyndns.org> Message-ID: <740c3aec050830082963aa8b42@mail.gmail.com> I like partition() but maybe even better would be if strings supported slicing by string indices. key, sep, val = 'foo = 32'.partition('=') would be: key, val = 'foo = 32'[:'='], 'foo = 32'['=':] To me it feels very natural to extend Python's slices to string indices and would cover most of partition()'s use cases. The if sep: idiom of parition() could be solved by throwing an IndexError: e.g: _, sep, port = host.partition(':') if sep: try: int(port) except ValueError: becomes: try: port = host[':':] int(port) except IndexError: pass except ValueError: An advantage of using slices would be that you could specify both a beginning and ending string like this: >>> s 'http://192.168.12.22:8080' >>> s['http://':':'] '192.168.12.22' Sorry if this idea has already been discussed. -- mvh Bj?rn From eric.nieuwland at xs4all.nl Tue Aug 30 17:35:24 2005 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Tue, 30 Aug 2005 17:35:24 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <431474D0.70300@cirad.fr> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> Message-ID: Pierre Barbier de Reuille wrote: > Or you want to have some "partition" method which accept regular > expressions: > > head, sep, tail = some_str.partition(re.compile(sep+'.'*offset)) Neat! +1 on regexps as an argument to partition(). --eric From solipsis at pitrou.net Tue Aug 30 17:40:26 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Aug 2005 17:40:26 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> Message-ID: <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr> > Neat! > +1 on regexps as an argument to partition(). It sounds better to have a separate function and call it re.partition, doesn't it ? By the way, re.partition() is *really* useful compared to re.split() because with the latter you don't which string precisely matched the pattern (it isn't an issue with str.split() since matching is exact). Regards Antoine. From shane at hathawaymix.org Tue Aug 30 17:42:26 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Tue, 30 Aug 2005 09:42:26 -0600 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> Message-ID: <43147E62.2060106@hathawaymix.org> Eric Nieuwland wrote: > Pierre Barbier de Reuille wrote: > >>Or you want to have some "partition" method which accept regular >>expressions: >> >>head, sep, tail = some_str.partition(re.compile(sep+'.'*offset)) > > > Neat! > +1 on regexps as an argument to partition(). Are you sure? I would instead expect to find a .partition method on a regexp object: head, sep, tail = re.compile(sep+'.'*offset).partition(some_str) Shane From pierre.barbier at cirad.fr Tue Aug 30 17:50:13 2005 From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille) Date: Tue, 30 Aug 2005 17:50:13 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <43147E62.2060106@hathawaymix.org> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> Message-ID: <43148035.7020007@cirad.fr> Shane Hathaway a ?crit : > Eric Nieuwland wrote: > >> Pierre Barbier de Reuille wrote: >> >>> Or you want to have some "partition" method which accept regular >>> expressions: >>> >>> head, sep, tail = some_str.partition(re.compile(sep+'.'*offset)) >> >> >> >> Neat! >> +1 on regexps as an argument to partition(). > > > Are you sure? I would instead expect to find a .partition method on a > regexp object: > > head, sep, tail = re.compile(sep+'.'*offset).partition(some_str) Well, to be consistent with current re module, it would be better to follow Antoine's suggestion : head, sep, tail = re.partition(re.compile(sep+'.'*offset), some_str) Pierre > > Shane > -- Pierre Barbier de Reuille INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP Botanique et Bio-informatique de l'Architecture des Plantes TA40/PSII, Boulevard de la Lironde 34398 MONTPELLIER CEDEX 5, France tel : (33) 4 67 61 65 77 fax : (33) 4 67 61 56 68 From shane at hathawaymix.org Tue Aug 30 17:55:28 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Tue, 30 Aug 2005 09:55:28 -0600 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <43148035.7020007@cirad.fr> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr> Message-ID: <43148170.1020903@hathawaymix.org> Pierre Barbier de Reuille wrote: > > Shane Hathaway a ?crit : >>Are you sure? I would instead expect to find a .partition method on a >>regexp object: >> >> head, sep, tail = re.compile(sep+'.'*offset).partition(some_str) > > > Well, to be consistent with current re module, it would be better to > follow Antoine's suggestion : > > head, sep, tail = re.partition(re.compile(sep+'.'*offset), some_str) Actually, consistency with the current re module requires new methods to be added in *both* places. Apparently Python believes TMTOWTDI is the right practice here. ;-) See search, match, split, findall, finditer, sub, and subn: http://docs.python.org/lib/node114.html http://docs.python.org/lib/re-objects.html Shane From tim.peters at gmail.com Tue Aug 30 18:14:55 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 30 Aug 2005 12:14:55 -0400 Subject: [Python-Dev] setdefault's second argument In-Reply-To: <43148170.1020903@hathawaymix.org> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr> <43148170.1020903@hathawaymix.org> Message-ID: <1f7befae05083009146a9c35ce@mail.gmail.com> Anyone remember why setdefault's second argument is optional? >>> d = {} >>> d.setdefault(666) >>> d {666: None} just doesn't seem useful. In fact, it's so silly that someone calling setdefault with just one arg seems far more likely to have a bug in their code than to get an outcome they actually wanted. Haven't found any 1-arg uses of setdefault() either, except for test code verifying that you _can_ omit the second arg. This came up in ZODB-land, where someone volunteered to add setdefault() to BTrees. Some flavors of BTrees are specialized to hold integer or float values, and then setting None as a value is impossible. I resolved it there by making BTree.setdefault() require both arguments. It was a surprise to me that dict.setdefault() didn't also require both. If there isn't a sane use case for leaving the second argument out, I'd like to drop the possibility in P3K (assuming setdefault() survives). From jcarlson at uci.edu Tue Aug 30 18:26:10 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 30 Aug 2005 09:26:10 -0700 Subject: [Python-Dev] setdefault's second argument In-Reply-To: <1f7befae05083009146a9c35ce@mail.gmail.com> References: <43148170.1020903@hathawaymix.org> <1f7befae05083009146a9c35ce@mail.gmail.com> Message-ID: <20050830092200.8B03.JCARLSON@uci.edu> Tim Peters wrote: > > Anyone remember why setdefault's second argument is optional? > > >>> d = {} > >>> d.setdefault(666) > >>> d > {666: None} For quick reference for other people, d.setdefault(key [, value]) returns the value that is currently there, or just assigned. The only case where it makes sense to omit the value parameter is in the case where value=None. > just doesn't seem useful. In fact, it's so silly that someone calling > setdefault with just one arg seems far more likely to have a bug in > their code than to get an outcome they actually wanted. Haven't found > any 1-arg uses of setdefault() either, except for test code verifying > that you _can_ omit the second arg. > > This came up in ZODB-land, where someone volunteered to add > setdefault() to BTrees. Some flavors of BTrees are specialized to > hold integer or float values, and then setting None as a value is > impossible. I resolved it there by making BTree.setdefault() require > both arguments. It was a surprise to me that dict.setdefault() didn't > also require both. > > If there isn't a sane use case for leaving the second argument out, > I'd like to drop the possibility in P3K (assuming setdefault() > survives). I agree, at least that in the case where people actually want None (the only time where the second argument is really optional, I think that they should have to specify it. EIBTI and all that. - Josiah From hoffman at ebi.ac.uk Tue Aug 30 18:19:22 2005 From: hoffman at ebi.ac.uk (Michael Hoffman) Date: Tue, 30 Aug 2005 17:19:22 +0100 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <43148170.1020903@hathawaymix.org> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr> <43148170.1020903@hathawaymix.org> Message-ID: [Shane Hathaway writes about the existence of both module-level functions and object methods to do the same regex operations] > Apparently Python believes TMTOWTDI is the right practice here. ;-) > See search, match, split, findall, finditer, sub, and subn: > > http://docs.python.org/lib/node114.html > http://docs.python.org/lib/re-objects.html Dare I ask whether the uncompiled versions should be considered for removal in Python 3.0? *puts on his asbestos jacket* -- Michael Hoffman European Bioinformatics Institute From tim.peters at gmail.com Tue Aug 30 18:38:55 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 30 Aug 2005 12:38:55 -0400 Subject: [Python-Dev] setdefault's second argument In-Reply-To: <20050830092200.8B03.JCARLSON@uci.edu> References: <43148170.1020903@hathawaymix.org> <1f7befae05083009146a9c35ce@mail.gmail.com> <20050830092200.8B03.JCARLSON@uci.edu> Message-ID: <1f7befae05083009384acec6c9@mail.gmail.com> [Tim Peters] >> Anyone remember why setdefault's second argument is optional? >> >> >>> d = {} >> >>> d.setdefault(666) >> >>> d >> {666: None} >> ... [Josiah Carlson] > For quick reference for other people, d.setdefault(key [, value]) > returns the value that is currently there, or just assigned. The only > case where it makes sense to omit the value parameter is in the case > where value=None. Yes, that's right. Overwhelmingly most often in the wild, a just-constructed empty container object is passed as the second argument. Rarely, I see 0 passed. I've found no case where None is wanted (except in the test suite, verifying that the 1-argument form does indeed default to using None). > ... > I agree, at least that in the case where people actually want None (the > only time where the second argument is really optional, I think that > they should have to specify it. EIBTI and all that. And since there apparently aren't any such cases outside of Python's test suite, that wouldn't be much of a burden on them . From raymond.hettinger at verizon.net Tue Aug 30 18:35:05 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 30 Aug 2005 12:35:05 -0400 Subject: [Python-Dev] setdefault's second argument In-Reply-To: <1f7befae05083009146a9c35ce@mail.gmail.com> Message-ID: <003301c5ad80$c72c1020$8832c797@oemcomputer> [Tim] > Anyone remember why setdefault's second argument is optional? IIRC, this is a vestige from its ancestor. The proposal for setdefault() described it as behaving like dict.get() but inserting the key if not found. > Haven't found > any 1-arg uses of setdefault() either, except for test code verifying > that you _can_ omit the second arg. Likewise, I found zero occurrences in the library, in my cumulative code base, and in the third-party packages on my system. > If there isn't a sane use case for leaving the second argument out, > I'd like to drop the possibility in P3K (assuming setdefault() > survives). Give a lack of legitimate use cases, do we have to wait to Py3.0? It could likely be fixed directly and not impact any code that people care about. Raymond From mcherm at mcherm.com Tue Aug 30 18:39:49 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Tue, 30 Aug 2005 09:39:49 -0700 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) Message-ID: <20050830093949.1rqenezh01cs0w0c@login.werra.lunarpages.com> Michael Hoffman writes: > Dare I ask whether the uncompiled versions [of re object methods] should > be considered for removal in Python 3.0? > > *puts on his asbestos jacket* No flames here, but I'd rather leave them. The docs make it clear that the two sets of functions/methods are equivalent, so the conceptual overhead is small (at least it doesn't scale with the number of methods in re). The docs make it clear that the compiled versions are faster, so serious users should prefer them. But the uncompiled versions are preferable in one special situation: short simple scripts -- the kind of thing often done with shell scriping except that Python is Better (TM). For these uses, performance is irrelevent and it turns a 2-line construct into a single line. Of course the uncompiled versions can be written as little 2-line functions but that's even WORSE for short simple scripts. Nearly everything I write these days is larger and more complex, but I retain a soft spot for short simple scripts and want Python to continue to be the best tool available for these tasks. -- Michael Chermside From tim.peters at gmail.com Tue Aug 30 18:56:11 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 30 Aug 2005 12:56:11 -0400 Subject: [Python-Dev] setdefault's second argument In-Reply-To: <003301c5ad80$c72c1020$8832c797@oemcomputer> References: <1f7befae05083009146a9c35ce@mail.gmail.com> <003301c5ad80$c72c1020$8832c797@oemcomputer> Message-ID: <1f7befae05083009565974978c@mail.gmail.com> [Raymond] > setdefault() described it as behaving like dict.get() but inserting the > key if not found. ... > Likewise, I found zero occurrences in the library, in my cumulative code > base, and in the third-party packages on my system. [Tim] >> If there isn't a sane use case for leaving the second argument out, >> I'd like to drop the possibility in P3K (assuming setdefault() >> survives). [Raymond] > Give a lack of legitimate use cases, do we have to wait to Py3.0? It > could likely be fixed directly and not impact any code that people care > about. That would be fine by me, but any change really requires a deprecation-warning release first. Dang! I may have just found a use, in Zope's lib/python/docutils/parsers/rst/directives/images.py (which is part of docutils, not really part of Zope): figwidth = options.setdefault('figwidth') figclass = options.setdefault('figclass') del options['figwidth'] del options['figclass'] I'm still thinking about what that's trying to do <0.5 wink>. Assuming options is a dict-like thingie, it probably meant to do: figwidth = options.pop('figwidth', None) figclass = options.pop('figclass', None) David, are you married to that bizarre use of setdefault ? Whatever, I can't claim there are _no_ uses of 1-arg setdefault() in the wild any more. From jcarlson at uci.edu Tue Aug 30 19:06:55 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 30 Aug 2005 10:06:55 -0700 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <43142093.4080104@cirad.fr> References: <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> Message-ID: <20050830091228.8B00.JCARLSON@uci.edu> Pierre Barbier de Reuille wrote: > Well, what it does is exactly what I tought, you can express most of the > use-cases of partition with: > > head, sep, tail = s.partition(sep) > if not sep: > #do something when it does not work > else: > #do something when it works > > And I propose to replace it by : > > try: > head, sep, tail = s.partition(sep) > # do something when it works > except SeparatorError: > # do something when it does not work No, you can't. As Tim Peters pointed out, in order to be correct, you need to use... try: head, found, tail = s.partition(sep) except ValueError: # do something when it can't find sep else: # do something when it can find sep By embedding the 'found' case inside the try/except clause as you offer, you could be hiding another exception, which is incorrect. > What I'm talking about is consistency. In most cases in Python, or at > least AFAIU, error testing is avoided and exception launching is > preferred mainly for efficiency reasons. So my question remains: why > prefer for that specific method returning an "error" value (i.e. an > empty separator) against an exception ? It is known among those who tune their Python code that try/except is relatively expensive when exceptions are raised, but not significantly faster (if any) when they are not. I'll provide an updated set of microbenchmarks... >>> if 1: ... x = 'h' ... t = time.time() ... for i in xrange(1000000): ... _ = x.find('h') ... if _ >= 0: ... pass ... else: ... pass ... print time.time()-t ... 0.84299993515 >>> if 1: ... x = 'h' ... t = time.time() ... for i in xrange(1000000): ... try: ... _ = x.index('h') ... except ValueError: ... pass ... else: ... pass ... print time.time()-t ... 0.81299996376 BUT! >>> if 1: ... x = 'h' ... t = time.time() ... for i in xrange(1000000): ... try: ... _ = x.index('i') ... except ValueError: ... pass ... else: ... pass ... print time.time()-t ... 4.29700016975 We should subtract the time of the for loop, the method call overhead, perhaps the integer object creation/fetch, and the assignment. str.__len__() is pretty fast (really just a member check, which is at a constant offset...), let us use that. >>> if 1: ... x = 'h' ... t = time.time() ... for i in xrange(1000000): ... _ = x.__len__() ... print time.time()-t ... 0.5 So, subtracting that .5 seconds from all the cases gives us... 0.343 seconds for .find's comparison 0.313 seconds for .index's exception handling when an exception is not raised 3.797 seconds for .index's exception handling when an exception is raised. In the case of a string being found, .index is about 10% faster than .find . In the case of a string not being found, .index's exception handlnig mechanics are over 11 times slower than .find's comparison. Those numbers should speak for themselves. In terms of the strings being automatically chopped up vs. manually chopping them up with slices, it is obvious which will be faster: C-level slicing. I agree with Raymond that if you are going to poo-poo on str.partition() not raising an exception, you should do some translations using the correct structure that Tim Peters provided, and post them here on python-dev as 'proof' that raising an exception in the cases provided is better. - Josiah From rrr at ronadam.com Tue Aug 30 19:09:41 2005 From: rrr at ronadam.com (Ron Adam) Date: Tue, 30 Aug 2005 13:09:41 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> Message-ID: <431492D5.6090102@ronadam.com> Raymond Hettinger wrote: > [Delaney, Timothy (Tim)] > >>+1 >> >>This is very useful behaviour IMO. > > > Thanks. It seems to be getting +1s all around. Wow, a lot of approvals! :) >>Have the precise return values of partition() been defined? +1 on the Name partition, I considered split or parts, but i agree partition reads better and since it's not so generic as something like get_parts, it creates a stronger identity making the code clearer. >>IMO the most useful (and intuitive) behaviour is to return strings in >>all cases. Wow, a lot of approvals! :-) A possibly to consider: Instead of partition() and rpartition(), have just partition with an optional step or skip value which can be a positive or negative non zero integer. head, found, tail = partition(sep, [step=1]) step = -1 step would look for sep from the right. step = 2, would look for the second sep from left. step = -2, would look for the second sep from the right. Default of course would be 1, find first step from the left. This would allow creating an iterator that could iterate though a string splitting on each sep from either the left, or right. Weather a 0 or a |value|>len(string) causes an exception would need to be decided. I can't think of an obvious use for a partition iterator at the moment, maybe someone could find an example. In any case, finding the second, or third sep is probably common enough. Cheers, Ron From skip at pobox.com Tue Aug 30 19:13:37 2005 From: skip at pobox.com (skip@pobox.com) Date: Tue, 30 Aug 2005 12:13:37 -0500 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <43145D03.6090801@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> <431455EC.6050402@gmail.com> <43145D03.6090801@gmail.com> Message-ID: <17172.37825.153046.857408@montanaro.dyndns.org> Nick> I momentarily forgot that "part" is also a verb in its own right, Nick> with the right meaning, too (think "parting your hair" and Nick> "parting the Red Sea"). If I remember correctly from watching "The Ten Commandments" as a kid, I believe Charlton Heston only parted the Red Sea in one place... Skip From barry at python.org Tue Aug 30 19:15:59 2005 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Aug 2005 13:15:59 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <5.1.1.6.0.20050830112541.01b25cb0@mail.telecommunity.com> References: <4314542C.7080000@gmail.com> <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <5.1.1.6.0.20050830112541.01b25cb0@mail.telecommunity.com> Message-ID: <1125422159.10126.11.camel@geddy.wooz.org> On Tue, 2005-08-30 at 11:27, Phillip J. Eby wrote: > >So if partition() [or whatever it'll be called] could have an optional > >second argument that defines the width of the 'cut' made, I would be > >helped enormously. The default for this second argument would be > >len(sep), to preserve the current proposal. +1 on the concept -- very nice Raymond. > Unrelated comment: maybe 'cut()' and rcut() would be nice short names. FWIW, +1 on .cut(), +0 on .partition() -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050830/98bdbe33/attachment-0001.pgp From skip at pobox.com Tue Aug 30 19:29:04 2005 From: skip at pobox.com (skip@pobox.com) Date: Tue, 30 Aug 2005 12:29:04 -0500 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr> Message-ID: <17172.38752.55260.62198@montanaro.dyndns.org> Antoine> By the way, re.partition() is *really* useful compared to Antoine> re.split() because with the latter you don't which string Antoine> precisely matched the pattern (it isn't an issue with Antoine> str.split() since matching is exact). Just group your re: >>> import re >>> >>> re.split("ab", "abracadabra") ['', 'racad', 'ra'] >>> re.split("(ab)", "abracadabra") ['', 'ab', 'racad', 'ab', 'ra'] and you get it in the return value. In fact, re.split with a grouped re is very much like Raymond's str.partition method without the guarantee of returning a three-element list. Skip From skip at pobox.com Tue Aug 30 19:30:26 2005 From: skip at pobox.com (skip@pobox.com) Date: Tue, 30 Aug 2005 12:30:26 -0500 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr> Message-ID: <17172.38834.592674.741120@montanaro.dyndns.org> In fact, re.split with a grouped re is very much like Raymond's str.partition method without the guarantee of returning a three-element list. Whoops... Should also have included the maxsplit=1 constraint. Skip From s.percivall at chello.se Tue Aug 30 19:30:18 2005 From: s.percivall at chello.se (Simon Percivall) Date: Tue, 30 Aug 2005 19:30:18 +0200 Subject: [Python-Dev] partition() In-Reply-To: <001d01c5ad75$2841b1a0$8832c797@oemcomputer> References: <001d01c5ad75$2841b1a0$8832c797@oemcomputer> Message-ID: <8C21C161-7B36-45F1-AA64-9E21B3F2942E@chello.se> On 30 aug 2005, at 17.11, Raymond Hettinger wrote: > Hey guys, don't get lost in random naming suggestions (cut, snap, > part, > parts, yada yada yada). Each of those is much less descriptive and > provides less differentiation from other string methods. Saving a few > characters is not worth introducing ambiguity. Trisect would be pretty descriptive ... //Simon From python at discworld.dyndns.org Tue Aug 30 19:32:35 2005 From: python at discworld.dyndns.org (Charles Cazabon) Date: Tue, 30 Aug 2005 11:32:35 -0600 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <20050830093949.1rqenezh01cs0w0c@login.werra.lunarpages.com> References: <20050830093949.1rqenezh01cs0w0c@login.werra.lunarpages.com> Message-ID: <20050830173235.GC25381@discworld.dyndns.org> Michael Chermside wrote: > Michael Hoffman writes: > > Dare I ask whether the uncompiled versions [of re object methods] should > > be considered for removal in Python 3.0? > > > > *puts on his asbestos jacket* > > No flames here, but I'd rather leave them. Me too. I have various programs that construct lots of large REs on the fly, knowing they'll only be used once. Not having to compile them to objects inline makes the code cleaner and easier to read. Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://pyropus.ca/software/ ----------------------------------------------------------------------- From skip at pobox.com Tue Aug 30 19:34:01 2005 From: skip at pobox.com (skip@pobox.com) Date: Tue, 30 Aug 2005 12:34:01 -0500 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr> <43148170.1020903@hathawaymix.org> Message-ID: <17172.39049.164534.215793@montanaro.dyndns.org> >> http://docs.python.org/lib/re-objects.html Michael> Dare I ask whether the uncompiled versions should be considered Michael> for removal in Python 3.0? It is quite convenient to not have to compile regular expressions in most cases. The module takes care of compiling your patterns and caching them for you. Skip From skip at pobox.com Tue Aug 30 19:40:18 2005 From: skip at pobox.com (skip@pobox.com) Date: Tue, 30 Aug 2005 12:40:18 -0500 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <1125422159.10126.11.camel@geddy.wooz.org> References: <4314542C.7080000@gmail.com> <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <5.1.1.6.0.20050830112541.01b25cb0@mail.telecommunity.com> <1125422159.10126.11.camel@geddy.wooz.org> Message-ID: <17172.39426.532398.825596@montanaro.dyndns.org> >> Unrelated comment: maybe 'cut()' and rcut() would be nice short names. Barry> FWIW, +1 on .cut(), +0 on .partition() As long as people are free associating: snip(), excise(), explode(), invade_iraq()... Skip From mwh at python.net Tue Aug 30 19:43:18 2005 From: mwh at python.net (Michael Hudson) Date: Tue, 30 Aug 2005 18:43:18 +0100 Subject: [Python-Dev] partition() In-Reply-To: <431450AF.4020902@gmail.com> (Nick Coghlan's message of "Tue, 30 Aug 2005 22:27:27 +1000") References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> <2my86jq2pl.fsf@starship.python.net> <431450AF.4020902@gmail.com> Message-ID: <2mu0h7pbgp.fsf@starship.python.net> Nick Coghlan writes: > Michael Hudson wrote: >> partition() works for me. It's not perfect, but it'll do. The idea >> works for me rather more; it even simplifies the >> >> if s.startswith(prefix): >> t = s[len(prefix):] >> ... > > How would you do it? Something like: > > head, found, tail = s.partition(prefix) > if found and not head: > ... > > I guess I agree that's an improvement - only a slight one, though. Yes. I seem to fairly often[1] do this with prefix as a literal so only having to mention it once would be a win for me. Cheers, mwh [1] But not often enough to have defined a function to do this job, it seems. -- I must be missing something. It is not possible to be this stupid. you don't meet a lot of actual people, do you? From barry at python.org Tue Aug 30 20:19:24 2005 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Aug 2005 14:19:24 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <20050830093949.1rqenezh01cs0w0c@login.werra.lunarpages.com> References: <20050830093949.1rqenezh01cs0w0c@login.werra.lunarpages.com> Message-ID: <1125425964.10961.3.camel@geddy.wooz.org> On Tue, 2005-08-30 at 12:39, Michael Chermside wrote: > Michael Hoffman writes: > > Dare I ask whether the uncompiled versions [of re object methods] should > > be considered for removal in Python 3.0? > No flames here, but I'd rather leave them. The docs make it clear that > the two sets of functions/methods are equivalent, so the conceptual > overhead is small (at least it doesn't scale with the number of methods > in re). The docs make it clear that the compiled versions are faster, so > serious users should prefer them. But the uncompiled versions are > preferable in one special situation: short simple scripts -- the kind > of thing often done with shell scriping except that Python is Better (TM). > For these uses, performance is irrelevent and it turns a 2-line > construct into a single line. Although it's mildly annoying that the docs describe the compiled method names in terms of the uncompiled functions. I always find myself looking up the regexp object's API only to be shuffled off to the module's API and then having to do the argument remapping myself. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050830/819018aa/attachment.pgp From hyeshik at gmail.com Tue Aug 30 20:24:40 2005 From: hyeshik at gmail.com (Hye-Shik Chang) Date: Wed, 31 Aug 2005 03:24:40 +0900 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <000701c5abdd$4f0b7440$d206a044@oemcomputer> References: <4311B7B6.8070503@egenix.com> <000701c5abdd$4f0b7440$d206a044@oemcomputer> Message-ID: <4f0b69dc0508301124423da48e@mail.gmail.com> On 8/28/05, Raymond Hettinger wrote: > >>> s = 'http://www.python.org' > >>> partition(s, '://') > ('http', '://', 'www.python.org') > >>> partition(s, '?') > ('http://www.python.org', '', '') > >>> partition(s, 'http://') > ('', 'http://', 'www.python.org') > >>> partition(s, 'org') > ('http://www.python.', 'org', '') > What would be a result for rpartition(s, '?') ? ('', '', 'http://www.python.org') or ('http://www.python.org', '', '') BTW, I wrote a somewhat preliminary patch for this functionality to let you save little of your time. :-) http://people.freebsd.org/~perky/partition-r1.diff Hye-Shik From raymond.hettinger at verizon.net Tue Aug 30 20:25:17 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 30 Aug 2005 14:25:17 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <431492D5.6090102@ronadam.com> Message-ID: <004601c5ad90$2c847020$8832c797@oemcomputer> [Ron Adam] > This would allow creating an iterator that could iterate though a string > splitting on each sep from either the left, or right. For uses more complex than basic partitioning, people should shift to more powerful tools like re.finditer(), re.findall(), and re.split(). > I can't think of an obvious use for a partition iterator at the moment, > maybe someone could find an example. I prefer to avoid variants that are searching of a purpose. > In any case, finding the second, > or third sep is probably common enough. That case should be handled with consecutive partitions: # keep everything after the second 'X' head, found, s = s.partition('X') head, found, s = s.partition('x') Raymond From raymond.hettinger at verizon.net Tue Aug 30 20:30:46 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 30 Aug 2005 14:30:46 -0400 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <4f0b69dc0508301124423da48e@mail.gmail.com> Message-ID: <004701c5ad90$f0faeec0$8832c797@oemcomputer> [Hye-Shik Chang] > What would be a result for rpartition(s, '?') ? > ('', '', 'http://www.python.org') > or > ('http://www.python.org', '', '') The former. The invariants for rpartition() are a mirror image of those for partition(). > BTW, I wrote a somewhat preliminary patch for this functionality > to let you save little of your time. :-) > > http://people.freebsd.org/~perky/partition-r1.diff Thanks. I've got one running already, but it is nice to have another for comparison. Raymond From paragate at gmx.net Tue Aug 30 20:37:10 2005 From: paragate at gmx.net (Wolfgang Lipp) Date: Tue, 30 Aug 2005 20:37:10 +0200 Subject: [Python-Dev] setdefault's second argument In-Reply-To: <1f7befae05083009146a9c35ce@mail.gmail.com> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr> <43148170.1020903@hathawaymix.org> <1f7befae05083009146a9c35ce@mail.gmail.com> Message-ID: On Tue, 30 Aug 2005 18:14:55 +0200, Tim Peters wrote: >>>> d = {} >>>> d.setdefault(666) >>>> d > {666: None} > > just doesn't seem useful. In fact, it's so silly that someone calling > setdefault with just one arg seems far more likely to have a bug in > their code than to get an outcome they actually wanted. Haven't found reminds me of dict.get()... i think in both cases being explicit:: beast = d.setdefault( 666, None ) beast = d.get( 666, None ) just reads better, allthemore since at least in my code what comes next is invariably a test 'if beast is None:...'. so beast = d.setdefault( 666 ) if beast is None: ... and beast = d.get( 666 ) if beast is None: ... a shorter but a tad too implicit for my feeling. _wolf From eric.nieuwland at xs4all.nl Tue Aug 30 20:41:21 2005 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Tue, 30 Aug 2005 20:41:21 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr> Message-ID: <14e7b895b32acb6b09e992cb570f4b99@xs4all.nl> On 30 aug 2005, at 17:40, Antoine Pitrou wrote: >> Neat! >> +1 on regexps as an argument to partition(). > > It sounds better to have a separate function and call it re.partition, > doesn't it ? > By the way, re.partition() is *really* useful compared to re.split() > because with the latter you don't which string precisely matched the > pattern (it isn't an issue with str.split() since matching is exact). Nice, too. BUT, "spam! and eggs".partition(re.compile("!.*d")) more closely resembles "xyz".split(), and that is the way things have evolved up-to now. --eric From pje at telecommunity.com Tue Aug 30 20:44:44 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 30 Aug 2005 14:44:44 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <004601c5ad90$2c847020$8832c797@oemcomputer> References: <431492D5.6090102@ronadam.com> Message-ID: <5.1.1.6.0.20050830143755.01fcc538@mail.telecommunity.com> At 02:25 PM 8/30/2005 -0400, Raymond Hettinger wrote: >That case should be handled with consecutive partitions: > ># keep everything after the second 'X' >head, found, s = s.partition('X') >head, found, s = s.partition('x') Or: s=s.partition('X')[2].partition('X')[2] which actually suggests a shorter, clearer way to do it: s = s.after('X').after('X') And the corresponding 'before' method, of course, such that if sep in s: s.before(sep), sep, s.after(sep) == s.partition(sep) Technically, these should probably be before_first and after_first, with the corresponding before_last and after_last corresponding to rpartition. From tim.peters at gmail.com Tue Aug 30 20:55:45 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 30 Aug 2005 14:55:45 -0400 Subject: [Python-Dev] setdefault's second argument In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr> <43148170.1020903@hathawaymix.org> <1f7befae05083009146a9c35ce@mail.gmail.com> Message-ID: <1f7befae0508301155a7baca3@mail.gmail.com> [Wolfgang Lipp] > reminds me of dict.get()... i think in both cases being explicit:: > > beast = d.setdefault( 666, None ) > beast = d.get( 666, None ) > > just reads better, allthemore since at least in my code what comes > next is invariably a test 'if beast is None:...'. so > > beast = d.setdefault( 666 ) > if beast is None: > ... Do you actually do this with setdefault()? It's not at all the same as the get() example next, because d.setdefault(666) may _also_ have the side effect of permanently adding a 666->None mapping to d. d.get(...) never mutates d. > and > > beast = d.get( 666 ) > if beast is None: > ... > > a shorter but a tad too implicit for my feeling. Nevertheless, 1-argument get() is used a lot. Outside the test suite, I've only found one use of 1-argument setdefault() so far, and it was a poor use (used two lines of code to emulate what dict.pop() does directly). From fredrik at pythonware.com Tue Aug 30 20:43:19 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 30 Aug 2005 20:43:19 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer><431414AB.4010005@cirad.fr><20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr><4314542C.7080000@gmail.com><65c606d6ef54240378726f4e4ad91f3d@xs4all.nl><431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr><43148170.1020903@hathawaymix.org> Message-ID: Michael Hoffman wrote: > Dare I ask whether the uncompiled versions should be considered for > removal in Python 3.0? > > *puts on his asbestos jacket* there are no uncompiled versions, so that's not a problem. if you mean the function level api, it's there for convenience. if you're using less than 100 expressions in your program, you don't really have to *explicitly* compile your expressions. the function api will do that for you all by itself. From fredrik at pythonware.com Tue Aug 30 19:54:25 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 30 Aug 2005 19:54:25 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer><200508301209.19693.anthony@interlink.com.au><5.1.1.6.0.20050829235726.029224c0@mail.telecommunity.com> <5.1.1.6.0.20050830112036.01b21aa8@mail.telecommunity.com> Message-ID: Phillip J. Eby wrote: >>both split on a given token. partition splits once, and returns all three >>parts, while piece returns the part you ask for > > No, because looking at that URL, there is no piece that is the token split > on. partition() always returns 3 parts for 1 occurrence of the token, > whereas $PIECE only has 2. so "absolutely nothing in common" has now turned into "does the same thing but doesn't return the value you passed to it" ? sorry for wasting my time. From fredrik at pythonware.com Tue Aug 30 20:53:23 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 30 Aug 2005 20:53:23 +0200 Subject: [Python-Dev] setdefault's second argument References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer><20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr><4314542C.7080000@gmail.com><65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr><43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr><43148170.1020903@hathawaymix.org> <1f7befae05083009146a9c35ce@mail.gmail.com> Message-ID: Tim Peters wrote: > Anyone remember why setdefault's second argument is optional? Some kind of symmetry with get, probably. if d.get(x) returns None if x doesn't exist, it makes some kind of sense that d.setdefault(x) returns None as well. Anyone remember why nobody managed to come up with a better name for setdefault (which is probably the worst name ever given to a method in the standard Python distribution) ? (if I were in charge, I'd rename it to something more informative. I'd also add a "join" built-in (similar to the good old string.join) and a "textfile" built- in (similar to open("U") plus support for encodings). but that's me. I want my code nice and tidy.) From fredrik at pythonware.com Tue Aug 30 21:15:19 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 30 Aug 2005 21:15:19 +0200 Subject: [Python-Dev] Proof of the pudding: str.partition() References: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer> Message-ID: Raymond Hettinger wrote: > Overall, I found that partition() usefully encapsulated commonly > occurring low-level programming patterns. In most cases, it completely > eliminated the need for slicing and indices. In several cases, code was > simplified dramatically; in some, the simplification was minor; and in a > few cases, the complexity was about the same. No cases were made worse. it is, however, a bit worrying that you end up ignoring one or more of the values in about 50% of your examples... > ! rest, _, query = rest.rpartition('?') > ! script, _, rest = rest.partition('/') > ! _, sep, port = host.partition(':') > ! head, sep, _ = path.rpartition('/') > ! line, _, _ = line.partition(';') # strip > chunk-extensions > ! host, _, port = host.rpartition(':') > ! head, _, tail = name.partition('.') > ! head, _, tail = tail.partition('.') > ! pname, found, _ = pname.rpartition('.') > ! head, _, tail = name.partition('.') > ! filename, _, arg = arg.rpartition(':') > ! line, _, _ = line.partition('#') > ! protocol, _, condition = meth.partition('_') > ! filename, _, _ = filename.partition(chr(0)) this is also a bit worrying > ! head, found, tail = seq.find('-') but that's more a problem with the test suite. From rrr at ronadam.com Tue Aug 30 21:37:18 2005 From: rrr at ronadam.com (Ron Adam) Date: Tue, 30 Aug 2005 15:37:18 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <5.1.1.6.0.20050830143755.01fcc538@mail.telecommunity.com> References: <431492D5.6090102@ronadam.com> <5.1.1.6.0.20050830143755.01fcc538@mail.telecommunity.com> Message-ID: <4314B56E.6070609@ronadam.com> Phillip J. Eby wrote: > At 02:25 PM 8/30/2005 -0400, Raymond Hettinger wrote: > >> That case should be handled with consecutive partitions: >> >> # keep everything after the second 'X' >> head, found, s = s.partition('X') >> head, found, s = s.partition('x') I was thinking of cases where head is everything before the second 'X'. A posible use case might be getting items in comma delimited string. > Or: > > s=s.partition('X')[2].partition('X')[2] > > which actually suggests a shorter, clearer way to do it: > > s = s.after('X').after('X') > > And the corresponding 'before' method, of course, such that if sep in s: > > s.before(sep), sep, s.after(sep) == s.partition(sep) > > Technically, these should probably be before_first and after_first, with > the corresponding before_last and after_last corresponding to rpartition. Do you really think these are easer than: head, found, tail = s.partition('X',2) I don't feel there is a need to avoid numbers entirely. In this case I think it's the better way to find the n'th seperator and since it's an optional value I feel it doesn't add a lot of complication. Anyway... It's just a suggestion. Cheers, Ron From paragate at gmx.net Tue Aug 30 21:45:23 2005 From: paragate at gmx.net (Wolfgang Lipp) Date: Tue, 30 Aug 2005 21:45:23 +0200 Subject: [Python-Dev] setdefault's second argument In-Reply-To: <1f7befae0508301155a7baca3@mail.gmail.com> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr> <43148170.1020903@hathawaymix.org> <1f7befae05083009146a9c35ce@mail.gmail.com> <1f7befae0508301155a7baca3@mail.gmail.com> Message-ID: On Tue, 30 Aug 2005 20:55:45 +0200, Tim Peters wrote: > [Wolfgang Lipp] >> reminds me of dict.get()... i think in both cases being explicit:: >> >> beast = d.setdefault( 666, None ) >> ... > > Do you actually do this with setdefault()? well, actually more like:: def f( x ): return x % 3 R = {} for x in range( 30 ): R.setdefault( f( x ), [] ).append( x ) still contrived, but you get the idea. i was really excited when finding out that d.pop, d.get and d.setdefault work in very much the same way in respect to the default argument, and my code has greatly benefitted from that. e.g. def f( **Q ): myoption = Q.pop( 'myoption', 42 ) if Q: raise TypeError(...) _w From pje at telecommunity.com Tue Aug 30 21:55:52 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 30 Aug 2005 15:55:52 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <200508301209.19693.anthony@interlink.com.au> <5.1.1.6.0.20050829235726.029224c0@mail.telecommunity.com> <5.1.1.6.0.20050830112036.01b21aa8@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050830154853.01b29a30@mail.telecommunity.com> At 07:54 PM 8/30/2005 +0200, Fredrik Lundh wrote: >Phillip J. Eby wrote: > > >>both split on a given token. partition splits once, and returns all three > >>parts, while piece returns the part you ask for > > > > No, because looking at that URL, there is no piece that is the token split > > on. partition() always returns 3 parts for 1 occurrence of the token, > > whereas $PIECE only has 2. > >so "absolutely nothing in common" has now turned into "does the >same thing but doesn't return the value you passed to it" ? $PIECE returns exactly one value. partition returns exactly 3. partition always returns the separator as one of the three values. $PIECE never does. How many more differences does it have to have before you consider them to be nothing alike? >sorry for wasting my time. And sorry for you being either illiterate or humor-impaired, to have missed the smiley on the sentence that said "absolutely nothing in common except having string arguments". You quoted it in your first reply, so it's not like it didn't make it into your email client. From barry at python.org Tue Aug 30 22:18:38 2005 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Aug 2005 16:18:38 -0400 Subject: [Python-Dev] setdefault's second argument In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com><65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr> <43148170.1020903@hathawaymix.org> <1f7befae05083009146a9c35ce@mail.gmail.com> Message-ID: <1125433118.10961.11.camel@geddy.wooz.org> On Tue, 2005-08-30 at 14:53, Fredrik Lundh wrote: > Some kind of symmetry with get, probably. if > > d.get(x) > > returns None if x doesn't exist, it makes some kind of sense that > > d.setdefault(x) I think that's right, and IIRC the specific detail about the optional second argument was probably hashed out in private Pythonlabs email, or over a tasty lunch of kung pao chicken. I don't have access to my private archives at the moment, though the public record seems to start about here: http://mail.python.org/pipermail/python-dev/2000-August/007819.html > Anyone remember why nobody managed to come up with a better name > for setdefault (which is probably the worst name ever given to a method > in the standard Python distribution) ? Heh. http://mail.python.org/pipermail/python-dev/2000-August/008059.html > (if I were in charge, I'd rename it to something more informative. Maybe like getorset() . Oh, and yeah, I don't care if we change .setdefault() to require its second argument -- I've never used it without one. But don't remove the method, it's quite handy. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050830/e7a714ee/attachment.pgp From tim.peters at gmail.com Tue Aug 30 22:27:44 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 30 Aug 2005 16:27:44 -0400 Subject: [Python-Dev] setdefault's second argument In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr> <43148170.1020903@hathawaymix.org> <1f7befae05083009146a9c35ce@mail.gmail.com> Message-ID: <1f7befae05083013274c1b7521@mail.gmail.com> [Fredrik Lundh] > ... > Anyone remember why nobody managed to come up with a better name > for setdefault (which is probably the worst name ever given to a method > in the standard Python distribution) ? I suggested a perfect name at the time: http://mail.python.org/pipermail/python-dev/2000-August/008036.html To save you from following that link, to this day I still mentally translate "setdefault" to "getorset" whenever I see it. That it didn't get that name is probably Skip's fault, for whining that "getorsetandget" would be "more accurate" . Actually, there's no evidence that Guido noticed: http://mail.python.org/pipermail/python-dev/2000-August/008059.html > (if I were in charge, I'd rename it to something more informative. I'd also > add a "join" built-in (similar to the good old string.join) and a "textfile" > built-in (similar to open("U") plus support for encodings). but that's me. I > want my code nice and tidy.) I'm not sure who is in charge, but I am sure they can be bribed ;-) From raymond.hettinger at verizon.net Tue Aug 30 22:27:48 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue, 30 Aug 2005 16:27:48 -0400 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: Message-ID: <005301c5ada1$4a52afc0$8832c797@oemcomputer> [Fredrik Lundh] > it is, however, a bit worrying that you end up ignoring one or more > of the values in about 50% of your examples... It drops to about 25% when you skip the ones that don't care about the found/not-found field: > > ! _, sep, port = host.partition(':') > > ! head, sep, _ = path.rpartition('/') > > ! line, _, _ = line.partition(';') # strip > > ! pname, found, _ = pname.rpartition('.') > > ! line, _, _ = line.partition('#') > > ! filename, _, _ = filename.partition(chr(0)) The remaining cases don't bug me much. They clearly say, ignore the left piece or ignore the right piece. We could, of course, make these clearer and more efficient by introducing more methods: s.before(sep) --> (left, sep) s.after(sep) --> (right, sep) s.rbefore(sep) --> (left, sep) s.r_after(sep) --> (right, sep) But who wants all of that? Raymond From fredrik at pythonware.com Tue Aug 30 22:46:45 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 30 Aug 2005 22:46:45 +0200 Subject: [Python-Dev] setdefault's second argument References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer><4314542C.7080000@gmail.com><65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr><43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr><43148170.1020903@hathawaymix.org><1f7befae05083009146a9c35ce@mail.gmail.com> <1f7befae05083013274c1b7521@mail.gmail.com> Message-ID: Tim Peters wrote: >> Anyone remember why nobody managed to come up with a better name >> for setdefault (which is probably the worst name ever given to a method >> in the standard Python distribution) ? > > I suggested a perfect name at the time: > > http://mail.python.org/pipermail/python-dev/2000-August/008036.html > > To save you from following that link, to this day I still mentally > translate "setdefault" to "getorset" whenever I see it. from this day, I'll do that as well. I have to admit that I had to follow that link anyway, just to make sure I wasn't involved in the decision at that time (which I wasn't, from what I can tell). But I stumbled upon this little naming protocol Protocol: if you have a suggestion for a name for this function, mail it to me. DON'T MAIL THE LIST. (If you mail it to the list, that name is disqualified.) Don't explain me why the name is good -- if it's good, I'll know, if it needs an explanation, it's not good. which I thought was most excellent, and something that we might PEP:ify for future use, until I realized that it gave us the "worst name ever"... oh well. From benji at benjiyork.com Tue Aug 30 23:06:06 2005 From: benji at benjiyork.com (Benji York) Date: Tue, 30 Aug 2005 17:06:06 -0400 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: <005301c5ada1$4a52afc0$8832c797@oemcomputer> References: <005301c5ada1$4a52afc0$8832c797@oemcomputer> Message-ID: <4314CA3E.3020606@benjiyork.com> Raymond Hettinger wrote: > [Fredrik Lundh] > >>it is, however, a bit worrying that you end up ignoring one or more >>of the values in about 50% of your examples... > > It drops to about 25% when you skip the ones that don't care about the > found/not-found field: > >>>! _, sep, port = host.partition(':') >>>! head, sep, _ = path.rpartition('/') >>>! line, _, _ = line.partition(';') # strip >>>! pname, found, _ = pname.rpartition('.') >>>! line, _, _ = line.partition('#') >>>! filename, _, _ = filename.partition(chr(0)) I know it's been discussed in the past, but this makes me wonder about language support for "dummy" or "don't care" placeholders for tuple unpacking. Would the above cases benefit from that, or (as has been suggested in the past) should slicing be used instead? Original: _, sep, port = host.partition(':') head, sep, _ = path.rpartition('/') line, _, _ = line.partition(';') pname, found, _ = pname.rpartition('.') line, _, _ = line.partition('#') Slicing: sep, port = host.partition(':')[1:] head, sep = path.rpartition('/')[:2] line = line.partition(';')[0] pname, found = pname.rpartition('.')[:2] line = line.partition('#')[0] I think I like the original better, but can't use "_" in my code because it's used for internationalization. -- Benji York From barry at python.org Tue Aug 30 23:05:52 2005 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Aug 2005 17:05:52 -0400 Subject: [Python-Dev] setdefault's second argument In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <4314542C.7080000@gmail.com><65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr> <43148170.1020903@hathawaymix.org> <1f7befae05083009146a9c35ce@mail.gmail.com> <1f7befae05083013274c1b7521@mail.gmail.com> Message-ID: <1125435952.10961.13.camel@geddy.wooz.org> On Tue, 2005-08-30 at 16:46, Fredrik Lundh wrote: > But I stumbled upon this little naming protocol > > Protocol: if you have a suggestion for a name for this function, mail > it to me. DON'T MAIL THE LIST. (If you mail it to the list, that > name is disqualified.) Don't explain me why the name is good -- if > it's good, I'll know, if it needs an explanation, it's not good. > > which I thought was most excellent, and something that we might PEP:ify > for future use, until I realized that it gave us the "worst name ever"... /And/ the rule was self-admittedly broken by Guido not a few posts after that one. ;) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050830/db154c7e/attachment.pgp From mcherm at mcherm.com Tue Aug 30 23:35:42 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Tue, 30 Aug 2005 14:35:42 -0700 Subject: [Python-Dev] Revising RE docs (was: partition() (was: Remove str.find in 3.0?)) Message-ID: <20050830143542.niq7a9s8bsrkc8ok@login.werra.lunarpages.com> Barry Warsaw writes: > Although it's mildly annoying that the docs describe the compiled method > names in terms of the uncompiled functions. I always find myself > looking up the regexp object's API only to be shuffled off to the > module's API and then having to do the argument remapping myself. An excellent point. Obviously, EITHER (1) the module functions ought to be documented by reference to the RE object methods, or vice versa: (2) document the RE object methods by reference to the module functions. (2) is what we have today, but I would prefer (1) to gently encourage people to use the precompiled objects (which are distinctly faster when re-used). Does anyone else think we ought to swap that around in the documentation? I'm not trying to assign more work to Fred... but if there were a python-dev consensus that this would be desirable, then perhaps someone would be encouraged to supply a patch. -- Michael Chermside From fdrake at acm.org Tue Aug 30 23:41:28 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 30 Aug 2005 17:41:28 -0400 Subject: [Python-Dev] Revising RE docs (was: partition() (was: Remove str.find in 3.0?)) In-Reply-To: <20050830143542.niq7a9s8bsrkc8ok@login.werra.lunarpages.com> References: <20050830143542.niq7a9s8bsrkc8ok@login.werra.lunarpages.com> Message-ID: <200508301741.28656.fdrake@acm.org> On Tuesday 30 August 2005 17:35, Michael Chermside wrote: > An excellent point. Obviously, EITHER (1) the module functions ought to > be documented by reference to the RE object methods, or vice versa: > (2) document the RE object methods by reference to the module functions. Agreed. I think the current arrangement is primarily a historical accident more than anything else, but I didn't write that section, so could be wrong. > Does anyone else think we ought to swap that around in the documentation? > I'm not trying to assign more work to Fred... but if there were a > python-dev consensus that this would be desirable, then perhaps someone > would be encouraged to supply a patch. I'd rather see it reversed from what it is as well. While I don't have the time myself (and don't consider it a critical issue), I certainly won't revert a patch to make the change without good reason. :-) -Fred -- Fred L. Drake, Jr. From goodger at python.org Tue Aug 30 22:24:44 2005 From: goodger at python.org (David Goodger) Date: Tue, 30 Aug 2005 16:24:44 -0400 Subject: [Python-Dev] setdefault's second argument In-Reply-To: <1f7befae05083009565974978c@mail.gmail.com> References: <1f7befae05083009146a9c35ce@mail.gmail.com> <003301c5ad80$c72c1020$8832c797@oemcomputer> <1f7befae05083009565974978c@mail.gmail.com> Message-ID: <4314C08C.6060302@python.org> [Tim Peters] > Dang! I may have just found a use, in Zope's > lib/python/docutils/parsers/rst/directives/images.py (which is part > of docutils, not really part of Zope): > > figwidth = options.setdefault('figwidth') > figclass = options.setdefault('figclass') > del options['figwidth'] > del options['figclass'] If a feature is available, it *will* eventually be used! Whose law is that? > I'm still thinking about what that's trying to do <0.5 wink>. The code needs to store the values of certain dict entries, then delete them. This is because the "options" dict is passed on to another function, where those entries are not welcome. The code above is simply shorter than this: if options.has_key('figwidth'): figwidth = options['figwidth'] del options['figwidth'] # again for 'figclass' Alternatively, try: figwidth = options['figwidth'] del options['figwidth'] except KeyError: pass It saves between one line and three lines of code per entry. But since those entries are probably not so common, it would actually be faster to use one of the above patterns. > Assuming options is a dict-like thingie, it probably meant to do: > > figwidth = options.pop('figwidth', None) > figclass = options.pop('figclass', None) Yes, but the "pop" method was only added in Python 2.3. Docutils currently maintains compatibility with Python 2.1, so that's RIGHT OUT! > David, are you married to that bizarre use of setdefault ? No, not at all. In fact, I will vehemently deny that I ever wrote such code, and will continue to do so until someone looks up its history and proves that I'm guilty, which I probably am. -- David Goodger -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 253 bytes Desc: OpenPGP digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20050830/8588c318/signature.pgp From rrr at ronadam.com Wed Aug 31 00:45:54 2005 From: rrr at ronadam.com (Ron Adam) Date: Tue, 30 Aug 2005 18:45:54 -0400 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: <4314CA3E.3020606@benjiyork.com> References: <005301c5ada1$4a52afc0$8832c797@oemcomputer> <4314CA3E.3020606@benjiyork.com> Message-ID: <4314E1A2.4060409@ronadam.com> Benji York wrote: > Raymond Hettinger wrote: > >>[Fredrik Lundh] >> >> >>>it is, however, a bit worrying that you end up ignoring one or more >>>of the values in about 50% of your examples... >> >>It drops to about 25% when you skip the ones that don't care about the >>found/not-found field: >> >> >>>>! _, sep, port = host.partition(':') >>>>! head, sep, _ = path.rpartition('/') >>>>! line, _, _ = line.partition(';') # strip >>>>! pname, found, _ = pname.rpartition('.') >>>>! line, _, _ = line.partition('#') >>>>! filename, _, _ = filename.partition(chr(0)) > > > I know it's been discussed in the past, but this makes me wonder about > language support for "dummy" or "don't care" placeholders for tuple > unpacking. Would the above cases benefit from that, or (as has been > suggested in the past) should slicing be used instead? > > Original: > _, sep, port = host.partition(':') > head, sep, _ = path.rpartition('/') > line, _, _ = line.partition(';') > pname, found, _ = pname.rpartition('.') > line, _, _ = line.partition('#') > > Slicing: > sep, port = host.partition(':')[1:] > head, sep = path.rpartition('/')[:2] > line = line.partition(';')[0] > pname, found = pname.rpartition('.')[:2] > line = line.partition('#')[0] > > I think I like the original better, but can't use "_" in my code because > it's used for internationalization. > -- > Benji York For cases where single values are desired, attribues could work. Slicing: line = line.partition(';').head line = line.partition('#').head But it gets awkward as soon as you want more than one. sep, port = host.partition(':').head, host.partition(':').sep Ron From shane at hathawaymix.org Wed Aug 31 01:00:43 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Tue, 30 Aug 2005 17:00:43 -0600 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: <4314E1A2.4060409@ronadam.com> References: <005301c5ada1$4a52afc0$8832c797@oemcomputer> <4314CA3E.3020606@benjiyork.com> <4314E1A2.4060409@ronadam.com> Message-ID: <4314E51B.1050507@hathawaymix.org> Ron Adam wrote: > For cases where single values are desired, attribues could work. > > Slicing: > line = line.partition(';').head > line = line.partition('#').head > > But it gets awkward as soon as you want more than one. > > sep, port = host.partition(':').head, host.partition(':').sep You can do both: make partition() return a sequence with attributes, similar to os.stat(). However, I would call the attributes "before", "sep", and "after". Shane From tdelaney at avaya.com Wed Aug 31 01:06:22 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Wed, 31 Aug 2005 09:06:22 +1000 Subject: [Python-Dev] Proof of the pudding: str.partition() Message-ID: <2773CAC687FD5F4689F526998C7E4E5F05CCAE@au3010avexu1.global.avaya.com> Shane Hathaway wrote: > Ron Adam wrote: >> For cases where single values are desired, attribues could work. >> >> Slicing: >> line = line.partition(';').head >> line = line.partition('#').head >> >> But it gets awkward as soon as you want more than one. >> >> sep, port = host.partition(':').head, host.partition(':').sep > > You can do both: make partition() return a sequence with attributes, > similar to os.stat(). However, I would call the attributes "before", > "sep", and "after". +0 I thought the same thing. I don't see a lot of use cases for it, but it could be useful. I don't see how it could hurt. Tim Delaney From fredrik at pythonware.com Wed Aug 31 01:05:20 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 31 Aug 2005 01:05:20 +0200 Subject: [Python-Dev] Proof of the pudding: str.partition() References: <005301c5ada1$4a52afc0$8832c797@oemcomputer><4314CA3E.3020606@benjiyork.com> <4314E1A2.4060409@ronadam.com> Message-ID: Ron Adam wrote: > For cases where single values are desired, attribues could work. > > Slicing: > line = line.partition(';').head > line = line.partition('#').head > > But it gets awkward as soon as you want more than one. > > sep, port = host.partition(':').head, host.partition(':').sep unless you go for the piece approach host, port = host.piece(":", 1, 2) (which, of course, is short for host, port = host.piece(":").group(1, 2) ) and wait for Mr Eby to tell you that piece has nothing whatsoever to do with string splitting. From tony at lownds.com Wed Aug 31 03:09:39 2005 From: tony at lownds.com (tony@lownds.com) Date: Tue, 30 Aug 2005 18:09:39 -0700 (PDT) Subject: [Python-Dev] Proof of the pudding: str.partition() Message-ID: <44572.67.127.59.200.1125450579.squirrel@lownds.com> I once wrote a similar method called cleave(). My use case involved a string-like class (Substr) whose instances could report their position in the original string. The re module wasn't preserving my class so I had to provide a different API. def cleave(self, pattern, start=0): """return Substr until match, match, Substr after match If there is no match, return Substr, None, '' """ Here are some observations/questions on Raymond's partition() idea. First of all, partition() is a much better name than cleave()! Substr didn't copy as partition() will have to, won't many of uses of partition() end up being O(N^2)? One way that gives the programmer a way avoid the copying would be to provide a string method findspan(). findspan() would returns the start and end of the found position in the string. start > end could signal no match; and since 0-character strings are disallowed in partition, end == 0 could also signal no match. partition() could be defined in terms of findspan(): start, end = s.findspan(sep) before, sep, after = s[:start], s[start:end], s[end:] Just a quick thought, -Tony From greg.ewing at canterbury.ac.nz Wed Aug 31 03:27:13 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 31 Aug 2005 13:27:13 +1200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <200508301209.19693.anthony@interlink.com.au> Message-ID: <43150771.1000102@canterbury.ac.nz> JustFillBug wrote: > trisplit() And then for when you need to record the result somewhere, tricord(). :-) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From jcarlson at uci.edu Wed Aug 31 03:35:37 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 30 Aug 2005 18:35:37 -0700 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: <44572.67.127.59.200.1125450579.squirrel@lownds.com> References: <44572.67.127.59.200.1125450579.squirrel@lownds.com> Message-ID: <20050830182304.8B11.JCARLSON@uci.edu> tony at lownds.com wrote: > Substr didn't copy as partition() will have to, won't many of uses of > partition() end up being O(N^2)? Yes. But if you look at most cases provided for in the standard library, that isn't an issue. In the case where it becomes an issue, it is generally because a user wants to do repeated splitting on the same token...which is better suited for str.split or re.split. > One way that gives the programmer a way avoid the copying would be to > provide a string method > findspan(). findspan() would returns the start and end of the found > position in the string. start > > end could signal no match; and since 0-character strings are disallowed in > partition, end == 0 > could also signal no match. partition() could be defined in terms of > findspan(): > start, end = s.findspan(sep) > before, sep, after = s[:start], s[start:end], s[end:] Actually no. When str.parition() doesn't find the separator, you get s, '', ''. Yours would produce '', '', s. On not found, you would need to use start==end==len(s). Further, findspan could be defined in terms of find... def findspan(s, sep): if len(sep) == 0: raise ValueError("null separator strings are not allowed") x = s.find(sep) if x >= 0: return x, x+len(sep) return len(s),len(s) Conceptually they are all the same. The trick with partition is that in the vast majority of use cases, one wants 2 or 3 of the resulting strings, and constructing the strings in the C-level code is far faster than manually slicing (which can be error prone). I will say the same thing that I've said at least three times already (with a bit of an expansion): IF YOU ARE GOING TO PROPOSE AN ALTERNATIVE, SHOW SOME COMPARATIVE CODE SAMPLES WHERE YOUR PROPOSAL DEFINITELY WINS OVER BOTH str.find AND str.partition. IF YOU CAN'T PROVIDE SUCH SAMPLES, THEN YOUR PROPOSAL ISN'T BETTER, AND YOU PROBABLY SHOULDN'T PROPOSE IT. bonus points for those who take the time to compare all of those that Raymond provided. - Josiah From greg.ewing at canterbury.ac.nz Wed Aug 31 03:43:59 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 31 Aug 2005 13:43:59 +1200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <431455EC.6050402@gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com> <431455EC.6050402@gmail.com> Message-ID: <43150B5F.5070102@canterbury.ac.nz> Nick Coghlan wrote: > Another option would be simply "str.part()" and "str.rpart()". Then you could > think of it as an abbreviation of either 'partition' or 'parts' depending on > your inclination. Or simply as the verb 'part', which also makes sense! Also it's short and snappy, whereas 'partition' seems rather too long-winded for such a useful little function. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing at canterbury.ac.nz +--------------------------------------+ From tony at lownds.com Wed Aug 31 03:58:00 2005 From: tony at lownds.com (tony@lownds.com) Date: Tue, 30 Aug 2005 18:58:00 -0700 (PDT) Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: <20050830182304.8B11.JCARLSON@uci.edu> References: <44572.67.127.59.200.1125450579.squirrel@lownds.com> <20050830182304.8B11.JCARLSON@uci.edu> Message-ID: <44836.67.127.59.200.1125453480.squirrel@lownds.com> > Actually no. When str.parition() doesn't find the separator, you get s, > '', ''. > Yours would produce '', '', s. On not found, you would need to use > start==end==len(s). > You're right. Nevermind, then. > I will say the same > thing that I've said at least three times already (with a bit of an > expansion): > Thanks for the re-re-emphasis. -Tony From adurdin at gmail.com Wed Aug 31 04:23:08 2005 From: adurdin at gmail.com (Andrew Durdin) Date: Wed, 31 Aug 2005 12:23:08 +1000 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <004701c5ad90$f0faeec0$8832c797@oemcomputer> References: <4f0b69dc0508301124423da48e@mail.gmail.com> <004701c5ad90$f0faeec0$8832c797@oemcomputer> Message-ID: <59e9fd3a050830192344ffeafd@mail.gmail.com> On 8/31/05, Raymond Hettinger wrote: > [Hye-Shik Chang] > > What would be a result for rpartition(s, '?') ? > > ('', '', 'http://www.python.org') > > or > > ('http://www.python.org', '', '') > > The former. The invariants for rpartition() are a mirror image of those > for partition(). Just to put my spoke in the wheel, I find the difference in the ordering of return values for partition() and rpartition() confusing: head, sep, remainder = partition(s) remainder, sep, head = rpartition(s) My first expectation for rpartition() was that it would return exactly the same values as partition(), but just work from the end of the string. IOW, I expected "www.python.org".partition("python") to return exactly the same as "www.python.org".rpartition("python") To try out partition(), I wrote a quick version of split() using partition, and using partition() was obvious and easy: def mysplit(s, sep): l = [] while s: part, _, s = s.partition(sep) l.append(part) return l I tripped up when trying to make an rsplit() (I'm using Python 2.3), because the return values were in "reverse order"; I had expected the only change to be using rpartition() instead of partition(). For a second example: one of the "fixed stdlib" examples that Raymond posted actually uses rpartition and partition in two consecutive lines -- I found this example not immediately obvious for the above reason: def run_cgi(self): """Execute a CGI script.""" dir, rest = self.cgi_info rest, _, query = rest.rpartition('?') script, _, rest = rest.partition('/') scriptname = dir + '/' + script scriptfile = self.translate_path(scriptname) if not os.path.exists(scriptfile): Anyway, I'm definitely +1 on partition(), but -1 on rpartition() returning in "reverse order". Andrew From tdelaney at avaya.com Wed Aug 31 04:27:57 2005 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Wed, 31 Aug 2005 12:27:57 +1000 Subject: [Python-Dev] Remove str.find in 3.0? Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com> Andrew Durdin wrote: > Just to put my spoke in the wheel, I find the difference in the > ordering of return values for partition() and rpartition() confusing: > > head, sep, remainder = partition(s) > remainder, sep, head = rpartition(s) This is the confusion - you've got the terminology wrong. before, sep, after = s.partition('?') ('http://www.python.org', '', '') before, sep, after = s.rpartition('?') ('', '', 'http://www.python.org') Tim Delaney From adurdin at gmail.com Wed Aug 31 05:23:25 2005 From: adurdin at gmail.com (Andrew Durdin) Date: Wed, 31 Aug 2005 13:23:25 +1000 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com> Message-ID: <59e9fd3a050830202334c69649@mail.gmail.com> On 8/31/05, Delaney, Timothy (Tim) wrote: > Andrew Durdin wrote: > > > Just to put my spoke in the wheel, I find the difference in the > > ordering of return values for partition() and rpartition() confusing: > > > > head, sep, remainder = partition(s) > > remainder, sep, head = rpartition(s) > > This is the confusion - you've got the terminology wrong. > > before, sep, after = s.partition('?') > ('http://www.python.org', '', '') > > before, sep, after = s.rpartition('?') > ('', '', 'http://www.python.org') That's still confusing (to me), though -- when the string is being processed, what comes before the separator is the stuff at the end of the string, and what comes after is the bit at the beginning of the string. It's not the terminology that's confusing me, though I find it hard to describe exactly what is. Maybe it's just me -- does anyone else have the same confusion? From guido at python.org Wed Aug 31 05:27:40 2005 From: guido at python.org (Guido van Rossum) Date: Tue, 30 Aug 2005 20:27:40 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <59e9fd3a050830202334c69649@mail.gmail.com> References: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com> <59e9fd3a050830202334c69649@mail.gmail.com> Message-ID: On 8/30/05, Andrew Durdin wrote: > On 8/31/05, Delaney, Timothy (Tim) wrote: > > Andrew Durdin wrote: > > > > > Just to put my spoke in the wheel, I find the difference in the > > > ordering of return values for partition() and rpartition() confusing: > > > > > > head, sep, remainder = partition(s) > > > remainder, sep, head = rpartition(s) > > > > This is the confusion - you've got the terminology wrong. > > > > before, sep, after = s.partition('?') > > ('http://www.python.org', '', '') > > > > before, sep, after = s.rpartition('?') > > ('', '', 'http://www.python.org') > > That's still confusing (to me), though -- when the string is being > processed, what comes before the separator is the stuff at the end of > the string, and what comes after is the bit at the beginning of the > string. It's not the terminology that's confusing me, though I find > it hard to describe exactly what is. Maybe it's just me -- does anyone > else have the same confusion? Hm. The example is poorly chosen because it's an end case. The invariant for both is (I'd hope!) "".join(s.partition()) == s == "".join(s.rpartition()) Thus, "a/b/c".partition("/") returns ("a", "/", "b/c") "a/b/c".rpartition("/") returns ("a/b", "/", "c") That can't be confusing can it? (Just think of it as rpartition() stopping at the last occurrence, rather than searching from the right. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From adurdin at gmail.com Wed Aug 31 05:44:16 2005 From: adurdin at gmail.com (Andrew Durdin) Date: Wed, 31 Aug 2005 13:44:16 +1000 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com> <59e9fd3a050830202334c69649@mail.gmail.com> Message-ID: <59e9fd3a05083020444caecff4@mail.gmail.com> On 8/31/05, Guido van Rossum wrote: > > Hm. The example is poorly chosen because it's an end case. The > invariant for both is (I'd hope!) > > "".join(s.partition()) == s == "".join(s.rpartition()) > (Just think of it as rpartition() stopping at the last occurrence, > rather than searching from the right. :-) Ah, that makes a difference. I could see that there was a different way of looking at the function, I just couldn't see what it was... Now I understand the way it's been done. Cheers, Andrew. From pje at telecommunity.com Wed Aug 31 05:49:15 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 30 Aug 2005 23:49:15 -0400 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: References: <005301c5ada1$4a52afc0$8832c797@oemcomputer> <4314CA3E.3020606@benjiyork.com> <4314E1A2.4060409@ronadam.com> Message-ID: <5.1.1.6.0.20050830233356.01b34118@mail.telecommunity.com> At 01:05 AM 8/31/2005 +0200, Fredrik Lundh wrote: >Ron Adam wrote: > > > For cases where single values are desired, attribues could work. > > > > Slicing: > > line = line.partition(';').head > > line = line.partition('#').head > > > > But it gets awkward as soon as you want more than one. > > > > sep, port = host.partition(':').head, host.partition(':').sep > >unless you go for the piece approach > > host, port = host.piece(":", 1, 2) > >(which, of course, is short for > > host, port = host.piece(":").group(1, 2) > >) > >and wait for Mr Eby to tell you that piece has nothing whatsoever >to do with string splitting. No, just to point out that you can make up whatever semantics you want, but the semantics you show above are *not* the same as what are shown at the page the person who posted about $PIECE cited, and on whose content I based my reply: http://www.jacquardsystems.com/Examples/function/piece.htm If you were following those semantics, then the code you presented above is buggy, as host.piece(':',1,2) would return the original string! Of course, since I know nothing of MUMPS besides what's on that page, it's entirely possible I've misinterpreted that page in some hideously subtle way -- as I pointed out in my original post regarding $PIECE. I like to remind myself and others of the possibility that I *could* be wrong, even when I'm *certain* I'm right, because it helps keep me from appearing any more arrogant than I already do, and it also helps to keep me from looking too stupid in those cases where I turn out to be wrong. Perhaps you might find that approach useful as well. In any case, to avoid confusion, you should probably specify the semantics of your piece() proposal in Python terms, so that those of us who don't know MUMPS have some possibility of grasping the inner mysteries of your proposal. From tjreedy at udel.edu Wed Aug 31 05:58:41 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 30 Aug 2005 23:58:41 -0400 Subject: [Python-Dev] Remove str.find in 3.0? References: <20050826134317.7DFD.JCARLSON@uci.edu> <43100E14.6080009@v.loewis.de> Message-ID: ""Martin v. Löwis"" wrote in message news:43100E14.6080009 at v.loewis.de... > Terry Reedy wrote: >> One (1a) is to give an inband signal that is like a normal >> response except that it is not (str.find returing -1). >> >> Python as distributed usually chooses 1b or 2. >> I believe str.find and >> .rfind are unique in the choice of 1a. > > That is not true. str.find's choice is not 1a, It it the paradigm example of 1a as I meant my definition. > -1 does *not* look like a normal response, > since a normal response is non-negative. Actually, the current doc does not clearly specify to some people that the response is a count. That is what lead to the 'str.find is buggy' thread on c.l.p, and something I will clarify when I propose a doc patch. In any case, Python does not have a count type, though I sometime wish it did. The return type is int and -1 is int, though it is not meant to be used as an int and it is a bug to do so. >It is *not* the only method with choice 1a): > dict.get returns None if the key is not found, None is only the default default, and whatever the default is, it is not necessarily an error return. A dict accessed via .get can be regarded as an infinite association matching all but a finite set of keys with the default. Example: a doubly infinite array of numbers with only a finite number of non-zero entries, implemented as a dict. This is the view actually used if one does normal calculations with that default return. There is no need, at least for that access method, for any key to be explicitly associated with the default. If the default *is* regarded as an error indicator, and is only used to guard normal processing of the value returned, then that default must not be associated any key. There is the problem that the domain of dict values is normally considered to be any Python object and functions can only return Python objects and not any non-Python-object error return. So the effective value domain for the particular dict must be the set 'Python objects' minus the error indicator. With discipline, None often works. Or, to guarantee 1b-ness, one can create a new type that cannot be in the dict. > For another example, file.read() returns an empty string at EOF. If the request is 'give me the rest of the file as a string', then '' is the answer, not a 'cannot answer' indicator. Similarly, if the request is 'how many bytes are left to read', then zero is a numerical answer, not a non-numerical 'cannot answer' indicator. Terry J. Reedy From tjreedy at udel.edu Wed Aug 31 06:08:25 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 31 Aug 2005 00:08:25 -0400 Subject: [Python-Dev] Remove str.find in 3.0? References: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com> Message-ID: "Delaney, Timothy (Tim)" wrote in message > before, sep, after = s.partition('?') > ('http://www.python.org', '', '') > > before, sep, after = s.rpartition('?') > ('', '', 'http://www.python.org') I can also see this as left, sep, right, with the sep not found case putting all in left or right depending on whether one scanned to the right or left. In other words, when the scanner runs out of chars to scan, everything is 'behind' the scan, where 'behind' depends on the direction of scanning. That seems nicely symmetric. Terry J. Reedy From tjreedy at udel.edu Wed Aug 31 06:13:40 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 31 Aug 2005 00:13:40 -0400 Subject: [Python-Dev] Revising RE docs (was: partition() (was: Removestr.find in 3.0?)) References: <20050830143542.niq7a9s8bsrkc8ok@login.werra.lunarpages.com> <200508301741.28656.fdrake@acm.org> Message-ID: "Fred L. Drake, Jr." wrote in message news:200508301741.28656.fdrake at acm.org... > I'd rather see it reversed from what it is as well. While I don't have > the > time myself (and don't consider it a critical issue), I certainly won't > revert a patch to make the change without good reason. :-) Do you mean 'not reject' rather than 'not revert'? Terry J. Reedy From rrr at ronadam.com Wed Aug 31 06:27:23 2005 From: rrr at ronadam.com (Ron Adam) Date: Wed, 31 Aug 2005 00:27:23 -0400 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: References: <005301c5ada1$4a52afc0$8832c797@oemcomputer><4314CA3E.3020606@benjiyork.com> <4314E1A2.4060409@ronadam.com> Message-ID: <431531AB.4080305@ronadam.com> Fredrik Lundh wrote: > Ron Adam wrote: > > >>For cases where single values are desired, attribues could work. >> >>Slicing: >> line = line.partition(';').head >> line = line.partition('#').head >> >>But it gets awkward as soon as you want more than one. >> >> sep, port = host.partition(':').head, host.partition(':').sep > > > unless you go for the piece approach > > host, port = host.piece(":", 1, 2) > > (which, of course, is short for > > host, port = host.piece(":").group(1, 2) > > ) I'm not familiar with piece, but it occurred to me it might be useful to get attributes groups in some way. My first (passing) thought was to do... host, port = host.partition(':').(head, sep) Where that would be short calling a method to return them: host, port = host.partition(':').getattribs('head','sep') But with only three items, the '_' is in the category of "Looks kind of strange, but I can get used to it because it works well.". Cheers, Ron From steve at holdenweb.com Wed Aug 31 06:50:12 2005 From: steve at holdenweb.com (Steve Holden) Date: Tue, 30 Aug 2005 23:50:12 -0500 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: References: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com> <59e9fd3a050830202334c69649@mail.gmail.com> Message-ID: <43153704.6080304@holdenweb.com> Guido van Rossum wrote: > On 8/30/05, Andrew Durdin wrote: [confusion] > > > Hm. The example is poorly chosen because it's an end case. The > invariant for both is (I'd hope!) > > "".join(s.partition()) == s == "".join(s.rpartition()) > > Thus, > > "a/b/c".partition("/") returns ("a", "/", "b/c") > > "a/b/c".rpartition("/") returns ("a/b", "/", "c") > > That can't be confusing can it? > > (Just think of it as rpartition() stopping at the last occurrence, > rather than searching from the right. :-) > So we can check that a substring x appears precisely once in the string s using s.partition(x) == s.rpartition(x) Oops, it fails if s == "". I can usually find some way to go wrong ... tongue-in-cheek-ly y'rs - steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ From tjreedy at udel.edu Wed Aug 31 06:52:47 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 31 Aug 2005 00:52:47 -0400 Subject: [Python-Dev] Proof of the pudding: str.partition() References: <005301c5ada1$4a52afc0$8832c797@oemcomputer> <4314CA3E.3020606@benjiyork.com><4314E1A2.4060409@ronadam.com> <4314E51B.1050507@hathawaymix.org> Message-ID: "Shane Hathaway" wrote in message news:4314E51B.1050507 at hathawaymix.org... > You can do both: make partition() return a sequence with attributes, > similar to os.stat(). However, I would call the attributes "before", > "sep", and "after". One could see that as a special-case back-compatibility kludge that maybe should disappear in 3.0. My impression is that the attributes were added precisely because unpacking several related attributes into several disconnected vars was found to be often awkward. The sequencing is arbitrary and one often needs less that all attributes. Terry J. Reedy From shane at hathawaymix.org Wed Aug 31 07:29:11 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Tue, 30 Aug 2005 23:29:11 -0600 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: References: <005301c5ada1$4a52afc0$8832c797@oemcomputer> <4314CA3E.3020606@benjiyork.com><4314E1A2.4060409@ronadam.com> <4314E51B.1050507@hathawaymix.org> Message-ID: <43154027.9020301@hathawaymix.org> Terry Reedy wrote: > "Shane Hathaway" wrote in message > news:4314E51B.1050507 at hathawaymix.org... > >>You can do both: make partition() return a sequence with attributes, >>similar to os.stat(). However, I would call the attributes "before", >>"sep", and "after". > > > One could see that as a special-case back-compatibility kludge that maybe > should disappear in 3.0. My impression is that the attributes were added > precisely because unpacking several related attributes into several > disconnected vars was found to be often awkward. The sequencing is > arbitrary and one often needs less that all attributes. Good point. Unlike os.stat(), it's very easy to remember the order of the return values from partition(). I'll add my +1 vote for part() and +0.9 for partition(). As for the regex version of partition(), I wonder if a little cleanup effort is in order so that new regex features don't have to be added in two places. I suggest a builtin for compiling regular expressions, perhaps called "regex". It would be easier to use the builtin than to import the re module, so there would no longer be a reason for the re module to have functions that duplicate the regular expression methods. Shane From jcarlson at uci.edu Wed Aug 31 07:30:58 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 30 Aug 2005 22:30:58 -0700 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <43153704.6080304@holdenweb.com> References: <43153704.6080304@holdenweb.com> Message-ID: <20050830222856.8B14.JCARLSON@uci.edu> Steve Holden wrote: > > Guido van Rossum wrote: > > On 8/30/05, Andrew Durdin wrote: > [confusion] > > > > > > Hm. The example is poorly chosen because it's an end case. The > > invariant for both is (I'd hope!) > > > > "".join(s.partition()) == s == "".join(s.rpartition()) > > > > Thus, > > > > "a/b/c".partition("/") returns ("a", "/", "b/c") > > > > "a/b/c".rpartition("/") returns ("a/b", "/", "c") > > > > That can't be confusing can it? > > > > (Just think of it as rpartition() stopping at the last occurrence, > > rather than searching from the right. :-) > > > So we can check that a substring x appears precisely once in the string > s using > > s.partition(x) == s.rpartition(x) > > Oops, it fails if s == "". I can usually find some way to go wrong ... There was an example in the standard library that used "s.find(y) == s.rfind(y)" as a test for zero or 1 instances of the searched for item. Generally though, s.count(x)==1 is a better test. - Josiah From pierre.barbier at cirad.fr Wed Aug 31 10:16:59 2005 From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille) Date: Wed, 31 Aug 2005 10:16:59 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <20050830091228.8B00.JCARLSON@uci.edu> References: <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <20050830091228.8B00.JCARLSON@uci.edu> Message-ID: <4315677B.1000004@cirad.fr> Josiah Carlson a ?crit : > Pierre Barbier de Reuille wrote: > > 0.5 > > So, subtracting that .5 seconds from all the cases gives us... > > 0.343 seconds for .find's comparison > 0.313 seconds for .index's exception handling when an exception is not > raised > 3.797 seconds for .index's exception handling when an exception is > raised. Well, when I did benchmark that (two years ago) the difference was, AFAIR, much greater ! But well, I just have to adjust my internal data sets ;) Pierre > > In the case of a string being found, .index is about 10% faster than > .find . In the case of a string not being found, .index's exception > handlnig mechanics are over 11 times slower than .find's comparison. > > [...] > > - Josiah > -- Pierre Barbier de Reuille INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP Botanique et Bio-informatique de l'Architecture des Plantes TA40/PSII, Boulevard de la Lironde 34398 MONTPELLIER CEDEX 5, France tel : (33) 4 67 61 65 77 fax : (33) 4 67 61 56 68 From ncoghlan at gmail.com Wed Aug 31 11:10:48 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 31 Aug 2005 19:10:48 +1000 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <4314B56E.6070609@ronadam.com> References: <431492D5.6090102@ronadam.com> <5.1.1.6.0.20050830143755.01fcc538@mail.telecommunity.com> <4314B56E.6070609@ronadam.com> Message-ID: <43157418.5040603@gmail.com> Ron Adam wrote: > I don't feel there is a need to avoid numbers entirely. In this case I > think it's the better way to find the n'th seperator and since it's an > optional value I feel it doesn't add a lot of complication. Anyway... > It's just a suggestion. Avoid overengineering this without genuine use cases. Raymond's review of the standard library shows that the basic version of str.partition provides definite readability benefits and also makes it easier to write correct code - enhancements can wait until we have some real experience with how people use the method. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.blogspot.com From fredrik at pythonware.com Wed Aug 31 12:16:51 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 31 Aug 2005 12:16:51 +0200 Subject: [Python-Dev] Proof of the pudding: str.partition() References: <005301c5ada1$4a52afc0$8832c797@oemcomputer><4314CA3E.3020606@benjiyork.com> <4314E1A2.4060409@ronadam.com> <431531AB.4080305@ronadam.com> Message-ID: Ron Adam wrote: > I'm not familiar with piece, but it occurred to me it might be useful to > get attributes groups in some way. My first (passing) thought was to do... > > host, port = host.partition(':').(head, sep) > > Where that would be short calling a method to return them: > > host, port = host.partition(':').getattribs('head','sep') note, however, that your first syntax doesn't work in today's python (bare names are always evaluated in the current scope, before any calls are made) given that you want both the pieces *and* a way to see if a split was made, the only half-reasonable alternatives to "I can always ignore the values I don't need" that I can think of are flag, part1, part2, ... = somestring.partition(sep, count=2) or flag, part1, part2, ... = somestring.piec^H^H^Hartition(sep, group, group, ...) where flag is true if the separator was found, and the number of parts returned corresponds to either count or the number of group indices (the latter is of course the external influence that cannot be named, but with an API modelled after RE's group method). From gmccaughan at synaptics-uk.com Wed Aug 31 12:38:01 2005 From: gmccaughan at synaptics-uk.com (Gareth McCaughan) Date: Wed, 31 Aug 2005 11:38:01 +0100 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <59e9fd3a050830192344ffeafd@mail.gmail.com> References: <4f0b69dc0508301124423da48e@mail.gmail.com> <004701c5ad90$f0faeec0$8832c797@oemcomputer> <59e9fd3a050830192344ffeafd@mail.gmail.com> Message-ID: <200508311138.02330.gmccaughan@synaptics-uk.com> > Just to put my spoke in the wheel, I find the difference in the > ordering of return values for partition() and rpartition() confusing: > > head, sep, remainder = partition(s) > remainder, sep, head = rpartition(s) > > My first expectation for rpartition() was that it would return exactly > the same values as partition(), but just work from the end of the > string. > > IOW, I expected "www.python.org".partition("python") to return exactly > the same as "www.python.org".rpartition("python") Yow. Me too, and indeed I've been skimming this thread without it ever occurring to me that it would be otherwise. > Anyway, I'm definitely +1 on partition(), but -1 on rpartition() > returning in "reverse order". +1. -- g From gmccaughan at synaptics-uk.com Wed Aug 31 12:43:11 2005 From: gmccaughan at synaptics-uk.com (Gareth McCaughan) Date: Wed, 31 Aug 2005 11:43:11 +0100 Subject: [Python-Dev] Remove str.find in 3.0? In-Reply-To: <200508311138.02330.gmccaughan@synaptics-uk.com> References: <4f0b69dc0508301124423da48e@mail.gmail.com> <59e9fd3a050830192344ffeafd@mail.gmail.com> <200508311138.02330.gmccaughan@synaptics-uk.com> Message-ID: <200508311143.11809.gmccaughan@synaptics-uk.com> I wrote: [Andrew Durdin:] > > IOW, I expected "www.python.org".partition("python") to return exactly > > the same as "www.python.org".rpartition("python") > > Yow. Me too, and indeed I've been skimming this thread without > it ever occurring to me that it would be otherwise. And, on re-skimming the thread, I think that was always the plan. So that's OK, then. :-) -- g From solipsis at pitrou.net Wed Aug 31 13:41:20 2005 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 31 Aug 2005 13:41:20 +0200 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <17172.38752.55260.62198@montanaro.dyndns.org> References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> <431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com> <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr> <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr> <17172.38752.55260.62198@montanaro.dyndns.org> Message-ID: <1125488480.31857.1.camel@p-dvsi-418-1.rd.francetelecom.fr> Le mardi 30 ao?t 2005 ? 12:29 -0500, skip at pobox.com a ?crit : > Just group your re: > > >>> import re > >>> > >>> re.split("ab", "abracadabra") > ['', 'racad', 'ra'] > >>> re.split("(ab)", "abracadabra") > ['', 'ab', 'racad', 'ab', 'ra'] > > and you get it in the return value. In fact, re.split with a grouped re is > very much like Raymond's str.partition method without the guarantee of > returning a three-element list. Thanks! I guess I should have read the documentation carefully instead of assuming re.split() worked like in some other language (namely, PHP). Regards Antoine. From mcherm at mcherm.com Wed Aug 31 13:55:35 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed, 31 Aug 2005 04:55:35 -0700 Subject: [Python-Dev] Proof of the pudding: str.partition() Message-ID: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com> Raymond's original definition for partition() did NOT support any of the following: (*) Regular Expressions (*) Ways to generate just 1 or 2 of the 3 values if some are not going to be used (*) Clever use of indices to avoid copying strings (*) Behind-the-scenes tricks to allow repeated re-partitioning to be faster than O(n^2) The absence of these "features" is a GOOD thing. It makes the behavior of partition() so simple and straightforward that it is easily documented and can be instantly understood by a competent programmer. I *like* keeping it simple. In fact, if anything, I'd give UP the one fancy feature he chose to include: (*) An rpartition() function that searches from the right ...except that I understand why he included it and am convinced by the arguments (use cases can be demonstrated and people would expect it to be there and complain if it weren't). Simplicity and elegence are two of the reasons that this is such an excellent proposal, let's not lose them. We have existing tools (like split() and the re module) to handle the tricky problems. -- Michael Chermside From amk at amk.ca Wed Aug 31 14:23:34 2005 From: amk at amk.ca (A.M. Kuchling) Date: Wed, 31 Aug 2005 08:23:34 -0400 Subject: [Python-Dev] Switching re and sre Message-ID: <20050831122334.GA4104@rogue.amk.ca> FYI: In a discussion on the Python security response list, Guido suggested that the sre.py and re.py modules should be switched. Currently re.py just imports the contents of sre.py -- once it supported both sre and the PCRE-based pre.py -- and sre.py contains the actual code. Now that pre.py is gone, we can move the actual code into re.py and make sre.py just import re.py, so that any user code that actually imports sre will still work. I'll make this change today. --amk From skip at pobox.com Wed Aug 31 15:02:40 2005 From: skip at pobox.com (skip@pobox.com) Date: Wed, 31 Aug 2005 08:02:40 -0500 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: References: <005301c5ada1$4a52afc0$8832c797@oemcomputer> <4314CA3E.3020606@benjiyork.com> <4314E1A2.4060409@ronadam.com> <4314E51B.1050507@hathawaymix.org> Message-ID: <17173.43632.145313.858480@montanaro.dyndns.org> >> You can do both: make partition() return a sequence with attributes, >> similar to os.stat(). However, I would call the attributes "before", >> "sep", and "after". Terry> One could see that as a special-case back-compatibility kludge Terry> that maybe should disappear in 3.0. Back compatibility with what? Since partition doesn't exist now there is nothing to be backward compatible with is there? I'm -1 on the notion of generating groups or attributes. In other cases (regular expressions, stat() results) there are good reasons to provide them. The results of a regular expression match are variable, depending on how many groups the user defines in his pattern. In the case of stat() there is no reason other than historic for the results to be returned in any particular order, so having named attributes makes the results easier to work with. The partition method has neither. It always returns a fixed tuple of three elements whose order is clearly based on the physical relationship of the three pieces of the string that have been partitioned. I think Raymond's original formulation is the correct one. Always return a three-element tuple of strings, nothing more. Use '_' or 'dummy' if there is some element you're not interested in. Skip From pje at telecommunity.com Wed Aug 31 15:40:15 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 31 Aug 2005 09:40:15 -0400 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com > Message-ID: <5.1.1.6.0.20050831092223.01b56d98@mail.telecommunity.com> At 04:55 AM 8/31/2005 -0700, Michael Chermside wrote: >Raymond's original definition for partition() did NOT support any >of the following: > > (*) Regular Expressions This can be orthogonally added to the 're' module, and definitely should not be part of the string method. > (*) Ways to generate just 1 or 2 of the 3 values if some are > not going to be used Yep, subscripting and slicing are more than adequate to handle *all* of those use cases, even the ones that some people have been jumping through odd hoops to express: before = x.partition(sep)[0] found = x.partition(sep)[1] after = x.partition(sep)[2] before, found = x.partition("foo")[:2] found, after = x.partition("foo")[1:] before, after = x.partition("foo")[::2] Okay, that last one is maybe a little too clever. I'd personally just use '__' or 'DONTCARE' or something like that for the value(s) I didn't care about, because it actually takes slightly less time to unpack a 3-tuple into three function-local variables than it does to pull out a single element of the tuple, and it's almost twice as fast as taking a slice and unpacking it into two variables. So, using three variables is both faster *and* easier to read than any of the variations anybody has proposed, including the ones I just showed above. > (*) Clever use of indices to avoid copying strings > (*) Behind-the-scenes tricks to allow repeated re-partitioning > to be faster than O(n^2) Yep, -1 on these. >The absence of these "features" is a GOOD thing. It makes the >behavior of partition() so simple and straightforward that it is >easily documented and can be instantly understood by a competent >programmer. I *like* keeping it simple. In fact, if anything, I'd >give UP the one fancy feature he chose to include: > > (*) An rpartition() function that searches from the right > >...except that I understand why he included it and am convinced >by the arguments (use cases can be demonstrated and people would >expect it to be there and complain if it weren't). I'd definitely like to keep rpartition. For example, splitting an HTTP url's hostname from its port should be done with rpartition, since you can have a 'username:password@' part before the host, and because the host can be a bracketed bracketed IPv6 host address with colons in it. >Simplicity and elegence are two of the reasons that this is such >an excellent proposal, +1. From rrr at ronadam.com Wed Aug 31 15:41:38 2005 From: rrr at ronadam.com (Ron Adam) Date: Wed, 31 Aug 2005 09:41:38 -0400 Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?) In-Reply-To: <43157418.5040603@gmail.com> References: <431492D5.6090102@ronadam.com> <5.1.1.6.0.20050830143755.01fcc538@mail.telecommunity.com> <4314B56E.6070609@ronadam.com> <43157418.5040603@gmail.com> Message-ID: <4315B392.9040906@ronadam.com> Nick Coghlan wrote: > Ron Adam wrote: > >>I don't feel there is a need to avoid numbers entirely. In this case I >>think it's the better way to find the n'th seperator and since it's an >>optional value I feel it doesn't add a lot of complication. Anyway... >>It's just a suggestion. > > > Avoid overengineering this without genuine use cases. Raymond's review of the > standard library shows that the basic version of str.partition provides > definite readability benefits and also makes it easier to write correct code - > enhancements can wait until we have some real experience with how people use > the method. > > Cheers, > Nick. The use cases for nth items 1 and -1 are the same ones for partition() and rpartition. It's only values greater or less than those that need use cases. (I'll try to find some.) True, a directional index enhancement could be added later, but not considering it now and then adding it later would mean rpartition() would become redundant and/or an argument against doing it later. As it's been stated fairly often, it's hard to remove something once it's put in. So it's prudent to consider a few alternative forms and rule them out, rather than try to change things later. Cheers, Ron From fredrik at pythonware.com Wed Aug 31 16:03:29 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 31 Aug 2005 16:03:29 +0200 Subject: [Python-Dev] Proof of the pudding: str.partition() References: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com > <5.1.1.6.0.20050831092223.01b56d98@mail.telecommunity.com> Message-ID: Phillip J. Eby wrote: > Yep, subscripting and slicing are more than adequate to handle *all* of > those use cases, even the ones that some people have been jumping through > odd hoops to express: > > before = x.partition(sep)[0] > found = x.partition(sep)[1] > after = x.partition(sep)[2] > > before, found = x.partition("foo")[:2] > found, after = x.partition("foo")[1:] > before, after = x.partition("foo")[::2] > > Okay, that last one is maybe a little too clever. I'd personally just use > '__' or 'DONTCARE' or something like that for the value(s) I didn't care > about, because it actually takes slightly less time to unpack a 3-tuple > into three function-local variables than it does to pull out a single > element of the tuple, and it's almost twice as fast as taking a slice and > unpacking it into two variables. you're completely missing the point. the problem isn't the time it takes to unpack the return value, the problem is that it takes time to create the substrings that you don't need. for some use cases, a naive partition-based solution is going to be a lot slower than the old find+slice approach, no matter how you slice, index, or unpack the return value. > So, using three variables is both faster *and* easier to read than any of > the variations anybody has proposed, including the ones I just showed above. try again. From python at discworld.dyndns.org Wed Aug 31 16:30:46 2005 From: python at discworld.dyndns.org (Charles Cazabon) Date: Wed, 31 Aug 2005 08:30:46 -0600 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com> References: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com> Message-ID: <20050831143046.GE522@discworld.dyndns.org> Michael Chermside wrote: > > (*) An rpartition() function that searches from the right > > ...except that I understand why he included it and am convinced > by the arguments (use cases can be demonstrated and people would > expect it to be there and complain if it weren't). I would think that perhaps an optional second argument to the method that controls whether it searches from the start (default) or end of the string might be nicer than having two separate methods, even though that would lose parallelism with the current .find/.index. While I'm at it, why not propose that for py3k that .rfind/.rindex/.rjust/.rsplit disappear, and .find/.index/.just/.split grow an optional "fromright" (or equivalent) optional keyword argument? Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://pyropus.ca/software/ ----------------------------------------------------------------------- From guido at python.org Wed Aug 31 16:54:07 2005 From: guido at python.org (Guido van Rossum) Date: Wed, 31 Aug 2005 07:54:07 -0700 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: <20050831143046.GE522@discworld.dyndns.org> References: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com> <20050831143046.GE522@discworld.dyndns.org> Message-ID: On 8/31/05, Charles Cazabon wrote: > I would think that perhaps an optional second argument to the method that > controls whether it searches from the start (default) or end of the string > might be nicer than having two separate methods, even though that would lose > parallelism with the current .find/.index. > > While I'm at it, why not propose that for py3k that > .rfind/.rindex/.rjust/.rsplit disappear, and .find/.index/.just/.split grow an > optional "fromright" (or equivalent) optional keyword argument? This violates one of my design principles: don't add boolean options to an API that control the semantics in such a way that the option value is (nearly) always a constant. Instead, provide two different method names. The motivation for this rule comes partly for performance: parameters are relatively expensive, and you shouldn't make the method test dynamically for a parameter value that is constant for the call site; and partly from readability: don't bother the reader with having to remember the full general functionality and how it is affected by the various flags; also, a Boolean positional argument is a really poor clue about its meaning, and it's easy to misremember the sense reversed. PS. This is a special case of a stronger design principle: don't let the *type* of the return value depend on the *value* of the arguments. PS2. As with all design principles, there are exceptions. But they are, um, exceptional. index/rindex is not such an exception. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Wed Aug 31 18:51:17 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 31 Aug 2005 12:51:17 -0400 Subject: [Python-Dev] Proof of the pudding: str.partition() References: <005301c5ada1$4a52afc0$8832c797@oemcomputer><4314CA3E.3020606@benjiyork.com> <4314E1A2.4060409@ronadam.com><4314E51B.1050507@hathawaymix.org> <17173.43632.145313.858480@montanaro.dyndns.org> Message-ID: wrote in message news:17173.43632.145313.858480 at montanaro.dyndns.org... > > >> You can do both: make partition() return a sequence with > attributes, > >> similar to os.stat(). However, I would call the attributes > "before", > >> "sep", and "after". > > Terry> One could see that as a special-case back-compatibility kludge > Terry> that maybe should disappear in 3.0. > > Back compatibility with what? os.stat without attributes. 'that' referred to its current 'sequence with attributes' return. > I'm -1 on the notion of generating groups or attributes. We agree. A back-compatibility kludge is not a precedent to be emulated. >In the case of stat() there is no reason other than historic > for the results to be returned in any particular order, Which is why I wonder whether the sequence part should be dropped in 3.0. Terry J. Reedy From python at discworld.dyndns.org Wed Aug 31 19:24:58 2005 From: python at discworld.dyndns.org (Charles Cazabon) Date: Wed, 31 Aug 2005 11:24:58 -0600 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: References: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com> <20050831143046.GE522@discworld.dyndns.org> Message-ID: <20050831172458.GB2476@discworld.dyndns.org> Guido van Rossum wrote: > On 8/31/05, Charles Cazabon wrote: > > > While I'm at it, why not propose that for py3k that > > .rfind/.rindex/.rjust/.rsplit disappear, and .find/.index/.just/.split > > grow an optional "fromright" (or equivalent) optional keyword argument? > > This violates one of my design principles: Ah, excellent response. Are your design principles written down anywhere? I didn't see anything on your essays page about them, but I'd like to learn at the feet of the BDFL. > don't add boolean options to an API that control the semantics in such a way > that the option value is (nearly) always a constant. Instead, provide two > different method names. Hmmm. I really dislike the additional names, but ... > The motivation for this rule comes partly for performance: parameters > are relatively expensive, and you shouldn't make the method test > dynamically for a parameter value that is constant for the call site; I can see this. > and partly from readability: don't bother the reader with having to > remember the full general functionality and how it is affected by the > various flags; This I don't think is so bad. It's analogous to providing the "reverse" parameter to sorted et al, and I don't think that's particularly hard to remember. It would also be rarely used; I use find/index tens of times more often than I use rfind/rindex, and I presume it would be the same for a hypothetical .part/.rpart. > also, a Boolean positional argument is a really poor clue about its meaning, > and it's easy to misremember the sense reversed. I totally agree. I therefore borrowed the time machine and modified my proposal to suggest it should be a keyword argument, not a positional one :). > PS. This is a special case of a stronger design principle: don't let > the *type* of the return value depend on the *value* of the arguments. Hmmm. In all of these cases, the type of the return value is constant. Only the value would change based on the value of the arguments. ... ? Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://pyropus.ca/software/ ----------------------------------------------------------------------- From jimjjewett at gmail.com Wed Aug 31 20:43:02 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 31 Aug 2005 14:43:02 -0400 Subject: [Python-Dev] Alternative name for str.partition() Message-ID: [In http://mail.python.org/pipermail/python-dev/2005-August/055880.html ] Andrew Durdin wrote: > one of the "fixed stdlib" examples that Raymond > posted actually uses rpartition and partition in two consecutive lines Even with that leadin, even right next to each other, it took me a bit of time to see the difference between rest.rpartition and rest.partition. > rest, _, query = rest.rpartition('?') > script, _, rest = rest.partition('/') Shortening the names helps, because a single letter matters more. > rest, _, query = rest.rpart('?') > script, _, rest = rest.part('/') A different-looking word (such as Greg's suggestion) might be even better, if the word also works on its own. > rest, _, query = rest.rsplit_at('?') > script, _, rest = rest.split_at('/') -jJ From jimjjewett at gmail.com Wed Aug 31 20:56:44 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 31 Aug 2005 14:56:44 -0400 Subject: [Python-Dev] Proof of the pudding: str.partition() Message-ID: Michael Chermside wrote (but I reordered): >Simplicity and elegence are two of the reasons that this > is such an excellent proposal, let's not lose them. > Raymond's original definition for partition() did NOT support > any of the following: > (*) Regular Expressions While this is obviously more powerful, and an analogue should probably go in re ... it doesn't belong in strings. I don't want to have to explain why "www.python.org".part('.') acts strangely (forget to escape the period). > (*) Ways to generate just 1 or 2 of the 3 values if some are > not going to be used > (*) Clever use of indices to avoid copying strings > (*) Behind-the-scenes tricks to allow repeated re-partitioning > to be faster than O(n^2) I think these may be useful behind the scenes, but the API should not expose them unless they are made more general. For instance, the compiler could recognize that junk variables (or variable names matching a certain pattern?) don't really have to be created -- and that would be useful for more than string splitting. Doing it as a special case here just leads to a backwards compatibility wart later. -jJ From raymond.hettinger at verizon.net Wed Aug 31 21:37:18 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed, 31 Aug 2005 15:37:18 -0400 Subject: [Python-Dev] Design Principles In-Reply-To: Message-ID: <003501c5ae63$664f5e40$4320c797@oemcomputer> > > While I'm at it, why not propose that for py3k that > > .rfind/.rindex/.rjust/.rsplit disappear, and .find/.index/.just/.split > grow an > > optional "fromright" (or equivalent) optional keyword argument? > > This violates one of my design principles: don't add boolean options > to an API that control the semantics in such a way that the option > value is (nearly) always a constant. Instead, provide two different > method names. > > The motivation for this rule comes partly for performance: parameters > are relatively expensive, and you shouldn't make the method test > dynamically for a parameter value that is constant for the call site; > and partly from readability: don't bother the reader with having to > remember the full general functionality and how it is affected by the > various flags; also, a Boolean positional argument is a really poor > clue about its meaning, and it's easy to misremember the sense > reversed. > > PS. This is a special case of a stronger design principle: don't let > the *type* of the return value depend on the *value* of the arguments. > > PS2. As with all design principles, there are exceptions. But they > are, um, exceptional. index/rindex is not such an exception. FWIW, after this is over, I'll put together a draft list of these principles. The one listed above has served us well. An early draft of itertools.ifilter() had an invert flag. The toolset improved when that was split to a separate function, ifilterfalse(). Other thoughts: Tim's rule on algorithm selection: We read Knuth so you don't have to. Raymond's rule on language proposals: Assertions that construct X is better than an existing construct Y should be backed up by a variety of side-by-side comparisons using real-world code samples. I'm sure there are plenty more if these in the archives. Raymond From janssen at parc.com Wed Aug 31 22:04:55 2005 From: janssen at parc.com (Bill Janssen) Date: Wed, 31 Aug 2005 13:04:55 PDT Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: Your message of "Wed, 31 Aug 2005 06:40:15 PDT." <5.1.1.6.0.20050831092223.01b56d98@mail.telecommunity.com> Message-ID: <05Aug31.130458pdt."58617"@synergy1.parc.xerox.com> > > (*) Regular Expressions > > This can be orthogonally added to the 're' module, and definitely should > not be part of the string method. Sounds right to me, and it *should* be orthogonally added to the 're' module coincidentally simultaneously with the change to the string object :-). I have to say, it would be nice if "foo bar".partition(re.compile('\s')) would work. That is, if the argument is an re pattern object instead of a string, it would be nice if it were understood appropriately, just for symmetry's sake. But it's hardly necessary. Presumably in the re module, there would be a function like re.partition("\s", "foo bar") for one-shot usage, or the expression re.compile('\s').partition("foo bar") Bill From oren.tirosh at gmail.com Wed Aug 31 22:24:52 2005 From: oren.tirosh at gmail.com (Oren Tirosh) Date: Wed, 31 Aug 2005 23:24:52 +0300 Subject: [Python-Dev] Python 3 design principles Message-ID: <7168d65a050831132415118382@mail.gmail.com> Most of the changes in PEP 3000 are tightening up of "There should be one obvious way to do it.": * Remove multiple forms of raising exceptions, leaving just "raise instance" * Remove exec as statement, leaving the compatible tuple/call form. * Remove <>, ``, leaving !=, repr etc. Other changes are to disallow things already considered poor style like: * No assignment to True/False/None * No input() * No access to list comprehension variable And there is also completely new stuff like static type checking. While a lot of existing code will break on 3.0 it is still generally possible to write code that will run on both 2.x and 3.0: use only the "proper" forms above, do not assume the result of zip or range is a list, use absolute imports (and avoid static types, of course). I already write all my new code this way. Is this "common subset" a happy coincidence or a design principle? Not all proposed changes remove redundancy or add completely new things. Some of them just change the way certain things must be done. For example: * Moving compile, id, intern to sys * Replacing print with write/writeln And possibly the biggest change: * Reorganize the standard library to not be as shallow I'm between +0 and -1 on these. I don't find them enough of an improvement to break this "common subset" behavior. It's not quite the same as strict backward compatibility and I find it worthwhile to try to keep it. Writing programs that run on both 2.x and 3 may require ugly version-dependent tricks like: try: compile except NameError: from sys import compile or perhaps try: import urllib except ImportError: from www import urllib Should the "common subset" be a design principle of Python 3? Do compile and id really have to be moved from __builtins__ to sys? Could the rearrangement of the standard library be a bit less aggressive and try to leave commonly used modules in place? Oren From python at discworld.dyndns.org Wed Aug 31 22:44:39 2005 From: python at discworld.dyndns.org (Charles Cazabon) Date: Wed, 31 Aug 2005 14:44:39 -0600 Subject: [Python-Dev] Python 3 design principles In-Reply-To: <7168d65a050831132415118382@mail.gmail.com> References: <7168d65a050831132415118382@mail.gmail.com> Message-ID: <20050831204439.GA3775@discworld.dyndns.org> Oren Tirosh wrote: > > Not all proposed changes remove redundancy or add completely new > things. Some of them just change the way certain things must be done. > For example: > * Moving compile, id, intern to sys > * Replacing print with write/writeln > And possibly the biggest change: > * Reorganize the standard library to not be as shallow > > I'm between +0 and -1 on these. I don't find them enough of an > improvement to break this "common subset" behavior. It's not quite the > same as strict backward compatibility and I find it worthwhile to try > to keep it. > > Writing programs that run on both 2.x and 3 may require ugly > version-dependent tricks like: > > try: > compile > except NameError: > from sys import compile Perhaps py3k could have a py2compat module. Importing it could have the effect of (for instance) putting compile, id, and intern into the global namespace, making print an alias for writeln, alias the standard library namespace, ... ? Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://pyropus.ca/software/ ----------------------------------------------------------------------- From reinhold-birkenfeld-nospam at wolke7.net Wed Aug 31 22:49:23 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Wed, 31 Aug 2005 22:49:23 +0200 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: <05Aug31.130458pdt."58617"@synergy1.parc.xerox.com> References: <5.1.1.6.0.20050831092223.01b56d98@mail.telecommunity.com> <05Aug31.130458pdt."58617"@synergy1.parc.xerox.com> Message-ID: Bill Janssen wrote: >> > (*) Regular Expressions >> >> This can be orthogonally added to the 're' module, and definitely should >> not be part of the string method. > > Sounds right to me, and it *should* be orthogonally added to the 're' > module coincidentally simultaneously with the change to the string > object :-). > > I have to say, it would be nice if > > "foo bar".partition(re.compile('\s')) > > would work. That is, if the argument is an re pattern object instead > of a string, it would be nice if it were understood appropriately, > just for symmetry's sake. But it's hardly necessary. And it's horrible, for none of the other string methods accept a RE. In Python, RE functionality is in the re module and nowhere else, and this is a Good Thing. There are languages which give REs too much weight by philosophy (hint, hint), but Python isn't one of them. Interestingly, Python programmers suffer less from the "help me, my RE doesn't work" problem. Reinhold -- Mail address is perfectly valid! From nnorwitz at gmail.com Wed Aug 31 22:56:37 2005 From: nnorwitz at gmail.com (Neal Norwitz) Date: Wed, 31 Aug 2005 13:56:37 -0700 Subject: [Python-Dev] Python 3 design principles In-Reply-To: <7168d65a050831132415118382@mail.gmail.com> References: <7168d65a050831132415118382@mail.gmail.com> Message-ID: On 8/31/05, Oren Tirosh wrote: > > Writing programs that run on both 2.x and 3 may require ugly > version-dependent tricks like: > > try: > compile > except NameError: > from sys import compile Note we can ease this process a little by making a copy without removing, e.g., adding compile to sys now without removing it. As programs support only Python 2.5+, they could use sys.compile and wouldn't need to resort to the try/except above. I realize this is only a marginal improvement. However, if we don't start making changes, we will be stuck maintain suboptimal behaviour forever. n From collinw at gmail.com Wed Aug 31 23:00:54 2005 From: collinw at gmail.com (Collin Winter) Date: Wed, 31 Aug 2005 16:00:54 -0500 Subject: [Python-Dev] Python 3 design principles In-Reply-To: <20050831204439.GA3775@discworld.dyndns.org> References: <7168d65a050831132415118382@mail.gmail.com> <20050831204439.GA3775@discworld.dyndns.org> Message-ID: <43aa6ff705083114004924f4a9@mail.gmail.com> Am 31-Aug 05, Charles Cazabon schrieb: > Perhaps py3k could have a py2compat module. Importing it could have the > effect of (for instance) putting compile, id, and intern into the global > namespace, making print an alias for writeln, alias the standard library > namespace, ... ? from __past__ import python2 Gr??e, Collin Winter From rkern at ucsd.edu Wed Aug 31 23:00:08 2005 From: rkern at ucsd.edu (Robert Kern) Date: Wed, 31 Aug 2005 14:00:08 -0700 Subject: [Python-Dev] Python 3 design principles In-Reply-To: <7168d65a050831132415118382@mail.gmail.com> References: <7168d65a050831132415118382@mail.gmail.com> Message-ID: Oren Tirosh wrote: > While a lot of existing code will break on 3.0 it is still generally > possible to write code that will run on both 2.x and 3.0: use only the > "proper" forms above, do not assume the result of zip or range is a > list, use absolute imports (and avoid static types, of course). I > already write all my new code this way. > > Is this "common subset" a happy coincidence or a design principle? I think it's because those are the most obvious things right now. The really radical stuff won't come up until active development on Python 3000 actually starts. And it will, so any "common subset" will probably not be very large. IMO, if we are going to restrict Python 3000 enough to protect that "common subset," then there's not enough payoff to justify breaking *any* backwards compatibility. If my current codebase[1] isn't going to be supported in Python 3000, I'm going to want the Python developers to use that opportunity to the fullest advantage to make a better language. [1] By which I mean the sum total of the code that I use not just code that I've personally written. I am a library-whore. -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From tjreedy at udel.edu Wed Aug 31 23:05:17 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 31 Aug 2005 17:05:17 -0400 Subject: [Python-Dev] Design Principles References: <003501c5ae63$664f5e40$4320c797@oemcomputer> Message-ID: "Raymond Hettinger" wrote in message news:003501c5ae63$664f5e40$4320c797 at oemcomputer... > FWIW, after this is over, I'll put together a draft list of these > principles. The one listed above has served us well. An early draft of > itertools.ifilter() had an invert flag. The toolset improved when that > was split to a separate function, ifilterfalse(). > > Other thoughts: > > Tim's rule on algorithm selection: We read Knuth so you don't have to. > > Raymond's rule on language proposals: Assertions that construct X is > better than an existing construct Y should be backed up by a variety of > side-by-side comparisons using real-world code samples. > > I'm sure there are plenty more if these in the archives. This would make a good information PEP to point people to when they ask 'Why ...' and the answer goes back to one of these principles. Terry J. Reedy From foom at fuhm.net Wed Aug 31 23:39:54 2005 From: foom at fuhm.net (James Y Knight) Date: Wed, 31 Aug 2005 17:39:54 -0400 Subject: [Python-Dev] Python 3 design principles In-Reply-To: References: <7168d65a050831132415118382@mail.gmail.com> Message-ID: On Aug 31, 2005, at 5:00 PM, Robert Kern wrote: > IMO, if we are going to restrict Python 3000 enough to protect that > "common subset," then there's not enough payoff to justify breaking > *any* backwards compatibility. If my current codebase[1] isn't > going to > be supported in Python 3000, I'm going to want the Python > developers to > use that opportunity to the fullest advantage to make a better > language. I disagree fully. As a maintainer in the Twisted project I very much hope that it is possible to adapt the code such that it will work on Python 3 while still maintaining compatibility with Python 2.X. Otherwise, it will be impossible to make the transition to Python 3 without either maintaining two forks of the codebase (I doubt that'll happen) or abandoning all users still on Python 2. And that surely won't happen either, for a while. Maybe by the time Python 3.1 or 3.2 comes out it'll be possible to completely abandon Python 2. I'm perfectly happy to see backwards-incompatible changes in Python 3, as long as they do not make it completely impossible to write code that can run on both Python 3 and Python 2.X. This suggests a few things to me: a) new features should be added to the python 2.x series first wherever possible. b) 3.0 should by and large by simply a feature-removal release, removing support for features already marked as going away by the end of the 2.x series and which have replacements. c) don't make any radical syntax changes which make it impossible to write code that can even parse in both versions. d) for all backwards-incompatible-change proposals, have a section dedicated to compatibility and migration of old code that explains both how to modify old code to do things purely the new way, _and_ how to modify code to work under both the old and new ways. Strive to make this as simple as possible, but if totally necessary, it may be reasonable to suggest writing a wrapper function which changes behavior based on python version/existence of new methods. James From steve at holdenweb.com Wed Aug 31 23:51:14 2005 From: steve at holdenweb.com (Steve Holden) Date: Wed, 31 Aug 2005 16:51:14 -0500 Subject: [Python-Dev] Proof of the pudding: str.partition() In-Reply-To: References: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com > <5.1.1.6.0.20050831092223.01b56d98@mail.telecommunity.com> Message-ID: Fredrik Lundh wrote: > Phillip J. Eby wrote: > > >>Yep, subscripting and slicing are more than adequate to handle *all* of >>those use cases, even the ones that some people have been jumping through >>odd hoops to express: >> >> before = x.partition(sep)[0] >> found = x.partition(sep)[1] >> after = x.partition(sep)[2] >> >> before, found = x.partition("foo")[:2] >> found, after = x.partition("foo")[1:] >> before, after = x.partition("foo")[::2] >> >>Okay, that last one is maybe a little too clever. I'd personally just use >>'__' or 'DONTCARE' or something like that for the value(s) I didn't care >>about, because it actually takes slightly less time to unpack a 3-tuple >>into three function-local variables than it does to pull out a single >>element of the tuple, and it's almost twice as fast as taking a slice and >>unpacking it into two variables. > > > you're completely missing the point. > > the problem isn't the time it takes to unpack the return value, the problem is that > it takes time to create the substrings that you don't need. > Indeed, and therefore the performance of rpartition is likely to get worse as the length of the input strung increases. I don't like to think about all those strings being created just to be garbage-collected. Pity the poor CPU ... :-) > for some use cases, a naive partition-based solution is going to be a lot slower > than the old find+slice approach, no matter how you slice, index, or unpack the > return value. > Yup. Then it gets down to statistical arguments about the distribution of use cases and input lengths. If we had a type that represented a substring of an existing string it might avoid the stress, but I'm not sure I see that one flying. > >>So, using three variables is both faster *and* easier to read than any of >>the variations anybody has proposed, including the ones I just showed above. > > > try again. > The collective brainpower that's been exercised on this one enhancement already must be phenomenal, but the proposal still isn't perfect. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ From aahz at pythoncraft.com Wed Aug 31 23:55:50 2005 From: aahz at pythoncraft.com (Aahz) Date: Wed, 31 Aug 2005 14:55:50 -0700 Subject: [Python-Dev] Design Principles In-Reply-To: <003501c5ae63$664f5e40$4320c797@oemcomputer> References: <003501c5ae63$664f5e40$4320c797@oemcomputer> Message-ID: <20050831215550.GA437@panix.com> On Wed, Aug 31, 2005, Raymond Hettinger wrote: > > FWIW, after this is over, I'll put together a draft list of these > principles. The one listed above has served us well. An early draft of > itertools.ifilter() had an invert flag. The toolset improved when that > was split to a separate function, ifilterfalse(). > > Other thoughts: > > Tim's rule on algorithm selection: We read Knuth so you don't have to. > > Raymond's rule on language proposals: Assertions that construct X is > better than an existing construct Y should be backed up by a variety of > side-by-side comparisons using real-world code samples. > > I'm sure there are plenty more if these in the archives. Nice! Also a pointer to the Zen of Python. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The way to build large Python applications is to componentize and loosely-couple the hell out of everything.