From jeremy@beopen.com Tue Dec 5 02:34:24 2000 From: jeremy@beopen.com (Jeremy Hylton) Date: Tue Dec 5 02:38:43 2000 Subject: [Python-Dev] 2.0 final is nearly ready There are provisional source tarballs available at ftp://python.beopen.com/pub/python/2.0 These are NOT the source tarballs that we intend to release. They are known to contain old README and Misc/NEWS files. But any reports of successful builds on your platform would be appreciated. I expect to put the final release in place in a few hours; any reports of success after the release will be expected <0.5 wink>. Jeremy From fredrik@effbot.org Fri Dec 1 06:39:57 2000 From: fredrik@effbot.org (Fredrik Lundh) Date: Fri, 1 Dec 2000 07:39:57 +0100 Subject: [Python-Dev] TypeError: foo, bar Message-ID: <008f01c05b61$877263b0$3c6340d5@hagrid> just stumbled upon yet another (high-profile) python newbie confused a "TypeError: read-only character buffer, dictionary" message. how about changing "read-only character buffer" to "string or read-only character buffer", and the "foo, bar" format to "expected foo, found bar", so we get: "TypeError: expected string or read-only character buffer, found dictionary" From tim.one@home.com Fri Dec 1 06:58:53 2000 From: tim.one@home.com (Tim Peters) Date: Fri, 1 Dec 2000 01:58:53 -0500 Subject: [Python-Dev] TypeError: foo, bar In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid> Message-ID: [Fredrik Lundh] > just stumbled upon yet another (high-profile) python newbie > confused a "TypeError: read-only character buffer, dictionary" > message. > > how about changing "read-only character buffer" to "string > or read-only character buffer", and the "foo, bar" format to > "expected foo, found bar", so we get: > > "TypeError: expected string or read-only character > buffer, found dictionary" +0. +1 if "found" is changed to "got". "found"-implies-a-search-ly y'rs - tim From thomas.heller@ion-tof.com Fri Dec 1 08:10:21 2000 From: thomas.heller@ion-tof.com (Thomas Heller) Date: Fri, 1 Dec 2000 09:10:21 +0100 Subject: [Python-Dev] PEP 229 and 222 References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us> Message-ID: <014301c05b6e$269716a0$e000a8c0@thomasnotebook> > > Beats me. I'm not even sure if the Distutils offers a way to compile > > a static Python binary. (GPW: well, does it?) > > It's in the CCompiler interface, but hasn't been exposed to the outside > world. (IOW, it's mainly a question of desiging the right setup > script/command line interface: the implementation should be fairly > straightforward, assuming the existing CCompiler classes do the right > thing for generating binary executables.) Distutils currently only supports build_*** commands for C-libraries and Python extensions. Shouldn't there also be build commands for shared libraries, executable programs and static Python binaries? Thomas BTW: Distutils-sig seems pretty dead these days... From ping@lfw.org Fri Dec 1 10:23:56 2000 From: ping@lfw.org (Ka-Ping Yee) Date: Fri, 1 Dec 2000 02:23:56 -0800 (PST) Subject: [Python-Dev] Cryptic error messages Message-ID: An attempt to use sockets for the first time yesterday left a friend of mine bewildered: >>> import socket >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) >>> s.connect('localhost:234') Traceback (most recent call last): File "", line 1, in ? TypeError: 2-sequence, 13-sequence >>> "What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked. I see in getargs.c (line 275) that this type of message is documented: /* Convert a tuple argument. [...] If the argument is invalid: [...] *msgbuf contains an error message, whose format is: ", ", where: is the name of the expected type, and is the name of the actual type, (so you can surround it by "expected ... found"), and msgbuf is returned. */ It's clear that the socketmodule is not prepending "expected" and appending "found", as the author of converttuple intended. But when i grepped through the source code, i couldn't find anyone applying this "expected %s found" % msgbuf convention outside of getargs.c. Is it really in use? Could we just change getargs.c so that converttuple() returns a message like "expected ..., got ..." instead of seterror()? Additionally it would be nice to say '13-character string' instead of '13-sequence'... -- ?!ng "All models are wrong; some models are useful." -- George Box From mwh21@cam.ac.uk Fri Dec 1 11:20:23 2000 From: mwh21@cam.ac.uk (Michael Hudson) Date: 01 Dec 2000 11:20:23 +0000 Subject: [Python-Dev] Cryptic error messages In-Reply-To: Ka-Ping Yee's message of "Fri, 1 Dec 2000 02:23:56 -0800 (PST)" References: Message-ID: Ka-Ping Yee writes: > An attempt to use sockets for the first time yesterday left a > friend of mine bewildered: > > >>> import socket > >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > >>> s.connect('localhost:234') > Traceback (most recent call last): > File "", line 1, in ? > TypeError: 2-sequence, 13-sequence > >>> > > "What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked. > I'm not sure about the general case, but in this case you could do something like: http://sourceforge.net/patch/?func=detailpatch&patch_id=102599&group_id=5470 Now you get an error message like: TypeError: getsockaddrarg: AF_INET address must be tuple, not string Cheers, M. -- I have gathered a posie of other men's flowers, and nothing but the thread that binds them is my own. -- Montaigne From guido@python.org Fri Dec 1 13:02:02 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 01 Dec 2000 08:02:02 -0500 Subject: [Python-Dev] TypeError: foo, bar In-Reply-To: Your message of "Fri, 01 Dec 2000 07:39:57 +0100." <008f01c05b61$877263b0$3c6340d5@hagrid> References: <008f01c05b61$877263b0$3c6340d5@hagrid> Message-ID: <200012011302.IAA31609@cj20424-a.reston1.va.home.com> > just stumbled upon yet another (high-profile) python newbie > confused a "TypeError: read-only character buffer, dictionary" > message. > > how about changing "read-only character buffer" to "string > or read-only character buffer", and the "foo, bar" format to > "expected foo, found bar", so we get: > > "TypeError: expected string or read-only character > buffer, found dictionary" The first was easy, and I've done it. The second one, for some reason, is hard. I forget why. Sorry. --Guido van Rossum (home page: http://www.python.org/~guido/) From cgw@fnal.gov Fri Dec 1 13:41:04 2000 From: cgw@fnal.gov (Charles G Waldman) Date: Fri, 1 Dec 2000 07:41:04 -0600 (CST) Subject: [Python-Dev] TypeError: foo, bar In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid> References: <008f01c05b61$877263b0$3c6340d5@hagrid> Message-ID: <14887.43632.812342.414156@buffalo.fnal.gov> Fredrik Lundh writes: > how about changing "read-only character buffer" to "string > or read-only character buffer", and the "foo, bar" format to > "expected foo, found bar", so we get: > > "TypeError: expected string or read-only character > buffer, found dictionary" +100. Recently, I've been teaching Python to some beginners and they find this message absolutely inscrutable. Also agree with Tim about "found" vs. "got", but this is of secondary importance. From Moshe Zadka Fri Dec 1 14:26:03 2000 From: Moshe Zadka (Moshe Zadka) Date: Fri, 1 Dec 2000 16:26:03 +0200 (IST) Subject: [Python-Dev] [OT] Change of Address Message-ID: I'm sorry to bother you all with this, but from time to time you might need to reach my by e-mail... 30 days from now, this e-mail address will no longer be valid. Please use anything@zadka.site.co.il to reach me. Thank you for your time. -- Moshe Zadka -- 95855124 http://advogato.org/person/moshez From gward@mems-exchange.org Fri Dec 1 15:14:53 2000 From: gward@mems-exchange.org (Greg Ward) Date: Fri, 1 Dec 2000 10:14:53 -0500 Subject: [Python-Dev] PEP 229 and 222 In-Reply-To: <014301c05b6e$269716a0$e000a8c0@thomasnotebook>; from thomas.heller@ion-tof.com on Fri, Dec 01, 2000 at 09:10:21AM +0100 References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us> <014301c05b6e$269716a0$e000a8c0@thomasnotebook> Message-ID: <20001201101452.A26074@ludwig.cnri.reston.va.us> On 01 December 2000, Thomas Heller said: > Distutils currently only supports build_*** commands for > C-libraries and Python extensions. > > Shouldn't there also be build commands for shared libraries, > executable programs and static Python binaries? Andrew and I talked about this a bit yesterday, and the proposed interface is as follows: python setup.py build_ext --static will compile all extensions in the current module distribution, but instead of creating a .so (.pyd) file for each one, will create a new python binary in build/bin.. Issue to be resolved: what to call the new python binary, especially when installing it (presumably we *don't* want to clobber the stock binary, but supplement it with (eg.) "foopython"). Note that there is no provision for selectively building some extensions as shared. This means that Andrew's Distutil-ization of the standard library will have to override the build_ext command and have some extra way to select extensions for shared/static. Neither of us considered this a problem. > BTW: Distutils-sig seems pretty dead these days... Yeah, that's a combination of me playing on other things and python.net email being dead for over a week. I'll cc the sig on this and see if this interface proposal gets anyone's attention. Greg From jeremy@alum.mit.edu Fri Dec 1 19:27:14 2000 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Dec 2000 14:27:14 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test Message-ID: <14887.64402.88530.714821@bitdiddle.concentric.net> There was recently some idle chatter in Guido's living room about using a unit testing framework (like PyUnit) for the Python regression test suite. We're also writing tests for some DC projects, and need to decide what framework to use. Does anyone have opinions on test frameworks? A quick web search turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver that allows implements xUnit-style unit tests. Are there other tools we should consider? Is anyone else interested in migrating the current test suite to a new framework? I hope the new framework will allow us to improve the test suite in a number of ways: - run an entire test suite to completion instead of stopping on the first failure - clearer reporting of what went wrong - better support for conditional tests, e.g. write a test for httplib that only runs if the network is up. This is tied into better error reporting, since the current test suite could only report that httplib succeeded or failed. Jeremy From fdrake@acm.org Fri Dec 1 19:24:46 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 1 Dec 2000 14:24:46 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net> References: <14887.64402.88530.714821@bitdiddle.concentric.net> Message-ID: <14887.64254.399477.935828@cj42289-a.reston1.va.home.com> Jeremy Hylton writes: > - better support for conditional tests, e.g. write a test for > httplib that only runs if the network is up. This is tied into > better error reporting, since the current test suite could only > report that httplib succeeded or failed. There is a TestSkipped exception that can be raised with an explanation of why. It's used in the largefile test (at least). I think it is documented in the README. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From akuchlin@mems-exchange.org Fri Dec 1 19:58:27 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 1 Dec 2000 14:58:27 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 02:27:14PM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> Message-ID: <20001201145827.D16751@kronos.cnri.reston.va.us> On Fri, Dec 01, 2000 at 02:27:14PM -0500, Jeremy Hylton wrote: >There was recently some idle chatter in Guido's living room about >using a unit testing framework (like PyUnit) for the Python regression >test suite. We're also writing tests for some DC projects, and need Someone remembered my post of 23 Nov, I see... The only other test framework I know of is the unittest.py inside Quixote, written because we thought PyUnit was kind of clunky. Greg Ward, who primarily wrote it, used more sneaky interpreter tricks to make the interface more natural, though it still worked with Jython last time we checked (some time ago, though). No GUI, but it can optionally show the code coverage of a test suite, too. See http://x63.deja.com/=usenet/getdoc.xp?AN=683946404 for some notes on using it. Obviously I think the Quixote unittest.py is the best choice for the stdlib. --amk From jeremy@alum.mit.edu Fri Dec 1 20:55:28 2000 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Dec 2000 15:55:28 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us> References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> Message-ID: <14888.4160.838336.537708@bitdiddle.concentric.net> Is there any documentation for the Quixote unittest tool? The Example page is helpful, but it feels like there are some details that are not explained. Jeremy From akuchlin@mems-exchange.org Fri Dec 1 21:12:12 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 1 Dec 2000 16:12:12 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14888.4160.838336.537708@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 03:55:28PM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> Message-ID: <20001201161212.A12372@kronos.cnri.reston.va.us> On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote: >Is there any documentation for the Quixote unittest tool? The Example >page is helpful, but it feels like there are some details that are not >explained. I don't believe we've written docs at all for internal use. What details seem to be missing? --amk From jeremy@alum.mit.edu Fri Dec 1 21:21:27 2000 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Dec 2000 16:21:27 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <20001201161212.A12372@kronos.cnri.reston.va.us> References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> <20001201161212.A12372@kronos.cnri.reston.va.us> Message-ID: <14888.5719.844387.435471@bitdiddle.concentric.net> >>>>> "AMK" == Andrew Kuchling writes: AMK> On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote: >> Is there any documentation for the Quixote unittest tool? The >> Example page is helpful, but it feels like there are some details >> that are not explained. AMK> I don't believe we've written docs at all for internal use. AMK> What details seem to be missing? Details: - I assume setup/shutdown are equivalent to setUp/tearDown - Is it possible to override constructor for TestScenario? - Is there something equivalent to PyUnit self.assert_ - What does parse_args() do? - What does run_scenarios() do? - If I have multiple scenarios, how do I get them to run? Jeremy From akuchlin@mems-exchange.org Fri Dec 1 21:34:30 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 1 Dec 2000 16:34:30 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14888.5719.844387.435471@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 04:21:27PM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> <20001201161212.A12372@kronos.cnri.reston.va.us> <14888.5719.844387.435471@bitdiddle.concentric.net> Message-ID: <20001201163430.A12417@kronos.cnri.reston.va.us> On Fri, Dec 01, 2000 at 04:21:27PM -0500, Jeremy Hylton wrote: > - I assume setup/shutdown are equivalent to setUp/tearDown Correct. > - Is it possible to override constructor for TestScenario? Beats me; I see no reason why you couldn't, though. > - Is there something equivalent to PyUnit self.assert_ Probably test_bool(), I guess: self.test_bool('self.run.is_draft()') asserts that self.run.is_draft() will return true. Or does self.assert_() do something more? > - What does parse_args() do? > - What does run_scenarios() do? > - If I have multiple scenarios, how do I get them to run? These 3 questions are all related, really. At the bottom of our test scripts, we have the following stereotyped code: if __name__ == "__main__": (scenarios, options) = parse_args() run_scenarios (scenarios, options) parse_args() ensures consistent arguments to test scripts; -c measures code coverage, -v is verbose, etc. It also looks in the __main__ module and finds all subclasses of TestScenario, so you can do: python test_process_run.py # Runs all N scenarios python test_process_run.py ProcessRunTest # Runs all cases for 1 scenario python test_process_run.py ProcessRunTest:check_access # Runs one test case # in one scenario class --amk From tim.one@home.com Fri Dec 1 21:47:54 2000 From: tim.one@home.com (Tim Peters) Date: Fri, 1 Dec 2000 16:47:54 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net> Message-ID: [Jeremy Hylton] > There was recently some idle chatter in Guido's living room about > using a unit testing framework (like PyUnit) for the Python regression > test suite. We're also writing tests for some DC projects, and need > to decide what framework to use. > > Does anyone have opinions on test frameworks? A quick web search > turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver > that allows implements xUnit-style unit tests. Are there other tools > we should consider? My own doctest is loved by people other than just me , but is aimed at ensuring that examples in docstrings work exactly as shown (which is why it starts with "doc" instead of "test"). > Is anyone else interested in migrating the current test suite to a new > framework? Yes. > I hope the new framework will allow us to improve the test > suite in a number of ways: > > - run an entire test suite to completion instead of stopping on > the first failure doctest does that. > - clearer reporting of what went wrong Ditto. > - better support for conditional tests, e.g. write a test for > httplib that only runs if the network is up. This is tied into > better error reporting, since the current test suite could only > report that httplib succeeded or failed. A doctest test is simply an interactive Python session pasted into a docstring (or more than one session, and/or interspersed with prose). If you can write an example in the interactive shell, doctest will verify it still works as advertised. This allows for embedding unit tests into the docs for each function, method and class. Nothing about them "looks like" an artificial test tacked on: the examples in the docs *are* the test cases. I need to try the other frameworks. I dare say doctest is ideal for computational functions, where the intended input->output relationship can be clearly explicated via examples. It's useless for GUIs. Usefulness varies accordingly between those extremes (doctest is natural exactly to the extent that a captured interactive session is helpful for documentation purposes). testing-ain't-easy-ly y'rs - tim From barry@digicool.com Sat Dec 2 03:52:29 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 1 Dec 2000 22:52:29 -0500 Subject: [Python-Dev] PEP 231, __findattr__() Message-ID: <14888.29181.355023.669030@anthem.concentric.net> I've just uploaded PEP 231, which describes a new hook in the instance access mechanism, called __findattr__() after a similar mechanism that exists in Jython (but is not exposed at the Python layer). You can do all kinds of interesting things with __findattr__(), including implement the __of__() protocol of ExtensionClass, and thus implicit and explicit acquisitions, in pure Python. You can also do Java Bean-like interfaces and C++-like access control. The PEP contains sample implementations of all of these, although the latter isn't as clean as I'd like, due to other restrictions in Python. My hope is that __findattr__() would eliminate most, if not all, the need for ExtensionClass, at least within the Zope and ZODB contexts. I haven't tried to implement Persistent using it though. Since it's a long PEP, I won't include it here. You can read about it at this URL http://python.sourceforge.net/peps/pep-0231.html It includes a link to the patch implementing this feature on SourceForge. Enjoy, -Barry From Moshe Zadka Sat Dec 2 09:11:50 2000 From: Moshe Zadka (Moshe Zadka) Date: Sat, 2 Dec 2000 11:11:50 +0200 (IST) Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: On Fri, 1 Dec 2000, Barry A. Warsaw wrote: > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). There's one thing that bothers me about this: what exactly is "the call stack"? Let me clarify: what happens when you have threads. Either machine-level threads and stackless threads confuse the issues here, not to talk about stackless continuations. Can you add a few words to the PEP about dealing with those? From mal@lemburg.com Sat Dec 2 10:03:11 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 02 Dec 2000 11:03:11 +0100 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: <3A28C8DF.E430484F@lemburg.com> "Barry A. Warsaw" wrote: > > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). > > You can do all kinds of interesting things with __findattr__(), > including implement the __of__() protocol of ExtensionClass, and thus > implicit and explicit acquisitions, in pure Python. You can also do > Java Bean-like interfaces and C++-like access control. The PEP > contains sample implementations of all of these, although the latter > isn't as clean as I'd like, due to other restrictions in Python. > > My hope is that __findattr__() would eliminate most, if not all, the > need for ExtensionClass, at least within the Zope and ZODB contexts. > I haven't tried to implement Persistent using it though. The PEP does define when and how __findattr__() is called, but makes no statement about what it should do or return... Here's a slightly different idea: Given the name, I would expect it to go look for an attribute and then return the attribute and its container (this doesn't seem to be what you have in mind here, though). An alternative approach given the semantics above would then be to first try a __getattr__() lookup and revert to __findattr__() in case this fails. I don't think there is any need to overload __setattr__() in such a way, because you cannot be sure which object actually gets the new attribute. By exposing the functionality using a new builtin, findattr(), this could be used for all the examples you give too. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From barry@digicool.com Sat Dec 2 16:50:02 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Sat, 2 Dec 2000 11:50:02 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> <3A28C8DF.E430484F@lemburg.com> Message-ID: <14889.10298.621133.961677@anthem.concentric.net> >>>>> "M" == M writes: M> The PEP does define when and how __findattr__() is called, M> but makes no statement about what it should do or return... Good point. I've clarified that in the PEP. M> Here's a slightly different idea: M> Given the name, I would expect it to go look for an attribute M> and then return the attribute and its container (this doesn't M> seem to be what you have in mind here, though). No, because some applications won't need a wrapped object. E.g. in the Java bean example, it just returns the attribute (which is stored with a slightly different name). M> An alternative approach given the semantics above would then be M> to first try a __getattr__() lookup and revert to M> __findattr__() in case this fails. I don't think this is as useful. What would that buy you that you can't already do today? The key concept here is that you want to give the class first crack to interpose on every attribute access. You want this hook to get called before anybody else can get at, or set, your attributes. That gives you (the class) total control to implement whatever policy is useful. M> I don't think there is any need to overload __setattr__() in M> such a way, because you cannot be sure which object actually M> gets the new attribute. M> By exposing the functionality using a new builtin, findattr(), M> this could be used for all the examples you give too. No, because then people couldn't use the object in the normal dot-notational way. -Barry From tismer@tismer.com Sat Dec 2 16:27:33 2000 From: tismer@tismer.com (Christian Tismer) Date: Sat, 02 Dec 2000 18:27:33 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: <3A2922F5.C2E0D10@tismer.com> Hi Barry, "Barry A. Warsaw" wrote: > > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). > > You can do all kinds of interesting things with __findattr__(), > including implement the __of__() protocol of ExtensionClass, and thus > implicit and explicit acquisitions, in pure Python. You can also do > Java Bean-like interfaces and C++-like access control. The PEP > contains sample implementations of all of these, although the latter > isn't as clean as I'd like, due to other restrictions in Python. > > My hope is that __findattr__() would eliminate most, if not all, the > need for ExtensionClass, at least within the Zope and ZODB contexts. > I haven't tried to implement Persistent using it though. I have been using ExtensionClass for quite a long time, and I have to say that you indeed eliminate most of its need through this super-elegant idea. Congratulations! Besides acquisition and persitency interception, wrapping plain C objects and giving them Class-like behavior while retaining fast access to internal properties but being able to override methods by Python methods was my other use of ExtensionClass. I assume this is the other "20%" part you mention, which is much harder to achieve? But that part also looks easier to implement now, by the support of the __findattr__ method. > Since it's a long PEP, I won't include it here. You can read about it > at this URL > > http://python.sourceforge.net/peps/pep-0231.html Great. I had to read it twice, but it was fun. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tismer@tismer.com Sat Dec 2 16:55:21 2000 From: tismer@tismer.com (Christian Tismer) Date: Sat, 02 Dec 2000 18:55:21 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: Message-ID: <3A292979.60BB1731@tismer.com> Moshe Zadka wrote: > > On Fri, 1 Dec 2000, Barry A. Warsaw wrote: > > > I've just uploaded PEP 231, which describes a new hook in the instance > > access mechanism, called __findattr__() after a similar mechanism that > > exists in Jython (but is not exposed at the Python layer). > > There's one thing that bothers me about this: what exactly is "the > call stack"? Let me clarify: what happens when you have threads. > Either machine-level threads and stackless threads confuse the issues > here, not to talk about stackless continuations. Can you add a few > words to the PEP about dealing with those? As far as I understood the patch (just skimmed), thee is no stack involved directly, but the instance increments and decrments a variable infindattr. + if (v != NULL && !inst->infindaddr && + (func = inst->in_class->cl_findattr)) + { + PyObject *args, *res; + args = Py_BuildValue("(OOO)", inst, name, v); + if (args == NULL) + return -1; + ++inst->infindaddr; + res = PyEval_CallObject(func, args); + --inst->infindaddr; This is: The call modifies the instance's state, while calling the findattr method. You are right: I see a serious problem with this. It doesn't even need continuations to get things messed up. Guido's proposed coroutines, together with uThread-Switching, might be able to enter the same instance twice with ease. Barry, after second thought, I feel this can become a problem in the future. This infindattr attribute only works correctly if we are guaranteed to use strict stack order of execution. What you're *intending* to to is to tell the PyEval_CallObject that it should not find the __findattr__ attribute. But this should be done only for this call and all of its descendants, but no *fresh* access from elsewhere. The hard way to get out of this would be to stop scheduling in that case. Maybe this is very cheap, but quite unelegant. We have a quite peculiar system state here: A function call acts like an escape, to make all subsequent calls behave differently, until this call is finished. Without blocking microthreads, a clean way to do this would be a search up in the frame chain, if there is a running __findattr__ method of this object. Fairly expensive. Well, the problem also exists with real threads, if they are allowed to switch in such a context. I fear it is necessary to either block this stuff until it is ready, or to maintain some thread-wise structure for the state of this object. Ok, after thinking some more, I'll start an extra message to Barry on this topic. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tismer@tismer.com Sat Dec 2 17:21:18 2000 From: tismer@tismer.com (Christian Tismer) Date: Sat, 02 Dec 2000 19:21:18 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: <3A292F8D.7C616449@tismer.com> "Barry A. Warsaw" wrote: > > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). Ok, as I announced already, here some thoughts on __findattr__, system state, and how it could work. Looking at your patch, I realize that you are blocking __findattr__ for your whole instance, until this call ends. This is not what you want to do, I guess. This has an effect of affecting the whole system state when threads are involved. Also you cannot use __findattr__ on any other attribute during this call. You want most probably do this: __findattr__ should not be invoked again for this instance, with this attribute name, for this "thread", until you are done. The correct way to find out whether __findattr__ is active or not would be to look upwards the frame chain and inspect it. Moshe also asked about continuations: I think this would resolve quite fine. However we jump around, the current chain of frames dictates the semantics of __findattr__. It even applies to Guido's tamed coroutines, given that an explicit switch were allowed in the context of __findattr__. In a sense, we get some kind of dynamic context here, since we need to do a lookup for something in the dynamic call chain. I guess this would be quite messy to implement, and inefficient. Isn't there a way to accomplish the desired effect without modifying the instance? In the context of __findattr__, *we* know that we don't want to get a recursive call. Let's assume __getattr__ and __setattr__ had yet another optional parameter: infindattr, defaulting to 0. We would than have to pass a positive value in this context, which would object.c tell to not try to invoke __findattr__ again. With explicit passing of state, no problems with threads can occour. Readability might improve as well. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From Moshe Zadka Sun Dec 3 13:14:43 2000 From: Moshe Zadka (Moshe Zadka) Date: Sun, 3 Dec 2000 15:14:43 +0200 (IST) Subject: [Python-Dev] Another Python Developer Missing Message-ID: Gordon McMillan is not a possible assignee in the assign_to field. -- Moshe Zadka -- 95855124 http://moshez.org From tim.one@home.com Sun Dec 3 17:35:36 2000 From: tim.one@home.com (Tim Peters) Date: Sun, 3 Dec 2000 12:35:36 -0500 Subject: [Python-Dev] Another Python Developer Missing In-Reply-To: Message-ID: [Moshe Zadka] > Gordon McMillan is not a possible assignee in the assign_to field. We almost never add people as Python developers unless they ask for that, since it comes with responsibility as well as riches beyond the dreams of avarice. If Gordon would like to apply, we won't charge him any interest until 2001 . From mal@lemburg.com Sun Dec 3 19:21:11 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 03 Dec 2000 20:21:11 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib urllib.py,1.107,1.108 References: <200012031830.KAA30620@slayer.i.sourceforge.net> Message-ID: <3A2A9D27.AF43D665@lemburg.com> "Martin v. Löwis" wrote: > > Update of /cvsroot/python/python/dist/src/Lib > In directory slayer.i.sourceforge.net:/tmp/cvs-serv30506 > > Modified Files: > urllib.py > Log Message: > Convert Unicode strings to byte strings before passing them into specific > protocols. Closes bug #119822. > > ... > + > + def toBytes(url): > + """toBytes(u"URL") --> 'URL'.""" > + # Most URL schemes require ASCII. If that changes, the conversion > + # can be relaxed > + if type(url) is types.UnicodeType: > + try: > + url = url.encode("ASCII") You should make this: 'ascii' -- encoding names are lower case per convention (and the implementation has a short-cut to speed up conversion to 'ascii' -- not for 'ASCII'). > + except UnicodeError: > + raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters") Would it be better to use a simple ValueError here ? (UnicodeError is a subclass of ValueError, but the error doesn't really have something to do with Unicode conversions...) > + return url > > def unwrap(url): -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer@tismer.com Sun Dec 3 20:01:07 2000 From: tismer@tismer.com (Christian Tismer) Date: Sun, 03 Dec 2000 22:01:07 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib filecmp.py,1.6,1.7 References: <200012032048.MAA10353@slayer.i.sourceforge.net> Message-ID: <3A2AA683.3840AA8A@tismer.com> Moshe Zadka wrote: > > Update of /cvsroot/python/python/dist/src/Lib > In directory slayer.i.sourceforge.net:/tmp/cvs-serv9465 > > Modified Files: > filecmp.py > Log Message: > Call of _cmp had wrong number of paramereters. > Fixed definition of _cmp. ... > ! return not abs(cmp(a, b, sh, st)) > except os.error: > return 2 Ugh! Wouldn't that be a fine chance to rename the cmp function in this module? Overriding a built-in is really not nice to have in a library. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From Moshe Zadka Sun Dec 3 21:01:07 2000 From: Moshe Zadka (Moshe Zadka) Date: Sun, 3 Dec 2000 23:01:07 +0200 (IST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib filecmp.py,1.6,1.7 In-Reply-To: <3A2AA683.3840AA8A@tismer.com> Message-ID: On Sun, 3 Dec 2000, Christian Tismer wrote: > Ugh! Wouldn't that be a fine chance to rename the cmp > function in this module? Overriding a built-in > is really not nice to have in a library. The fine chance was when we moved cmp.py->filecmp.py. Now it would just break backwards compatability. -- Moshe Zadka -- 95855124 http://moshez.org From tismer@tismer.com Sun Dec 3 20:12:15 2000 From: tismer@tismer.com (Christian Tismer) Date: Sun, 03 Dec 2000 22:12:15 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Libfilecmp.py,1.6,1.7 References: Message-ID: <3A2AA91F.843E2BAE@tismer.com> Moshe Zadka wrote: > > On Sun, 3 Dec 2000, Christian Tismer wrote: > > > Ugh! Wouldn't that be a fine chance to rename the cmp > > function in this module? Overriding a built-in > > is really not nice to have in a library. > > The fine chance was when we moved cmp.py->filecmp.py. > Now it would just break backwards compatability. Yes, I see. cmp belongs to the module's interface. Maybe it could be renamed anyway, and be assigned to cmp at the very end of the file, but not using cmp anywhere in the code. My first reaction on reading the patch was "juck!" since I didn't know this module. python-dev/null - ly y'rs - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From martin@loewis.home.cs.tu-berlin.de Sun Dec 3 21:56:44 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 3 Dec 2000 22:56:44 +0100 Subject: [Python-Dev] PEP 231, __findattr__() Message-ID: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> > Isn't there a way to accomplish the desired effect without modifying > the instance? In the context of __findattr__, *we* know that we > don't want to get a recursive call. Let's assume __getattr__ and > __setattr__ had yet another optional parameter: infindattr, > defaulting to 0. We would than have to pass a positive value in > this context, which would object.c tell to not try to invoke > __findattr__ again. Who is "we" here? The Python code implementing __findattr__? How would it pass a value to __setattr__? It doesn't call __setattr__, instead it has "self.__myfoo = x"... I agree that the current implementation is not thread-safe. To solve that, you'd need to associate with each instance not a single "infindattr" attribute, but a whole set of them - one per "thread of execution" (which would be a thread-id in most threading systems). Of course, that would need some cooperation from the any thread scheme (including uthreads), which would need to provide an identification for a "calling context". Regards, Martin From martin@loewis.home.cs.tu-berlin.de Sun Dec 3 22:07:17 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 3 Dec 2000 23:07:17 +0100 Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108 Message-ID: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de> > You should make this: 'ascii' -- encoding names are lower case per > convention (and the implementation has a short-cut to speed up > conversion to 'ascii' -- not for 'ASCII'). With conventions, it is a difficult story. I'm pretty certain that users typically see that particular american standard as ASCII (to the extend of calling it "a s c two"), not ascii. As for speed - feel free to change the code if you think it matters. > + raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters") > Would it be better to use a simple ValueError here ? (UnicodeError > is a subclass of ValueError, but the error doesn't really have > something to do with Unicode conversions...) Why does it not have to do with Unicode conversion? A conversion from Unicode to ASCII was attempted, and failed. I guess I would be more open to suggested changes if you had put them into the patch manager at the time you've reviewed the patch... Regards, Martin From tismer@tismer.com Sun Dec 3 21:38:11 2000 From: tismer@tismer.com (Christian Tismer) Date: Sun, 03 Dec 2000 23:38:11 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> Message-ID: <3A2ABD43.AB56BD60@tismer.com> "Martin v. Loewis" wrote: > > > Isn't there a way to accomplish the desired effect without modifying > > the instance? In the context of __findattr__, *we* know that we > > don't want to get a recursive call. Let's assume __getattr__ and > > __setattr__ had yet another optional parameter: infindattr, > > defaulting to 0. We would than have to pass a positive value in > > this context, which would object.c tell to not try to invoke > > __findattr__ again. > > Who is "we" here? The Python code implementing __findattr__? How would > it pass a value to __setattr__? It doesn't call __setattr__, instead > it has "self.__myfoo = x"... Ouch - right! Sorry :) > I agree that the current implementation is not thread-safe. To solve > that, you'd need to associate with each instance not a single > "infindattr" attribute, but a whole set of them - one per "thread of > execution" (which would be a thread-id in most threading systems). Of > course, that would need some cooperation from the any thread scheme > (including uthreads), which would need to provide an identification > for a "calling context". Right, that is one possible way to do it. I also thought about some alternatives, but they all sound too complicated to justify them. Also I don't think this is only thread-related, since mess can happen even with an explicit coroutine jmp. Furthermore, how to deal with multiple attribute names? The function works wrong if __findattr__ tries to inspect another attribute. IMO, the state of the current interpreter changes here (or should do so), and this changed state needs to be carried down with all subsequent function calls. confused - ly chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From mal@lemburg.com Sun Dec 3 22:51:10 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 03 Dec 2000 23:51:10 +0100 Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108 References: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de> Message-ID: <3A2ACE5E.A9F860A8@lemburg.com> "Martin v. Loewis" wrote: > > > You should make this: 'ascii' -- encoding names are lower case per > > convention (and the implementation has a short-cut to speed up > > conversion to 'ascii' -- not for 'ASCII'). > > With conventions, it is a difficult story. I'm pretty certain that > users typically see that particular american standard as ASCII (to the > extend of calling it "a s c two"), not ascii. It's a convention in the codec registry design and used as such in the Unicode implementation. > As for speed - feel free to change the code if you think it matters. Hey... this was just a suggestion. I thought that you didn't know of the internal short-cut and wanted to hint at it. > > + raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters") > > > Would it be better to use a simple ValueError here ? (UnicodeError > > is a subclass of ValueError, but the error doesn't really have > > something to do with Unicode conversions...) > > Why does it not have to do with Unicode conversion? A conversion from > Unicode to ASCII was attempted, and failed. Sure, but the fact that URLs have to be ASCII is not something that is enforced by the Unicode implementation. > I guess I would be more open to suggested changes if you had put them > into the patch manager at the time you've reviewed the patch... I didn't review the patch, only the summary... Don't have much time to look into these things closely right now, so all I can do is comment. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From barry@scottb.demon.co.uk Mon Dec 4 00:55:32 2000 From: barry@scottb.demon.co.uk (Barry Scott) Date: Mon, 4 Dec 2000 00:55:32 -0000 Subject: [Python-Dev] A house upon the sand In-Reply-To: <20001130181937.B21596@ludwig.cnri.reston.va.us> Message-ID: <000201c05d8c$e7a15b10$060210ac@private> I fully support Greg Wards view. If string was removed I'd not update the old code but add in my own string module. Given the effort you guys went to to keep the C extension protocol the same (in the context of crashing on importing a 1.5 dll into 2.0) I amazed you think that string could be removed... Could you split the lib into blessed and backward compatibility sections? Then by some suitable mechanism I can choose the compatibility I need? Oh and as for join obviously a method of a list... ['thats','better'].join(' ') Barry From fredrik@pythonware.com Mon Dec 4 10:37:18 2000 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 4 Dec 2000 11:37:18 +0100 Subject: [Python-Dev] unit testing and Python regression test References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> Message-ID: <00e701c05dde$2d77c240$0900a8c0@SPIFF> andrew kuchling wrote: > Someone remembered my post of 23 Nov, I see... The only other test > framework I know of is the unittest.py inside Quixote, written because > we thought PyUnit was kind of clunky. the pythonware teams agree -- we've been using an internal reimplementation of Kent Beck's original Smalltalk work, but we're switching to unittest.py. > Obviously I think the Quixote unittest.py is the best choice for the stdlib. +1 from here. From mal@lemburg.com Mon Dec 4 11:14:20 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 04 Dec 2000 12:14:20 +0100 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> <3A28C8DF.E430484F@lemburg.com> <14889.10298.621133.961677@anthem.concentric.net> Message-ID: <3A2B7C8C.D6B889EE@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "M" == M writes: > > M> The PEP does define when and how __findattr__() is called, > M> but makes no statement about what it should do or return... > > Good point. I've clarified that in the PEP. > > M> Here's a slightly different idea: > > M> Given the name, I would expect it to go look for an attribute > M> and then return the attribute and its container (this doesn't > M> seem to be what you have in mind here, though). > > No, because some applications won't need a wrapped object. E.g. in > the Java bean example, it just returns the attribute (which is stored > with a slightly different name). I was thinking of a standardised helper which could then be used for all kinds of attribute retrieval techniques. Acquisition would be easy to do, access control too. In most cases __findattr__ would simply return (self, self.attrname). > M> An alternative approach given the semantics above would then be > M> to first try a __getattr__() lookup and revert to > M> __findattr__() in case this fails. > > I don't think this is as useful. What would that buy you that you > can't already do today? Forget that idea... *always* calling __findattr__ is the more useful way, just like you intended. > The key concept here is that you want to give the class first crack to > interpose on every attribute access. You want this hook to get called > before anybody else can get at, or set, your attributes. That gives > you (the class) total control to implement whatever policy is useful. Right. > M> I don't think there is any need to overload __setattr__() in > M> such a way, because you cannot be sure which object actually > M> gets the new attribute. > > M> By exposing the functionality using a new builtin, findattr(), > M> this could be used for all the examples you give too. > > No, because then people couldn't use the object in the normal > dot-notational way. Uhm, why not ? -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gvwilson@nevex.com Mon Dec 4 14:40:58 2000 From: gvwilson@nevex.com (Greg Wilson) Date: Mon, 4 Dec 2000 09:40:58 -0500 Subject: [Python-Dev] Q: Python standard library re-org plans/schedule? In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us> Message-ID: Hi, everyone. A potential customer has asked whether there are any plans to re-organize and rationalize the Python standard library. If there are any firms plans, and a schedule (however tentative), I'd be grateful for a pointer. Thanks, Greg From barry@digicool.com Mon Dec 4 15:13:23 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 10:13:23 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> Message-ID: <14891.46227.785856.307437@anthem.concentric.net> >>>>> "MvL" == Martin v Loewis writes: MvL> I agree that the current implementation is not MvL> thread-safe. To solve that, you'd need to associate with each MvL> instance not a single "infindattr" attribute, but a whole set MvL> of them - one per "thread of execution" (which would be a MvL> thread-id in most threading systems). Of course, that would MvL> need some cooperation from the any thread scheme (including MvL> uthreads), which would need to provide an identification for MvL> a "calling context". I'm still catching up on several hundred emails over the weekend. I had a sneaking suspicion that infindattr wasn't thread-safe, so I'm convinced this is a bug in the implementation. One approach might be to store the info in the thread state object (isn't that how the recursive repr stop flag is stored?) That would also save having to allocate an extra int for every instance (yuck) but might impose a bit more of a performance overhead. I'll work more on this later today. -Barry From jeremy@alum.mit.edu Mon Dec 4 15:23:10 2000 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Mon, 4 Dec 2000 10:23:10 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <00e701c05dde$2d77c240$0900a8c0@SPIFF> References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <00e701c05dde$2d77c240$0900a8c0@SPIFF> Message-ID: <14891.46814.359333.76720@bitdiddle.concentric.net> >>>>> "FL" == Fredrik Lundh writes: FL> andrew kuchling wrote: >> Someone remembered my post of 23 Nov, I see... The only other >> test framework I know of is the unittest.py inside Quixote, >> written because we thought PyUnit was kind of clunky. FL> the pythonware teams agree -- we've been using an internal FL> reimplementation of Kent Beck's original Smalltalk work, but FL> we're switching to unittest.py. Can you provide any specifics about what you like about unittest.py (perhaps as opposed to PyUnit)? Jeremy From guido@python.org Mon Dec 4 15:20:11 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 04 Dec 2000 10:20:11 -0500 Subject: [Python-Dev] Q: Python standard library re-org plans/schedule? In-Reply-To: Your message of "Mon, 04 Dec 2000 09:40:58 EST." References: Message-ID: <200012041520.KAA20979@cj20424-a.reston1.va.home.com> > Hi, everyone. A potential customer has asked whether there are any > plans to re-organize and rationalize the Python standard library. > If there are any firms plans, and a schedule (however tentative), > I'd be grateful for a pointer. Alas, none that I know of except the ineffable Python 3000 schedule. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Mon Dec 4 15:46:53 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Mon, 4 Dec 2000 10:46:53 -0500 Subject: [Python-Dev] Quixote unit testing docs (Was: unit testing) In-Reply-To: <14891.46814.359333.76720@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Mon, Dec 04, 2000 at 10:23:10AM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <00e701c05dde$2d77c240$0900a8c0@SPIFF> <14891.46814.359333.76720@bitdiddle.concentric.net> Message-ID: <20001204104653.A19387@kronos.cnri.reston.va.us> Prodded by Jeremy, I went and actually wrote some documentation for the Quixote unittest.py; please see . The HTML is from a manually hacked Library Reference, so ignore the broken image links and other formatting goofyness. In case anyone needs it, the LaTeX is in /files/python/. The plain text version comes out to around 290 lines; I can post it to this list if that's desired. --amk From pf@artcom-gmbh.de Mon Dec 4 17:59:54 2000 From: pf@artcom-gmbh.de (Peter Funk) Date: Mon, 4 Dec 2000 18:59:54 +0100 (MET) Subject: Tim Peter's doctest compared to Quixote unit testing (was Re: [Python-Dev] Quixote unit testing docs) In-Reply-To: <20001204104653.A19387@kronos.cnri.reston.va.us> from Andrew Kuchling at "Dec 4, 2000 10:46:53 am" Message-ID: Hi all, Andrew Kuchling: > ... I ... actually wrote some documentation for > the Quixote unittest.py; please see > . [...] > comes out to around 290 lines; I can post it to this list if that's > desired. After reading Andrews docs, I think Quixote basically offers three additional features if compared with Tim Peters 'doctest': 1. integration of Skip Montanaro's code coverage analysis. 2. the idea of Scenario objects useful to share the setup needed to test related functions or methods of a class (same start condition). 3. Some useful functions to check whether the result returned by some test fullfills certain properties without having to be so explicite, as cut'n'paste from the interactive interpreter session would have been. As I've pointed out before in private mail to Jeremy I've used Tim Peters 'doctest.py' to accomplish all testing of Python apps in our company. In doctest each doc string is an independent unit, which starts fresh. Sometimes this leads to duplicated setup stuff, which is needed to test each method of a set of related methods from a class. This is distracting, if you intend the test cases to take their double role of being at same time useful documentational examples for the intended use of the provided API. Tim_one: Do you read this? What do you think about the idea to add something like the following two functions to 'doctest': use_module_scenario() -- imports all objects created and preserved during execution of the module doc string examples. use_class_scenario() -- imports all objects created and preserved during the execution of doc string examples of a class. Only allowed in doc string examples of methods. This would allow easily to provide the same setup scenario to a group of related test cases. AFAI understand doctest handles test-shutdown automatically, iff the doc string test examples leave no persistent resources behind. Regards, Peter From moshez@zadka.site.co.il Tue Dec 5 03:31:18 2000 From: moshez@zadka.site.co.il (Moshe Zadka) Date: Tue, 05 Dec 2000 05:31:18 +0200 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) of "Mon, 04 Dec 2000 10:13:23 EST." <14891.46227.785856.307437@anthem.concentric.net> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> Message-ID: <20001205033118.9135CA817@darjeeling.zadka.site.co.il> > I'm still catching up on several hundred emails over the weekend. I > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm > convinced this is a bug in the implementation. One approach might be > to store the info in the thread state object I don't think this is a good idea -- continuations and coroutines might mess it up. Maybe the right thing is to mess with the *compilation* of __findattr__ so that it would call __setattr__ and __getattr__ with special flags that stop them from calling __findattr__? This is ugly, but I can't think of a better way. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tismer@tismer.com Mon Dec 4 18:35:19 2000 From: tismer@tismer.com (Christian Tismer) Date: Mon, 04 Dec 2000 20:35:19 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> Message-ID: <3A2BE3E7.60A8E220@tismer.com> Moshe Zadka wrote: > > > I'm still catching up on several hundred emails over the weekend. I > > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm > > convinced this is a bug in the implementation. One approach might be > > to store the info in the thread state object > > I don't think this is a good idea -- continuations and coroutines might > mess it up. Maybe the right thing is to mess with the *compilation* of > __findattr__ so that it would call __setattr__ and __getattr__ with > special flags that stop them from calling __findattr__? This is > ugly, but I can't think of a better way. Yeah, this is what I tried to say by "different machine state"; compiling different behavior in the case of a special method is an interesting idea. It is limited somewhat, since the changed system state is not inherited to called functions. But if __findattr__ performs its one, single task in its body alone, we are fine. still-thinking-of-alternatives - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tismer@tismer.com Mon Dec 4 18:52:43 2000 From: tismer@tismer.com (Christian Tismer) Date: Mon, 04 Dec 2000 20:52:43 +0200 Subject: [Python-Dev] A house upon the sand References: <000201c05d8c$e7a15b10$060210ac@private> Message-ID: <3A2BE7FB.831F2F93@tismer.com> Barry Scott wrote: > > I fully support Greg Wards view. If string was removed I'd not > update the old code but add in my own string module. > > Given the effort you guys went to to keep the C extension protocol the > same (in the context of crashing on importing a 1.5 dll into 2.0) I > amazed you think that string could be removed... > > Could you split the lib into blessed and backward compatibility sections? > Then by some suitable mechanism I can choose the compatibility I need? > > Oh and as for join obviously a method of a list... > > ['thats','better'].join(' ') The above is the way as it is defined for JavaScript. But in JavaScript, the list join method performs an implicit str() on the list elements. As has been discussed some time ago, Python's lists are too versatile to justify a string-centric method. Marc André pointed out that one could do a reduction with the semantics of the "+" operator, but Guido said that he wouldn't like to see [2, 3, 5].join(7) being reduced to 2+7+3+7+5 == 24. That could only be avoided if there were a way to distinguish numeric addition from concatenation. but-I-could-live-with-it - ly y'rs - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From barry@digicool.com Mon Dec 4 21:23:00 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 16:23:00 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> Message-ID: <14892.2868.982013.313562@anthem.concentric.net> >>>>> "CT" == Christian Tismer writes: CT> You want most probably do this: __findattr__ should not be CT> invoked again for this instance, with this attribute name, for CT> this "thread", until you are done. First, I think the rule should be "__findattr__ should not be invoked again for this instance, in this thread, until you are done". I.e. once in __findattr__, you want all subsequent attribute references to bypass findattr, because presumably, your instance now has complete control for all accesses in this thread. You don't want to limit it to just the currently named attribute. Second, if "this thread" is defined as _PyThreadState_Current, then we have a simple solution, as I mapped out earlier. We do a PyThreadState_GetDict() and store the instance in that dict on entry to __findattr__ and remove it on exit from __findattr__. If the instance can be found in the current thread's dict, we bypass __findattr__. >>>>> "MZ" == Moshe Zadka writes: MZ> I don't think this is a good idea -- continuations and MZ> coroutines might mess it up. You might be right, but I'm not sure. If we make __findattr__ thread safe according to the definition above, and if uthread/coroutine/continuation safety can be accomplished by the __findattr__ programmer's discipline, then I think that is enough. IOW, if we can tell the __findattr__ author to not relinquish the uthread explicitly during the __findattr__ call, we're cool. Oh, and as long as we're not somehow substantially reducing the utility of __findattr__ by making that restriction. What I worry about is re-entrancy that isn't under the programmer's control, like the Real Thread-safety problem. -Barry From barry@digicool.com Mon Dec 4 22:58:33 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 17:58:33 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <3A2C0E0D.E042D026@tismer.com> Message-ID: <14892.8601.41178.81475@anthem.concentric.net> >>>>> "CT" == Christian Tismer writes: CT> Hmm. WHat do you think about Moshe's idea to change compiling CT> of the method? It has the nice advantage that there are no CT> Thread-safety problems by design. The only drawback is that CT> the contract of not-calling-myself only holds for this CT> function. I'm not sure I understand what Moshe was proposing. Moshe: are you saying that we should change the way the compiler works, so that it somehow recognizes this special case? I'm not sure I like that approach. I think I want something more runtime-y, but I'm not sure why (maybe just because I'm more comfortable mucking about in the run-time than in the compiler). -Barry From guido@python.org Mon Dec 4 23:16:17 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 04 Dec 2000 18:16:17 -0500 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: Your message of "Mon, 04 Dec 2000 16:23:00 EST." <14892.2868.982013.313562@anthem.concentric.net> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> Message-ID: <200012042316.SAA23081@cj20424-a.reston1.va.home.com> I'm unconvinced by the __findattr__ proposal as it now stands. - Do you really think that JimF would do away with ExtensionClasses if __findattr__ was intruduced? I kinda doubt it. See [*footnote]. It seems that *using* __findattr__ is expensive (even if *not* using is cheap :-). - Why is deletion not supported? What if you want to enforce a policy on deletions too? - It's ugly to use the same call for get and set. The examples indicate that it's not such a great idea: every example has *two* tests whether it's get or set. To share a policy, the proper thing to do is to write a method that either get or set can use. - I think it would be sufficient to *only* use __findattr__ for getattr -- __setattr__ and __delattr__ already have full control. The "one routine to implement the policy" argument doesn't really hold, I think. - The PEP says that the "in-findattr" flag is set on the instance. We've already determined that this is not thread-safe. This is not just a bug in the implementation -- it's a bug in the specification. I also find it ugly. But if we decide to do this, it can go in the thread-state -- if we ever add coroutines, we have to decide on what stuff to move from the thread state to the coroutine state anyway. - It's also easy to conceive situations where recursive __findattr__ calls on the same instance in the same thread/coroutine are perfectly desirable -- e.g. when __findattr__ ends up calling a method that uses a lot of internal machinery of the class. You don't want all the machinery to have to be aware of the fact that it may be called with __findattr__ on the stack and without it. So perhaps it may be better to only treat the body of __findattr__ itself special, as Moshe suggested. What does Jython do here? - The code examples require a *lot* of effort to understand. These are complicated issues! (I rewrote the Bean example using __getattr__ and __setattr__ and found no need for __findattr__; the __getattr__ version is simpler and easier to understand. I'm still studying the other __findattr__ examples.) - The PEP really isn't that long, except for the code examples. I recommend reading the patch first -- the patch is probably shorter than any specification of the feature can be. --Guido van Rossum (home page: http://www.python.org/~guido/) [*footnote] There's an easy way (that few people seem to know) to cause __getattr__ to be called for virtually all attribute accesses: put *all* (user-visible) attributes in a sepate dictionary. If you want to prevent access to this dictionary too (for Zope security enforcement), make it a global indexed by id() -- a destructor(__del__) can take care of deleting entries here. From martin@loewis.home.cs.tu-berlin.de Mon Dec 4 23:10:43 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 5 Dec 2000 00:10:43 +0100 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <14891.46227.785856.307437@anthem.concentric.net> (barry@digicool.com) References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> Message-ID: <200012042310.AAA00786@loewis.home.cs.tu-berlin.de> > I'm still catching up on several hundred emails over the weekend. I > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm > convinced this is a bug in the implementation. One approach might be > to store the info in the thread state object (isn't that how the > recursive repr stop flag is stored?) Whether this works depends on how exactly the info is stored. A single flag won't be sufficient, since multiple objects may have __findattr__ in progress in a given thread. With a set of instances, it would work, though. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Mon Dec 4 23:13:15 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 5 Dec 2000 00:13:15 +0100 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <20001205033118.9135CA817@darjeeling.zadka.site.co.il> (message from Moshe Zadka on Tue, 05 Dec 2000 05:31:18 +0200) References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> Message-ID: <200012042313.AAA00832@loewis.home.cs.tu-berlin.de> > I don't think this is a good idea -- continuations and coroutines > might mess it up. If coroutines and continuations present operate preemptively, then they should present themselves as an implementation of the thread API; perhaps the thread API needs to be extended to allow for such a feature. If yielding control is in the hands of the implementation, it would be easy to outrule a context switch while findattr is in progress. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Mon Dec 4 23:19:37 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 5 Dec 2000 00:19:37 +0100 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <14892.8601.41178.81475@anthem.concentric.net> (barry@digicool.com) References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <3A2C0E0D.E042D026@tismer.com> <14892.8601.41178.81475@anthem.concentric.net> Message-ID: <200012042319.AAA00877@loewis.home.cs.tu-berlin.de> > I'm not sure I understand what Moshe was proposing. Moshe: are you > saying that we should change the way the compiler works, so that it > somehow recognizes this special case? I'm not sure I like that > approach. I think I want something more runtime-y, but I'm not sure > why (maybe just because I'm more comfortable mucking about in the > run-time than in the compiler). I guess you are also uncomfortable with the problem that the compile-time analysis cannot "see" through levels of indirection. E.g. if findattr as return self.compute_attribute(real_attribute) then compile-time analysis could figure out to call compute_attribute directly. However, that method may be implemented as def compute_attribute(self,name): return self.mapping[name] where the access to mapping could not be detected statically. Regards, Martin From tismer@tismer.com Mon Dec 4 21:35:09 2000 From: tismer@tismer.com (Christian Tismer) Date: Mon, 04 Dec 2000 23:35:09 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> Message-ID: <3A2C0E0D.E042D026@tismer.com> "Barry A. Warsaw" wrote: > > >>>>> "CT" == Christian Tismer writes: > > CT> You want most probably do this: __findattr__ should not be > CT> invoked again for this instance, with this attribute name, for > CT> this "thread", until you are done. > > First, I think the rule should be "__findattr__ should not be invoked > again for this instance, in this thread, until you are done". Maybe this is better. Surely easier. :) [ThreadState solution - well fine so far] > MZ> I don't think this is a good idea -- continuations and > MZ> coroutines might mess it up. > > You might be right, but I'm not sure. > > If we make __findattr__ thread safe according to the definition above, > and if uthread/coroutine/continuation safety can be accomplished by > the __findattr__ programmer's discipline, then I think that is enough. > IOW, if we can tell the __findattr__ author to not relinquish the > uthread explicitly during the __findattr__ call, we're cool. Oh, and > as long as we're not somehow substantially reducing the utility of > __findattr__ by making that restriction. > > What I worry about is re-entrancy that isn't under the programmer's > control, like the Real Thread-safety problem. Hmm. WHat do you think about Moshe's idea to change compiling of the method? It has the nice advantage that there are no Thread-safety problems by design. The only drawback is that the contract of not-calling-myself only holds for this function. I don't know how Threadstate scale up when there are more things like these invented. Well, for the moment, the simple solution with Stackless would just be to let the interpreter recurse in this call, the same as it happens during __init__ and anything else that isn't easily turned into tail-recursion. It just blocks :-) ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From barry@digicool.com Tue Dec 5 02:54:23 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 21:54:23 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> Message-ID: <14892.22751.921264.156010@anthem.concentric.net> >>>>> "GvR" == Guido van Rossum writes: GvR> - Do you really think that JimF would do away with GvR> ExtensionClasses if __findattr__ was intruduced? I kinda GvR> doubt it. See [*footnote]. It seems that *using* GvR> __findattr__ is expensive (even if *not* using is cheap :-). That's not even the real reason why JimF wouldn't stop using ExtensionClass. He's already got too much code invested in EC. However EC can be a big pill to swallow for some applications because it's a C extension (and because it has some surprising non-Pythonic side effects). In those situations, a pure Python approach, even though slower, is useful. GvR> - Why is deletion not supported? What if you want to enforce GvR> a policy on deletions too? It could be, without much work. GvR> - It's ugly to use the same call for get and set. The GvR> examples indicate that it's not such a great idea: every GvR> example has *two* tests whether it's get or set. To share a GvR> policy, the proper thing to do is to write a method that GvR> either get or set can use. I don't have strong feelings either way. GvR> - I think it would be sufficient to *only* use __findattr__ GvR> for getattr -- __setattr__ and __delattr__ already have full GvR> control. The "one routine to implement the policy" argument GvR> doesn't really hold, I think. What about the ability to use "normal" x.name attribute access syntax inside the hook? Let me guess your answer. :) GvR> - The PEP says that the "in-findattr" flag is set on the GvR> instance. We've already determined that this is not GvR> thread-safe. This is not just a bug in the implementation -- GvR> it's a bug in the specification. I also find it ugly. But GvR> if we decide to do this, it can go in the thread-state -- if GvR> we ever add coroutines, we have to decide on what stuff to GvR> move from the thread state to the coroutine state anyway. Right. That's where we've ended up in subsequent messages on this thread. GvR> - It's also easy to conceive situations where recursive GvR> __findattr__ calls on the same instance in the same GvR> thread/coroutine are perfectly desirable -- e.g. when GvR> __findattr__ ends up calling a method that uses a lot of GvR> internal machinery of the class. You don't want all the GvR> machinery to have to be aware of the fact that it may be GvR> called with __findattr__ on the stack and without it. Hmm, okay, I don't really understand your example. I suppose I'm envisioning __findattr__ as a way to provide an interface to clients of the class. Maybe it's a bean interface, maybe it's an acquisition interface or an access control interface. The internal machinery has to know something about how that interface is implemented, so whether __findattr__ is recursive or not doesn't seem to enter into it. And also, allowing __findattr__ to be recursive will just impose different constraints on the internal machinery methods, just like __setattr__ currently does. I.e. you better know that you're in __setattr__ and not do self.name type things, or you'll recurse forever. GvR> So perhaps it may be better to only treat the body of GvR> __findattr__ itself special, as Moshe suggested. Maybe I'm being dense, but I'm not sure exactly what this means, or how you would do this. GvR> What does Jython do here? It's not exactly equivalent, because Jython's __findattr__ can't call back into Python. GvR> - The code examples require a *lot* of effort to understand. GvR> These are complicated issues! (I rewrote the Bean example GvR> using __getattr__ and __setattr__ and found no need for GvR> __findattr__; the __getattr__ version is simpler and easier GvR> to understand. I'm still studying the other __findattr__ GvR> examples.) Is it simpler because you separated out the set and get behavior? If __findattr__ only did getting, I think it would be a lot similar too (but I'd still be interested in seeing your __getattr__ only example). The acquisition examples are complicated because I wanted to support the same interface that EC's acquisition classes support. All that detail isn't necessary for example code. GvR> - The PEP really isn't that long, except for the code GvR> examples. I recommend reading the patch first -- the patch GvR> is probably shorter than any specification of the feature can GvR> be. Would it be more helpful to remove the examples? If so, where would you put them? It's certainly useful to have examples someplace I think. GvR> There's an easy way (that few people seem to know) to cause GvR> __getattr__ to be called for virtually all attribute GvR> accesses: put *all* (user-visible) attributes in a sepate GvR> dictionary. If you want to prevent access to this dictionary GvR> too (for Zope security enforcement), make it a global indexed GvR> by id() -- a destructor(__del__) can take care of deleting GvR> entries here. Presumably that'd be a module global, right? Maybe within Zope that could be protected, but outside of that, that global's always going to be accessible. So are methods, even if given private names. And I don't think that such code would be any more readable since instead of self.name you'd see stuff like def __getattr__(self, name): global instdict mydict = instdict[id(self)] obj = mydict[name] ... def __setattr__(self, name, val): global instdict mydict = instdict[id(self)] instdict[name] = val ... and that /might/ be a problem with Jython currently, because id()'s may be reused. And relying on __del__ may have unfortunate side effects when viewed in conjunction with garbage collection. You're probably still unconvinced , but are you dead-set against it? I can try implementing __findattr__() as a pre-__getattr__ hook only. Then we can live with the current __setattr__() restrictions and see what the examples look like in that situation. -Barry From guido@python.org Tue Dec 5 12:54:20 2000 From: guido@python.org (Guido van Rossum) Date: Tue, 05 Dec 2000 07:54:20 -0500 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: Your message of "Mon, 04 Dec 2000 21:54:23 EST." <14892.22751.921264.156010@anthem.concentric.net> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> <14892.22751.921264.156010@anthem.concentric.net> Message-ID: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> > >>>>> "GvR" == Guido van Rossum writes: > > GvR> - Do you really think that JimF would do away with > GvR> ExtensionClasses if __findattr__ was intruduced? I kinda > GvR> doubt it. See [*footnote]. It seems that *using* > GvR> __findattr__ is expensive (even if *not* using is cheap :-). > > That's not even the real reason why JimF wouldn't stop using > ExtensionClass. He's already got too much code invested in EC. > However EC can be a big pill to swallow for some applications because > it's a C extension (and because it has some surprising non-Pythonic > side effects). In those situations, a pure Python approach, even > though slower, is useful. Agreed. But I'm still hoping to find the silver bullet that lets Jim (and everybody else) do what ExtensionClass does without needing another extension. > GvR> - Why is deletion not supported? What if you want to enforce > GvR> a policy on deletions too? > > It could be, without much work. Then it should be -- except I prefer to do only getattr anyway, see below. > GvR> - It's ugly to use the same call for get and set. The > GvR> examples indicate that it's not such a great idea: every > GvR> example has *two* tests whether it's get or set. To share a > GvR> policy, the proper thing to do is to write a method that > GvR> either get or set can use. > > I don't have strong feelings either way. What does Jython do? I thought it only did set (hence the name :-). I think there's no *need* for findattr to catch the setattr operation, because __setattr__ *already* gets invoked on each set not just ones where the attr doesn't yet exist. > GvR> - I think it would be sufficient to *only* use __findattr__ > GvR> for getattr -- __setattr__ and __delattr__ already have full > GvR> control. The "one routine to implement the policy" argument > GvR> doesn't really hold, I think. > > What about the ability to use "normal" x.name attribute access syntax > inside the hook? Let me guess your answer. :) Aha! You got me there. Clearly the REAL reason for wanting __findattr__ is the no-recursive-calls rule -- which is also the most uncooked feature... Traditional getattr hooks don't need this as much because they don't get called when the attribute already exists; traditional setattr hooks deal with it by switching on the attribute name. The no-recursive-calls rule certainly SEEMS an attractive way around this. But I'm not sure that it really is... I need to get my head around this more. (The only reason I'm still posting this reply is to test the new mailing lists setup via mail.python.org.) > GvR> - The PEP says that the "in-findattr" flag is set on the > GvR> instance. We've already determined that this is not > GvR> thread-safe. This is not just a bug in the implementation -- > GvR> it's a bug in the specification. I also find it ugly. But > GvR> if we decide to do this, it can go in the thread-state -- if > GvR> we ever add coroutines, we have to decide on what stuff to > GvR> move from the thread state to the coroutine state anyway. > > Right. That's where we've ended up in subsequent messages on this thread. > > GvR> - It's also easy to conceive situations where recursive > GvR> __findattr__ calls on the same instance in the same > GvR> thread/coroutine are perfectly desirable -- e.g. when > GvR> __findattr__ ends up calling a method that uses a lot of > GvR> internal machinery of the class. You don't want all the > GvR> machinery to have to be aware of the fact that it may be > GvR> called with __findattr__ on the stack and without it. > > Hmm, okay, I don't really understand your example. I suppose I'm > envisioning __findattr__ as a way to provide an interface to clients > of the class. Maybe it's a bean interface, maybe it's an acquisition > interface or an access control interface. The internal machinery has > to know something about how that interface is implemented, so whether > __findattr__ is recursive or not doesn't seem to enter into it. But the class is also a client of itself, and not all cases where it is a client of itself are inside a findattr call. Take your bean example. Suppose your bean class also has a spam() method. The findattr code needs to account for this, e.g.: def __findattr__(self, name, *args): if name == "spam" and not args: return self.spam ...original body here... Or you have to add a _get_spam() method: def _get_spam(self): return self.spam Either solution gets tedious if there ar a lot of methods; instead, findattr could check if the attr is defined on the class, and then return that: def __findattr__(self, name, *args): if not args and name[0] != '_' and hasattr(self.__class__, name): return getattr(self, name) ...original body here... Anyway, let's go back to the spam method. Suppose it references self.foo. The findattr machinery will access it. Fine. But now consider another attribute (bar) with _set_bar() and _get_bar() methods that do a little more. Maybe bar is really calculated from the value of self.foo. Then _get_bar cannot use self.foo (because it's inside findattr so findattr won't resolve it, and self.foo doesn't actually exist on the instance) so it has to use self.__myfoo. Fine -- after all this is inside a _get_* handler, which knows it's being called from findattr. But what if, instead of needing self.foo, _get_bar wants to call self.spam() in order? Then self.spam() is being called from inside findattr, so when it access self.foo, findattr isn't used -- and it fails with an AttributeError! Sorry for the long detour, but *that's* the problem I was referring to. I think the scenario is quite realistic. > And also, allowing __findattr__ to be recursive will just impose > different constraints on the internal machinery methods, just like > __setattr__ currently does. I.e. you better know that you're in > __setattr__ and not do self.name type things, or you'll recurse > forever. Actually, this is usually solved by having __setattr__ check for specific names only, and for others do self.__dict__[name] = value; that way, recursive __setattr__ calls are okay. Similar for __getattr__ (which has to raise AttributeError for unrecognized names). > GvR> So perhaps it may be better to only treat the body of > GvR> __findattr__ itself special, as Moshe suggested. > > Maybe I'm being dense, but I'm not sure exactly what this means, or > how you would do this. Read Moshe's messages (and Martin's replies) again. I don't care that much for it so I won't explain it again. > GvR> What does Jython do here? > > It's not exactly equivalent, because Jython's __findattr__ can't call > back into Python. I'd say that Jython's __findattr__ is an entirely different beast than what we have here. Its min purpose in life appears to be to be a getattr equivalent that returns NULL instead of raising an exception when the attribute isn't found -- which is reasonable because from within Java, testing for null is much cheaper than checking for an exception, and you often need to look whether a given attribute exists do some default action if not. (In fact, I'd say that CPython could also use a findattr of this kind...) This is really too bad. Based on the name similarity and things I thought you'd said in private before, I thought that they would be similar. Then the experience with Jython would be a good argument for adding a findattr hook to CPython. But now that they are totally different beasts it doesn't help at all. > GvR> - The code examples require a *lot* of effort to understand. > GvR> These are complicated issues! (I rewrote the Bean example > GvR> using __getattr__ and __setattr__ and found no need for > GvR> __findattr__; the __getattr__ version is simpler and easier > GvR> to understand. I'm still studying the other __findattr__ > GvR> examples.) > > Is it simpler because you separated out the set and get behavior? If > __findattr__ only did getting, I think it would be a lot similar too > (but I'd still be interested in seeing your __getattr__ only > example). Here's my getattr example. It's more lines of code, but cleaner IMHO: class Bean: def __init__(self, x): self.__myfoo = x def __isprivate(self, name): return name.startswith('_') def __getattr__(self, name): if self.__isprivate(name): raise AttributeError, name return getattr(self, "_get_" + name)() def __setattr__(self, name, value): if self.__isprivate(name): self.__dict__[name] = value else: return getattr(self, "_set_" + name)(value) def _set_foo(self, x): self.__myfoo = x def _get_foo(self): return self.__myfoo b = Bean(3) print b.foo b.foo = 9 print b.foo > The acquisition examples are complicated because I wanted > to support the same interface that EC's acquisition classes support. > All that detail isn't necessary for example code. I *still* have to study the examples... :-( Will do next. > GvR> - The PEP really isn't that long, except for the code > GvR> examples. I recommend reading the patch first -- the patch > GvR> is probably shorter than any specification of the feature can > GvR> be. > > Would it be more helpful to remove the examples? If so, where would > you put them? It's certainly useful to have examples someplace I > think. No, my point is that the examples need more explanation. Right now the EC example is over 200 lines of brain-exploding code! :-) > GvR> There's an easy way (that few people seem to know) to cause > GvR> __getattr__ to be called for virtually all attribute > GvR> accesses: put *all* (user-visible) attributes in a sepate > GvR> dictionary. If you want to prevent access to this dictionary > GvR> too (for Zope security enforcement), make it a global indexed > GvR> by id() -- a destructor(__del__) can take care of deleting > GvR> entries here. > > Presumably that'd be a module global, right? Maybe within Zope that > could be protected, Yes. > but outside of that, that global's always going to > be accessible. So are methods, even if given private names. Aha! Another think that I expect has been on your agenda for a long time, but which isn't explicit in the PEP (AFAICT): findattr gives *total* control over attribute access, unlike __getattr__ and __setattr__ and private name mangling, which can all be defeated. And this may be one of the things that Jim is after with ExtensionClasses in Zope. Although I believe that in DTML, he doesn't trust this: he uses source-level (or bytecode-level) transformations to turn all X.Y operations into a call into a security manager. So I'm not sure that the argument is very strong. > And I > don't think that such code would be any more readable since instead of > self.name you'd see stuff like > > def __getattr__(self, name): > global instdict > mydict = instdict[id(self)] > obj = mydict[name] > ... > > def __setattr__(self, name, val): > global instdict > mydict = instdict[id(self)] > instdict[name] = val > ... > > and that /might/ be a problem with Jython currently, because id()'s > may be reused. And relying on __del__ may have unfortunate side > effects when viewed in conjunction with garbage collection. Fair enough. I withdraw the suggestion, and propose restricted execution instead. There, you can use Bastions -- which have problems of their own, but you do get total control. > You're probably still unconvinced , but are you dead-set against > it? I can try implementing __findattr__() as a pre-__getattr__ hook > only. Then we can live with the current __setattr__() restrictions > and see what the examples look like in that situation. I am dead-set against introducing a feature that I don't fully understand. Let's continue this discussion. --Guido van Rossum (home page: http://www.python.org/~guido/) From bckfnn@worldonline.dk Tue Dec 5 15:40:10 2000 From: bckfnn@worldonline.dk (Finn Bock) Date: Tue, 05 Dec 2000 15:40:10 GMT Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> <14892.22751.921264.156010@anthem.concentric.net> <200012051254.HAA25502@cj20424-a.reston1.va.home.com> Message-ID: <3a2d0c29.242749@smtp.worldonline.dk> On Tue, 05 Dec 2000 07:54:20 -0500, you wrote: >> GvR> What does Jython do here? >> >> It's not exactly equivalent, because Jython's __findattr__ can't call >> back into Python. > >I'd say that Jython's __findattr__ is an entirely different beast than >what we have here. Its min purpose in life appears to be to be a >getattr equivalent that returns NULL instead of raising an exception >when the attribute isn't found -- which is reasonable because from >within Java, testing for null is much cheaper than checking for an >exception, and you often need to look whether a given attribute exists >do some default action if not. Correct. It is also the method to override when making a new builtin type and it will be called on such a type subclass regardless of the presence of any __getattr__ hook and __dict__ content. So I think it have some of the properties which Barry wants. regards, finn From greg@cosc.canterbury.ac.nz Tue Dec 5 23:07:06 2000 From: greg@cosc.canterbury.ac.nz (greg@cosc.canterbury.ac.nz) Date: Wed, 06 Dec 2000 12:07:06 +1300 (NZDT) Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__()) In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> Message-ID: <200012052307.MAA01082@s454.cosc.canterbury.ac.nz> I can't believe you're even considering a magic dynamically-scoped flag that invisibly changes the semantics of fundamental operations. To me the idea is utterly insane! If I understand correctly, the problem is that if you do something like def __findattr__(self, name): if name == 'spam': return self.__dict__['spam'] then self.__dict__ is going to trigger a recursive __findattr__ call. It seems to me that if you're going to have some sort of hook that is always called on any x.y reference, you need some way of explicitly bypassing it and getting at the underlying machinery. I can think of a couple of ways: 1) Make the __dict__ attribute special, so that accessing it always bypasses __findattr__. 2) Provide some other way of getting direct access to the attributes of an object, e.g. new builtins called peekattr() and pokeattr(). This assumes that you always know when you write a particular access whether you want it to be a "normal" or "special" one, so that you can use the appropriate mechanism. Are there any cases where this is not true? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From barry@digicool.com Wed Dec 6 02:20:40 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Tue, 5 Dec 2000 21:20:40 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> <14892.22751.921264.156010@anthem.concentric.net> <200012051254.HAA25502@cj20424-a.reston1.va.home.com> <3a2d0c29.242749@smtp.worldonline.dk> Message-ID: <14893.41592.701128.58110@anthem.concentric.net> >>>>> "FB" == Finn Bock writes: FB> Correct. It is also the method to override when making a new FB> builtin type and it will be called on such a type subclass FB> regardless of the presence of any __getattr__ hook and FB> __dict__ content. So I think it have some of the properties FB> which Barry wants. We had a discussion about this PEP at our group meeting today. Rather than write it all twice, I'm going to try to update the PEP and patch tonight. I think what we came up with will solve most of the problems raised, and will be implementable in Jython (I'll try to work up a Jython patch too, if I don't fall asleep first :) -Barry From barry@digicool.com Wed Dec 6 02:54:36 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Tue, 5 Dec 2000 21:54:36 -0500 Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__()) References: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> <200012052307.MAA01082@s454.cosc.canterbury.ac.nz> Message-ID: <14893.43628.61063.905227@anthem.concentric.net> >>>>> "greg" == writes: | 1) Make the __dict__ attribute special, so that accessing | it always bypasses __findattr__. You're not far from what I came up with right after our delicious lunch. We're going to invent a new protocol which passes __dict__ into the method as an argument. That way self.__dict__ doesn't need to be special cased at all because you can get at all the attributes via a local! So no recursion stop hack is necessary. More in the updated PEP and patch. -Barry From dgoodger@bigfoot.com Thu Dec 7 04:33:33 2000 From: dgoodger@bigfoot.com (David Goodger) Date: Wed, 06 Dec 2000 23:33:33 -0500 Subject: [Python-Dev] unit testing and Python regression test Message-ID: There is another unit testing implementation out there, OmPyUnit, available from: http://www.objectmentor.com/freeware/downloads.html -- David Goodger dgoodger@bigfoot.com Open-source projects: - The Go Tools Project: http://gotools.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net (soon!) From fdrake@users.sourceforge.net Thu Dec 7 06:26:54 2000 From: fdrake@users.sourceforge.net (Fred L. Drake) Date: Wed, 6 Dec 2000 22:26:54 -0800 Subject: [Python-Dev] [development doc updates] Message-ID: <200012070626.WAA22103@orbital.p.sourceforge.net> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Lots of small changes, but most important, more DOM documentation: http://python.sourceforge.net/devel-docs/lib/module-xml.dom.html From guido@python.org Thu Dec 7 17:48:53 2000 From: guido@python.org (Guido van Rossum) Date: Thu, 07 Dec 2000 12:48:53 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons Message-ID: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> After perusing David Ascher's proposal, several versions of his patches, and hundreds of email exchanged on this subject (almost all of this dated April or May of 1998), I've produced a reasonably semblance of PEP 207. Get it from CVS or here on the web: http://python.sourceforge.net/peps/pep-0207.html I'd like to hear your comments, praise, and criticisms! The PEP still needs work; in particular, the minority point of view back then (that comparisons should return only Boolean results) is not adequately represented (but I *did* work in a reference to tabnanny, to ensure Tim's support :-). I'd like to work on a patch next, but I think there will be interference with Neil's coercion patch. I'm not sure how to resolve that yet; maybe I'll just wait until Neil's coercion patch is checked in. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Thu Dec 7 17:54:51 2000 From: guido@python.org (Guido van Rossum) Date: Thu, 07 Dec 2000 12:54:51 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) Message-ID: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> I'm maybe about three quarters on the way with writing PEP 230 -- far enough along to be asking for comments. Get it from CVS or go to: http://python.sourceforge.net/peps/pep-0230.html A prototype implementation in Python is included in the PEP; I think this shows that the implementation is not too complex (Paul Prescod's fear about my proposal). This is pretty close to what I proposed earlier (Nov 5), except that I have added warning category classes (inspired by Paul's proposal). This class also serves as the exception to be raised when warnings are turned into exceptions. Do I need to include a discussion of Paul's counter-proposal and why I rejected it? --Guido van Rossum (home page: http://www.python.org/~guido/) From Barrett@stsci.edu Thu Dec 7 22:49:02 2000 From: Barrett@stsci.edu (Paul Barrett) Date: Thu, 7 Dec 2000 17:49:02 -0500 (EST) Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays Message-ID: <14896.1191.240597.632888@nem-srvr.stsci.edu> What is the status of PEP 209? I see David Ascher is the champion of this PEP, but nothing has been written up. Is the intention of this PEP to make the current Numeric a built-in feature of Python or to re-implement and replace the current Numeric module? The reason that I ask these questions is because I'm working on a prototype of a new N-dimensional Array module which I call Numeric 2. This new module will be much more extensible than the current Numeric. For example, new array types and universal functions can be loaded or imported on demand. We also intend to implement a record (or C-structure) type, because 1-D arrays or lists of records are a common data structure for storing photon events in astronomy and related fields. The current Numeric does not handle record types efficiently, particularly when the data type is not aligned and is in non-native endian format. To handle such data, temporary arrays must be created and alignment and byte-swapping done on them. Numeric 2 does such pre- and post-processing inside the inner-most loop which is more efficient in both time and memory. It also does type conversion at this level which is consistent with that proposed for PEP 208. Since many scientific users would like direct access to the array data via C pointers, we have investigated using the buffer object. We have not had much success with it, because of its implementation. I have scanned the python-dev mailing list for discussions of this issue and found that it now appears to be deprecated. My opinion on this is that a new _fundamental_ built-in type should be created for memory allocation with features and an interface similar to the _mmap_ object. I'll call this a _malloc_ object. This would allow Numeric 2 to use either object interchangeably depending on the circumstance. The _string_ type could also benefit from this new object by using a read-only version of it. Since its an object, it's memory area should be safe from inadvertent deletion. Because of these and other new features in Numeric 2, I have a keen interest in the status of PEPs 207, 208, 211, 225, and 228; and also in the proposed buffer object. I'm willing to implement this new _malloc_ object if members of the python-dev list are in agreement. Actually I see no alternative, given the current design of Numeric 2, since the Array class will initially be written completely in Python and will need a mutable memory buffer, while the _string_ type is meant to be a read-only object. All comments welcome. -- Paul -- Dr. Paul Barrett Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Group FAX: 410-338-4767 Baltimore, MD 21218 From DavidA@ActiveState.com Fri Dec 8 01:13:04 2000 From: DavidA@ActiveState.com (David Ascher) Date: Thu, 7 Dec 2000 17:13:04 -0800 (Pacific Standard Time) Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays In-Reply-To: <14896.1191.240597.632888@nem-srvr.stsci.edu> Message-ID: On Thu, 7 Dec 2000, Paul Barrett wrote: > What is the status of PEP 209? I see David Ascher is the champion of > this PEP, but nothing has been written up. Is the intention of this I put my name on the PEP just to make sure it wasn't forgotten. If someone wants to champion it, their name should go on it. --david From guido@python.org Fri Dec 8 16:10:50 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 11:10:50 -0500 Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays In-Reply-To: Your message of "Thu, 07 Dec 2000 17:49:02 EST." <14896.1191.240597.632888@nem-srvr.stsci.edu> References: <14896.1191.240597.632888@nem-srvr.stsci.edu> Message-ID: <200012081610.LAA30679@cj20424-a.reston1.va.home.com> > What is the status of PEP 209? I see David Ascher is the champion of > this PEP, but nothing has been written up. Is the intention of this > PEP to make the current Numeric a built-in feature of Python or to > re-implement and replace the current Numeric module? David has already explained why his name is on it -- basically, David's name is on several PEPs but he doesn't currently have any time to work on these, so other volunteers are most welcome to join. It is my understanding that the current Numeric is sufficiently messy in implementation and controversial in semantics that it would not be a good basis to start from. However, I do think that a basic multi-dimensional array object would be a welcome addition to core Python. > The reason that I ask these questions is because I'm working on a > prototype of a new N-dimensional Array module which I call Numeric 2. > This new module will be much more extensible than the current Numeric. > For example, new array types and universal functions can be loaded or > imported on demand. We also intend to implement a record (or > C-structure) type, because 1-D arrays or lists of records are a common > data structure for storing photon events in astronomy and related > fields. I'm not familiar with the use of computers in astronomy and related fields, so I'll take your word for that! :-) > The current Numeric does not handle record types efficiently, > particularly when the data type is not aligned and is in non-native > endian format. To handle such data, temporary arrays must be created > and alignment and byte-swapping done on them. Numeric 2 does such > pre- and post-processing inside the inner-most loop which is more > efficient in both time and memory. It also does type conversion at > this level which is consistent with that proposed for PEP 208. > > Since many scientific users would like direct access to the array data > via C pointers, we have investigated using the buffer object. We have > not had much success with it, because of its implementation. I have > scanned the python-dev mailing list for discussions of this issue and > found that it now appears to be deprecated. Indeed. I think it's best to leave the buffer object out of your implementation plans. There are several problems with it, and one of the backburner projects is to redesign it to be much more to the point (providing less, not more functionality). > My opinion on this is that a new _fundamental_ built-in type should be > created for memory allocation with features and an interface similar > to the _mmap_ object. I'll call this a _malloc_ object. This would > allow Numeric 2 to use either object interchangeably depending on the > circumstance. The _string_ type could also benefit from this new > object by using a read-only version of it. Since its an object, it's > memory area should be safe from inadvertent deletion. Interesting. I'm actually not sufficiently familiar with mmap to comment. But would the existing array module's array object be at all useful? You can get to the raw bytes in C (using the C buffer API, which is not deprecated) and it is extensible. > Because of these and other new features in Numeric 2, I have a keen > interest in the status of PEPs 207, 208, 211, 225, and 228; and also > in the proposed buffer object. Here are some quick comments on the mentioned PEPs. 207: Rich Comparisons. This will go into Python 2.1. (I just finished the first draft of the PEP, please read it and comment.) 208: Reworking the Coercion Model. This will go into Python 2.1. Neil Schemenauer has mostly finished the patches already. Please comment. 211: Adding New Lineal Algebra Operators (Greg Wilson). This is unlikely to go into Python 2.1. I don't like the idea much. If you disagree, please let me know! (Also, a choice has to be made between 211 and 225; I don't want to accept both, so until 225 is rejected, 211 is in limbo.) 225: Elementwise/Objectwise Operators (Zhu, Lielens). This will definitely not go into Python 2.1. It adds too many new operators. 228: Reworking Python's Numeric Model. This is a total pie-in-the-sky PEP, and this kind of change is not likely to happen before Python 3000. > I'm willing to implement this new _malloc_ object if members of the > python-dev list are in agreement. Actually I see no alternative, > given the current design of Numeric 2, since the Array class will > initially be written completely in Python and will need a mutable > memory buffer, while the _string_ type is meant to be a read-only > object. Would you be willing to take over authorship of PEP 209? David Ascher and the Numeric Python community will thank you. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Dec 8 18:43:39 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 13:43:39 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Your message of "Thu, 30 Nov 2000 17:46:52 EST." References: Message-ID: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> After the last round of discussion, I was left with the idea that the best thing we could do to help destructive iteration is to introduce a {}.popitem() that returns an arbitrary (key, value) pair and deletes it. I wrote about this: > > One more concern: if you repeatedly remove the *first* item, the hash > > table will start looking lobsided. Since we don't resize the hash > > table on deletes, maybe picking an item at random (but not using an > > expensive random generator!) would be better. and Tim replied: > Which is the reason SETL doesn't specify *which* set item is removed: if > you always start looking at "the front" of a dict that's being consumed, the > dict fills with turds without shrinking, you skip over them again and again, > and consuming the entire dict is still quadratic time. > > Unfortunately, while using a random start point is almost always quicker > than that, the expected time for consuming the whole dict remains quadratic. > > The clearest way around that is to save a per-dict search finger, recording > where the last search left off. Start from its current value. Failure if > it wraps around. This is linear time in non-pathological cases (a > pathological case is one in which it isn't linear time ). I've implemented this, except I use a static variable for the finger intead of a per-dict finger. I'm concerned about adding 4-8 extra bytes to each dict object for a feature that most dictionaries never need. So, instead, I use a single shared finger. This works just as well as long as this is used for a single dictionary. For multiple dictionaries (either used by the same thread or in different threads), it'll work almost as well, although it's possible to make up a pathological example that would work qadratically. An easy example of such a pathological example is to call popitem() for two identical dictionaries in lock step. Comments please! We could: - Live with the pathological cases. - Forget the whole thing; and then also forget about firstkey() etc. which has the same problem only worse. - Fix the algorithm. Maybe jumping criss-cross through the hash table like lookdict does would improve that; but I don't understand the math used for that ("Cycle through GF(2^n)-{0}" ???). I've placed a patch on SourceForge: http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470 The algorithm is: static PyObject * dict_popitem(dictobject *mp, PyObject *args) { static int finger = 0; int i; dictentry *ep; PyObject *res; if (!PyArg_NoArgs(args)) return NULL; if (mp->ma_used == 0) { PyErr_SetString(PyExc_KeyError, "popitem(): dictionary is empty"); return NULL; } i = finger; if (i >= mp->ma_size) ir = 0; while ((ep = &mp->ma_table[i])->me_value == NULL) { i++; if (i >= mp->ma_size) i = 0; } finger = i+1; res = PyTuple_New(2); if (res != NULL) { PyTuple_SET_ITEM(res, 0, ep->me_key); PyTuple_SET_ITEM(res, 1, ep->me_value); Py_INCREF(dummy); ep->me_key = dummy; ep->me_value = NULL; mp->ma_used--; } return res; } --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Dec 8 18:51:49 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 13:51:49 -0500 Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use Message-ID: <200012081851.NAA32254@cj20424-a.reston1.va.home.com> Moshe proposes to add an overridable function sys.displayhook(obj) which will be called by the interpreter for the PRINT_EXPR opcode, instead of hardcoding the behavior. The default implementation will of course have the current behavior, but this makes it much simpler to experiment with alternatives, e.g. using str() instead of repr() (or to choose between str() and repr() based on the type). Moshe has asked me to pronounce on this PEP. I've thought about it, and I'm now all for it. Moshe (or anyone else), please submit a patch to SF that shows the complete implementation! --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Fri Dec 8 19:06:50 2000 From: tim.one@home.com (Tim Peters) Date: Fri, 8 Dec 2000 14:06:50 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: [Guido, on sharing a search finger and getting worse-than-linear behavior in a simple test case] See my reply on SourceForge (crossed in the mails). I predict that fixing this in an acceptable way (not bulletproof, but linear-time for all predictably common cases) is a two-character change. Surprise, although maybe I'm hallucinating (would someone please confirm?): when I went to the SF patch manager page to look for your patch (using the Open Patches view), I couldn't find it. My guess is that if there are "too many" patches to fit on one screen, then unlike the SF *bug* manager, you don't get any indication that more patches exist or any control to go to the next page. From barry@digicool.com Fri Dec 8 19:18:26 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 14:18:26 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: <14897.13314.469255.853298@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> Surprise, although maybe I'm hallucinating (would someone TP> please confirm?): when I went to the SF patch manager page to TP> look for your patch (using the Open Patches view), I couldn't TP> find it. My guess is that if there are "too many" patches to TP> fit on one screen, then unlike the SF *bug* manager, you don't TP> get any indication that more patches exist or any control to TP> go to the next page. I haven't checked recently, but this was definitely true a few weeks ago. I think I even submitted an admin request on it, but I don't remember for sure. -Barry From Barrett@stsci.edu Fri Dec 8 21:22:39 2000 From: Barrett@stsci.edu (Paul Barrett) Date: Fri, 8 Dec 2000 16:22:39 -0500 (EST) Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays In-Reply-To: <200012081610.LAA30679@cj20424-a.reston1.va.home.com> References: <14896.1191.240597.632888@nem-srvr.stsci.edu> <200012081610.LAA30679@cj20424-a.reston1.va.home.com> Message-ID: <14897.10309.686024.254701@nem-srvr.stsci.edu> Guido van Rossum writes: > > What is the status of PEP 209? I see David Ascher is the champion of > > this PEP, but nothing has been written up. Is the intention of this > > PEP to make the current Numeric a built-in feature of Python or to > > re-implement and replace the current Numeric module? > > David has already explained why his name is on it -- basically, > David's name is on several PEPs but he doesn't currently have any time > to work on these, so other volunteers are most welcome to join. > > It is my understanding that the current Numeric is sufficiently messy > in implementation and controversial in semantics that it would not be > a good basis to start from. That is our (Rick, Perry, and I) belief also. > However, I do think that a basic multi-dimensional array object would > be a welcome addition to core Python. That's re-assuring. > Indeed. I think it's best to leave the buffer object out of your > implementation plans. There are several problems with it, and one of > the backburner projects is to redesign it to be much more to the point > (providing less, not more functionality). I agree and have already made the decision to leave it out. > > My opinion on this is that a new _fundamental_ built-in type should be > > created for memory allocation with features and an interface similar > > to the _mmap_ object. I'll call this a _malloc_ object. This would > > allow Numeric 2 to use either object interchangeably depending on the > > circumstance. The _string_ type could also benefit from this new > > object by using a read-only version of it. Since its an object, it's > > memory area should be safe from inadvertent deletion. > > Interesting. I'm actually not sufficiently familiar with mmap to > comment. But would the existing array module's array object be at all > useful? You can get to the raw bytes in C (using the C buffer API, > which is not deprecated) and it is extensible. I tried using this but had problems. I'll look into it again. > > Because of these and other new features in Numeric 2, I have a keen > > interest in the status of PEPs 207, 208, 211, 225, and 228; and also > > in the proposed buffer object. > > Here are some quick comments on the mentioned PEPs. I've got these PEPs on my desk and will comment on them when I can. > > I'm willing to implement this new _malloc_ object if members of the > > python-dev list are in agreement. Actually I see no alternative, > > given the current design of Numeric 2, since the Array class will > > initially be written completely in Python and will need a mutable > > memory buffer, while the _string_ type is meant to be a read-only > > object. > > Would you be willing to take over authorship of PEP 209? David Ascher > and the Numeric Python community will thank you. Yes, I'd gladly wield vast and inconsiderate power over unsuspecting pythoneers. ;-) -- Paul From guido@python.org Fri Dec 8 22:58:03 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 17:58:03 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Thu, 07 Dec 2000 12:54:51 EST." <200012071754.MAA26557@cj20424-a.reston1.va.home.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> Message-ID: <200012082258.RAA02389@cj20424-a.reston1.va.home.com> Nobody seems to care much about the warnings PEP so far. What's up? Are you all too busy buying presents for the holidays? Then get me some too, please? :-) > http://python.sourceforge.net/peps/pep-0230.html I've now produced a prototype implementation for the C code: http://sourceforge.net/patch/?func=detailpatch&patch_id=102715&group_id=5470 Issues: - This defines a C API PyErr_Warn(category, message) instead of Py_Warn(message, category) as the PEP proposes. I actually like this better: it's consistent with PyErr_SetString() etc. rather than with the Python warn(message[, category]) function. - This calls the Python module from C. We'll have to see if this is fast enough. I wish I could postpone the import of warnings.py until the first call to PyErr_Warn(), but unfortunately the warning category classes must be initialized first (so they can be passed into PyErr_Warn()). The current version of warnings.py imports rather a lot of other modules (e.g. re and getopt); this can be reduced by placing those imports inside the functions that use them. - All the issues listed in the PEP. Please comment! BTW: somebody overwrote the PEP on SourceForge with an older version. Please remember to do a "cvs update" before running "make install" in the peps directory! --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Fri Dec 8 23:26:51 2000 From: gstein@lyra.org (Greg Stein) Date: Fri, 8 Dec 2000 15:26:51 -0800 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Dec 08, 2000 at 01:43:39PM -0500 References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: <20001208152651.H30644@lyra.org> On Fri, Dec 08, 2000 at 01:43:39PM -0500, Guido van Rossum wrote: >... > Comments please! We could: > > - Live with the pathological cases. I agree: live with it. The typical case will operate just fine. > - Forget the whole thing; and then also forget about firstkey() > etc. which has the same problem only worse. No opinion. > - Fix the algorithm. Maybe jumping criss-cross through the hash table > like lookdict does would improve that; but I don't understand the > math used for that ("Cycle through GF(2^n)-{0}" ???). No need. The keys were inserted randomly, so sequencing through is effectively random. :-) >... > static PyObject * > dict_popitem(dictobject *mp, PyObject *args) > { > static int finger = 0; > int i; > dictentry *ep; > PyObject *res; > > if (!PyArg_NoArgs(args)) > return NULL; > if (mp->ma_used == 0) { > PyErr_SetString(PyExc_KeyError, > "popitem(): dictionary is empty"); > return NULL; > } > i = finger; > if (i >= mp->ma_size) > ir = 0; Should be "i = 0" Cheers, -g -- Greg Stein, http://www.lyra.org/ From tismer@tismer.com Sat Dec 9 16:44:14 2000 From: tismer@tismer.com (Christian Tismer) Date: Sat, 09 Dec 2000 18:44:14 +0200 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: <3A32615E.D39B68D2@tismer.com> Guido van Rossum wrote: > > After the last round of discussion, I was left with the idea that the > best thing we could do to help destructive iteration is to introduce a > {}.popitem() that returns an arbitrary (key, value) pair and deletes > it. I wrote about this: > > > > One more concern: if you repeatedly remove the *first* item, the hash > > > table will start looking lobsided. Since we don't resize the hash > > > table on deletes, maybe picking an item at random (but not using an > > > expensive random generator!) would be better. > > and Tim replied: > > > Which is the reason SETL doesn't specify *which* set item is removed: if > > you always start looking at "the front" of a dict that's being consumed, the > > dict fills with turds without shrinking, you skip over them again and again, > > and consuming the entire dict is still quadratic time. > > > > Unfortunately, while using a random start point is almost always quicker > > than that, the expected time for consuming the whole dict remains quadratic. > > > > The clearest way around that is to save a per-dict search finger, recording > > where the last search left off. Start from its current value. Failure if > > it wraps around. This is linear time in non-pathological cases (a > > pathological case is one in which it isn't linear time ). > > I've implemented this, except I use a static variable for the finger > intead of a per-dict finger. I'm concerned about adding 4-8 extra > bytes to each dict object for a feature that most dictionaries never > need. So, instead, I use a single shared finger. This works just as > well as long as this is used for a single dictionary. For multiple > dictionaries (either used by the same thread or in different threads), > it'll work almost as well, although it's possible to make up a > pathological example that would work qadratically. > > An easy example of such a pathological example is to call popitem() > for two identical dictionaries in lock step. > > Comments please! We could: > > - Live with the pathological cases. > > - Forget the whole thing; and then also forget about firstkey() > etc. which has the same problem only worse. > > - Fix the algorithm. Maybe jumping criss-cross through the hash table > like lookdict does would improve that; but I don't understand the > math used for that ("Cycle through GF(2^n)-{0}" ???). That algorithm is really a gem which you should know, so let me try to explain it. Intro: A little story about finite field theory (very basic). ------------------------------------------------------------- For every prime p and every power p^n, there exists a Galois Field ( GF(p^n) ), which is a finite field. The additive group is called "elementary Abelian", it is commutative, and it looks a little like a vector space, since addition works in cycles modulo p for every p cell. The multiplicative group is cyclic, and it never touches 0. Cyclic groups are generated by a single primitive element. The powers of that element make up all the other elements. For all elements of the multiplication group GF(p^n)* the equality x^(p^n -1) == 1 . A generator element is therefore a primitive (p^n-1)th root of unity. >From another point of view, the elements of GF(p^n) can be seen as coefficients of polynomials over GF(p). It can be easily shown that every generator of the multiplicative group is an irreducible polynomial of degree n with coefficients in GF(p). An irreducible polynomial over a field has the property not to vanish for any value of the field. It has no zero in the field. By adjoining such a zero to the field, we turn it into an extension field: GF(p^n). Now on the dictionary case. --------------------------- The idea is to conceive every non-zero bit pattern as coefficients of a polynomial over GF(2)[x]. We need to find an irreducible polynomial of degee n over the prime field GF(2). There exists a primitive n'th root µ of unity in GF(2^n) which generates every non-zero bit pattern of length n, being coefficients of a polynomial over µ. That means, every given bit pattern can be seen as some power of µ. µ enumerates the whole multiplicative group, and the given pattern is just one position in that enumeration. We can go to the next position in this cycle simply by multiplying the pattern by µ. But since we are dealing with polynomials over µ, this multiplication isn't much more that adding one to very exponent in the polynomial, hence a left shift of our pattern. Adjusting the overflow of this pattern involves a single addition, which is just an XOR in GF(2^n). Example: p=2 n=3 G = GF(8) = GF(2^3) ---------------------------------------- """ Since 8 = 2^3, the prime field is GF(2) and we need to find a monic irreducible cubic polynomial over that field. Since the coefficients can only be 0 and 1, the list of irreducible candidates is easily obtained. x^3 + 1 x^3 + x + 1 x^3 + x^2 + 1 x^3 + x^2 + x + 1 Now substituting 0 gives 1 in all cases, and substituting 1 will give 0 only if there are an odd number of x terms, so the irreducible cubics are just x^3 + x + 1 and x^3 + x^2 + 1. Now the multiplicative group of this field is a cyclic group of order 7 and so every nonidentity element is a generator. Letting µ be a root of the first polynomial, we have µ^3 + µ + 1 = 0, or µ^3 = µ + 1, so the powers of µ are: µ^1 = µ µ^2 = µ^2 µ^3 = µ + 1 µ^4 = µ^2 + µ µ^5 = µ^2 + µ + 1 µ^6 = µ^2 + 1 µ^7 = 1 """ We could of course equally choose the second polynomial with an isomorphic result. The above example was taken from http://www-math.cudenver.edu/~wcherowi/courses/finflds.html Note that finding the irreducible polynomial was so easy since a reducible cubic always has a linear factor. In the general case, we have to check against all possible subpolynomials or use much more of the theory. Application of the example to the dictionary algorithm (DA) ----------------------------------------------------------- We stay in GF(8) and use the above example. The maximum allowed pattern value in our system is 2^n - 1. This is the variable "mask" in the program. We assume a given non-zero bit pattern with coefficients (a2, a1, a0) and write down a polynomial in µ for it: p = a2*µ^2 + a1*µ + a0 To obtain the next pattern in the group's enumeration, we multiply by the generator polynomial µ: p*µ = a2*µ^3 + a1*µ^2 + a0*µ In the program, this is line 210: incr = incr << 1; a simple shift. But in the case that a2 is not zero, we get an overflow, and we have to fold the bit back, by the following identity: µ^3 = µ+1 That means, we have to subtract µ^3 from the pattern and add µ+1 instead. But since addition and subtraction is identical in GF(2), we just add the whole polynomial. >From a different POV, we just add zero here, since µ^3 + µ + 1 = 0 The full progam to perform the polynomial multiplication gets down to just a shift, a test and an XOR: incr = incr << 1; if (incr > mask) incr ^= mp->ma_poly; Summary ======= For every power n of two, we can find a generator element µ of GF(2^n). Every non-zero bit pattern can be taken as coefficient of a polynomial in µ. The powers of µ reach all these patterns. Therefore, each pattern *is* some power of µ. By multiplication with µ we can reach every possible pattern exactly once. Since these patterns are used as distances from the primary hash-computed slot modulo 2^n, and the distances are never zero, all slots can be reached. ----------------------------------- Appendix, on the use of finger: ------------------------------- Instead of using a global finger variable, you can do the following (involving a cast from object to int) : - if the 0'th slot of the dict is non-empty: return this element and insert the dummy element as key. Set the value field to the Dictionary Algorithm would give for the removed object's hash. This is the next finger. - else: treat the value field of the 0'th slot as the last finger. If it is zero, initialize it with 2^n-1. Repetitively use the DA until you find an entry. Save the finger in slot 0 again. This dosn't cost an extra slot, and even when the dictionary is written between removals, the chance to loose the finger is just 1:(2^n-1) on every insertion. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From nas@arctrix.com Sat Dec 9 11:30:06 2000 From: nas@arctrix.com (Neil Schemenauer) Date: Sat, 9 Dec 2000 03:30:06 -0800 Subject: [Python-Dev] PEP 208 and __coerce__ Message-ID: <20001209033006.A3737@glacier.fnational.com> While working on the implementation of PEP 208, I discovered that __coerce__ has some surprising properties. Initially I implemented __coerce__ so that the numberic operation currently being performed was called on the values returned by __coerce__. This caused test_class to blow up due to code like this: class Test: def __coerce__(self, other): return (self, other) The 2.0 "solves" this by not calling __coerce__ again if the objects returned by __coerce__ are instances. This has the effect of making code like: class A: def __coerce__(self, other): return B(), other class B: def __coerce__(self, other): return 1, other A() + 1 fail to work in the expected way. The question is: how should __coerce__ work? One option is to leave it work the way it does in 2.0. Alternatively, I could change it so that if coerce returns (self, *) then __coerce__ is not called again. Neil From mal@lemburg.com Sat Dec 9 18:49:29 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 09 Dec 2000 19:49:29 +0100 Subject: [Python-Dev] PEP 208 and __coerce__ References: <20001209033006.A3737@glacier.fnational.com> Message-ID: <3A327EB9.BD2CA3CC@lemburg.com> Neil Schemenauer wrote: > > While working on the implementation of PEP 208, I discovered that > __coerce__ has some surprising properties. Initially I > implemented __coerce__ so that the numberic operation currently > being performed was called on the values returned by __coerce__. > This caused test_class to blow up due to code like this: > > class Test: > def __coerce__(self, other): > return (self, other) > > The 2.0 "solves" this by not calling __coerce__ again if the > objects returned by __coerce__ are instances. This has the > effect of making code like: > > class A: > def __coerce__(self, other): > return B(), other > > class B: > def __coerce__(self, other): > return 1, other > > A() + 1 > > fail to work in the expected way. The question is: how should > __coerce__ work? One option is to leave it work the way it does > in 2.0. Alternatively, I could change it so that if coerce > returns (self, *) then __coerce__ is not called again. +0 -- the idea behind the PEP 208 is to get rid off the centralized coercion mechanism, so fixing it to allow yet more obscure variants should be carefully considered. I see __coerce__ et al. as old style mechanisms -- operator methods have much more information available to do the right thing than the single bottelneck __coerce__. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tim.one@home.com Sat Dec 9 20:49:04 2000 From: tim.one@home.com (Tim Peters) Date: Sat, 9 Dec 2000 15:49:04 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > ... > I've implemented this, except I use a static variable for the finger > intead of a per-dict finger. I'm concerned about adding 4-8 extra > bytes to each dict object for a feature that most dictionaries never > need. It's a bit ironic that dicts are guaranteed to be at least 1/3 wasted space . Let's pick on Christian's idea to reclaim a few bytes of that. > So, instead, I use a single shared finger. This works just as > well as long as this is used for a single dictionary. For multiple > dictionaries (either used by the same thread or in different threads), > it'll work almost as well, although it's possible to make up a > pathological example that would work qadratically. > > An easy example of such a pathological example is to call popitem() > for two identical dictionaries in lock step. Please see my later comments attached to the patch: http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470 In short, for me (truly) identical dicts perform well with or without my suggestion, while dicts cloned via dict.copy() perform horribly with or without my suggestion (their internal structures differ); still curious as to whether that's also true for you (am I looking at a Windows bug? I don't see how, but it's possible ...). In any case, my suggestion turned out to be worthless on my box. Playing around via simulations suggests that a shared finger is going to be disastrous when consuming more than one dict unless they have identical internal structure (not just compare equal). As soon as they get a little out of synch, it just gets worse with each succeeding probe. > Comments please! We could: > > - Live with the pathological cases. How boring . > - Forget the whole thing; and then also forget about firstkey() > etc. which has the same problem only worse. I don't know that this is an important idea for dicts in general (it is important for sets) -- it's akin to an xrange for dicts. But then I've had more than one real-life program that built giant dicts then ran out of memory trying to iterate over them! I'd like to fix that. > - Fix the algorithm. Maybe jumping criss-cross through the hash table > like lookdict does would improve that; but I don't understand the > math used for that ("Cycle through GF(2^n)-{0}" ???). Christian explained that well (thanks!). However, I still don't see any point to doing that business in .popitem(): when inserting keys, the jitterbug probe sequence has the crucial benefit of preventing primary clustering when keys collide. But when we consume a dict, we just want to visit every slot as quickly as possible. [Christian] > Appendix, on the use of finger: > ------------------------------- > > Instead of using a global finger variable, you can do the > following (involving a cast from object to int) : > > - if the 0'th slot of the dict is non-empty: > return this element and insert the dummy element > as key. Set the value field to the Dictionary Algorithm > would give for the removed object's hash. This is the > next finger. > - else: > treat the value field of the 0'th slot as the last finger. > If it is zero, initialize it with 2^n-1. > Repetitively use the DA until you find an entry. Save > the finger in slot 0 again. > > This dosn't cost an extra slot, and even when the dictionary > is written between removals, the chance to loose the finger > is just 1:(2^n-1) on every insertion. I like that, except: 1) As above, I don't believe the GF business buys anything over a straightforward search when consuming a dict. 2) Overloading the value field bristles with problems, in part because it breaks the invariant that a slot is unused if and only if the value field is NULL, in part because C doesn't guarantee that you can get away with casting an arbitrary int to a pointer and back again. None of the problems in #2 arise if we abuse the me_hash field instead, so the attached does that. Here's a typical run of Guido's test case using this (on an 866MHz machine w/ 256Mb RAM -- the early values jump all over the place from run to run): run = 0 log2size = 10 size = 1024 7.4 usec per item to build (total 0.008 sec) 3.4 usec per item to destroy twins (total 0.003 sec) log2size = 11 size = 2048 6.7 usec per item to build (total 0.014 sec) 3.4 usec per item to destroy twins (total 0.007 sec) log2size = 12 size = 4096 7.0 usec per item to build (total 0.029 sec) 3.7 usec per item to destroy twins (total 0.015 sec) log2size = 13 size = 8192 7.1 usec per item to build (total 0.058 sec) 5.9 usec per item to destroy twins (total 0.048 sec) log2size = 14 size = 16384 14.7 usec per item to build (total 0.241 sec) 6.4 usec per item to destroy twins (total 0.105 sec) log2size = 15 size = 32768 12.2 usec per item to build (total 0.401 sec) 3.9 usec per item to destroy twins (total 0.128 sec) log2size = 16 size = 65536 7.8 usec per item to build (total 0.509 sec) 4.0 usec per item to destroy twins (total 0.265 sec) log2size = 17 size = 131072 7.9 usec per item to build (total 1.031 sec) 4.1 usec per item to destroy twins (total 0.543 sec) The last one is over 100 usec per item using the original patch (with or without my first suggestion). if-i-were-a-betting-man-i'd-say-"bingo"-ly y'rs - tim Drop-in replacement for the popitem in the patch: static PyObject * dict_popitem(dictobject *mp, PyObject *args) { int i = 0; dictentry *ep; PyObject *res; if (!PyArg_NoArgs(args)) return NULL; if (mp->ma_used == 0) { PyErr_SetString(PyExc_KeyError, "popitem(): dictionary is empty"); return NULL; } /* Set ep to "the first" dict entry with a value. We abuse the hash * field of slot 0 to hold a search finger: * If slot 0 has a value, use slot 0. * Else slot 0 is being used to hold a search finger, * and we use its hash value as the first index to look. */ ep = &mp->ma_table[0]; if (ep->me_value == NULL) { i = (int)ep->me_hash; /* The hash field may be uninitialized trash, or it * may be a real hash value, or it may be a legit * search finger, or it may be a once-legit search * finger that's out of bounds now because it * wrapped around or the table shrunk -- simply * make sure it's in bounds now. */ if (i >= mp->ma_size || i < 1) i = 1; /* skip slot 0 */ while ((ep = &mp->ma_table[i])->me_value == NULL) { i++; if (i >= mp->ma_size) i = 1; } } res = PyTuple_New(2); if (res != NULL) { PyTuple_SET_ITEM(res, 0, ep->me_key); PyTuple_SET_ITEM(res, 1, ep->me_value); Py_INCREF(dummy); ep->me_key = dummy; ep->me_value = NULL; mp->ma_used--; } assert(mp->ma_table[0].me_value == NULL); mp->ma_table[0].me_hash = i + 1; /* next place to start */ return res; } From tim.one@home.com Sat Dec 9 21:09:30 2000 From: tim.one@home.com (Tim Peters) Date: Sat, 9 Dec 2000 16:09:30 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Message-ID: > assert(mp->ma_table[0].me_value == NULL); > mp->ma_table[0].me_hash = i + 1; /* next place to start */ Ack, those two lines should move up into the "if (res != NULL)" block. errors-are-error-prone-ly y'rs - tim From gvwilson@nevex.com Sun Dec 10 16:11:09 2000 From: gvwilson@nevex.com (Greg Wilson) Date: Sun, 10 Dec 2000 11:11:09 -0500 Subject: [Python-Dev] re: So You Want to Write About Python? Message-ID: Hi, folks. Jon Erickson (Doctor Dobb's Journal), Frank Willison (O'Reilly), and I (professional loose cannon) are doing a workshop at IPC on writing books and magazine articles about Python. It would be great to have a few articles (in various stages of their lives) and/or book proposals from people on this list to use as examples. So, if you think the world oughta know about the things you're doing, and would like to use this to help get yourself motivated to start writing, please drop me a line. I'm particularly interested in: - the real-world issues involved in moving to Unicode - non-trivial XML processing using SAX and DOM (where "non-trivial" means "including namespaces, entity references, error handling, and all that") - the theory and practice of stackless, generators, and continuations - the real-world tradeoffs between the various memory management schemes that are now available for Python - feature comparisons of various Foobars that can be used with Python (where "Foobar" could be "GUI toolkit", "IDE", "web scripting toolkit", or just about anything else) - performance analysis and tuning of Python itself (as an example of how you speed up real applications --- this is something that matters a lot in the real world, but tends to get forgotten in school) - just about anything else that you wish someone had written for you before you started your last big project Thanks, Greg From paul@prescod.net Sun Dec 10 18:02:27 2000 From: paul@prescod.net (Paul Prescod) Date: Sun, 10 Dec 2000 10:02:27 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> Message-ID: <3A33C533.ABA27C7C@prescod.net> Guido van Rossum wrote: > > Nobody seems to care much about the warnings PEP so far. What's up? > Are you all too busy buying presents for the holidays? Then get me > some too, please? :-) My opinions: * it should be a built-in or keyword, not a function in "sys". Warning is supposed to be as easy as possible so people will do it often. sys.argv and sys.stdout annoy me as it is. * the term "level" applied to warnings typically means "warning level" as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something. * this level idea gives rise to another question. What if I want to see the full stack context of a warning? Do I have to implement a whole new warning output hook? It seems like I should be able to specify this as a command line option alongside the action. * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out. * should there be a sys.formatwarning? What if I want to redirect warnings to a socket -- I'd like to use the standard formatting machinery. Or vice versa, I might want to change the formatting but not override the destination. * there should be a "RuntimeWarning" -- base category for warnings about dubious runtime behaviors (e.g. integer division truncated value) * it should be possible to strip warnings as an optimization step. That may require interpreter and syntax support. * warnings will usually be tied to tests which the user will want to be able to optimize out also. (e.g. if __debug__ and type(foo)==StringType: warn "Should be Unicode!") I propose: >>> warn conditional, message[, category] to be very parallel with >>> assert conditional, message I'm not proposing the use of the assert keyword anymore, but I am trying to reuse the syntax for familiarity. Perhaps -Wstrip would strip warnings out of the bytecode. Paul Prescod From nas@arctrix.com Sun Dec 10 13:46:46 2000 From: nas@arctrix.com (Neil Schemenauer) Date: Sun, 10 Dec 2000 05:46:46 -0800 Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) Message-ID: <20001210054646.A5219@glacier.fnational.com> Sourceforge unloads are not working. The lastest version of the patch for PEP 208 is here: http://arctrix.com/nas/python/coerce-6.0.diff Operations on instances now call __coerce__ if it exists. I think the patch is now complete. Converting other builtin types to "new style numbers" can be done with a separate patch. Neil From guido@python.org Sun Dec 10 22:17:08 2000 From: guido@python.org (Guido van Rossum) Date: Sun, 10 Dec 2000 17:17:08 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Sun, 10 Dec 2000 10:02:27 PST." <3A33C533.ABA27C7C@prescod.net> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> Message-ID: <200012102217.RAA12550@cj20424-a.reston1.va.home.com> > My opinions: > > * it should be a built-in or keyword, not a function in "sys". Warning > is supposed to be as easy as possible so people will do it often. Disagree. Warnings are there mostly for the Python system to warn the Python programmer. The most heavy use will come from the standard library, not from user code. > sys.argv and sys.stdout annoy me as it is. Too bad. > * the term "level" applied to warnings typically means "warning level" > as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something. Good point. > * this level idea gives rise to another question. What if I want to see > the full stack context of a warning? Do I have to implement a whole new > warning output hook? It seems like I should be able to specify this as a > command line option alongside the action. Turn warnings into errors and you'll get a full traceback. If you really want a full traceback without exiting, some creative use of sys._getframe() and the traceback module will probably suit you well. > * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out. I don't. > * should there be a sys.formatwarning? What if I want to redirect > warnings to a socket -- I'd like to use the standard formatting > machinery. Or vice versa, I might want to change the formatting but not > override the destination. Good point. I'm changing this to: def showwarning(message, category, filename, lineno, file=None): """Hook to frite a warning to a file; replace if you like.""" and def formatwarning(message, category, filename, lineno): """Hook to format a warning the standard way.""" > * there should be a "RuntimeWarning" -- base category for warnings > about dubious runtime behaviors (e.g. integer division truncated value) OK. > * it should be possible to strip warnings as an optimization step. That > may require interpreter and syntax support. I don't see the point of this. I think this comes from our different views on who should issue warnings. > * warnings will usually be tied to tests which the user will want to be > able to optimize out also. (e.g. if __debug__ and type(foo)==StringType: > warn "Should be Unicode!") > > I propose: > > >>> warn conditional, message[, category] Sorry, this is not worth a new keyword. > to be very parallel with > > >>> assert conditional, message > > I'm not proposing the use of the assert keyword anymore, but I am trying > to reuse the syntax for familiarity. Perhaps -Wstrip would strip > warnings out of the bytecode. Why? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@effbot.org Mon Dec 11 00:16:25 2000 From: fredrik@effbot.org (Fredrik Lundh) Date: Mon, 11 Dec 2000 01:16:25 +0100 Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com> Message-ID: <000901c06307$9a814d60$3c6340d5@hagrid> Guido wrote: > Moshe proposes to add an overridable function sys.displayhook(obj) > which will be called by the interpreter for the PRINT_EXPR opcode, > instead of hardcoding the behavior. The default implementation will > of course have the current behavior, but this makes it much simpler to > experiment with alternatives, e.g. using str() instead of repr() (or > to choose between str() and repr() based on the type). hmm. instead of patching here and there, what's stopping us from doing it the right way? I'd prefer something like: import code class myCLI(code.InteractiveConsole): def displayhook(self, data): # non-standard display hook print str(data) sys.setcli(myCLI()) (in other words, why not move the *entire* command line interface over to Python code) From guido@python.org Mon Dec 11 02:24:20 2000 From: guido@python.org (Guido van Rossum) Date: Sun, 10 Dec 2000 21:24:20 -0500 Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use In-Reply-To: Your message of "Mon, 11 Dec 2000 01:16:25 +0100." <000901c06307$9a814d60$3c6340d5@hagrid> References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com> <000901c06307$9a814d60$3c6340d5@hagrid> Message-ID: <200012110224.VAA12844@cj20424-a.reston1.va.home.com> > Guido wrote: > > Moshe proposes to add an overridable function sys.displayhook(obj) > > which will be called by the interpreter for the PRINT_EXPR opcode, > > instead of hardcoding the behavior. The default implementation will > > of course have the current behavior, but this makes it much simpler to > > experiment with alternatives, e.g. using str() instead of repr() (or > > to choose between str() and repr() based on the type). Effbot regurgitates: > hmm. instead of patching here and there, what's stopping us > from doing it the right way? I'd prefer something like: > > import code > > class myCLI(code.InteractiveConsole): > def displayhook(self, data): > # non-standard display hook > print str(data) > > sys.setcli(myCLI()) > > (in other words, why not move the *entire* command line interface > over to Python code) Indeed, this is why I've been hesitant to bless Moshe's hack. I finally decided to go for it because I don't see this redesign of the CLI happening anytime soon. In order to do it right, it would require a redesign of the parser input handling, which is probably the oldest code in Python (short of the long integer math, which predates Python by several years). The current code module is a hack, alas, and doesn't always get it right the same way as the *real* CLI does things. So, rather than wait forever for the perfect solution, I think it's okay to settle for less sooner. "Now is better than never." --Guido van Rossum (home page: http://www.python.org/~guido/) From paulp@ActiveState.com Mon Dec 11 06:59:29 2000 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 10 Dec 2000 22:59:29 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com> Message-ID: <3A347B51.ADB3F12C@ActiveState.com> Guido van Rossum wrote: > >... > > Disagree. Warnings are there mostly for the Python system to warn the > Python programmer. The most heavy use will come from the standard > library, not from user code. Most Python code is part of some library or another. It may not be the standard library but its still a library. Perl and Java both make warnings (especially about deprecation) very easy *for user code*. > > * it should be possible to strip warnings as an optimization step. That > > may require interpreter and syntax support. > > I don't see the point of this. I think this comes from our different > views on who should issue warnings. Everyone who creates a reusable library will want to issue warnings. That is to say, most serious Python programmers. Anyhow, let's presume that it is only the standard library that issues warnings (for arguments sake). What if I have a speed-critical module that triggers warnings in an inner loop. Turning off the warning doesn't turn off the overhead of the warning infrastructure. I should be able to turn off the overhead easily -- ideally from the Python command line. And I still feel that part of that "overhead" is in the code that tests to determine whether to issue the warnings. There should be a way to turn off that overhead also. Paul From paulp@ActiveState.com Mon Dec 11 07:23:17 2000 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 10 Dec 2000 23:23:17 -0800 Subject: [Python-Dev] Online help PEP Message-ID: <3A3480E5.C2577AE6@ActiveState.com> PEP: ??? Title: Python Online Help Version: $Revision: 1.0 $ Author: paul@prescod.net, paulp@activestate.com (Paul Prescod) Status: Draft Type: Standards Track Python-Version: 2.1 Status: Incomplete Abstract This PEP describes a command-line driven online help facility for Python. The facility should be able to build on existing documentation facilities such as the Python documentation and docstrings. It should also be extensible for new types and modules. Interactive use: Simply typing "help" describes the help function (through repr overloading). "help" can also be used as a function: The function takes the following forms of input: help( "string" ) -- built-in topic or global help( ) -- docstring from object or type help( "doc:filename" ) -- filename from Python documentation If you ask for a global, it can be a fully-qualfied name such as help("xml.dom"). You can also use the facility from a command-line python --help if In either situation, the output does paging similar to the "more" command. Implementation The help function is implemented in an onlinehelp module which is demand-loaded. There should be options for fetching help information from environments other than the command line through the onlinehelp module: onelinehelp.gethelp(object_or_string) -> string It should also be possible to override the help display function by assigning to onlinehelp.displayhelp(object_or_string). The module should be able to extract module information from either the HTML or LaTeX versions of the Python documentation. Links should be accommodated in a "lynx-like" manner. Over time, it should also be able to recognize when docstrings are in "special" syntaxes like structured text, HTML and LaTeX and decode them appropriately. A prototype implementation is available with the Python source distribution as nondist/sandbox/doctools/onlinehelp.py. Built-in Topics help( "intro" ) - What is Python? Read this first! help( "keywords" ) - What are the keywords? help( "syntax" ) - What is the overall syntax? help( "operators" ) - What operators are available? help( "builtins" ) - What functions, types, etc. are built-in? help( "modules" ) - What modules are in the standard library? help( "copyright" ) - Who owns Python? help( "moreinfo" ) - Where is there more information? help( "changes" ) - What changed in Python 2.0? help( "extensions" ) - What extensions are installed? help( "faq" ) - What questions are frequently asked? help( "ack" ) - Who has done work on Python lately? Security Issues This module will attempt to import modules with the same names as requested topics. Don't use the modules if you are not confident that everything in your pythonpath is from a trusted source. Local Variables: mode: indented-text indent-tabs-mode: nil End: From tim.one@home.com Mon Dec 11 07:36:57 2000 From: tim.one@home.com (Tim Peters) Date: Mon, 11 Dec 2000 02:36:57 -0500 Subject: [Python-Dev] FW: [Python-Help] indentation Message-ID: While we're talking about pluggable CLIs, I share this fellow's confusion over IDLE's CLI variant: block code doesn't "look right" under IDLE because sys.ps2 doesn't exist under IDLE. Some days you can't make *anybody* happy . -----Original Message----- ... Subject: [Python-Help] indentation Sent: Sunday, December 10, 2000 7:32 AM ... My Problem has to do with identation: I put the following to idle: >>> if not 1: print 'Hallo' else: SyntaxError: invalid syntax I get the Message above. I know that else must be 4 spaces to the left, but idle doesn't let me do this. I have only the alternative to put to most left point. But than I disturb the block structure and I get again the error message. I want to have it like this: >>> if not 1: print 'Hallo' else: Can you help me? ... From fredrik@pythonware.com Mon Dec 11 11:36:53 2000 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 11 Dec 2000 12:36:53 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> Message-ID: <033701c06366$ab746580$0900a8c0@SPIFF> christian wrote: > That algorithm is really a gem which you should know, > so let me try to explain it. I think someone just won the "brain exploder 2000" award ;-) to paraphrase Bertrand Russell, "Mathematics may be defined as the subject where I never know what you are talking about, nor whether what you are saying is true" cheers /F From thomas@xs4all.net Mon Dec 11 12:12:09 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 11 Dec 2000 13:12:09 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF>; from fredrik@pythonware.com on Mon, Dec 11, 2000 at 12:36:53PM +0100 References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: <20001211131208.G4396@xs4all.nl> On Mon, Dec 11, 2000 at 12:36:53PM +0100, Fredrik Lundh wrote: > christian wrote: > > That algorithm is really a gem which you should know, > > so let me try to explain it. > I think someone just won the "brain exploder 2000" award ;-) By acclamation, I'd expect. I know it was the best laugh I had since last week's Have I Got News For You, even though trying to understand it made me glad I had boring meetings to recuperate in ;) Highschool-dropout-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Mon Dec 11 12:33:18 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 11 Dec 2000 13:33:18 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: <3A34C98E.7C42FD24@lemburg.com> Fredrik Lundh wrote: > > christian wrote: > > That algorithm is really a gem which you should know, > > so let me try to explain it. > > I think someone just won the "brain exploder 2000" award ;-) > > to paraphrase Bertrand Russell, > > "Mathematics may be defined as the subject where I never > know what you are talking about, nor whether what you are > saying is true" Hmm, I must have missed that one... care to repost ? -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer@tismer.com Mon Dec 11 13:49:48 2000 From: tismer@tismer.com (Christian Tismer) Date: Mon, 11 Dec 2000 15:49:48 +0200 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: <3A34DB7C.FF7E82CE@tismer.com> Fredrik Lundh wrote: > > christian wrote: > > That algorithm is really a gem which you should know, > > so let me try to explain it. > > I think someone just won the "brain exploder 2000" award ;-) > > to paraphrase Bertrand Russell, > > "Mathematics may be defined as the subject where I never > know what you are talking about, nor whether what you are > saying is true" :-)) Well, I was primarily targeting Guido, who said that he came from math, and one cannot study math without standing a basic algebra course, I think. I tried my best to explain it for those who know at least how groups, fields, rings and automorphisms work. Going into more details of the theory would be off-topic for python-dev, but I will try it in an upcoming DDJ article. As you might have guessed, I didn't do this just for fun. It is the old game of explaining what is there, convincing everybody that you at least know what you are talking about, and then three days later coming up with an improved application of the theory. Today is Monday, 2 days left. :-) ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From guido@python.org Mon Dec 11 15:12:24 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 10:12:24 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Your message of "Mon, 11 Dec 2000 15:49:48 +0200." <3A34DB7C.FF7E82CE@tismer.com> References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> Message-ID: <200012111512.KAA23622@cj20424-a.reston1.va.home.com> > Fredrik Lundh wrote: > > > > christian wrote: > > > That algorithm is really a gem which you should know, > > > so let me try to explain it. > > > > I think someone just won the "brain exploder 2000" award ;-) > > > > to paraphrase Bertrand Russell, > > > > "Mathematics may be defined as the subject where I never > > know what you are talking about, nor whether what you are > > saying is true" > > :-)) > > Well, I was primarily targeting Guido, who said that he > came from math, and one cannot study math without standing > a basic algebra course, I think. I tried my best to explain > it for those who know at least how groups, fields, rings > and automorphisms work. Going into more details of the > theory would be off-topic for python-dev, but I will try > it in an upcoming DDJ article. I do have a math degree, but it is 18 years old and I had to give up after the first paragraph of your explanation. It made me vividly recall the first and only class on Galois Theory that I ever took -- after one hour I realized that this was not for me and I didn't have a math brain after all. I went back to the basement where the software development lab was (i.e. a row of card punches :-). > As you might have guessed, I didn't do this just for fun. > It is the old game of explaining what is there, convincing > everybody that you at least know what you are talking about, > and then three days later coming up with an improved > application of the theory. > > Today is Monday, 2 days left. :-) I'm very impressed. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Dec 11 15:15:02 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 10:15:02 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Sun, 10 Dec 2000 22:59:29 PST." <3A347B51.ADB3F12C@ActiveState.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com> <3A347B51.ADB3F12C@ActiveState.com> Message-ID: <200012111515.KAA23764@cj20424-a.reston1.va.home.com> [me] > > Disagree. Warnings are there mostly for the Python system to warn the > > Python programmer. The most heavy use will come from the standard > > library, not from user code. [Paul Prescod] > Most Python code is part of some library or another. It may not be the > standard library but its still a library. Perl and Java both make > warnings (especially about deprecation) very easy *for user code*. Hey. I'm not making it impossible to use warnings. I'm making it very easy. All you have to do is put "from warnings import warn" at the top of your library module. Requesting a built-in or even a new statement is simply excessive. > > > * it should be possible to strip warnings as an optimization step. That > > > may require interpreter and syntax support. > > > > I don't see the point of this. I think this comes from our different > > views on who should issue warnings. > > Everyone who creates a reusable library will want to issue warnings. > That is to say, most serious Python programmers. > > Anyhow, let's presume that it is only the standard library that issues > warnings (for arguments sake). What if I have a speed-critical module > that triggers warnings in an inner loop. Turning off the warning doesn't > turn off the overhead of the warning infrastructure. I should be able to > turn off the overhead easily -- ideally from the Python command line. > And I still feel that part of that "overhead" is in the code that tests > to determine whether to issue the warnings. There should be a way to > turn off that overhead also. So rewrite your code so that it doesn't trigger the warning. When you get a warning, you're doing something that could be done in a better way. So don't whine about the performance. It's a quality of implementation issue whether C code that tests for issues that deserve warnings can do the test without slowing down code that doesn't deserve a warning. Ditto for standard library code. Here's an example. I expect there will eventually (not in 2.1 yet!) warnings in the deprecated string module. If you get such a warning in a time-critical piece of code, the solution is to use string methods -- not to while about the performance of the backwards compatibility code. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@digicool.com Mon Dec 11 16:02:29 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Mon, 11 Dec 2000 11:02:29 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> Message-ID: <14900.64149.910989.998348@anthem.concentric.net> Some of my thoughts after reading the PEP and Paul/Guido's exchange. - A function in the warn module is better than one in the sys module. "from warnings import warn" is good enough to not warrant a built-in. I get the sense that the PEP description is behind Guido's currently implementation here. - When PyErr_Warn() returns 1, does that mean a warning has been transmuted into an exception, or some other exception occurred during the setting of the warning? (I think I know, but the PEP could be clearer here). - It would be nice if lineno can be a range specification. Other matches are based on regexps -- think of this as a line number regexp. - Why not do setupwarnings() in site.py? - Regexp matching on messages should be case insensitive. - The second argument to sys.warn() or PyErr_Warn() can be any class, right? If so, it's easy for me to have my own warning classes. What if I want to set up my own warnings filters? Maybe if `action' could be a callable as well as a string. Then in my IDE, I could set that to "mygui.popupWarningsDialog". -Barry From guido@python.org Mon Dec 11 15:57:33 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 10:57:33 -0500 Subject: [Python-Dev] Online help PEP In-Reply-To: Your message of "Sun, 10 Dec 2000 23:23:17 PST." <3A3480E5.C2577AE6@ActiveState.com> References: <3A3480E5.C2577AE6@ActiveState.com> Message-ID: <200012111557.KAA24266@cj20424-a.reston1.va.home.com> I approve of the general idea. Barry, please assign a PEP number. > PEP: ??? > Title: Python Online Help > Version: $Revision: 1.0 $ > Author: paul@prescod.net, paulp@activestate.com (Paul Prescod) > Status: Draft > Type: Standards Track > Python-Version: 2.1 > Status: Incomplete > > Abstract > > This PEP describes a command-line driven online help facility > for Python. The facility should be able to build on existing > documentation facilities such as the Python documentation > and docstrings. It should also be extensible for new types and > modules. > > Interactive use: > > Simply typing "help" describes the help function (through repr > overloading). Cute -- like license, copyright, credits I suppose. > "help" can also be used as a function: > > The function takes the following forms of input: > > help( "string" ) -- built-in topic or global Why does a global require string quotes? > help( ) -- docstring from object or type > help( "doc:filename" ) -- filename from Python documentation I'm missing help() -- table of contents I'm not sure if the table of contents should be printed by the repr output. > If you ask for a global, it can be a fully-qualfied name such as > help("xml.dom"). Why are the string quotes needed? When are they useful? > You can also use the facility from a command-line > > python --help if Is this really useful? Sounds like Perlism to me. > In either situation, the output does paging similar to the "more" > command. Agreed. But how to implement paging in a platform-dependent manner? On Unix, os.system("more") or "$PAGER" is likely to work. On Windows, I suppose we could use its MORE, although that's pretty braindead. On the Mac? Also, inside IDLE or Pythonwin, invoking the system pager isn't a good idea. > Implementation > > The help function is implemented in an onlinehelp module which is > demand-loaded. What does "demand-loaded" mean in a Python context? > There should be options for fetching help information from > environments other than the command line through the onlinehelp > module: > > onelinehelp.gethelp(object_or_string) -> string Good idea. > It should also be possible to override the help display function by > assigning to onlinehelp.displayhelp(object_or_string). Good idea. Pythonwin and IDLE could use this. But I'd like it to work at least "okay" if they don't. > The module should be able to extract module information from either > the HTML or LaTeX versions of the Python documentation. Links should > be accommodated in a "lynx-like" manner. I think this is beyond the scope. The LaTeX isn't installed anywhere (and processing would be too much work). The HTML is installed only on Windows, where there already is a way to get it to pop up in your browser (actually two: it's in the Start menu, and also in IDLE's Help menu). > Over time, it should also be able to recognize when docstrings are > in "special" syntaxes like structured text, HTML and LaTeX and > decode them appropriately. A standard syntax for docstrings is under development, PEP 216. I don't agree with the proposal there, but in any case the help PEP should not attempt to legalize a different format than PEP 216. > A prototype implementation is available with the Python source > distribution as nondist/sandbox/doctools/onlinehelp.py. Neat. I noticed that in a 24-line screen, the pagesize must be set to 21 to avoid stuff scrolling off the screen. Maybe there's an off-by-3 error somewhere? I also noticed that it always prints '1' when invoked as a function. The new license pager in site.py avoids this problem. help("operators") and several others raise an AttributeError('handledocrl'). The "lynx-line links" don't work. > Built-in Topics > > help( "intro" ) - What is Python? Read this first! > help( "keywords" ) - What are the keywords? > help( "syntax" ) - What is the overall syntax? > help( "operators" ) - What operators are available? > help( "builtins" ) - What functions, types, etc. are built-in? > help( "modules" ) - What modules are in the standard library? > help( "copyright" ) - Who owns Python? > help( "moreinfo" ) - Where is there more information? > help( "changes" ) - What changed in Python 2.0? > help( "extensions" ) - What extensions are installed? > help( "faq" ) - What questions are frequently asked? > help( "ack" ) - Who has done work on Python lately? I think it's naive to expect this help facility to replace browsing the website or the full documentation package. There should be one entry that says to point your browser there (giving the local filesystem URL if available), and that's it. The rest of the online help facility should be concerned with exposing doc strings. > Security Issues > > This module will attempt to import modules with the same names as > requested topics. Don't use the modules if you are not confident > that everything in your pythonpath is from a trusted source. Yikes! Another reason to avoid the "string" -> global variable option. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Dec 11 16:53:37 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 11:53:37 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 11:02:29 EST." <14900.64149.910989.998348@anthem.concentric.net> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> Message-ID: <200012111653.LAA24545@cj20424-a.reston1.va.home.com> > Some of my thoughts after reading the PEP and Paul/Guido's exchange. > > - A function in the warn module is better than one in the sys module. > "from warnings import warn" is good enough to not warrant a > built-in. I get the sense that the PEP description is behind > Guido's currently implementation here. Yes. I've updated the PEP to match the (2nd) implementation. > - When PyErr_Warn() returns 1, does that mean a warning has been > transmuted into an exception, or some other exception occurred > during the setting of the warning? (I think I know, but the PEP > could be clearer here). I've clarified this now: it returns 1 in either case. You have to do exception handling in either case. I'm not telling why -- you don't need to know. The caller of PyErr_Warn() should not attempt to catch the exception -- if that's your intent, you shouldn't be calling PyErr_Warn(). And PyErr_Warn() is complicated enough that it has to allow raising an exception. > - It would be nice if lineno can be a range specification. Other > matches are based on regexps -- think of this as a line number > regexp. Too much complexity already. > - Why not do setupwarnings() in site.py? See the PEP and the current implementation. The delayed-loading of the warnings module means that we have to save the -W options as sys.warnoptions. (This also makes them work when multiple interpreters are used -- they all get the -W options.) > - Regexp matching on messages should be case insensitive. Good point! Done in my version of the code. > - The second argument to sys.warn() or PyErr_Warn() can be any class, > right? Almost. It must be derived from __builtin__.Warning. > If so, it's easy for me to have my own warning classes. > What if I want to set up my own warnings filters? Maybe if `action' > could be a callable as well as a string. Then in my IDE, I could > set that to "mygui.popupWarningsDialog". No, for that purpose you would override warnings.showwarning(). --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas@xs4all.net Mon Dec 11 16:58:39 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 11 Dec 2000 17:58:39 +0100 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <14900.64149.910989.998348@anthem.concentric.net>; from barry@digicool.com on Mon, Dec 11, 2000 at 11:02:29AM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> Message-ID: <20001211175839.H4396@xs4all.nl> On Mon, Dec 11, 2000 at 11:02:29AM -0500, Barry A. Warsaw wrote: > - A function in the warn module is better than one in the sys module. > "from warnings import warn" is good enough to not warrant a > built-in. I get the sense that the PEP description is behind > Guido's currently implementation here. +1 on this. I have a response to Guido's first posted PEP on my laptop, but due to a weekend in Germany wasn't able to post it before he updated the PEP. I guess I can delete the arguments for this, now ;) but lets just say I think 'sys' is being a bit overused, and the case of a function in sys and its data in another module is just plain silly. > - When PyErr_Warn() returns 1, does that mean a warning has been > transmuted into an exception, or some other exception occurred > during the setting of the warning? (I think I know, but the PEP > could be clearer here). How about returning 1 for 'warning turned into exception' and -1 for 'normal exception' ? It would be slightly more similar to other functions if '-1' meant 'exception', and it would be easy to put in an if statement -- and still allow C code to ignore the produced error, if it wanted to. > - It would be nice if lineno can be a range specification. Other > matches are based on regexps -- think of this as a line number > regexp. +0 on this... I'm not sure if such fine-grained control is really necessary. I liked the hint at 'per function' granularity, but I realise it's tricky to do right, what with naming issues and all that. > - Regexp matching on messages should be case insensitive. How about being able to pass in compiled regexp objects as well as strings ? I haven't looked at the implementation at all, so I'm not sure how expensive it would be, but it might also be nice to have users (= programmers) pass in an object with its own 'match' method, so you can 'interactively' decide whether or not to raise an exception, popup a window, and what not. Sort of like letting 'action' be a callable, which I think is a good idea as well. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido@python.org Mon Dec 11 17:11:02 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 12:11:02 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 17:58:39 +0100." <20001211175839.H4396@xs4all.nl> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> Message-ID: <200012111711.MAA24818@cj20424-a.reston1.va.home.com> > > - When PyErr_Warn() returns 1, does that mean a warning has been > > transmuted into an exception, or some other exception occurred > > during the setting of the warning? (I think I know, but the PEP > > could be clearer here). > > How about returning 1 for 'warning turned into exception' and -1 for 'normal > exception' ? It would be slightly more similar to other functions if '-1' > meant 'exception', and it would be easy to put in an if statement -- and > still allow C code to ignore the produced error, if it wanted to. Why would you want this? The user clearly said that they wanted the exception! --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik@effbot.org Mon Dec 11 17:13:10 2000 From: fredrik@effbot.org (Fredrik Lundh) Date: Mon, 11 Dec 2000 18:13:10 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com> Message-ID: <009a01c06395$a9da3220$3c6340d5@hagrid> > Hmm, I must have missed that one... care to repost ? doesn't everyone here read the daily URL? here's a link: http://mail.python.org/pipermail/python-dev/2000-December/010913.html From barry@digicool.com Mon Dec 11 17:18:04 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Mon, 11 Dec 2000 12:18:04 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> Message-ID: <14901.3149.109401.151742@anthem.concentric.net> >>>>> "GvR" == Guido van Rossum writes: GvR> I've clarified this now: it returns 1 in either case. You GvR> have to do exception handling in either case. I'm not GvR> telling why -- you don't need to know. The caller of GvR> PyErr_Warn() should not attempt to catch the exception -- if GvR> that's your intent, you shouldn't be calling PyErr_Warn(). GvR> And PyErr_Warn() is complicated enough that it has to allow GvR> raising an exception. Makes sense. >> - It would be nice if lineno can be a range specification. >> Other matches are based on regexps -- think of this as a line >> number regexp. GvR> Too much complexity already. Okay, no biggie I think. >> - Why not do setupwarnings() in site.py? GvR> See the PEP and the current implementation. The GvR> delayed-loading of the warnings module means that we have to GvR> save the -W options as sys.warnoptions. (This also makes GvR> them work when multiple interpreters are used -- they all get GvR> the -W options.) Cool. >> - Regexp matching on messages should be case insensitive. GvR> Good point! Done in my version of the code. Cool. >> - The second argument to sys.warn() or PyErr_Warn() can be any >> class, right? GvR> Almost. It must be derived from __builtin__.Warning. __builtin__.Warning == exceptions.Warning, right? >> If so, it's easy for me to have my own warning classes. What >> if I want to set up my own warnings filters? Maybe if `action' >> could be a callable as well as a string. Then in my IDE, I >> could set that to "mygui.popupWarningsDialog". GvR> No, for that purpose you would override GvR> warnings.showwarning(). Cool. Looks good. -Barry From thomas@xs4all.net Mon Dec 11 18:04:56 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 11 Dec 2000 19:04:56 +0100 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012111711.MAA24818@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 12:11:02PM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com> Message-ID: <20001211190455.I4396@xs4all.nl> On Mon, Dec 11, 2000 at 12:11:02PM -0500, Guido van Rossum wrote: > > How about returning 1 for 'warning turned into exception' and -1 for 'normal > > exception' ? It would be slightly more similar to other functions if '-1' > > meant 'exception', and it would be easy to put in an if statement -- and > > still allow C code to ignore the produced error, if it wanted to. > Why would you want this? The user clearly said that they wanted the > exception! The difference is that in one case, the user will see the original warning-turned-exception, and in the other she won't -- the warning will be lost. At best she'll see (by looking at the traceback) the code intended to give a warning (that might or might not have been turned into an exception) and failed. The warning code might decide to do something aditional to notify the user of the thing it intended to warn about, which ended up as a 'real' exception because of something else. It's no biggy, obviously, except that if you change your mind it will be hard to add it without breaking code. Even if you explicitly state the return value should be tested for boolean value, not greater-than-zero value. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido@python.org Mon Dec 11 18:16:58 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 13:16:58 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 19:04:56 +0100." <20001211190455.I4396@xs4all.nl> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com> <20001211190455.I4396@xs4all.nl> Message-ID: <200012111816.NAA25214@cj20424-a.reston1.va.home.com> > > > How about returning 1 for 'warning turned into exception' and -1 for 'normal > > > exception' ? It would be slightly more similar to other functions if '-1' > > > meant 'exception', and it would be easy to put in an if statement -- and > > > still allow C code to ignore the produced error, if it wanted to. > > > Why would you want this? The user clearly said that they wanted the > > exception! > > The difference is that in one case, the user will see the original > warning-turned-exception, and in the other she won't -- the warning will be > lost. At best she'll see (by looking at the traceback) the code intended to > give a warning (that might or might not have been turned into an exception) > and failed. Yes -- this is a standard convention in Python. if there's a bug in code that is used to raise or handle an exception, you get a traceback from that bug. > The warning code might decide to do something aditional to > notify the user of the thing it intended to warn about, which ended up as a > 'real' exception because of something else. Nah. The warning code shouldn't worry about that. If there's a bug in PyErr_Warn(), that should get top priority until it's fixed. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Dec 11 18:12:56 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 11 Dec 2000 19:12:56 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com> <009a01c06395$a9da3220$3c6340d5@hagrid> Message-ID: <3A351928.3A41C970@lemburg.com> Fredrik Lundh wrote: > > > Hmm, I must have missed that one... care to repost ? > > doesn't everyone here read the daily URL? No time for pull logic... only push logic ;-) > here's a link: > http://mail.python.org/pipermail/python-dev/2000-December/010913.html Thanks. A very nice introduction indeed. The only thing which didn't come through in the first reading: why do we need GF(p^n)'s in the first place ? The second reading then made this clear: we need to assure that by iterating through the set of possible coefficients we can actually reach all slots in the dictionary... a gem indeed. Now if we could only figure out an equally simple way of producing perfect hash functions on-the-fly we could eliminate the need for the PyObject_Compare()s... ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tim.one@home.com Mon Dec 11 20:22:55 2000 From: tim.one@home.com (Tim Peters) Date: Mon, 11 Dec 2000 15:22:55 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: [/F, on Christian's GF tutorial] > I think someone just won the "brain exploder 2000" award ;-) Well, anyone can play. When keys collide, what we need is a function f(i) such that repeating i = f(i) visits every int in (0, 2**N) exactly once before setting i back to its initial value, for a fixed N and where the first i is in (0, 2**N). This is the quickest: def f(i): i -= 1 if i == 0: i = 2**N-1 return i Unfortunately, this leads to performance-destroying "primary collisions" (see Knuth, or any other text w/ a section on hashing). Other *good* possibilities include a pseudo-random number generator of maximal period, or viewing the ints in (0, 2**N) as bit vectors indicating set membership and generating all subsets of an N-element set in a Grey code order. The *form* of the function dictobject.c actually uses is: def f(i): i <<= 1 if i >= 2**N: i ^= MAGIC_CONSTANT_DEPENDING_ON_N return i which is suitably non-linear and as fast as the naive method. Given the form of the function, you don't need any theory at all to find a value for MAGIC_CONSTANT_DEPENDING_ON_N that simply works. In fact, I verified all the magic values in dictobject.c via brute force, because the person who contributed the original code botched the theory slightly and gave us some values that didn't work. I'll rely on the theory if and only if we have to extend this to 64-bit machines someday: I'm too old to wait for a brute search of a space with 2**64 elements . mathematics-is-a-battle-against-mortality-ly y'rs - tim From greg@cosc.canterbury.ac.nz Mon Dec 11 21:46:11 2000 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Dec 2000 10:46:11 +1300 (NZDT) Subject: [Python-Dev] Online help PEP In-Reply-To: <200012111557.KAA24266@cj20424-a.reston1.va.home.com> Message-ID: <200012112146.KAA01771@s454.cosc.canterbury.ac.nz> Guido: > Paul Prescod: > > In either situation, the output does paging similar to the "more" > > command. > Agreed. Only if it can be turned off! I usually prefer to use the scrolling capabilities of whatever shell window I'm using rather than having some program's own idea of how to do paging forced upon me when I don't want it. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From moshez@zadka.site.co.il Tue Dec 12 06:33:02 2000 From: moshez@zadka.site.co.il (Moshe Zadka) Date: Tue, 12 Dec 2000 08:33:02 +0200 (IST) Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) Message-ID: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il> On Mon, 11 Dec 2000 15:22:55 -0500, "Tim Peters" wrote: > Well, anyone can play. When keys collide, what we need is a function f(i) > such that repeating > i = f(i) > visits every int in (0, 2**N) exactly once before setting i back to its > initial value, for a fixed N and where the first i is in (0, 2**N). OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)? Did 0 harm you in your childhood, and you're trying to get back? <0 wink>. If we had an affine operation, instead of a linear one, we could have [0, 2**n). I won't repeat the proof here but changing > def f(i): > i <<= 1 i^=1 # This is the line I added > if i >= 2**N: > i ^= MAGIC_CONSTANT_DEPENDING_ON_N > return i Makes you waltz all over [0, 2**n) if the original made you comple (0, 2**n). if-i'm-wrong-then-someone-should-shoot-me-to-save-me-the-embarrasment-ly y'rs, Z. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tim.one@home.com Mon Dec 11 22:38:56 2000 From: tim.one@home.com (Tim Peters) Date: Mon, 11 Dec 2000 17:38:56 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il> Message-ID: [Tim] > Well, anyone can play. When keys collide, what we need is a > function f(i) such that repeating > i = f(i) > visits every int in (0, 2**N) exactly once before setting i back to its > initial value, for a fixed N and where the first i is in (0, 2**N). [Moshe Zadka] > OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)? > Did 0 harm you in your childhood, and you're trying to get > back? <0 wink>. We don't need f at all unless we've already determined there's a collision at some index h. The i sequence is used to offset h (mod 2**N). An increment of 0 would waste time (h+0 == h, but we've already done a full compare on the h'th table entry and already determined it wasn't equal to what we're looking for). IOW, there are only 2**N-1 slots still of interest by the time f is needed. > If we had an affine operation, instead of a linear one, we could have > [0, 2**n). I won't repeat the proof here but changing > > def f(i): > i <<= 1 > i^=1 # This is the line I added > if i >= 2**N: > i ^= MAGIC_CONSTANT_DEPENDING_ON_N > return i > > Makes you waltz all over [0, 2**n) if the original made you comple > (0, 2**n). But, Moshe! The proof would have been the most interesting part . From gstein@lyra.org Tue Dec 12 00:15:50 2000 From: gstein@lyra.org (Greg Stein) Date: Mon, 11 Dec 2000 16:15:50 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012111653.LAA24545@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 11:53:37AM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> Message-ID: <20001211161550.Y7732@lyra.org> On Mon, Dec 11, 2000 at 11:53:37AM -0500, Guido van Rossum wrote: >... > > - The second argument to sys.warn() or PyErr_Warn() can be any class, > > right? > > Almost. It must be derived from __builtin__.Warning. Since you must do "from warnings import warn" before using the warnings, then I think it makes sense to put the Warning classes into the warnings module. (e.g. why increase the size of the builtins?) Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@python.org Tue Dec 12 00:39:31 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 19:39:31 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 16:15:50 PST." <20001211161550.Y7732@lyra.org> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> Message-ID: <200012120039.TAA02983@cj20424-a.reston1.va.home.com> > Since you must do "from warnings import warn" before using the warnings, > then I think it makes sense to put the Warning classes into the warnings > module. (e.g. why increase the size of the builtins?) I don't particularly care whether the Warning category classes are builtins, but I can't declare them in the warnings module. Typical use from C is: if (PyErr_Warn(PyExc_DeprecationWarning, "the strop module is deprecated")) return NULL; PyErr_Warn() imports the warnings module on its first call. But the value of PyExc_DeprecationWarning c.s. must be available *before* the first call, so they can't be imported from the warnings module! My first version imported warnings at the start of the program, but this almost doubled the start-up time, hence the design where the module is imported only when needed. The most convenient place to create the Warning category classes is in the _exceptions module; doing it the easiest way there means that they are automatically exported to __builtin__. This doesn't bother me enough to try and hide them. --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Tue Dec 12 01:11:02 2000 From: gstein@lyra.org (Greg Stein) Date: Mon, 11 Dec 2000 17:11:02 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012120039.TAA02983@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 07:39:31PM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> Message-ID: <20001211171102.C7732@lyra.org> On Mon, Dec 11, 2000 at 07:39:31PM -0500, Guido van Rossum wrote: > > Since you must do "from warnings import warn" before using the warnings, > > then I think it makes sense to put the Warning classes into the warnings > > module. (e.g. why increase the size of the builtins?) > > I don't particularly care whether the Warning category classes are > builtins, but I can't declare them in the warnings module. Typical > use from C is: > > if (PyErr_Warn(PyExc_DeprecationWarning, > "the strop module is deprecated")) > return NULL; > > PyErr_Warn() imports the warnings module on its first call. But the > value of PyExc_DeprecationWarning c.s. must be available *before* the > first call, so they can't be imported from the warnings module! Do the following: pywarn.h or pyerrors.h: #define PyWARN_DEPRECATION "DeprecationWarning" ... if (PyErr_Warn(PyWARN_DEPRECATION, "the strop module is deprecated")) return NULL; The PyErr_Warn would then use the string to dynamically look up / bind to the correct value from the warnings module. By using the symbolic constant, you will catch typos in the C code (e.g. if people passed raw strings, then a typo won't be found until runtime; using symbols will catch the problem at compile time). The above strategy will allow for fully-delayed loading, and for all the warnings to be located in the "warnings" module. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@python.org Tue Dec 12 01:21:41 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 20:21:41 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 17:11:02 PST." <20001211171102.C7732@lyra.org> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> <20001211171102.C7732@lyra.org> Message-ID: <200012120121.UAA04576@cj20424-a.reston1.va.home.com> > > PyErr_Warn() imports the warnings module on its first call. But the > > value of PyExc_DeprecationWarning c.s. must be available *before* the > > first call, so they can't be imported from the warnings module! > > Do the following: > > pywarn.h or pyerrors.h: > > #define PyWARN_DEPRECATION "DeprecationWarning" > > ... > if (PyErr_Warn(PyWARN_DEPRECATION, > "the strop module is deprecated")) > return NULL; > > The PyErr_Warn would then use the string to dynamically look up / bind to > the correct value from the warnings module. By using the symbolic constant, > you will catch typos in the C code (e.g. if people passed raw strings, then > a typo won't be found until runtime; using symbols will catch the problem at > compile time). > > The above strategy will allow for fully-delayed loading, and for all the > warnings to be located in the "warnings" module. Yeah, that would be a possibility, if it was deemed evil that the warnings appear in __builtin__. I don't see what's so evil about that. (There's also the problem that the C code must be able to create new warning categories, as long as they are derived from the Warning base class. Your approach above doesn't support this. I'm sure you can figure a way around that too. But I prefer to hear why you think it's needed first.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Tue Dec 12 01:26:00 2000 From: gstein@lyra.org (Greg Stein) Date: Mon, 11 Dec 2000 17:26:00 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012120121.UAA04576@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 08:21:41PM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> <20001211171102.C7732@lyra.org> <200012120121.UAA04576@cj20424-a.reston1.va.home.com> Message-ID: <20001211172600.E7732@lyra.org> On Mon, Dec 11, 2000 at 08:21:41PM -0500, Guido van Rossum wrote: >... > > The above strategy will allow for fully-delayed loading, and for all the > > warnings to be located in the "warnings" module. > > Yeah, that would be a possibility, if it was deemed evil that the > warnings appear in __builtin__. I don't see what's so evil about > that. > > (There's also the problem that the C code must be able to create new > warning categories, as long as they are derived from the Warning base > class. Your approach above doesn't support this. I'm sure you can > figure a way around that too. But I prefer to hear why you think it's > needed first.) I'm just attempting to avoid dumping more names into __builtins__ is all. I don't believe there is anything intrinsically bad about putting more names in there, but avoiding the kitchen-sink metaphor for __builtins__ has got to be a Good Thing :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido@python.org Tue Dec 12 13:43:59 2000 From: guido@python.org (Guido van Rossum) Date: Tue, 12 Dec 2000 08:43:59 -0500 Subject: [Python-Dev] Request review of gdbm patch Message-ID: <200012121343.IAA06713@cj20424-a.reston1.va.home.com> I'm asking for a review of the patch to gdbm at http://sourceforge.net/patch/?func=detailpatch&patch_id=102638&group_id=5470 I asked the author for clarification and this is what I got. Can anybody suggest what to do? His mail doesn't give me much confidence in the patch. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Forwarded Message Date: Tue, 12 Dec 2000 13:24:13 +0100 From: Damjan To: Guido van Rossum Subject: Re: your gdbm patch for Python On Mon, Dec 11, 2000 at 03:51:03PM -0500, Guido van Rossum wrote: > I'm looking at your patch at SourceForge: First, I'm sorry it was such a mess of a patch, but I could't figure it out how to send a more elaborate comment. (But then again, I would't have an email from Guido van Rossum in my mail-box, to show of my friends :) > and wondering two things: > > (1) what does the patch do? > > (2) why does the patch remove the 'f' / GDBM_FAST option? From the gdbm info page: ...The following may also be logically or'd into the database flags: GDBM_SYNC, which causes all database operations to be synchronized to the disk, and GDBM_NOLOCK, which prevents the library from performing any locking on the database file. The option GDBM_FAST is now obsolete, since `gdbm' defaults to no-sync mode... ^^^^^^^^ (1) My patch adds two options to the gdbm.open(..) function. These are 'u' for GDBM_NOLOCK, and 's' for GDBM_SYNC. (2) GDBM_FAST is obsolete because gdbm defaults to GDBM_FAST, so it's removed. I'm also thinking about adding a lock and unlock methods to the gdbm object, but it seems that a gdbm database can only be locked and not unlocked. - -- Damjan Georgievski | Дамјан ГеоргиевÑки Skopje, Macedonia | Скопје, Македонија ------- End of Forwarded Message From mal@lemburg.com Tue Dec 12 13:49:40 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 12 Dec 2000 14:49:40 +0100 Subject: [Python-Dev] Codec aliasing and naming Message-ID: <3A362CF4.2082A606@lemburg.com> I just wanted to inform you of a change I plan for the standard encodings search function to enable better support for aliasing of encoding names. The current implementation caches the aliases returned from the codecs .getaliases() function in the encodings lookup cache rather than in the alias cache. As a consequence, the hyphen to underscore mapping is not applied to the aliases. A codec would have to return a list of all combinations of names with hyphens and underscores in order to emulate the standard lookup behaviour. I have a ptach which fixes this and also assures that aliases cannot be overwritten by codecs which register at some later point in time. This assures that we won't run into situations where a codec import suddenly overrides behaviour of previously active codecs. I would also like to propose the use of a new naming scheme for codecs which enables drop-in installation. As discussed on the i18n-sig list, people would like to install codecs without having the users to call a codec registration function or to touch site.py. The standard search function in the encodings package has a nice property (which I only noticed after the fact ;) which allows using Python package names in the encoding names, e.g. you can install a package 'japanese' and the access the codecs in that package using 'japanese.shiftjis' without having to bother registering a new codec search function for the package -- the encodings package search function will redirect the lookup to the 'japanese' package. Using package names in the encoding name has several advantages: * you know where the codec comes from * you can have mutliple codecs for the same encoding * drop-in installation without registration is possible * the need for a non-default encoding package is visible in the source code * you no longer need to drop new codecs into the Python standard lib Perhaps someone could add a note about this possibility to the codec docs ?! If noone objects, I'll apply the patch for the enhanced alias support later today. Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@python.org Tue Dec 12 13:57:01 2000 From: guido@python.org (Guido van Rossum) Date: Tue, 12 Dec 2000 08:57:01 -0500 Subject: [Python-Dev] Codec aliasing and naming In-Reply-To: Your message of "Tue, 12 Dec 2000 14:49:40 +0100." <3A362CF4.2082A606@lemburg.com> References: <3A362CF4.2082A606@lemburg.com> Message-ID: <200012121357.IAA06846@cj20424-a.reston1.va.home.com> > Perhaps someone could add a note about this possibility > to the codec docs ?! You can check it in yourself or mail it to Fred or submit it to SF... I don't expect anyone else will jump in and document this properly. > If noone objects, I'll apply the patch for the enhanced alias > support later today. Fine with me (but I don't use codecs -- where's the Dutch language support? :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Tue Dec 12 14:38:20 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Tue, 12 Dec 2000 15:38:20 +0100 Subject: [Python-Dev] Codec aliasing and naming References: <3A362CF4.2082A606@lemburg.com> <200012121357.IAA06846@cj20424-a.reston1.va.home.com> Message-ID: <3A36385C.60C7F2B@lemburg.com> Guido van Rossum wrote: > > > Perhaps someone could add a note about this possibility > > to the codec docs ?! > > You can check it in yourself or mail it to Fred or submit it to SF... > I don't expect anyone else will jump in and document this properly. I'll submit a bug report so that this doesn't get lost in the archives. Don't have time for it myself... alas, noone really does seem to have time these days ;-) > > If noone objects, I'll apply the patch for the enhanced alias > > support later today. > > Fine with me (but I don't use codecs -- where's the Dutch language > support? :-). OK. About the Dutch language support: this would make a nice Christmas fun-project... a new standard module which interfaces to babel.altavista.com (hmm, they don't list Dutch as a supported language yet, but maybe if we bug them enough... ;). -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From paulp@ActiveState.com Tue Dec 12 18:11:13 2000 From: paulp@ActiveState.com (Paul Prescod) Date: Tue, 12 Dec 2000 10:11:13 -0800 Subject: [Python-Dev] Online help PEP References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> Message-ID: <3A366A41.1A14EFD4@ActiveState.com> Guido van Rossum wrote: > >... > > help( "string" ) -- built-in topic or global > > Why does a global require string quotes? It doesn't, but if you happen to say help( "dir" ) instead of help( dir ), I think it should do the right thing. > I'm missing > > help() -- table of contents > > I'm not sure if the table of contents should be printed by the repr > output. I don't see any benefit in having different behaviors for help and help(). > > If you ask for a global, it can be a fully-qualfied name such as > > help("xml.dom"). > > Why are the string quotes needed? When are they useful? When you haven't imported the thing you are asking about. Or when the string comes from another UI like an editor window, command line or web form. > > You can also use the facility from a command-line > > > > python --help if > > Is this really useful? Sounds like Perlism to me. I'm just trying to make it easy to quickly get answers to Python questions. I could totally see someone writing code in VIM switching to a bash window to type: python --help os.path.dirname That's alot easier than: $ python Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. >>> import os >>> help(os.path.dirname) And what does it hurt? > > In either situation, the output does paging similar to the "more" > > command. > > Agreed. But how to implement paging in a platform-dependent manner? > On Unix, os.system("more") or "$PAGER" is likely to work. On Windows, > I suppose we could use its MORE, although that's pretty braindead. On > the Mac? Also, inside IDLE or Pythonwin, invoking the system pager > isn't a good idea. The current implementation does paging internally. You could override it to use the system pager (or no pager). > What does "demand-loaded" mean in a Python context? When you "touch" the help object, it loads the onlinehelp module which has the real implementation. The thing in __builtins__ is just a lightweight proxy. > > It should also be possible to override the help display function by > > assigning to onlinehelp.displayhelp(object_or_string). > > Good idea. Pythonwin and IDLE could use this. But I'd like it to > work at least "okay" if they don't. Agreed. > > The module should be able to extract module information from either > > the HTML or LaTeX versions of the Python documentation. Links should > > be accommodated in a "lynx-like" manner. > > I think this is beyond the scope. Well, we have to do one of: * re-write a subset of the docs in a form that can be accessed from the command line * access the existing docs in a form that's installed * auto-convert the docs into a form that's compatible I've already implemented HTML parsing and LaTeX parsing is actually not that far off. I just need impetus to finish a LaTeX-parsing project I started on my last vacation. The reason that LaTeX is interesting is because it would be nice to be able to move documentation from existing LaTeX files into docstrings. > The LaTeX isn't installed anywhere > (and processing would be too much work). > The HTML is installed only > on Windows, where there already is a way to get it to pop up in your > browser (actually two: it's in the Start menu, and also in IDLE's Help > menu). If the documentation becomes an integral part of the Python code, then it will be installed. It's ridiculous that it isn't already. ActivePython does install the docs on all platforms. > A standard syntax for docstrings is under development, PEP 216. I > don't agree with the proposal there, but in any case the help PEP > should not attempt to legalize a different format than PEP 216. I won't hold my breath for a standard Python docstring format. I've gone out of my way to make the code format independent.. > Neat. I noticed that in a 24-line screen, the pagesize must be set to > 21 to avoid stuff scrolling off the screen. Maybe there's an off-by-3 > error somewhere? Yes. > I also noticed that it always prints '1' when invoked as a function. > The new license pager in site.py avoids this problem. Okay. > help("operators") and several others raise an > AttributeError('handledocrl'). Fixed. > The "lynx-line links" don't work. I don't think that's implemented yet. > I think it's naive to expect this help facility to replace browsing > the website or the full documentation package. There should be one > entry that says to point your browser there (giving the local > filesystem URL if available), and that's it. The rest of the online > help facility should be concerned with exposing doc strings. I don't want to replace the documentation. But there is no reason we should set out to make it incomplete. If its integrated with the HTML then people can choose whatever access mechanism is easiest for them right now I'm trying hard not to be "naive". Realistically, nobody is going to write a million docstrings between now and Python 2.1. It is much more feasible to leverage the existing documentation that Fred and others have spent months on. > > Security Issues > > > > This module will attempt to import modules with the same names as > > requested topics. Don't use the modules if you are not confident > > that everything in your pythonpath is from a trusted source. > Yikes! Another reason to avoid the "string" -> global variable > option. I don't think we should lose that option. People will want to look up information from non-executable environments like command lines, GUIs and web pages. Perhaps you can point me to techniques for extracting information from Python modules and packages without executing them. Paul From guido@python.org Tue Dec 12 20:46:09 2000 From: guido@python.org (Guido van Rossum) Date: Tue, 12 Dec 2000 15:46:09 -0500 Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE Message-ID: <200012122046.PAA16915@cj20424-a.reston1.va.home.com> ------- Forwarded Message Date: Tue, 12 Dec 2000 12:38:20 -0800 From: noreply@sourceforge.net To: noreply@sourceforge.net Subject: SourceForge: PROJECT DOWNTIME NOTICE ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS This update is being sent to project administrators only and contains important information regarding your project. Please read it in its entirety. INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION As noted in the sitewide email sent this week, the SourceForge.net infrastructure is being upgraded (and relocated). As part of this projects, plans are underway to further increase capacity and responsiveness. We are scheduling the relocation of the systems serving project subdomain web pages. IMPORTANT: This move will affect you in the following ways: 1. Service and availability of SourceForge.net and the development tools provided will continue uninterupted. 2. Project page webservers hosting subdomains (yourprojectname.sourceforge.net) will be down Friday December 15 from 9PM PST (12AM EST) until 3AM PST. 3. CVS will be unavailable (read only part of the time) from 7PM until 3AM PST 4. Mailing lists and mail aliases will be unavailable until 3AM PST - --------------------- This email was sent from sourceforge.net. To change your email receipt preferences, please visit the site and edit your account via the "Account Maintenance" link. Direct any questions to admin@sourceforge.net, or reply to this email. ------- End of Forwarded Message From greg@cosc.canterbury.ac.nz Tue Dec 12 22:42:01 2000 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 13 Dec 2000 11:42:01 +1300 (NZDT) Subject: [Python-Dev] Online help PEP In-Reply-To: <3A366A41.1A14EFD4@ActiveState.com> Message-ID: <200012122242.LAA01902@s454.cosc.canterbury.ac.nz> Paul Prescod: > Guido: > > Why are the string quotes needed? When are they useful? > When you haven't imported the thing you are asking about. It would be interesting if the quoted form allowed you to extract doc info from a module *without* having the side effect of importing it. This could no doubt be done for pure Python modules. Would be rather tricky for extension modules, though, I expect. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From barry@digicool.com Wed Dec 13 02:21:36 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Tue, 12 Dec 2000 21:21:36 -0500 Subject: [Python-Dev] Two new PEPs, 232 & 233 Message-ID: <14902.56624.20961.768525@anthem.concentric.net> I've just uploaded two new PEPs. 232 is a revision of my pre-PEP era function attribute proposal. 233 is Paul Prescod's proposal for an on-line help facility. http://python.sourceforge.net/peps/pep-0232.html http://python.sourceforge.net/peps/pep-0233.html Let the games begin, -Barry From tim.one@home.com Wed Dec 13 03:34:35 2000 From: tim.one@home.com (Tim Peters) Date: Tue, 12 Dec 2000 22:34:35 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Message-ID: [Moshe Zadka] > If we had an affine operation, instead of a linear one, we could have > [0, 2**n). I won't repeat the proof here but changing > > def f(i): > i <<= 1 > i^=1 # This is the line I added > if i >= 2**N: > i ^= MAGIC_CONSTANT_DEPENDING_ON_N > return i > > Makes you waltz all over [0, 2**n) if the original made you comple > (0, 2**n). [Tim] > But, Moshe! The proof would have been the most interesting part . Turns out the proof would have been intensely interesting, as you can see by running the attached with and without the new line commented out. don't-ever-trust-a-theoretician-ly y'rs - tim N = 2 MAGIC_CONSTANT_DEPENDING_ON_N = 7 def f(i): i <<= 1 # i^=1 # This is the line I added if i >= 2**N: i ^= MAGIC_CONSTANT_DEPENDING_ON_N return i i = 1 for nothing in range(4): print i, i = f(i) print i From akuchlin@mems-exchange.org Wed Dec 13 03:55:33 2000 From: akuchlin@mems-exchange.org (A.M. Kuchling) Date: Tue, 12 Dec 2000 22:55:33 -0500 Subject: [Python-Dev] Splitting up _cursesmodule Message-ID: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> At 2502 lines, _cursesmodule.c is cumbersomely large. I've just received a patch from Thomas Gellekum that adds support for the panel library that will add another 500 lines. I'd like to split the C file into several subfiles (_curses_panel.c, _curses_window.c, etc.) that get #included from the master _cursesmodule.c file. Do the powers that be approve of this idea? --amk From tim.one@home.com Wed Dec 13 03:54:20 2000 From: tim.one@home.com (Tim Peters) Date: Tue, 12 Dec 2000 22:54:20 -0500 Subject: FW: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE Message-ID: FYI, looks like SourceForge is scheduled to be unusable in a span covering late Friday thru early Saturday (OTT -- One True Time, defined by the clocks in Guido's house). -----Original Message----- From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On Behalf Of Guido van Rossum Sent: Tuesday, December 12, 2000 3:46 PM To: python-dev@python.org Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE ------- Forwarded Message Date: Tue, 12 Dec 2000 12:38:20 -0800 From: noreply@sourceforge.net To: noreply@sourceforge.net Subject: SourceForge: PROJECT DOWNTIME NOTICE ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS This update is being sent to project administrators only and contains important information regarding your project. Please read it in its entirety. INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION As noted in the sitewide email sent this week, the SourceForge.net infrastructure is being upgraded (and relocated). As part of this projects, plans are underway to further increase capacity and responsiveness. We are scheduling the relocation of the systems serving project subdomain web pages. IMPORTANT: This move will affect you in the following ways: 1. Service and availability of SourceForge.net and the development tools provided will continue uninterupted. 2. Project page webservers hosting subdomains (yourprojectname.sourceforge.net) will be down Friday December 15 from 9PM PST (12AM EST) until 3AM PST. 3. CVS will be unavailable (read only part of the time) from 7PM until 3AM PST 4. Mailing lists and mail aliases will be unavailable until 3AM PST --------------------- This email was sent from sourceforge.net. To change your email receipt preferences, please visit the site and edit your account via the "Account Maintenance" link. Direct any questions to admin@sourceforge.net, or reply to this email. ------- End of Forwarded Message _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://www.python.org/mailman/listinfo/python-dev From esr@thyrsus.com Wed Dec 13 04:29:17 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Tue, 12 Dec 2000 23:29:17 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> Message-ID: <20001212232917.A22839@thyrsus.com> A.M. Kuchling : > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > received a patch from Thomas Gellekum that adds support for the panel > library that will add another 500 lines. I'd like to split the C file > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > get #included from the master _cursesmodule.c file. > > Do the powers that be approve of this idea? I doubt I qualify as a power that be, but I'm certainly +1 on panel support. -- Eric S. Raymond The biggest hypocrites on gun control are those who live in upscale developments with armed security guards -- and who want to keep other people from having guns to defend themselves. But what about lower-income people living in high-crime, inner city neighborhoods? Should such people be kept unarmed and helpless, so that limousine liberals can 'make a statement' by adding to the thousands of gun laws already on the books?" --Thomas Sowell From fdrake@acm.org Wed Dec 13 06:24:01 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 01:24:01 -0500 (EST) Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> Message-ID: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> A.M. Kuchling writes: > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > received a patch from Thomas Gellekum that adds support for the panel > library that will add another 500 lines. I'd like to split the C file > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > get #included from the master _cursesmodule.c file. Would it be reasonable to add panel support as a second extension module? Is there really a need for them to be in the same module, since the panel library is a separate library? -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From gstein@lyra.org Wed Dec 13 07:58:38 2000 From: gstein@lyra.org (Greg Stein) Date: Tue, 12 Dec 2000 23:58:38 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> Message-ID: <20001212235838.T8951@lyra.org> On Tue, Dec 12, 2000 at 10:55:33PM -0500, A.M. Kuchling wrote: > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > received a patch from Thomas Gellekum that adds support for the panel > library that will add another 500 lines. I'd like to split the C file > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > get #included from the master _cursesmodule.c file. Why should they be #included? I thought that we can build multiple .c files into a module... Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 13 08:05:05 2000 From: gstein@lyra.org (Greg Stein) Date: Wed, 13 Dec 2000 00:05:05 -0800 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects dictobject.c,2.68,2.69 In-Reply-To: <200012130102.RAA31828@slayer.i.sourceforge.net>; from tim_one@users.sourceforge.net on Tue, Dec 12, 2000 at 05:02:49PM -0800 References: <200012130102.RAA31828@slayer.i.sourceforge.net> Message-ID: <20001213000505.U8951@lyra.org> On Tue, Dec 12, 2000 at 05:02:49PM -0800, Tim Peters wrote: > Update of /cvsroot/python/python/dist/src/Objects > In directory slayer.i.sourceforge.net:/tmp/cvs-serv31776/python/dist/src/objects > > Modified Files: > dictobject.c > Log Message: > Bring comments up to date (e.g., they still said the table had to be > a prime size, which is in fact never true anymore ...). >... > --- 55,78 ---- > > /* > ! There are three kinds of slots in the table: > ! > ! 1. Unused. me_key == me_value == NULL > ! Does not hold an active (key, value) pair now and never did. Unused can > ! transition to Active upon key insertion. This is the only case in which > ! me_key is NULL, and is each slot's initial state. > ! > ! 2. Active. me_key != NULL and me_key != dummy and me_value != NULL > ! Holds an active (key, value) pair. Active can transition to Dummy upon > ! key deletion. This is the only case in which me_value != NULL. > ! > ! 3. Dummy. me_key == dummy && me_value == NULL > ! Previously held an active (key, value) pair, but that was deleted and an > ! active pair has not yet overwritten the slot. Dummy can transition to > ! Active upon key insertion. Dummy slots cannot be made Unused again > ! (cannot have me_key set to NULL), else the probe sequence in case of > ! collision would have no way to know they were once active. 4. The popitem finger. :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From moshez@zadka.site.co.il Wed Dec 13 19:19:53 2000 From: moshez@zadka.site.co.il (Moshe Zadka) Date: Wed, 13 Dec 2000 21:19:53 +0200 (IST) Subject: [Python-Dev] Splitting up _cursesmodule Message-ID: <20001213191953.7208DA82E@darjeeling.zadka.site.co.il> On Tue, 12 Dec 2000 23:29:17 -0500, "Eric S. Raymond" wrote: > A.M. Kuchling : > > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > > received a patch from Thomas Gellekum that adds support for the panel > > library that will add another 500 lines. I'd like to split the C file > > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > > get #included from the master _cursesmodule.c file. > > > > Do the powers that be approve of this idea? > > I doubt I qualify as a power that be, but I'm certainly +1 on panel support. I'm +1 on panel support, but that seems the wrong solution. Why not have several C moudles (_curses_panel,...) and manage a more unified namespace with the Python wrapper modules? /curses/panel.py -- from _curses_panel import * etc. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From akuchlin@mems-exchange.org Wed Dec 13 12:44:23 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 13 Dec 2000 07:44:23 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 01:24:01AM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> Message-ID: <20001213074423.A30348@kronos.cnri.reston.va.us> [CC'ing Thomas Gellekum ] On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote: > Would it be reasonable to add panel support as a second extension >module? Is there really a need for them to be in the same module, >since the panel library is a separate library? Quite possibly, though the patch isn't structured that way. The panel module will need access to the type object for the curses window object, so it'll have to ensure that _curses is already imported, but that's no problem. Thomas, do you feel capable of implementing it as a separate module, or should I work on it? Probably a _cursesmodule.h header will have to be created to make various definitions available to external users of the basic objects in _curses. (Bonus: this means that the menu and form libraries, if they ever get wrapped, can be separate modules, too.) --amk From tg@melaten.rwth-aachen.de Wed Dec 13 14:00:46 2000 From: tg@melaten.rwth-aachen.de (Thomas Gellekum) Date: 13 Dec 2000 15:00:46 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 07:44:23 -0500" References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> Message-ID: Andrew Kuchling writes: > [CC'ing Thomas Gellekum ] > > On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote: > > Would it be reasonable to add panel support as a second extension > >module? Is there really a need for them to be in the same module, > >since the panel library is a separate library? > > Quite possibly, though the patch isn't structured that way. The panel > module will need access to the type object for the curses window > object, so it'll have to ensure that _curses is already imported, but > that's no problem. You mean as separate modules like import curses import panel ? Hm. A panel object is associated with a window object, so it's created from a window method. This means you'd need to add window.new_panel() to PyCursesWindow_Methods[] and curses.update_panels(), curses.panel_above() and curses.panel_below() (or whatever they're called after we're through discussing this ;-)) to PyCurses_Methods[]. Also, the curses.panel_{above,below}() wrappers need access to the list_of_panels via find_po(). > Thomas, do you feel capable of implementing it as a separate module, > or should I work on it? It's probably finished a lot sooner when you do it; OTOH, it would be fun to try it. Let's carry this discussion a bit further. > Probably a _cursesmodule.h header will have > to be created to make various definitions available to external > users of the basic objects in _curses. That's easy. The problem is that we want to extend those basic objects in _curses. > (Bonus: this means that the > menu and form libraries, if they ever get wrapped, can be separate > modules, too.) Sure, if we solve this for panel, the others are a SMOP. :-) tg From guido@python.org Wed Dec 13 14:31:52 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 09:31:52 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src README,1.106,1.107 In-Reply-To: Your message of "Wed, 13 Dec 2000 06:14:35 PST." <200012131414.GAA20849@slayer.i.sourceforge.net> References: <200012131414.GAA20849@slayer.i.sourceforge.net> Message-ID: <200012131431.JAA21243@cj20424-a.reston1.va.home.com> > + --with-cxx=: Some C++ compilers require that main() is > + compiled with the C++ if there is any C++ code in the application. > + Specifically, g++ on a.out systems may require that to support > + construction of global objects. With this option, the main() function > + of Python will be compiled with ; use that only if you > + plan to use C++ extension modules, and if your compiler requires > + compilation of main() as a C++ program. Thanks for documenting this; see my continued reservation in the (reopened) bug report. Another question remains regarding the docs though: why is it bad to always compile main.c with a C++ compiler? --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Wed Dec 13 15:19:01 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 10:19:01 -0500 (EST) Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> Message-ID: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> Thomas Gellekum writes: > You mean as separate modules like > > import curses > import panel Or better yet: import curses import curses.panel > ? Hm. A panel object is associated with a window object, so it's > created from a window method. This means you'd need to add > window.new_panel() to PyCursesWindow_Methods[] and > curses.update_panels(), curses.panel_above() and curses.panel_below() > (or whatever they're called after we're through discussing this ;-)) > to PyCurses_Methods[]. Do these new functions have to be methods on the window objects, or can they be functions in the new module that take a window as a parameter? The underlying window object can certainly provide slots for the use of the panel (forms, ..., etc.) bindings, and simply initialize them to NULL (or whatever) for newly created windows. > Also, the curses.panel_{above,below}() wrappers need access to the > list_of_panels via find_po(). There's no reason that underlying utilities can't be provided by _curses using a CObject. The Extending & Embedding manual has a section on using CObjects to provide a C API to a module without having to link to it directly. > That's easy. The problem is that we want to extend those basic objects > in _curses. Again, I'm curious about the necessity of this. I suspect it can be avoided. I think the approach I've hinted at above will allow you to avoid this, and will allow the panel (forms, ...) support to be added simply by adding additional modules as they are written and the underlying libraries are installed on the host. I know the question of including these modules in the core distribution has come up before, but the resurgence in interest in these makes me want to bring it up again: Does the curses package (and the associated C extension(s)) belong in the standard library, or does it make sense to spin out a distutils-based package? I've no objection to them being in the core, but it seems that the release cycle may want to diverge from Python's. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From guido@python.org Wed Dec 13 15:48:50 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 10:48:50 -0500 Subject: [Python-Dev] Online help PEP In-Reply-To: Your message of "Tue, 12 Dec 2000 10:11:13 PST." <3A366A41.1A14EFD4@ActiveState.com> References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> <3A366A41.1A14EFD4@ActiveState.com> Message-ID: <200012131548.KAA21344@cj20424-a.reston1.va.home.com> [Paul's PEP] > > > help( "string" ) -- built-in topic or global [me] > > Why does a global require string quotes? [Paul] > It doesn't, but if you happen to say > > help( "dir" ) instead of help( dir ), I think it should do the right > thing. Fair enough. > > I'm missing > > > > help() -- table of contents > > > > I'm not sure if the table of contents should be printed by the repr > > output. > > I don't see any benefit in having different behaviors for help and > help(). Having the repr() overloading invoke the pager is dangerous. The beta version of the license command did this, and it caused some strange side effects, e.g. vars(__builtins__) would start reading from input and confuse the users. The new version's repr() returns the desired string if it's less than a page, and 'Type license() to see the full license text' if the pager would need to be invoked. > > > If you ask for a global, it can be a fully-qualfied name such as > > > help("xml.dom"). > > > > Why are the string quotes needed? When are they useful? > > When you haven't imported the thing you are asking about. Or when the > string comes from another UI like an editor window, command line or web > form. The implied import is a major liability. If you can do this without importing (e.g. by source code inspection), fine. Otherwise, you might issue some kind of message like "you must first import XXX.YYY". > > > You can also use the facility from a command-line > > > > > > python --help if > > > > Is this really useful? Sounds like Perlism to me. > > I'm just trying to make it easy to quickly get answers to Python > questions. I could totally see someone writing code in VIM switching to > a bash window to type: > > python --help os.path.dirname > > That's alot easier than: > > $ python > Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32 > Type "copyright", "credits" or "license" for more information. > >>> import os > >>> help(os.path.dirname) > > And what does it hurt? The hurt is code bloat in the interpreter and creeping featurism. If you need command line access to the docs (which may be a reasonable thing to ask for, although to me it sounds backwards :-), it's better to provide a separate command, e.g. pythondoc. (Analog to perldoc.) > > > In either situation, the output does paging similar to the "more" > > > command. > > > > Agreed. But how to implement paging in a platform-dependent manner? > > On Unix, os.system("more") or "$PAGER" is likely to work. On Windows, > > I suppose we could use its MORE, although that's pretty braindead. On > > the Mac? Also, inside IDLE or Pythonwin, invoking the system pager > > isn't a good idea. > > The current implementation does paging internally. You could override it > to use the system pager (or no pager). Yes. Please add that option to the PEP. > > What does "demand-loaded" mean in a Python context? > > When you "touch" the help object, it loads the onlinehelp module which > has the real implementation. The thing in __builtins__ is just a > lightweight proxy. Please suggest an implementation. > > > It Should Also Be Possible To Override The Help Display Function By > > > Assigning To Onlinehelp.Displayhelp(Object_Or_String). > > > > Good Idea. Pythonwin And Idle Could Use This. But I'D Like It To > > Work At Least "Okay" If They Don'T. > > Agreed. Glad You'Re So Agreeable. :) > > > The Module Should Be Able To Extract Module Information From Either > > > The Html Or Latex Versions Of The Python Documentation. Links Should > > > Be Accommodated In A "Lynx-Like" Manner. > > > > I Think This Is Beyond The Scope. > > Well, We Have To Do One Of: > > * Re-Write A Subset Of The Docs In A Form That Can Be Accessed From The > Command Line > * Access The Existing Docs In A Form That'S Installed > * Auto-Convert The Docs Into A Form That'S Compatible I Really Don'T Think That This Tool Should Attempt To Do Everything. If Someone *Really* Wants To Browse The Existing (Large) Doc Set In A Terminal Emulation Window, Let Them Use Lynx And Point It To The Documentation Set. (I Agree That The Html Docs Should Be Installed, By The Way.) > I'Ve Already Implemented Html Parsing And Latex Parsing Is Actually Not > That Far Off. I Just Need Impetus To Finish A Latex-Parsing Project I > Started On My Last Vacation. A Latex Parser Would Be Most Welcome -- If It Could Replace Latex2html! That Old Perl Program Is Really Ready For Retirement. (Ask Fred.) > The Reason That Latex Is Interesting Is Because It Would Be Nice To Be > Able To Move Documentation From Existing Latex Files Into Docstrings. That'S What Some People Think. I Disagree That It Would Be Either Feasible Or A Good Idea To Put All Documentation For A Typical Module In Its Doc Strings. > > The Latex Isn'T Installed Anywhere > > (And Processing Would Be Too Much Work). > > The Html Is Installed Only > > On Windows, Where There Already Is A Way To Get It To Pop Up In Your > > Browser (Actually Two: It'S In The Start Menu, And Also In Idle'S Help > > Menu). > > If The Documentation Becomes An Integral Part Of The Python Code, Then > It Will Be Installed. It'S Ridiculous That It Isn'T Already. Why Is That Ridiculous? It'S Just As Easy To Access Them Through The Web For Most People. If It'S Not, They Are Available In Easily Downloadable Tarballs Supporting A Variety Of Formats. That'S Just Too Much To Be Included In The Standard Rpms. (Also, Latex2html Requires So Much Hand-Holding, And Is So Slow, That It'S Really Not A Good Idea To Let "Make Install" Install The Html By Default.) > Activepython Does Install The Docs On All Platforms. Great. More Power To You. > > A Standard Syntax For Docstrings Is Under Development, Pep 216. I > > Don'T Agree With The Proposal There, But In Any Case The Help Pep > > Should Not Attempt To Legalize A Different Format Than Pep 216. > > I Won'T Hold My Breath For A Standard Python Docstring Format. I'Ve Gone > Out Of My Way To Make The Code Format Independent.. To Tell You The Truth, I'M Not Holding My Breath Either. :-) So your code should just dump the doc string on stdout without interpreting it in any way (except for paging). > > Neat. I noticed that in a 24-line screen, the pagesize must be set to > > 21 to avoid stuff scrolling off the screen. Maybe there's an off-by-3 > > error somewhere? > > Yes. It's buggier than just that. The output of the pager prints an extra "| " at the start of each page except for the first, and the first page is a line longer than subsequent pages. BTW, another bug: try help(cgi). It's nice that it gives the default value for arguments, but the defaults for FieldStorage.__init__ happen to include os.environ. Its entire value is dumped -- which causes the pager to be off (it wraps over about 20 lines for me). I think you may have to truncate long values a bit, e.g. by using the repr module. > > I also noticed that it always prints '1' when invoked as a function. > > The new license pager in site.py avoids this problem. > > Okay. Where's the check-in? :-) > > help("operators") and several others raise an > > AttributeError('handledocrl'). > > Fixed. > > > The "lynx-line links" don't work. > > I don't think that's implemented yet. I'm not sure what you intended to implement there. I prefer to see the raw URLs, then I can do whatever I normally do to paste them into my preferred webbrowser (which *not* lynx :-). > > I think it's naive to expect this help facility to replace browsing > > the website or the full documentation package. There should be one > > entry that says to point your browser there (giving the local > > filesystem URL if available), and that's it. The rest of the online > > help facility should be concerned with exposing doc strings. > > I don't want to replace the documentation. But there is no reason we > should set out to make it incomplete. If its integrated with the HTML > then people can choose whatever access mechanism is easiest for them > right now > > I'm trying hard not to be "naive". Realistically, nobody is going to > write a million docstrings between now and Python 2.1. It is much more > feasible to leverage the existing documentation that Fred and others > have spent months on. I said above, and I'll say it again: I think the majority of people would prefer to use their standard web browser to read the standard docs. It's not worth the effort to try to make those accessible through help(). In fact, I'd encourage the development of a command-line-invoked help facility that shows doc strings in the user's preferred web browser -- the webbrowser module makes this trivial. > > > Security Issues > > > > > > This module will attempt to import modules with the same names as > > > requested topics. Don't use the modules if you are not confident > > > that everything in your pythonpath is from a trusted source. > > Yikes! Another reason to avoid the "string" -> global variable > > option. > > I don't think we should lose that option. People will want to look up > information from non-executable environments like command lines, GUIs > and web pages. Perhaps you can point me to techniques for extracting > information from Python modules and packages without executing them. I don't know specific tools, but any serious docstring processing tool ends up parsing the source code for this very reason, so there's probably plenty of prior art. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Wed Dec 13 16:07:22 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 11:07:22 -0500 (EST) Subject: [Python-Dev] Online help PEP In-Reply-To: <200012131548.KAA21344@cj20424-a.reston1.va.home.com> References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> <3A366A41.1A14EFD4@ActiveState.com> <200012131548.KAA21344@cj20424-a.reston1.va.home.com> Message-ID: <14903.40634.569192.704368@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > A Latex Parser Would Be Most Welcome -- If It Could Replace > Latex2html! That Old Perl Program Is Really Ready For Retirement. > (Ask Fred.) Note that Doc/tools/sgmlconv/latex2esis.py already includes a moderate start at a LaTeX parser. Paragraph marking is done as a separate step in Doc/tools/sgmlconv/docfixer.py, but I'd like to push that down into the LaTeX handler. (Note that these tools are mostly broken at the moment, except for latex2esis.py, which does most of what I need other than paragraph marking.) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From Barrett@stsci.edu Wed Dec 13 16:34:40 2000 From: Barrett@stsci.edu (Paul Barrett) Date: Wed, 13 Dec 2000 11:34:40 -0500 (EST) Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) In-Reply-To: <20001210054646.A5219@glacier.fnational.com> References: <20001210054646.A5219@glacier.fnational.com> Message-ID: <14903.41669.883591.420446@nem-srvr.stsci.edu> Neil Schemenauer writes: > Sourceforge unloads are not working. The lastest version of the > patch for PEP 208 is here: > > http://arctrix.com/nas/python/coerce-6.0.diff > > Operations on instances now call __coerce__ if it exists. I > think the patch is now complete. Converting other builtin types > to "new style numbers" can be done with a separate patch. My one concern about this patch is whether the non-commutativity of operators is preserved. This issue being important for matrix operations (not to be confused with element-wise array operations). -- Paul From guido@python.org Wed Dec 13 16:45:12 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 11:45:12 -0500 Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) In-Reply-To: Your message of "Wed, 13 Dec 2000 11:34:40 EST." <14903.41669.883591.420446@nem-srvr.stsci.edu> References: <20001210054646.A5219@glacier.fnational.com> <14903.41669.883591.420446@nem-srvr.stsci.edu> Message-ID: <200012131645.LAA21719@cj20424-a.reston1.va.home.com> > Neil Schemenauer writes: > > Sourceforge unloads are not working. The lastest version of the > > patch for PEP 208 is here: > > > > http://arctrix.com/nas/python/coerce-6.0.diff > > > > Operations on instances now call __coerce__ if it exists. I > > think the patch is now complete. Converting other builtin types > > to "new style numbers" can be done with a separate patch. > > My one concern about this patch is whether the non-commutativity of > operators is preserved. This issue being important for matrix > operations (not to be confused with element-wise array operations). Yes, this is preserved. (I'm spending most of my waking hours understanding this patch -- it is a true piece of wizardry.) --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Wed Dec 13 17:38:00 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 13 Dec 2000 18:38:00 +0100 Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) References: <20001210054646.A5219@glacier.fnational.com> <14903.41669.883591.420446@nem-srvr.stsci.edu> <200012131645.LAA21719@cj20424-a.reston1.va.home.com> Message-ID: <3A37B3F7.5640FAFC@lemburg.com> Guido van Rossum wrote: > > > Neil Schemenauer writes: > > > Sourceforge unloads are not working. The lastest version of the > > > patch for PEP 208 is here: > > > > > > http://arctrix.com/nas/python/coerce-6.0.diff > > > > > > Operations on instances now call __coerce__ if it exists. I > > > think the patch is now complete. Converting other builtin types > > > to "new style numbers" can be done with a separate patch. > > > > My one concern about this patch is whether the non-commutativity of > > operators is preserved. This issue being important for matrix > > operations (not to be confused with element-wise array operations). > > Yes, this is preserved. (I'm spending most of my waking hours > understanding this patch -- it is a true piece of wizardry.) The fact that coercion didn't allow detection of parameter order was the initial cause for my try at fixing it back then. I was confronted with the fact that at C level there was no way to tell whether the operands were in the order left, right or right, left -- as a result I used a gross hack in mxDateTime to still make this work... -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From esr@thyrsus.com Wed Dec 13 21:01:46 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 16:01:46 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> Message-ID: <20001213160146.A24753@thyrsus.com> Fred L. Drake, Jr. : > I know the question of including these modules in the core > distribution has come up before, but the resurgence in interest in > these makes me want to bring it up again: Does the curses package > (and the associated C extension(s)) belong in the standard library, or > does it make sense to spin out a distutils-based package? I've no > objection to them being in the core, but it seems that the release > cycle may want to diverge from Python's. Curses needs to be in the core for political reasons. Specifically, to support CML2 without requiring any extra packages or downloads beyond the stock Python interpreter. And what makes CML2 so constrained and so important? It's my bid to replace the Linux kernel's configuration machinery. It has many advantages over the existing config system, but the linux developers are *very* resistant to adding things to the kernel's minimum build kit. Python alone may prove too much for them to swallow (though there are hopeful signs they will); Python plus a separately downloadable curses module would definitely be too much. Guido attaches sufficient importance to getting Python into the kernel build machinery that he approved adding ncurses to the standard modules on that basis. This would be a huge design win for us, raising Python's visibility considerably. So curses must stay in the core. I don't have a requirement for panels; my present curses front end simulates them. But if panels were integrated into the core I could simplify the front-end code significantly. Every line I can remove from my stuff (even if it, in effect, is just migrating into the Python core) makes it easier to sell CML2 into the kernel. -- Eric S. Raymond "Experience should teach us to be most on our guard to protect liberty when the government's purposes are beneficient... The greatest dangers to liberty lurk in insidious encroachment by men of zeal, well meaning but without understanding." -- Supreme Court Justice Louis Brandeis From jheintz@isogen.com Wed Dec 13 21:10:32 2000 From: jheintz@isogen.com (John D. Heintz) Date: Wed, 13 Dec 2000 15:10:32 -0600 Subject: [Python-Dev] Announcing ZODB-Corba code release Message-ID: <3A37E5C8.7000800@isogen.com> Here is the first release of code that exposes a ZODB database through CORBA (omniORB). The code is functioning, the docs are sparse, and it should work on your machines. ;-) I am only going to be in town for the next two days, then I will be unavailable until Jan 1. See http://www.zope.org/Members/jheintz/ZODB_CORBA_Connection to download the code. It's not perfect, but it works for me. Enjoy, John -- . . . . . . . . . . . . . . . . . . . . . . . . John D. Heintz | Senior Engineer 1016 La Posada Dr. | Suite 240 | Austin TX 78752 T 512.633.1198 | jheintz@isogen.com w w w . d a t a c h a n n e l . c o m From guido@python.org Wed Dec 13 21:19:01 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 16:19:01 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Your message of "Wed, 13 Dec 2000 16:01:46 EST." <20001213160146.A24753@thyrsus.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> Message-ID: <200012132119.QAA11060@cj20424-a.reston1.va.home.com> > So curses must stay in the core. I don't have a requirement for > panels; my present curses front end simulates them. But if panels were > integrated into the core I could simplify the front-end code > significantly. Every line I can remove from my stuff (even if it, in > effect, is just migrating into the Python core) makes it easier to > sell CML2 into the kernel. On the other hand you may want to be conservative. You already have to require Python 2.0 (I presume). The panel stuff will be available in 2.1 at the earliest. You probably shouldn't throw out your panel emulation until your code has already been accepted... --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@loewis.home.cs.tu-berlin.de Wed Dec 13 21:56:27 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 13 Dec 2000 22:56:27 +0100 Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107 Message-ID: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de> > Another question remains regarding the docs though: why is it bad to > always compile main.c with a C++ compiler? For the whole thing to work, it may also be necessary to link the entire application with a C++ compiler; that in turn may bind to the C++ library. Linking with the system's C++ library means that the Python executable cannot be as easily exchanged between installations of the operating system - you'd also need to have the right version of the C++ library to run it. If the C++ library is static, that may also increase the size of the executable. I can't really point to a specific problem that would occur on a specific system I use if main() was compiled with a C++ compiler. However, on the systems I use (Windows, Solaris, Linux), you can build C++ extension modules even if Python was not compiled as a C++ application. On Solaris and Windows, you'd also have to chose the C++ compiler you want to use (MSVC++, SunPro CC, or g++); in turn, different C++ runtime systems would be linked into the application. Regards, Martin From esr@thyrsus.com Wed Dec 13 22:03:59 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 17:03:59 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132119.QAA11060@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 04:19:01PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> Message-ID: <20001213170359.A24915@thyrsus.com> Guido van Rossum : > > So curses must stay in the core. I don't have a requirement for > > panels; my present curses front end simulates them. But if panels were > > integrated into the core I could simplify the front-end code > > significantly. Every line I can remove from my stuff (even if it, in > > effect, is just migrating into the Python core) makes it easier to > > sell CML2 into the kernel. > > On the other hand you may want to be conservative. You already have > to require Python 2.0 (I presume). The panel stuff will be available > in 2.1 at the earliest. You probably shouldn't throw out your panel > emulation until your code has already been accepted... Yes, that's how I am currently expecting it to play out -- but if the 2.4.0 kernel is delayed another six months, I'd change my mind. I'll explain this, because python-dev people should grok what the surrounding politics and timing are. I actually debated staying with 1.5.2 as a base version. What changed my mind was two things. One: by going to 2.0 I could drop close to 600 lines and three entire support modules from CML2, slimming down its footprint in the kernel tree significantly (by more than 10% of the entire code volume, actually). Second: CML2 is not going to be seriously evaluated until 2.4.0 final is out. Linus made this clear when I demoed it for him at LWE. My best guess about when that will happen is late January into Februrary. By the time Red Hat issues its next distro after that (probably May or thenabouts) it's a safe bet 2.0 will be on it, and everywhere else. But if the 2.4.0 kernel slips another six months yet again, and our 2.1 commes out relatively quickly (like, just before the 9th Python Conference :-)) then we *might* have time to get 2.1 into the distros before CML2 gets the imprimatur. So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel will be delayed yet again :-). -- Eric S. Raymond Ideology, politics and journalism, which luxuriate in failure, are impotent in the face of hope and joy. -- P. J. O'Rourke From nas@arctrix.com Wed Dec 13 15:37:45 2000 From: nas@arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 07:37:45 -0800 Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107 In-Reply-To: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Wed, Dec 13, 2000 at 10:56:27PM +0100 References: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de> Message-ID: <20001213073745.C17148@glacier.fnational.com> These are issues to consider for Python 3000 as well. AFAKI, C++ ABIs are a nighmare. Neil From fdrake@acm.org Wed Dec 13 22:29:25 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 17:29:25 -0500 (EST) Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213170359.A24915@thyrsus.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> Message-ID: <14903.63557.282592.796169@cj42289-a.reston1.va.home.com> Eric S. Raymond writes: > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel > will be delayed yet again :-). Politics aside, I think development of curses-related extensions like panels and forms doesn't need to be delayed. I've posted what I think are relavant technical comments already, and leave it up to the developers of any new modules to get them written -- I don't know enough curses to offer any help there. Regardless of how the curses package is distributed and deployed, I don't see any reason to delay development in its existing location in the Python CVS repository. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From nas@arctrix.com Wed Dec 13 15:41:54 2000 From: nas@arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 07:41:54 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> Message-ID: <20001213074154.D17148@glacier.fnational.com> On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote: > CML2 is not going to be seriously evaluated until 2.4.0 final > is out. Linus made this clear when I demoed it for him at LWE. > My best guess about when that will happen is late January into > Februrary. By the time Red Hat issues its next distro after > that (probably May or thenabouts) it's a safe bet 2.0 will be > on it, and everywhere else. I don't think that is a very safe bet. Python 2.0 missed the Debian Potato boat. I have no idea when Woody is expected to be released but I expect it may take longer than that if history is any indication. Neil From guido@python.org Wed Dec 13 23:03:31 2000 From: guido@python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 18:03:31 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Your message of "Wed, 13 Dec 2000 07:41:54 PST." <20001213074154.D17148@glacier.fnational.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> Message-ID: <200012132303.SAA12434@cj20424-a.reston1.va.home.com> > I don't think that is a very safe bet. Python 2.0 missed the > Debian Potato boat. This may have had to do more with the unresolved GPL issues. I recently received a mail from Stallman indicating that an agreement with CNRI has been reached; they have agreed (in principle, at least) to specific changes to the CNRI license that will defuse the choice-of-law clause when it is combined with GPL-licensed code "in a non-separable way". A glitch here is that the BeOpen license probably has to be changed too, but I believe that that's all doable. > I have no idea when Woody is expected to be > released but I expect it may take longer than that if history is > any indication. And who or what is Woody? Feeling-left-out, --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein@lyra.org Wed Dec 13 23:16:09 2000 From: gstein@lyra.org (Greg Stein) Date: Wed, 13 Dec 2000 15:16:09 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> Message-ID: <20001213151609.E8951@lyra.org> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: >... > > I have no idea when Woody is expected to be > > released but I expect it may take longer than that if history is > > any indication. > > And who or what is Woody? One of the Debian releases. Dunno if it is the "next" release, but there ya go. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Wed Dec 13 23:18:34 2000 From: gstein@lyra.org (Greg Stein) Date: Wed, 13 Dec 2000 15:18:34 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> Message-ID: <20001213151834.F8951@lyra.org> On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote: >... > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel > will be delayed yet again :-). The kernel is not going to be delayed that much. Linus wants it to go out this month. Worst case, I could see January. But no way on six months. But as Fred said: that should not change panels going into the curses support at all. You can always have a "compat.py" module in CML2 that provides functionality for prior-to-2.1 releases of Python. I'd also be up for a separate _curses_panels module, loaded into the curses package. Cheers, -g -- Greg Stein, http://www.lyra.org/ From esr@thyrsus.com Wed Dec 13 23:33:02 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 18:33:02 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213151834.F8951@lyra.org>; from gstein@lyra.org on Wed, Dec 13, 2000 at 03:18:34PM -0800 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213151834.F8951@lyra.org> Message-ID: <20001213183302.A25160@thyrsus.com> Greg Stein : > On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote: > >... > > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel > > will be delayed yet again :-). > > The kernel is not going to be delayed that much. Linus wants it to go out > this month. Worst case, I could see January. But no way on six months. I know what Linus wants. That's why I'm estimating end of January or earlier Februrary -- the man's error curve on these estimates has a certain, er, *consistency* about it. -- Eric S. Raymond Alcohol still kills more people every year than all `illegal' drugs put together, and Prohibition only made it worse. Oppose the War On Some Drugs! From nas@arctrix.com Wed Dec 13 17:18:48 2000 From: nas@arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 09:18:48 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> Message-ID: <20001213091848.A17326@glacier.fnational.com> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: > > I don't think that is a very safe bet. Python 2.0 missed the > > Debian Potato boat. > > This may have had to do more with the unresolved GPL issues. I can't remember the exact dates but I think Debian Potato was frozen before Python 2.0 was released. Once a Debian release is frozen packages are not upgraded except under unusual circumstances. > I recently received a mail from Stallman indicating that an > agreement with CNRI has been reached; they have agreed (in > principle, at least) to specific changes to the CNRI license > that will defuse the choice-of-law clause when it is combined > with GPL-licensed code "in a non-separable way". A glitch here > is that the BeOpen license probably has to be changed too, but > I believe that that's all doable. This is great news. > > I have no idea when Woody is expected to be > > released but I expect it may take longer than that if history is > > any indication. > > And who or what is Woody? Woody would be another character from the Pixar movie "Toy Story" (just like Rex, Bo, Potato, Slink, and Hamm). I believe Bruce Perens used to work a Pixar. Debian uses a code name for the development release until a release number is assigned. This avoids some problems but has the disadvantage of confusing people who are not familiar with Debian. I should have said "the next stable release of Debian". Neil (aka nas@debian.org) From akuchlin@mems-exchange.org Thu Dec 14 00:26:32 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 13 Dec 2000 19:26:32 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> Message-ID: <20001213192632.A30585@kronos.cnri.reston.va.us> On Wed, Dec 13, 2000 at 10:19:01AM -0500, Fred L. Drake, Jr. wrote: > Do these new functions have to be methods on the window objects, or >can they be functions in the new module that take a window as a >parameter? The underlying window object can certainly provide slots Panels and windows have a 1-1 association, but they're separate objects. The window.new_panel function could become just a method which takes a window as its first argument; it would only need the TypeObject for PyCursesWindow, in order to do typechecking. > > Also, the curses.panel_{above,below}() wrappers need access to the > > list_of_panels via find_po(). The list_of_panels is used only in the curses.panel module, so it could be private to that module, since only panel-related functions care about it. I'm ambivalent about the list_of_panels. It's a linked list storing (PyWindow, PyPanel) pairs. Probably it should use a dictionary instead of implementing a little list, just to reduce the amount of code. >does it make sense to spin out a distutils-based package? I've no >objection to them being in the core, but it seems that the release >cycle may want to diverge from Python's. Consensus seemed to be to leave it in; I'd have no objection to removing it, but either course is fine with me. So, I suggest we create _curses_panel.c, which would be available as curses.panel. (A panel.py module could then add any convenience functions that are required.) Thomas, do you want to work on this, or should I? --amk From nas@arctrix.com Wed Dec 13 17:43:06 2000 From: nas@arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 09:43:06 -0800 Subject: [Python-Dev] OT: Debian and Python In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100 References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> Message-ID: <20001213094306.C17326@glacier.fnational.com> On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote: > Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0. > Someone created a separate set of 2.0-packages, but they didn't include > readline and gdbm support because of the licencing issues. (Posted on c.l.py > sometime this week.) I've had Python packages for Debian stable for a while. I guess I should have posted a link: http://arctrix.com/nas/python/debian/ Most useful modules are enabled. > I'm *almost* tempted enough to learn enough about > dpkg/.deb files to build my own licence-be-damned set Its quite easy. Debian source packages are basicly a diff. Applying the diff will create a "debian" directory and in that directory will be a makefile called "rules". Use the target "binary" to create new binary packages. Good things to know are that you must be in the source directory when you run the makefile (ie. ./debian/rules binary). You should be running a shell under fakeroot to get the install permissions right (running "fakeroot" will do). You need to have the Debian developer tools installed. There is a list somewhere on debian.org. "apt-get source " will get, extract and patch a package ready for tweaking and building (handy for getting stuff from unstable to run on stable). This is too off topic for python-dev. If anyone needs more info they can email me directly. Neil From thomas@xs4all.net Thu Dec 14 00:05:34 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 14 Dec 2000 01:05:34 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> Message-ID: <20001214010534.M4396@xs4all.nl> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: > > I don't think that is a very safe bet. Python 2.0 missed the Debian > > Potato boat. > > This may have had to do more with the unresolved GPL issues. This is very likely. Debian is very licence -- or at least GPL -- aware. Which is a pity, really, because I already prefer it over RedHat in all other cases (and RedHat is also pretty licence aware, just less piously, devoutly, beyond-practicality-IMHO dedicated to the GPL.) > > I have no idea when Woody is expected to be released but I expect it may > > take longer than that if history is any indication. BTW, I believe Debian uses a fairly steady release schedule, something like an unstable->stable switch every year or 6 months or so ? I seem to recall seeing something like that on the debian website, but can't check right now. > And who or what is Woody? Woody is Debian's current development branch, the current bearer of the alias 'unstable'. It'll become Debian 2.3 (I believe, I don't pay attention to version numbers, I just run unstable :) once it's stabilized. 'potato' is the previous development branch, and currently the 'stable' branch. You can compare them with 'rawhide' and 'redhat-7.0', respectively :) (With the enormous difference that you can upgrade your debian install to a new version (even the devel version, or update your machine to the latest devel snapshot) while you are using it, without having to reboot ;) Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0. Someone created a separate set of 2.0-packages, but they didn't include readline and gdbm support because of the licencing issues. (Posted on c.l.py sometime this week.) I'm *almost* tempted enough to learn enough about dpkg/.deb files to build my own licence-be-damned set, but it'd be a lot of work to mirror the current debian 1.5.2 set of packages (which include numeric, imaging, mxTools, GTK/GNOME, and a shitload of 3rd party modules) in 2.0. Ponder, maybe it could be done semi-automatically, from the src-deb's of those packages. By the way, in woody, there are 52 packages with 'python' in the name, and 32 with 'perl' in the name... Pity all of my perl-hugging hippy-friends are still blindly using RedHat, and refuse to listen to my calls from the Debian/Python-dark-side :-) Oh, and the names 'woody' and 'potato' came from the movie Toy Story, in case you wondered ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From esr@snark.thyrsus.com Thu Dec 14 00:46:37 2000 From: esr@snark.thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 19:46:37 -0500 Subject: [Python-Dev] Business related to the upcoming Python conference Message-ID: <200012140046.TAA25289@snark.thyrsus.com> I'm sending this to python-dev because I believe most or all of the reviewers for my PC9 paper are on this list. Paul, would you please forward to any who were not? First, my humble apologies for not having got my PC9 reviews in on time. I diligently read my assigned papers early, but I couldn't do the reviews early because of technical problems with my Foretec account -- and then I couldn't do them late because the pre-deadline crunch happened while I was on a ten-day speaking and business trip in Japan and California, with mostly poor or nonexistent Internet access. Matters were not helped by a nasty four-month-old problem in my personal life coming to a head right in the middle of the trip. Nor by the fact that the trip included the VA Linux Systems annual stockholders' meeting and the toughest Board of Directors' meeting in my tenure. We had to hammer out a strategic theory of what to do now that the dot-com companies who used to be our best companies aren't getting funded any more. Unfortunately, it's at times like this that Board members earn their stock options. Management oversight. Fiduciary responsibility. Mumble... Second, the feedback I received on the paper was *excellent*, and I will be making many of the recommended changes. I've already extended the discussion of "Why Python?" including addressing the weaknesses of Scheme and Prolog for this application. I have said more about uses of CML2 beyond the Linux kernel. I am working on a discussion of the politics of CML2 option, but may save that for the stand-up talk rather than the written paper. I will try to trim the CML2 language reference for the final version. (The reviewer who complained about the lack of references on the SAT problem should be pleased to hear that URLs to relevant papers are in fact included in the masters. I hope they show in the final version as rendered for publication.) -- Eric S. Raymond The Constitution is not neutral. It was designed to take the government off the backs of the people. -- Justice William O. Douglas From moshez@zadka.site.co.il Thu Dec 14 12:22:24 2000 From: moshez@zadka.site.co.il (Moshe Zadka) Date: Thu, 14 Dec 2000 14:22:24 +0200 (IST) Subject: [Python-Dev] Splitting up _cursesmodule Message-ID: <20001214122224.739EEA82E@darjeeling.zadka.site.co.il> On Wed, 13 Dec 2000 07:41:54 -0800, Neil Schemenauer wrote: > I don't think that is a very safe bet. Python 2.0 missed the > Debian Potato boat. By a long time -- potato was frozen for a few months when 2.0 came out. > I have no idea when Woody is expected to be > released but I expect it may take longer than that if history is > any indication. My bet is that woody starts freezing as soon as 2.4.0 is out. Note that once it starts freezing, 2.1 doesn't have a shot of getting in, regardless of how long it takes to freeze. OTOH, since in woody time there's a good chance for the "testing" distribution, a lot more people would be running something that *can* and *will* upgrade to 2.1 almost as soon as it is out. (For the record, most of the Debian users I know run woody on their server) -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From jeremy@alum.mit.edu Thu Dec 14 05:04:43 2000 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 14 Dec 2000 00:04:43 -0500 (EST) Subject: [Python-Dev] new draft of PEP 227 Message-ID: <14904.21739.804346.650062@bitdiddle.concentric.net> I've got a new draft of PEP 227. The terminology and wording are more convoluted than they need to be. I'll do at least one revision just to say things more clearly, but I'd appreciate comments on the proposed spec if you can read the current draft. Jeremy From cgw@fnal.gov Thu Dec 14 06:03:01 2000 From: cgw@fnal.gov (Charles G Waldman) Date: Thu, 14 Dec 2000 00:03:01 -0600 (CST) Subject: [Python-Dev] Memory leaks in tupleobject.c Message-ID: <14904.25237.654143.861733@buffalo.fnal.gov> I've been running a set of memory-leak tests against the latest Python and have found that running "test_extcall" leaks memory. This gave me a strange sense of deja vu, having fixed this once before... >From the CVS logs for tupleobject.c: revision 2.31 date: 2000/04/21 21:15:05; author: guido; state: Exp; lines: +59 -16 Patch by Charles G Waldman to avoid a sneaky memory leak in _PyTuple_Resize(). revision 2.47 date: 2000/10/05 19:36:49; author: nascheme; state: Exp; lines: +24 -86 Simplify _PyTuple_Resize by not using the tuple free list and dropping support for the last_is_sticky flag. A few hard to find bugs may be fixed by this patch since the old code was buggy. The 2.47 patch seems to have re-introduced the memory leak which was fixed in 2.31. Maybe the old code was buggy, but the "right thing" would have been to fix it, not to throw it away.... if _PyTuple_Resize simply ignores the tuple free list, memory will be leaked. From nas@arctrix.com Wed Dec 13 23:43:43 2000 From: nas@arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 15:43:43 -0800 Subject: [Python-Dev] Memory leaks in tupleobject.c In-Reply-To: <14904.25237.654143.861733@buffalo.fnal.gov>; from cgw@fnal.gov on Thu, Dec 14, 2000 at 12:03:01AM -0600 References: <14904.25237.654143.861733@buffalo.fnal.gov> Message-ID: <20001213154343.A18303@glacier.fnational.com> On Thu, Dec 14, 2000 at 12:03:01AM -0600, Charles G Waldman wrote: > date: 2000/10/05 19:36:49; author: nascheme; state: Exp; lines: +24 -86 > Simplify _PyTuple_Resize by not using the tuple free list and dropping > support for the last_is_sticky flag. A few hard to find bugs may be > fixed by this patch since the old code was buggy. > > The 2.47 patch seems to have re-introduced the memory leak which was > fixed in 2.31. Maybe the old code was buggy, but the "right thing" > would have been to fix it, not to throw it away.... if _PyTuple_Resize > simply ignores the tuple free list, memory will be leaked. Guilty as charged. Can you explain how the current code is leaking memory? I can see one problem with deallocating size=0 tuples. Are there any more leaks? Neil From cgw@fnal.gov Thu Dec 14 06:57:05 2000 From: cgw@fnal.gov (Charles G Waldman) Date: Thu, 14 Dec 2000 00:57:05 -0600 (CST) Subject: [Python-Dev] Memory leaks in tupleobject.c In-Reply-To: <20001213154343.A18303@glacier.fnational.com> References: <14904.25237.654143.861733@buffalo.fnal.gov> <20001213154343.A18303@glacier.fnational.com> Message-ID: <14904.28481.292539.354303@buffalo.fnal.gov> Neil Schemenauer writes: > Guilty as charged. Can you explain how the current code is > leaking memory? I can see one problem with deallocating size=0 > tuples. Are there any more leaks? Actually, I think I may have spoken too hastily - it's late and I'm tired and I should be sleeping rather than staring at the screen (like I've been doing since 8:30 this morning) - I jumped to conclusions - I'm not really sure that it was your patch that caused the leak; all I can say with 100% certainty is that if you run "test_extcall" in a loop, memory usage goes through the ceiling.... It's not just the cyclic garbage caused by the "saboteur" function because even with this commented out, the memory leak persists. I'm actually trying to track down a different memory leak, something which is currently causing trouble in one of our production servers (more about this some other time) and just as a sanity check I ran my little "leaktest.py" script over all the test_*.py modules in the distribution, and found that test_extcall triggers leaks... having analyzed and fixed this once before (see the CVS logs for tupleobject.c), I jumped to conclusions about the reason for its return. I'll take a more clear-headed and careful look tomorrow and post something (hopefully) a little more conclusive. It may have been some other change that caused this memory leak to re-appear. If you feel inclined to investigate, just do "reload(test.test_extcall)" in a loop and watch the memory usage with ps or top or what-have-you... -C From paulp@ActiveState.com Thu Dec 14 07:00:21 2000 From: paulp@ActiveState.com (Paul Prescod) Date: Wed, 13 Dec 2000 23:00:21 -0800 Subject: [Python-Dev] new draft of PEP 227 References: <14904.21739.804346.650062@bitdiddle.concentric.net> Message-ID: <3A387005.6725DAAE@ActiveState.com> Jeremy Hylton wrote: > > I've got a new draft of PEP 227. The terminology and wording are more > convoluted than they need to be. I'll do at least one revision just > to say things more clearly, but I'd appreciate comments on the > proposed spec if you can read the current draft. It set me to thinking: Python should never require declarations. But would it necessarily be a problem for Python to have a variable declaration syntax? Might not the existence of declarations simplify some aspects of the proposal and of backwards compatibility? Along the same lines, might a new rule make Python code more robust? We could say that a local can only shadow a global if the local is formally declared. It's pretty rare that there is a good reason to shadow a global and Python makes it too easy to do accidentally. Paul Prescod From paulp@ActiveState.com Thu Dec 14 07:29:35 2000 From: paulp@ActiveState.com (Paul Prescod) Date: Wed, 13 Dec 2000 23:29:35 -0800 Subject: [Python-Dev] Online help scope Message-ID: <3A3876DF.5554080C@ActiveState.com> I think Guido and I are pretty far apart on the scope and requirements of this online help thing so I'd like some clarification and opinions from the peanut gallery. Consider these scenarios a) Signature >>> help( dir ) dir([object]) -> list of stringsb) b) Usage hint >>> help( dir ) dir([object]) -> list of stringsb) Return an alphabetized list of names comprising (some of) the attributes of the given object. Without an argument, the names in the current scope are listed. With an instance argument, only the instance attributes are returned. With a class argument, attributes of the base class are not returned. For other types or arguments, this may list members or methods. c) Complete documentation, paged(man-style) >>> help( dir ) dir([object]) -> list of stringsb) Without arguments, return the list of names in the current local symbol table. With an argument, attempts to return a list of valid attribute for that object. This information is gleaned from the object's __dict__, __methods__ and __members__ attributes, if defined. The list is not necessarily complete; e.g., for classes, attributes defined in base classes are not included, and for class instances, methods are not included. The resulting list is sorted alphabetically. For example: >>> import sys >>> dir() ['sys'] >>> dir(sys) ['argv', 'exit', 'modules', 'path', 'stderr', 'stdin', 'stdout'] d) Complete documentation in a user-chosen hypertext window >>> help( dir ) (Netscape or lynx pops up) I'm thinking that maybe we need two functions: * help * pythondoc pythondoc("dir") would launch the Python documentation for the "dir" command. > That'S What Some People Think. I Disagree That It Would Be Either > Feasible Or A Good Idea To Put All Documentation For A Typical Module > In Its Doc Strings. Java and Perl people do it regularly. I think that in the greater world of software development, the inline model has won (or is winning) and I don't see a compelling reason to fight the tide. There will always be out-of-line tutorials, discussions, books etc. The canonical module documentation could be inline. That improves the liklihood of it being maintained. The LaTeX documentation is a major bottleneck and moving to XML or SGML will not help. Programmers do not want to learn documentation systems or syntaxes. They want to write code and comments. > I said above, and I'll say it again: I think the majority of people > would prefer to use their standard web browser to read the standard > docs. It's not worth the effort to try to make those accessible > through help(). No matter what we decide on the issue above, reusing the standard documentation is the only practical way of populating the help system in the short-term. Right now, today, there is a ton of documentation that exists only in LaTeX and HTML. Tons of modules have no docstrings. Keywords have no docstrings. Compare the docstring for urllib.urlretrieve to the HTML documentation. In fact, you've given me a good idea: if the HTML is not available locally, I can access it over the web. Paul Prescod From paulp@ActiveState.com Thu Dec 14 07:29:53 2000 From: paulp@ActiveState.com (Paul Prescod) Date: Wed, 13 Dec 2000 23:29:53 -0800 Subject: [Python-Dev] Online help PEP References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> <3A366A41.1A14EFD4@ActiveState.com> <200012131548.KAA21344@cj20424-a.reston1.va.home.com> Message-ID: <3A3876F1.D3E65E90@ActiveState.com> Guido van Rossum wrote: > > Having the repr() overloading invoke the pager is dangerous. The beta > version of the license command did this, and it caused some strange > side effects, e.g. vars(__builtins__) would start reading from input > and confuse the users. The new version's repr() returns the desired > string if it's less than a page, and 'Type license() to see the full > license text' if the pager would need to be invoked. I'll add this to the PEP. > The implied import is a major liability. If you can do this without > importing (e.g. by source code inspection), fine. Otherwise, you > might issue some kind of message like "you must first import XXX.YYY". Okay, I'll add to the PEP that an open issue is what strategy to use, but that we want to avoid implicit import. > The hurt is code bloat in the interpreter and creeping featurism. If > you need command line access to the docs (which may be a reasonable > thing to ask for, although to me it sounds backwards :-), it's better > to provide a separate command, e.g. pythondoc. (Analog to perldoc.) Okay, I'll add a pythondoc proposal to the PEP. > Yes. Please add that option to the PEP. Done. > > > What does "demand-loaded" mean in a Python context? > > > > When you "touch" the help object, it loads the onlinehelp module which > > has the real implementation. The thing in __builtins__ is just a > > lightweight proxy. > > Please suggest an implementation. In the PEP. > Glad You'Re So Agreeable. :) What happened to your capitalization? elisp gone awry? > ... > To Tell You The Truth, I'M Not Holding My Breath Either. :-) So your > code should just dump the doc string on stdout without interpreting it > in any way (except for paging). I'll do this for the first version. > It's buggier than just that. The output of the pager prints an extra > "| " at the start of each page except for the first, and the first > page is a line longer than subsequent pages. For some reason that I now I forget, that code is pretty hairy. > BTW, another bug: try help(cgi). It's nice that it gives the default > value for arguments, but the defaults for FieldStorage.__init__ happen > to include os.environ. Its entire value is dumped -- which causes the > pager to be off (it wraps over about 20 lines for me). I think you > may have to truncate long values a bit, e.g. by using the repr module. Okay. There are a lot of little things we need to figure out. Such as whether we should print out docstrings for private methods etc. >... > I don't know specific tools, but any serious docstring processing tool > ends up parsing the source code for this very reason, so there's > probably plenty of prior art. Okay, I'll look into it. Paul From tim.one@home.com Thu Dec 14 07:35:00 2000 From: tim.one@home.com (Tim Peters) Date: Thu, 14 Dec 2000 02:35:00 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A387005.6725DAAE@ActiveState.com> Message-ID: [Paul Prescod] > ... > Along the same lines, might a new rule make Python code more robust? > We could say that a local can only shadow a global if the local is > formally declared. It's pretty rare that there is a good reason to > shadow a global and Python makes it too easy to do accidentally. I've rarely seen problems due to shadowing a global, but have often seen problems due to shadowing a builtin. Alas, if this rule were extended to builtins too-- where it would do the most good --then the names of builtins would effectively become reserved words (any code shadowing them today would be broken until declarations were added, and any code working today may break tomorrow if a new builtin were introduced that happened to have the same name as a local). From pf@artcom-gmbh.de Thu Dec 14 07:42:59 2000 From: pf@artcom-gmbh.de (Peter Funk) Date: Thu, 14 Dec 2000 08:42:59 +0100 (MET) Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: <200012132039.MAA07496@slayer.i.sourceforge.net> from Moshe Zadka at "Dec 13, 2000 12:39:24 pm" Message-ID: Hi, I think the following change is incompatible and will break applications. At least I have some server type applications that rely on 'allow_reuse_address' defaulting to 0, because they use the 'address already in use' exception, to make sure, that exactly one server process is running on this port. One of these applications, which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work, if I change this default in SocketServer.py. Would you please explain the reasoning behind this change? Moshe Zadka: > *** SocketServer.py 2000/09/01 03:25:14 1.19 > --- SocketServer.py 2000/12/13 20:39:17 1.20 > *************** > *** 158,162 **** > request_queue_size = 5 > > ! allow_reuse_address = 0 > > def __init__(self, server_address, RequestHandlerClass): > --- 158,162 ---- > request_queue_size = 5 > > ! allow_reuse_address = 1 > > def __init__(self, server_address, RequestHandlerClass): Regards, Peter -- Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260 office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen) From paul@prescod.net Thu Dec 14 07:57:30 2000 From: paul@prescod.net (Paul Prescod) Date: Wed, 13 Dec 2000 23:57:30 -0800 Subject: [Python-Dev] new draft of PEP 227 References: Message-ID: <3A387D6A.782E6A3B@prescod.net> Tim Peters wrote: > > ... > > I've rarely seen problems due to shadowing a global, but have often seen > problems due to shadowing a builtin. Really? I think that there are two different issues here. One is consciously choosing to create a new variable but not understanding that there already exists a variable by that name. (i.e. str, list). Another is trying to assign to a global but actually shadowing it. There is no way that anyone coming from another language is going to consider this transcript reasonable: >>> a=5 >>> def show(): ... print a ... >>> def set(val): ... a=val ... >>> a 5 >>> show() 5 >>> set(10) >>> show() 5 It doesn't seem to make any sense. My solution is to make the assignment in "set" illegal unless you add a declaration that says: "No, really. I mean it. Override that sucker." As the PEP points out, overriding is seldom a good idea so the requirement to declare would be rarely invoked. Actually, one could argue that there is no good reason to even *allow* the shadowing of globals. You can always add an underscore to the end of the variable name to disambiguate. > Alas, if this rule were extended to > builtins too-- where it would do the most good --then the names of builtins > would effectively become reserved words (any code shadowing them today would > be broken until declarations were added, and any code working today may > break tomorrow if a new builtin were introduced that happened to have the > same name as a local). I have no good solutions to the shadowing-builtins accidently problem. But I will say that those sorts of problems are typically less subtle: str = "abcdef" ... str(5) # You'll get a pretty good error message here! The "right answer" in terms of namespace theory is to consistently refer to builtins with a prefix (whether "__builtins__" or "$") but that's pretty unpalatable from an aesthetic point of view. Paul Prescod From tim.one@home.com Thu Dec 14 08:41:19 2000 From: tim.one@home.com (Tim Peters) Date: Thu, 14 Dec 2000 03:41:19 -0500 Subject: [Python-Dev] Online help scope In-Reply-To: <3A3876DF.5554080C@ActiveState.com> Message-ID: [Paul Prescod] > I think Guido and I are pretty far apart on the scope and requirements > of this online help thing so I'd like some clarification and opinions > from the peanut gallery. > > Consider these scenarios > > a) Signature > ... > b) Usage hint > ... > c) Complete documentation, paged(man-style) > ... > d) Complete documentation in a user-chosen hypertext window > ... Guido's style guide has a lot to say about docstrings, suggesting that they were intended to support two scenarios: #a+#b together (the first line of a multi-line docstring), and #c+#d together (the entire docstring). In this respect I think Guido was (consciously or not) aping elisp's conventions, up to but not including the elisp convention for naming the arguments in the first line of a docstring. The elisp conventions were very successful (simple, and useful in practice), so aping them is a good thing. We've had stalemate ever since: there isn't a single style of writing docstrings in practice because no single docstring processor has been blessed, while no docstring processor can gain momentum before being blessed. Every attempt to date has erred by trying to do too much, thus attracting so much complaint that it can't ever become blessed. The current argument over PEP 233 appears more of the same. The way to break the stalemate is to err on the side of simplicity: just cater to the two obvious (first-line vs whole-string) cases, and for existing docstrings only. HTML vs plain text is fluff. Paging vs non-paging is fluff. Dumping to stdout vs displaying in a browser is fluff. Jumping through hoops for functions and modules whose authors didn't bother to write docstrings is fluff. Etc. People fight over fluff until it fills the air and everyone chokes to death on it <0.9 wink>. Something dirt simple can get blessed, and once *anything* is blessed, a million docstrings will bloom. [Guido] > That'S What Some People Think. I Disagree That It Would Be Either > Feasible Or A Good Idea To Put All Documentation For A Typical Module > In Its Doc Strings. I'm with Paul on this one: that's what module.__doc__ is for, IMO (Javadoc is great, Eiffel's embedded doc tools are great, Perl POD is great, even REBOL's interactive help is great). All Java, Eiffel, Perl and REBOL have in common that Python lacks is *a* blessed system, no matter how crude. [back to Paul] > ... > No matter what we decide on the issue above, reusing the standard > documentation is the only practical way of populating the help system > in the short-term. Right now, today, there is a ton of documentation > that exists only in LaTeX and HTML. Tons of modules have no docstrings. Then write tools to automatically create docstrings from the LaTeX and HTML, but *check in* the results (i.e., add the docstrings so created to the codebase), and keep the help system simple. > Keywords have no docstrings. Neither do integers, but they're obvious too . From thomas@xs4all.net Thu Dec 14 09:13:49 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 14 Dec 2000 10:13:49 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100 References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> Message-ID: <20001214101348.N4396@xs4all.nl> On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote: > By the way, in woody, there are 52 packages with 'python' in the name, and > 32 with 'perl' in the name... Ah, not true, sorry. I shouldn't have posted off-topic stuff after being awoken by machine-down-alarms ;) That was just what my reasonably-default install had installed. Debian has what looks like most CPAN modules as packages, too, so it's closer to a 110/410 spread (python/perl.) Still, not a bad number :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal@lemburg.com Thu Dec 14 10:32:58 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 11:32:58 +0100 Subject: [Python-Dev] new draft of PEP 227 References: <14904.21739.804346.650062@bitdiddle.concentric.net> Message-ID: <3A38A1DA.7EC49149@lemburg.com> Jeremy Hylton wrote: > > I've got a new draft of PEP 227. The terminology and wording are more > convoluted than they need to be. I'll do at least one revision just > to say things more clearly, but I'd appreciate comments on the > proposed spec if you can read the current draft. The PEP doesn't mention the problems I pointed out about breaking the lookup schemes w/r to symbols in methods, classes and globals. Please add a comment about this to the PEP + maybe the example I gave in one the posts to python-dev about it. I consider the problem serious enough to limit the nested scoping to lambda functions (or functions in general) only if that's possible. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Thu Dec 14 10:55:38 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 11:55:38 +0100 Subject: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule) References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> Message-ID: <3A38A72A.4011B5BD@lemburg.com> Thomas Wouters wrote: > > On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: > > > I don't think that is a very safe bet. Python 2.0 missed the Debian > > > Potato boat. > > > > This may have had to do more with the unresolved GPL issues. > > This is very likely. Debian is very licence -- or at least GPL -- aware. > Which is a pity, really, because I already prefer it over RedHat in all > other cases (and RedHat is also pretty licence aware, just less piously, > devoutly, beyond-practicality-IMHO dedicated to the GPL.) About the GPL issue: as I understood Guido's post, RMS still regards the choice of law clause as being incompatible to the GPL (heck, doesn't this guy ever think about international trade terms, the United Nations Convention on International Sale of Goods or local law in one of the 200+ countries where you could deploy GPLed software... is the GPL only meant for US programmers ?). I am currently rewriting my open source licenses as well and among other things I chose to integrate a choice of law clause as well. Seeing RMS' view of things, I guess that my license will be regarded as incompatible to the GPL which is sad even though I'm in good company... e.g. the Apache license, the Zope license, etc. Dual licensing is not possible as it would reopen the loop-wholes in the GPL I tried to fix in my license. Any idea on how to proceed ? Another issue: since Python doesn't link Python scripts, is it still true that if one (pure) Python package is covered by the GPL, then all other packages needed by that application will also fall under GPL ? Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein@lyra.org Thu Dec 14 11:57:43 2000 From: gstein@lyra.org (Greg Stein) Date: Thu, 14 Dec 2000 03:57:43 -0800 Subject: (offtopic) Re: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A38A72A.4011B5BD@lemburg.com>; from mal@lemburg.com on Thu, Dec 14, 2000 at 11:55:38AM +0100 References: <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> <3A38A72A.4011B5BD@lemburg.com> Message-ID: <20001214035742.Z8951@lyra.org> On Thu, Dec 14, 2000 at 11:55:38AM +0100, M.-A. Lemburg wrote: >... > I am currently rewriting my open source licenses as well and among > other things I chose to integrate a choice of law clause as well. > Seeing RMS' view of things, I guess that my license will be regarded > as incompatible to the GPL which is sad even though I'm in good > company... e.g. the Apache license, the Zope license, etc. Dual > licensing is not possible as it would reopen the loop-wholes in the > GPL I tried to fix in my license. Any idea on how to proceed ? Only RMS is under the belief that the Apache license is incompatible. It is either clause 4 or 5 (I forget which) where we state that certain names (e.g. "Apache") cannot be used in derived products' names and promo materials. RMS views this as an "additional restriction on redistribution", which is apparently not allowed by the GPL. We (the ASF) generally feel he is being a royal pain in the ass with this. We've sent him a big, long email asking for clarification / resolution, but haven't heard back (we sent it a month or so ago). Basically, his FUD creates views such as yours ("the Apache license is incompatible with the GPL") because people just take his word for it. We plan to put together a web page to outline our own thoughts and licensing beliefs/philosophy. We're also planning to rev our license to rephrase/alter the particular clause, but for logistic purposes (putting the project name in there ties it to the particular project; we want a generic ASF license that can be applied to all of the projects without a search/replace). At this point, the ASF is taking the position of ignoring him and his controlling attitude(*) and beliefs. There is the outstanding letter to him, but that doesn't really change our point of view. Cheers, -g (*) for a person espousing freedom, it is rather ironic just how much of a control freak he is (stemming from a no-compromise position to guarantee peoples' freedoms, he always wants things done his way) -- Greg Stein, http://www.lyra.org/ From tg@melaten.rwth-aachen.de Thu Dec 14 13:07:12 2000 From: tg@melaten.rwth-aachen.de (Thomas Gellekum) Date: 14 Dec 2000 14:07:12 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 19:26:32 -0500" References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213192632.A30585@kronos.cnri.reston.va.us> Message-ID: Andrew Kuchling writes: > I'm ambivalent about the list_of_panels. It's a linked list storing > (PyWindow, PyPanel) pairs. Probably it should use a dictionary > instead of implementing a little list, just to reduce the amount of > code. I don't like it either, so feel free to shred it. As I said, this is the first (piece of an) extension module I've written and I thought it would be easier to implement a little list than to manage a Python list or such in C. > So, I suggest we create _curses_panel.c, which would be available as > curses.panel. (A panel.py module could then add any convenience > functions that are required.) > > Thomas, do you want to work on this, or should I? Just do it. I'll try to add more examples in the meantime. tg From fredrik@pythonware.com Thu Dec 14 13:19:08 2000 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 14 Dec 2000 14:19:08 +0100 Subject: [Python-Dev] fuzzy logic? Message-ID: <015101c065d0$717d1680$0900a8c0@SPIFF> here's a simple (but somewhat strange) test program: def spam(): a = 1 if (0): global a print "global a" a = 2 def egg(): b = 1 if 0: global b print "global b" b = 2 egg() spam() print a print b if I run this under 1.5.2, I get: 2 Traceback (innermost last): File "", line 19, in ? NameError: b From gstein@lyra.org Thu Dec 14 13:42:11 2000 From: gstein@lyra.org (Greg Stein) Date: Thu, 14 Dec 2000 05:42:11 -0800 Subject: [Python-Dev] fuzzy logic? In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF>; from fredrik@pythonware.com on Thu, Dec 14, 2000 at 02:19:08PM +0100 References: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: <20001214054210.G8951@lyra.org> I would take a guess that the "if 0:" is optimized away *before* the inspection for a "global" statement. But the compiler doesn't know how to optimize away "if (0):", so the global statement remains. Ah. Just checked. Look at compile.c::com_if_stmt(). There is a call to "is_constant_false()" in there. Heh. Looks like is_constant_false() could be made a bit smarter. But the point is valid: you can make is_constant_false() as smart as you want, and you'll still end up with "funny" global behavior. Cheers, -g On Thu, Dec 14, 2000 at 02:19:08PM +0100, Fredrik Lundh wrote: > here's a simple (but somewhat strange) test program: > > def spam(): > a = 1 > if (0): > global a > print "global a" > a = 2 > > def egg(): > b = 1 > if 0: > global b > print "global b" > b = 2 > > egg() > spam() > > print a > print b > > if I run this under 1.5.2, I get: > > 2 > Traceback (innermost last): > File "", line 19, in ? > NameError: b > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://www.python.org/mailman/listinfo/python-dev -- Greg Stein, http://www.lyra.org/ From mwh21@cam.ac.uk Thu Dec 14 13:58:24 2000 From: mwh21@cam.ac.uk (Michael Hudson) Date: 14 Dec 2000 13:58:24 +0000 Subject: [Python-Dev] fuzzy logic? In-Reply-To: "Fredrik Lundh"'s message of "Thu, 14 Dec 2000 14:19:08 +0100" References: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: 1) Is there anything is the standard library that does the equivalent of import symbol,token def decode_ast(ast): if token.ISTERMINAL(ast[0]): return (token.tok_name[ast[0]], ast[1]) else: return (symbol.sym_name[ast[0]],)+tuple(map(decode_ast,ast[1:])) so that, eg: >>> pprint.pprint(decode.decode_ast(parser.expr("0").totuple())) ('eval_input', ('testlist', ('test', ('and_test', ('not_test', ('comparison', ('expr', ('xor_expr', ('and_expr', ('shift_expr', ('arith_expr', ('term', ('factor', ('power', ('atom', ('NUMBER', '0'))))))))))))))), ('NEWLINE', ''), ('ENDMARKER', '')) ? Should there be? (Especially if it was a bit better written). ... and Greg's just said everything else I wanted to! Cheers, M. -- please realize that the Common Lisp community is more than 40 years old. collectively, the community has already been where every clueless newbie will be going for the next three years. so relax, please. -- Erik Naggum, comp.lang.lisp From guido@python.org Thu Dec 14 14:51:26 2000 From: guido@python.org (Guido van Rossum) Date: Thu, 14 Dec 2000 09:51:26 -0500 Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: Your message of "Thu, 14 Dec 2000 08:42:59 +0100." References: Message-ID: <200012141451.JAA15637@cj20424-a.reston1.va.home.com> > I think the following change is incompatible and will break applications. > > At least I have some server type applications that rely on > 'allow_reuse_address' defaulting to 0, because they use > the 'address already in use' exception, to make sure, that exactly one > server process is running on this port. One of these applications, > which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work, > if I change this default in SocketServer.py. > > Would you please explain the reasoning behind this change? The reason for the patch is that without this, if you kill a TCP server and restart it right away, you'll get a 'port in use" error -- TCP has some kind of strange wait period after a connection is closed before it can be reused. The patch avoids this error. As far as I know, with TCP, code using SO_REUSEADDR still cannot bind to the port when another process is already using it, but for UDP, the semantics may be different. Is your server using UDP? Try this patch if your problem is indeed related to UDP: *** SocketServer.py 2000/12/13 20:39:17 1.20 --- SocketServer.py 2000/12/14 14:48:16 *************** *** 268,273 **** --- 268,275 ---- """UDP server class.""" + allow_reuse_address = 0 + socket_type = socket.SOCK_DGRAM max_packet_size = 8192 If this works for you, I'll check it in, of course. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy@alum.mit.edu Thu Dec 14 14:52:37 2000 From: jeremy@alum.mit.edu (Jeremy Hylton) Date: Thu, 14 Dec 2000 09:52:37 -0500 (EST) Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A38A1DA.7EC49149@lemburg.com> References: <14904.21739.804346.650062@bitdiddle.concentric.net> <3A38A1DA.7EC49149@lemburg.com> Message-ID: <14904.57013.371474.691948@bitdiddle.concentric.net> >>>>> "MAL" == M -A Lemburg writes: MAL> Jeremy Hylton wrote: >> >> I've got a new draft of PEP 227. The terminology and wording are >> more convoluted than they need to be. I'll do at least one >> revision just to say things more clearly, but I'd appreciate >> comments on the proposed spec if you can read the current draft. MAL> The PEP doesn't mention the problems I pointed out about MAL> breaking the lookup schemes w/r to symbols in methods, classes MAL> and globals. I believe it does. There was some discussion on python-dev and with others in private email about how classes should be handled. The relevant section of the specification is: If a name is used within a code block, but it is not bound there and is not declared global, the use is treated as a reference to the nearest enclosing function region. (Note: If a region is contained within a class definition, the name bindings that occur in the class block are not visible to enclosed functions.) MAL> Please add a comment about this to the PEP + maybe the example MAL> I gave in one the posts to python-dev about it. I consider the MAL> problem serious enough to limit the nested scoping to lambda MAL> functions (or functions in general) only if that's possible. If there was some other concern you had, then I don't know what it was. I recall that you had a longish example that raised a NameError immediately :-). Jeremy From mal@lemburg.com Thu Dec 14 15:02:33 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 16:02:33 +0100 Subject: [Python-Dev] new draft of PEP 227 References: <14904.21739.804346.650062@bitdiddle.concentric.net> <3A38A1DA.7EC49149@lemburg.com> <14904.57013.371474.691948@bitdiddle.concentric.net> Message-ID: <3A38E109.54C07565@lemburg.com> Jeremy Hylton wrote: > > >>>>> "MAL" == M -A Lemburg writes: > > MAL> Jeremy Hylton wrote: > >> > >> I've got a new draft of PEP 227. The terminology and wording are > >> more convoluted than they need to be. I'll do at least one > >> revision just to say things more clearly, but I'd appreciate > >> comments on the proposed spec if you can read the current draft. > > MAL> The PEP doesn't mention the problems I pointed out about > MAL> breaking the lookup schemes w/r to symbols in methods, classes > MAL> and globals. > > I believe it does. There was some discussion on python-dev and > with others in private email about how classes should be handled. > > The relevant section of the specification is: > > If a name is used within a code block, but it is not bound there > and is not declared global, the use is treated as a reference to > the nearest enclosing function region. (Note: If a region is > contained within a class definition, the name bindings that occur > in the class block are not visible to enclosed functions.) Well hidden ;-) Honestly, I think that you should either make this specific case more visible to readers of the PEP since this single detail would produce most of the problems with nested scopes. BTW, what about nested classes ? AFAIR, the PEP only talks about nested functions. > MAL> Please add a comment about this to the PEP + maybe the example > MAL> I gave in one the posts to python-dev about it. I consider the > MAL> problem serious enough to limit the nested scoping to lambda > MAL> functions (or functions in general) only if that's possible. > > If there was some other concern you had, then I don't know what it > was. I recall that you had a longish example that raised a NameError > immediately :-). The idea behind the example should have been clear, though. x = 1 class C: x = 2 def test(self): print x -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake@acm.org Thu Dec 14 15:09:57 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 14 Dec 2000 10:09:57 -0500 (EST) Subject: [Python-Dev] fuzzy logic? In-Reply-To: References: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: <14904.58053.282537.260186@cj42289-a.reston1.va.home.com> Michael Hudson writes: > 1) Is there anything is the standard library that does the equivalent > of No, but I have a chunk of code that does in a different way. Where in the library do you think it belongs? The compiler package sounds like the best place, but that's not installed by default. (Jeremy, is that likely to change soon?) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From mwh21@cam.ac.uk Thu Dec 14 15:47:33 2000 From: mwh21@cam.ac.uk (Michael Hudson) Date: 14 Dec 2000 15:47:33 +0000 Subject: [Python-Dev] fuzzy logic? In-Reply-To: "Fred L. Drake, Jr."'s message of "Thu, 14 Dec 2000 10:09:57 -0500 (EST)" References: <015101c065d0$717d1680$0900a8c0@SPIFF> <14904.58053.282537.260186@cj42289-a.reston1.va.home.com> Message-ID: "Fred L. Drake, Jr." writes: > Michael Hudson writes: > > 1) Is there anything is the standard library that does the equivalent > > of > > No, but I have a chunk of code that does in a different way. I'm guessing everyone who's played with the parser much does, hence the suggestion. I agree my implementation is probably not optimal - I just threw it together as quickly as I could! > Where in the library do you think it belongs? The compiler package > sounds like the best place, but that's not installed by default. > (Jeremy, is that likely to change soon?) Actually, I'd have thought the parser module would be most natural, but that would probably mean doing the _module.c trick, and it's probably not worth the bother. OTOH, it seems that wrapping any given extension module in a python module is becoming if anything the norm, so maybe it is. Cheers, M. -- I don't remember any dirty green trousers. -- Ian Jackson, ucam.chat From nowonder@nowonder.de Thu Dec 14 15:50:10 2000 From: nowonder@nowonder.de (Peter Schneider-Kamp) Date: Thu, 14 Dec 2000 16:50:10 +0100 Subject: [Python-Dev] [PEP-212] new draft Message-ID: <3A38EC32.210BD1A2@nowonder.de> In an attempt to revive PEP 212 - Loop counter iteration I have updated the draft. The HTML version can be found at: http://python.sourceforge.net/peps/pep-0212.html I will appreciate any form of comments and/or criticisms. Peter P.S.: Now I have posted it - should I update the Post-History? Or is that for posts to c.l.py? From pf@artcom-gmbh.de Thu Dec 14 15:56:08 2000 From: pf@artcom-gmbh.de (Peter Funk) Date: Thu, 14 Dec 2000 16:56:08 +0100 (MET) Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: <200012141451.JAA15637@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 14, 2000 9:51:26 am" Message-ID: Hi, Moshes checkin indeed makes a lot of sense. Sorry for the irritation. Guido van Rossum: > The reason for the patch is that without this, if you kill a TCP server > and restart it right away, you'll get a 'port in use" error -- TCP has > some kind of strange wait period after a connection is closed before > it can be reused. The patch avoids this error. > > As far as I know, with TCP, code using SO_REUSEADDR still cannot bind > to the port when another process is already using it, but for UDP, the > semantics may be different. > > Is your server using UDP? No and I must admit, that I didn't tested carefully enough: From a quick look at my process listing I assumed there were indeed two server processes running concurrently which would have broken the needed mutual exclusion. But the second process went in a sleep-and-retry-to-connect-loop which I simply forgot about. This loop was initially built into my server to wait until the "strange wait period" you mentioned above was over or a certain number of retries has been exceeded. I guess I can take this ugly work-around out with Python 2.0 and newer, since the BaseHTTPServer.py shipped with Python 2.0 already contained allow_reuse_address = 1 default in the HTTPServer class. BTW: I've took my old W.Richard Stevens Unix Network Programming from the shelf. After rereading the rather terse paragraph about SO_REUSEADDR I guess the wait period is necessary to make sure, that their is no connect pending from an outside client on this TCP port. I can't find nothing about UDP and REUSE. Regards, Peter From guido@python.org Thu Dec 14 16:17:27 2000 From: guido@python.org (Guido van Rossum) Date: Thu, 14 Dec 2000 11:17:27 -0500 Subject: [Python-Dev] Online help scope In-Reply-To: Your message of "Wed, 13 Dec 2000 23:29:35 PST." <3A3876DF.5554080C@ActiveState.com> References: <3A3876DF.5554080C@ActiveState.com> Message-ID: <200012141617.LAA16179@cj20424-a.reston1.va.home.com> > I think Guido and I are pretty far apart on the scope and requirements > of this online help thing so I'd like some clarification and opinions > from the peanut gallery. I started replying but I think Tim's said it all. Let's do something dead simple. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@digicool.com Thu Dec 14 17:14:01 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 12:14:01 -0500 Subject: [Python-Dev] [PEP-212] new draft References: <3A38EC32.210BD1A2@nowonder.de> Message-ID: <14904.65497.940293.975775@anthem.concentric.net> >>>>> "PS" == Peter Schneider-Kamp writes: PS> P.S.: Now I have posted it - should I update the Post-History? PS> Or is that for posts to c.l.py? Originally, I'd thought of it as tracking the posting history to c.l.py. I'm not sure how useful that header is after all -- maybe in just giving a start into the python-list archives... -Barry From tim.one@home.com Thu Dec 14 17:33:41 2000 From: tim.one@home.com (Tim Peters) Date: Thu, 14 Dec 2000 12:33:41 -0500 Subject: [Python-Dev] fuzzy logic? In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: Note that the behavior of both functions is undefined ("Names listed in a global statement must not be used in the same code block textually preceding that global statement", from the Lang Ref, and "if" does not introduce a new code block in Python's terminology). But you'll get the same outcome via these trivial variants, which sidestep that problem: def spam(): if (0): global a print "global a" a = 2 def egg(): if 0: global b print "global b" b = 2 *Now* you can complain . > -----Original Message----- > From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On > Behalf Of Fredrik Lundh > Sent: Thursday, December 14, 2000 8:19 AM > To: python-dev@python.org > Subject: [Python-Dev] fuzzy logic? > > > here's a simple (but somewhat strange) test program: > > def spam(): > a = 1 > if (0): > global a > print "global a" > a = 2 > > def egg(): > b = 1 > if 0: > global b > print "global b" > b = 2 > > egg() > spam() > > print a > print b > > if I run this under 1.5.2, I get: > > 2 > Traceback (innermost last): > File "", line 19, in ? > NameError: b > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://www.python.org/mailman/listinfo/python-dev From tim.one@home.com Thu Dec 14 18:46:09 2000 From: tim.one@home.com (Tim Peters) Date: Thu, 14 Dec 2000 13:46:09 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule) In-Reply-To: <3A38A72A.4011B5BD@lemburg.com> Message-ID: [MAL] > About the GPL issue: as I understood Guido's post, RMS still regards > the choice of law clause as being incompatible to the GPL Yes. Actually, I don't know what RMS really thinks -- his public opinions on legal issues appear to be echoes of what Eben Moglen tells him. Like his views or not, Moglen is a tenured law professor > (heck, doesn't this guy ever think about international trade terms, > the United Nations Convention on International Sale of Goods > or local law in one of the 200+ countries where you could deploy > GPLed software... Yes. > is the GPL only meant for US programmers ?). No. Indeed, that's why the GPL is grounded in copyright law, because copyright law is the most uniform (across countries) body of law we've got. Most commentary I've seen suggests that the GPL has its *weakest* legal legs in the US! > I am currently rewriting my open source licenses as well and among > other things I chose to integrate a choice of law clause as well. > Seeing RMS' view of things, I guess that my license will be regarded > as incompatible to the GPL Yes. > which is sad even though I'm in good company... e.g. the Apache > license, the Zope license, etc. Dual licensing is not possible as > it would reopen the loop-wholes in the GPL I tried to fix in my > license. Any idea on how to proceed ? You can wait to see how the CNRI license turns out, then copy it if it's successful; you can approach the FSF directly; you can stop trying to do it yourself and reuse some license that's already been blessed by the FSF; or you can give up on GPL compatibility (according to the FSF). I don't see any other choices. > Another issue: since Python doesn't link Python scripts, is it > still true that if one (pure) Python package is covered by the GPL, > then all other packages needed by that application will also fall > under GPL ? Sorry, couldn't make sense of the question. Just as well, since you should ask about it on a GNU forum anyway . From mal@lemburg.com Thu Dec 14 20:02:05 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 21:02:05 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: Message-ID: <3A39273D.4AE24920@lemburg.com> Tim Peters wrote: > > [MAL] > > About the GPL issue: as I understood Guido's post, RMS still regards > > the choice of law clause as being incompatible to the GPL > > Yes. Actually, I don't know what RMS really thinks -- his public opinions > on legal issues appear to be echoes of what Eben Moglen tells him. Like his > views or not, Moglen is a tenured law professor But it's his piece of work, isn't it ? He's the one who can change it. > > (heck, doesn't this guy ever think about international trade terms, > > the United Nations Convention on International Sale of Goods > > or local law in one of the 200+ countries where you could deploy > > GPLed software... > > Yes. Strange, then how come he sees the choice of law clause as a problem: without explicitely ruling out the applicability of the UN CISC, this clause is waived by it anyway... at least according to a specialist on software law here in Germany. > > is the GPL only meant for US programmers ?). > > No. Indeed, that's why the GPL is grounded in copyright law, because > copyright law is the most uniform (across countries) body of law we've got. > Most commentary I've seen suggests that the GPL has its *weakest* legal legs > in the US! Huh ? Just an example: in Germany customer rights assure a 6 month warranty on everything you buy or obtain in some other way. Liability is another issue: there are some very unpleasant laws which render most of the "no liability" paragraphs in licenses useless in Germany. Even better: since the license itself is written in English a German party could simply consider the license non-binding, since he or she hasn't agreed to accept contract in foreign languages. France has similar interpretations. > > I am currently rewriting my open source licenses as well and among > > other things I chose to integrate a choice of law clause as well. > > Seeing RMS' view of things, I guess that my license will be regarded > > as incompatible to the GPL > > Yes. > > > which is sad even though I'm in good company... e.g. the Apache > > license, the Zope license, etc. Dual licensing is not possible as > > it would reopen the loop-wholes in the GPL I tried to fix in my > > license. Any idea on how to proceed ? > > You can wait to see how the CNRI license turns out, then copy it if it's > successful; you can approach the FSF directly; you can stop trying to do it > yourself and reuse some license that's already been blessed by the FSF; or > you can give up on GPL compatibility (according to the FSF). I don't see > any other choices. I guess I'll go with the latter. > > Another issue: since Python doesn't link Python scripts, is it > > still true that if one (pure) Python package is covered by the GPL, > > then all other packages needed by that application will also fall > > under GPL ? > > Sorry, couldn't make sense of the question. Just as well, since you should > ask about it on a GNU forum anyway . Isn't this question (whether the GPL virus applies to byte-code as well) important to Python programmers as well ? Oh well, nevermind... it's still nice to hear that CNRI and RMS have finally made up their minds to render Python GPL-compatible -- whatever this means ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From cgw@fnal.gov Thu Dec 14 21:06:43 2000 From: cgw@fnal.gov (Charles G Waldman) Date: Thu, 14 Dec 2000 15:06:43 -0600 (CST) Subject: [Python-Dev] memory leaks Message-ID: <14905.13923.659879.100243@buffalo.fnal.gov> The following code (extracted from test_extcall.py) leaks memory: class Foo: def method(self, arg1, arg2): return arg1 + arg2 def f(): err = None try: Foo.method(*(1, 2, 3)) except TypeError, err: pass del err One-line fix (also posted to Sourceforge): --- Python/ceval.c 2000/10/30 17:15:19 2.213 +++ Python/ceval.c 2000/12/14 20:54:02 @@ -1905,8 +1905,7 @@ class))) { PyErr_SetString(PyExc_TypeError, "unbound method must be called with instance as first argument"); - x = NULL; - break; + goto extcall_fail; } } } I think that there are a bunch more memory leaks lurking around... this only fixes one of them. I'll send more info as I find out what's going on. From tim.one@home.com Thu Dec 14 21:28:09 2000 From: tim.one@home.com (Tim Peters) Date: Thu, 14 Dec 2000 16:28:09 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A39273D.4AE24920@lemburg.com> Message-ID: I'm not going to argue about the GPL. Take it up with the FSF! I will say that if you do get the FSF's attention, Moglen will have an instant counter to any objection you're likely to raise -- he's been thinking about this for 10 years, and he's heard it all. And in our experience, RMS won't commit to anything before running it past Moglen. [MAL] > But it's his [RMS's] piece of work, isn't it ? He's the one who can > change it. Akin to saying Python is Guido's piece of work. Yes, no, kinda, more true at some times than others, ditto respects. RMS has consistently said that any changes for the next version of the GPL will take at least a year, due to extensive legal review required first. Would be more clearly true to say that the first version of the GPL was RMS's alone -- but version 2 came out in 1991. > ... > Strange, then how come he sees the choice of law clause as a problem: > without explicitely ruling out the applicability of the UN CISC, > this clause is waived by it anyway... at least according to a > specialist on software law here in Germany. > ... [and other "who knows?" objections] ... Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re: [License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who dismissed it almost offhandedly as "layman's commentary". You'll have to ask him why: MAL, we're not lawyers. We're incompetent to have this discussion -- or at least I am, and Moglen thinks you are too . >>> Another issue: since Python doesn't link Python scripts, is it >>> still true that if one (pure) Python package is covered by the GPL, >>> then all other packages needed by that application will also fall >>> under GPL ? [Tim] >> Sorry, couldn't make sense of the question. Just as well, >> since you should ask about it on a GNU forum anyway . [MAL] > Isn't this question (whether the GPL virus applies to byte-code > as well) important to Python programmers as well ? I don't know -- like I said, I couldn't make sense of the question, i.e. I couldn't figure out what it is you're asking. I *suspect* it's based on a misunderstanding of the GPL; for example, gcc is a GPL'ed application that requires stuff from the OS in order to do its job of compiling, but that doesn't mean that every OS it runs on falls under the GPL. The GPL contains no restrictions on *use*, it restricts only copying, modifying and distributing (the specific rights granted by copyright law). I don't see any way to read the GPL as restricting your ability to distribute a GPL'ed program P on its own, no matter what the status of the packages that P may rely upon for operation. The GPL is also not viral in the sense that it cannot infect an unwitting victim. Nothing whatsoever you do or don't do can make *any* other program Q "fall under" the GPL -- only Q's owner can set the license for Q. The GPL purportedly can prevent you from distributing (but not from using) a program that links with a GPL'ed program, but that doesn't appear to be what you're asking about. Or is it? If you were to put, say, mxDateTime, under the GPL, then yes, I believe the FSF would claim I could not distribute my program T that uses mxDateTime unless T were also under the GPL or a GPL-compatible license. But if mxDateTime is not under the GPL, then nothing I do with T can magically change the mxDateTime license to the GPL (although if your mxDateTime license allows me to redistribute mxDateTime under a different license, then it allows me to ship a copy of mxDateTime under the GPL). That said, the whole theory of GPL linking is muddy to me, especially since the word "link" (and its variants) doesn't appear in the GPL. > Oh well, nevermind... it's still nice to hear that CNRI and RMS > have finally made up their minds to render Python GPL-compatible -- > whatever this means ;-) I'm not sure it means anything yet. CNRI and the FSF believed they reached agreement before, but that didn't last after Moglen and Kahn each figured out what the other was really suggesting. From mal@lemburg.com Thu Dec 14 22:25:31 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 23:25:31 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: Message-ID: <3A3948DB.9165E404@lemburg.com> Tim Peters wrote: > > I'm not going to argue about the GPL. Take it up with the FSF! Sorry, I got a bit carried away -- I don't want to take it up with the FSF, simply because I couldn't care less. What's bugging me is that this one guy is splitting the OSS world in two even though both halfs actually want the same thing: software which you can use for free with full source code. I find that a very poor situation. > I will say > that if you do get the FSF's attention, Moglen will have an instant counter > to any objection you're likely to raise -- he's been thinking about this for > 10 years, and he's heard it all. And in our experience, RMS won't commit to > anything before running it past Moglen. > > [MAL] > > But it's his [RMS's] piece of work, isn't it ? He's the one who can > > change it. > > Akin to saying Python is Guido's piece of work. Yes, no, kinda, more true > at some times than others, ditto respects. RMS has consistently said that > any changes for the next version of the GPL will take at least a year, due > to extensive legal review required first. Would be more clearly true to say > that the first version of the GPL was RMS's alone -- but version 2 came out > in 1991. Point taken. > > ... > > Strange, then how come he sees the choice of law clause as a problem: > > without explicitely ruling out the applicability of the UN CISC, > > this clause is waived by it anyway... at least according to a > > specialist on software law here in Germany. > > ... [and other "who knows?" objections] ... > > Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re: > [License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who > dismissed it almost offhandedly as "layman's commentary". You'll have to > ask him why: MAL, we're not lawyers. We're incompetent to have this > discussion -- or at least I am, and Moglen thinks you are too . I'm not a lawyer either, but I am able to apply common sense and know about German trade laws. Anyway, here a reference which covers all the controversial subjects. It's in German, but these guys qualify as lawyers ;-) ... http://www.ifross.de/ifross_html/index.html There's also a book on the subject in German which covers all aspects of software licensing. Here's the reference in case anyone cares: Jochen Marly, Softwareüberlassungsverträge C.H. Beck, München, 2000 > >>> Another issue: since Python doesn't link Python scripts, is it > >>> still true that if one (pure) Python package is covered by the GPL, > >>> then all other packages needed by that application will also fall > >>> under GPL ? > > [Tim] > >> Sorry, couldn't make sense of the question. Just as well, > >> since you should ask about it on a GNU forum anyway . > > [MAL] > > Isn't this question (whether the GPL virus applies to byte-code > > as well) important to Python programmers as well ? > > I don't know -- like I said, I couldn't make sense of the question, i.e. I > couldn't figure out what it is you're asking. I *suspect* it's based on a > misunderstanding of the GPL; for example, gcc is a GPL'ed application that > requires stuff from the OS in order to do its job of compiling, but that > doesn't mean that every OS it runs on falls under the GPL. The GPL contains > no restrictions on *use*, it restricts only copying, modifying and > distributing (the specific rights granted by copyright law). I don't see > any way to read the GPL as restricting your ability to distribute a GPL'ed > program P on its own, no matter what the status of the packages that P may > rely upon for operation. This is very controversial: if an application Q needs a GPLed library P to work, then P and Q form a new whole in the sense of the GPL. And this even though P wasn't even distributed together with Q. Don't ask me why, but that's how RMS and folks look at it. It can be argued that the dynamic linker actually integrates P into Q, but is the same argument valid for a Python program Q which relies on a GPLed package P ? (The relationship between Q and P is one of providing interfaces -- there is no call address patching required for the setup to work.) > The GPL is also not viral in the sense that it cannot infect an unwitting > victim. Nothing whatsoever you do or don't do can make *any* other program > Q "fall under" the GPL -- only Q's owner can set the license for Q. The GPL > purportedly can prevent you from distributing (but not from using) a program > that links with a GPL'ed program, but that doesn't appear to be what you're > asking about. Or is it? No. What's viral about the GPL is that you can turn an application into a GPLed one by merely linking the two together -- that's why e.g. the libc is distributed under the LGPL which doesn't have this viral property. > If you were to put, say, mxDateTime, under the GPL, then yes, I believe the > FSF would claim I could not distribute my program T that uses mxDateTime > unless T were also under the GPL or a GPL-compatible license. But if > mxDateTime is not under the GPL, then nothing I do with T can magically > change the mxDateTime license to the GPL (although if your mxDateTime > license allows me to redistribute mxDateTime under a different license, then > it allows me to ship a copy of mxDateTime under the GPL). > > That said, the whole theory of GPL linking is muddy to me, especially since > the word "link" (and its variants) doesn't appear in the GPL. True. > > Oh well, nevermind... it's still nice to hear that CNRI and RMS > > have finally made up their minds to render Python GPL-compatible -- > > whatever this means ;-) > > I'm not sure it means anything yet. CNRI and the FSF believed they reached > agreement before, but that didn't last after Moglen and Kahn each figured > out what the other was really suggesting. Oh boy... -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From greg@cosc.canterbury.ac.nz Thu Dec 14 23:19:09 2000 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Dec 2000 12:19:09 +1300 (NZDT) Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A3948DB.9165E404@lemburg.com> Message-ID: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> "M.-A. Lemburg" : > if an application Q needs a GPLed > library P to work, then P and Q form a new whole in the sense of > the GPL. I don't see how Q can *need* any particular library P to work. The most it can need is some library with an API which is compatible with P's. So I don't buy that argument. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Thu Dec 14 23:58:24 2000 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Dec 2000 12:58:24 +1300 (NZDT) Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A387005.6725DAAE@ActiveState.com> Message-ID: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> Paul Prescod : > We could say that a local can only shadow a global > if the local is formally declared. How do you intend to enforce that? Seems like it would require a test on every assignment to a local, to make sure nobody has snuck in a new global since the function was compiled. > Actually, one could argue that there is no good reason to > even *allow* the shadowing of globals. If shadowing were completely disallowed, it would make it impossible to write a completely self-contained function whose source could be moved from one environment to another without danger of it breaking. I wouldn't like the language to have a characteristic like that. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg@cosc.canterbury.ac.nz Fri Dec 15 00:06:12 2000 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Dec 2000 13:06:12 +1300 (NZDT) Subject: [Python-Dev] Online help scope In-Reply-To: Message-ID: <200012150006.NAA02154@s454.cosc.canterbury.ac.nz> Tim Peters : > [Paul Prescod] > > Keywords have no docstrings. > Neither do integers, but they're obvious too . Oh, I don't know, it could be useful. >>> help(2) The first prime number. >>> help(2147483647) sys.maxint, the largest Python small integer. >>> help(42) The answer to the ultimate question of life, the universe and everything. See also: ultimate_question. >>> help("ultimate_question") [Importing research.mice.earth] [Calling earth.find_ultimate_question] This may take about 10 million years, please be patient... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From barry@digicool.com Fri Dec 15 00:33:16 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 19:33:16 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <3A3948DB.9165E404@lemburg.com> <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> Message-ID: <14905.26316.407495.981198@anthem.concentric.net> >>>>> "GE" == Greg Ewing writes: GE> I don't see how Q can *need* any particular library P to GE> work. The most it can need is some library with an API which GE> is compatible with P's. So I don't buy that argument. It's been my understanding that the FSF's position on this is as follows. If the only functional implementation of the API is GPL'd software then simply writing your code against that API is tantamount to linking with that software. Their reasoning is that the clear intent of the programmer (shut up, Chad) is to combine the program with GPL code. As soon as there is a second, non-GPL implementation of the API, you're fine because while you may not distribute your program with the GPL'd software linked in, those who receive your software wouldn't be forced to combine GPL and non-GPL code. -Barry From tim.one@home.com Fri Dec 15 03:01:36 2000 From: tim.one@home.com (Tim Peters) Date: Thu, 14 Dec 2000 22:01:36 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A3948DB.9165E404@lemburg.com> Message-ID: [MAL] > Sorry, I got a bit carried away -- I don't want to take it up > with the FSF, simply because I couldn't care less. Well, nobody else is able to Pronounce on what the FSF believes or will do. Which tells me that you're not really interested in playing along with the FSF here after all -- which we both knew from the start anyway . > What's bugging me is that this one guy is splitting the OSS world There are many people on the FSF bandwagon. I'm not one of them, but I can count. > in two even though both halfs actually want the same thing: software > which you can use for free with full source code. I find that a very > poor situation. RMS would not agree that both halves want the same thing; to the contrary, he's openly contemptuous of the Open Source movement -- which you also knew from the start. > [stuff about German law I won't touch with 12-foot schnitzel] OTOH, a German FSF advocate assured me: I also tend to forget that the system of the law works different in the US as in Germany. In Germany something that most people will believe (called "common grounds") play a role in the court. So if you knew, because it is widely known what the GPL means, than it is harder to attack that in court. In the US, when something gets to court it doesn't matter at all what people believed about it. Heck, we'll let mass murderers go free if a comma was in the wrong place in a 1592 statute, or send a kid to jail for life for using crack cocaine instead of the flavor favored by stockbrokers . I hope the US is unique in that respect, but it does makes the GPL weaker here because even if *everyone* in our country believed the GPL means what RMS says it means, a US court would give that no weight in its logic-chopping. >>> Another issue: since Python doesn't link Python scripts, is it >>> still true that if one (pure) Python package is covered by the GPL, >>> then all other packages needed by that application will also fall >>> under GPL ? > This is very controversial: if an application Q needs a GPLed > library P to work, then P and Q form a new whole in the sense of > the GPL. And this even though P wasn't even distributed together > with Q. Don't ask me why, but that's how RMS and folks look at it. Understood, but have you reread your question above, which I've said twice I can't make sense of? That's not what you were asking about. Your question above asks, if anything, the opposite: the *application* Q is GPL'ed, and the question above asks whether that means the *Ps* it depends on must also be GPL'ed. To the best of my ability, I've answered "NO" to that one, and "YES" to the question it appears you meant to ask. > It can be argued that the dynamic linker actually integrates > P into Q, but is the same argument valid for a Python program Q > which relies on a GPLed package P ? (The relationship between > Q and P is one of providing interfaces -- there is no call address > patching required for the setup to work.) As before, I believe the FSF will say YES. Unless there's also a non-GPL'ed implementation of the same interface that people could use just as well. See my extended mxDateTime example too. > ... > No. What's viral about the GPL is that you can turn an application > into a GPLed one by merely linking the two together No, you cannot. You can link them together all day without any hassle. What you cannot do is *distribute* it unless the aggregate is first placed under the GPL (or a GPL-compatible license) too. If you distribute it without taking that step, that doesn't turn it into a GPL'ed application either -- in that case you've simply (& supposedly) violated the license on P, so your distribution was simply (& supposedly) illegal. And that is in fact the end result that people who knowingly use the GPL want (granting that it appears most people who use the GPL do so unknowing of its consequences). > -- that's why e.g. the libc is distributed under the LGPL which > doesn't have this viral property. You should read RMS on why glibc is under the LGPL: http://www.fsf.org/philosophy/why-not-lgpl.html It will at least disabuse you of the notion that RMS and you are after the same thing . From paulp@ActiveState.com Fri Dec 15 04:02:08 2000 From: paulp@ActiveState.com (Paul Prescod) Date: Thu, 14 Dec 2000 20:02:08 -0800 Subject: [Python-Dev] new draft of PEP 227 References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> Message-ID: <3A3997C0.F977AF51@ActiveState.com> Greg Ewing wrote: > > Paul Prescod : > > > We could say that a local can only shadow a global > > if the local is formally declared. > > How do you intend to enforce that? Seems like it would > require a test on every assignment to a local, to make > sure nobody has snuck in a new global since the function > was compiled. I would expect that all of the checks would be at compile-time. Except for __dict__ hackery, I think it is doable. Python already keeps track of all assignments to locals and all assignments to globals in a function scope. The only addition is keeping track of assignments at a global scope. > > Actually, one could argue that there is no good reason to > > even *allow* the shadowing of globals. > > If shadowing were completely disallowed, it would make it > impossible to write a completely self-contained function > whose source could be moved from one environment to another > without danger of it breaking. I wouldn't like the language > to have a characteristic like that. That seems like a very esoteric requirement. How often do you have functions that do not rely *at all* on their environment (other functions, import statements, global variables). When you move code you have to do some rewriting or customizing of the environment in 94% of the cases. How much effort do you want to spend on the other 6%? Also, there are tools that are designed to help you move code without breaking programs (refactoring editors). They can just as easily handle renaming local variables as adding import statements and fixing up function calls. Paul Prescod From mal@lemburg.com Fri Dec 15 10:05:59 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 11:05:59 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> Message-ID: <3A39ED07.6B3EE68E@lemburg.com> Greg Ewing wrote: > > "M.-A. Lemburg" : > > if an application Q needs a GPLed > > library P to work, then P and Q form a new whole in the sense of > > the GPL. > > I don't see how Q can *need* any particular library P > to work. The most it can need is some library with > an API which is compatible with P's. So I don't > buy that argument. It's the view of the FSF, AFAIK. You can't distribute an application in binary which dynamically links against libreadline (which is GPLed) on the user's machine, since even though you don't distribute libreadline the application running on the user's machine is considered the "whole" in terms of the GPL. FWIW, I don't agree with that view either, but that's probably because I'm a programmer and not a lawyer :) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Fri Dec 15 10:25:12 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 11:25:12 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: Message-ID: <3A39F188.E366B481@lemburg.com> Tim Peters wrote: > > [Tim and MAL talking about the FSF and their views] > > [Tim and MAL showing off as hobby advocates ;-)] > > >>> Another issue: since Python doesn't link Python scripts, is it > >>> still true that if one (pure) Python package is covered by the GPL, > >>> then all other packages needed by that application will also fall > >>> under GPL ? > > > This is very controversial: if an application Q needs a GPLed > > library P to work, then P and Q form a new whole in the sense of > > the GPL. And this even though P wasn't even distributed together > > with Q. Don't ask me why, but that's how RMS and folks look at it. > > Understood, but have you reread your question above, which I've said twice I > can't make sense of? I know, it was backwards. Take an example: I have a program which wants to process MP3 files in some way. Now because of some stroke is luck, all Python MP3 modules out there are covered by the GPL. Now I could write an application which uses a certain interface and then tell the user to install the MP3 module separately. As Barry mentioned, this setup will cause distribution of my application to be illegal because I could have only done so by putting the application under the GPL. > You should read RMS on why glibc is under the LGPL: > > http://www.fsf.org/philosophy/why-not-lgpl.html > > It will at least disabuse you of the notion that RMS and you are after the > same thing . :-) Let's stop this discussion and get back to those cheerful things like Christmas Bells and Santa Clause... :-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From akuchlin@mems-exchange.org Fri Dec 15 13:27:24 2000 From: akuchlin@mems-exchange.org (A.M. Kuchling) Date: Fri, 15 Dec 2000 08:27:24 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE Message-ID: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> unicodeobject.c contains this code: PyErr_Format(PyExc_ValueError, "unsupported format character '%c' (0x%x) " "at index %i", c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat)); c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits, so '%\u3000' % 1 results in an error message containing "'\000' (0x3000)". Is this worth fixing? I'd say no, since the hex value is more useful for Unicode strings anyway. (I still wanted to mention this little buglet, since I just touched this bit of code.) --amk From jack@oratrix.nl Fri Dec 15 14:26:15 2000 From: jack@oratrix.nl (Jack Jansen) Date: Fri, 15 Dec 2000 15:26:15 +0100 Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: Message by Guido van Rossum , Thu, 14 Dec 2000 09:51:26 -0500 , <200012141451.JAA15637@cj20424-a.reston1.va.home.com> Message-ID: <20001215142616.705993B9B44@snelboot.oratrix.nl> > The reason for the patch is that without this, if you kill a TCP server > and restart it right away, you'll get a 'port in use" error -- TCP has > some kind of strange wait period after a connection is closed before > it can be reused. The patch avoids this error. Well, actually there's a pretty good reason for the "port in use" behaviour: the TCP standard more-or-less requires it. A srchost/srcport/dsthost/dstport combination should not be reused until the maximum TTL has passed, because there may still be "old" retransmissions around. Especially the "open" packets are potentially dangerous. Setting the reuse bit while you're debugging is fine, but setting it in general is not a very good idea... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido@python.org Fri Dec 15 14:31:19 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 09:31:19 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: Your message of "Thu, 14 Dec 2000 20:02:08 PST." <3A3997C0.F977AF51@ActiveState.com> References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> Message-ID: <200012151431.JAA19799@cj20424-a.reston1.va.home.com> > Greg Ewing wrote: > > > > Paul Prescod : > > > > > We could say that a local can only shadow a global > > > if the local is formally declared. > > > > How do you intend to enforce that? Seems like it would > > require a test on every assignment to a local, to make > > sure nobody has snuck in a new global since the function > > was compiled. > > I would expect that all of the checks would be at compile-time. Except > for __dict__ hackery, I think it is doable. Python already keeps track > of all assignments to locals and all assignments to globals in a > function scope. The only addition is keeping track of assignments at a > global scope. > > > > Actually, one could argue that there is no good reason to > > > even *allow* the shadowing of globals. > > > > If shadowing were completely disallowed, it would make it > > impossible to write a completely self-contained function > > whose source could be moved from one environment to another > > without danger of it breaking. I wouldn't like the language > > to have a characteristic like that. > > That seems like a very esoteric requirement. How often do you have > functions that do not rely *at all* on their environment (other > functions, import statements, global variables). > > When you move code you have to do some rewriting or customizing of the > environment in 94% of the cases. How much effort do you want to spend on > the other 6%? Also, there are tools that are designed to help you move > code without breaking programs (refactoring editors). They can just as > easily handle renaming local variables as adding import statements and > fixing up function calls. Can we cut this out please? Paul is misguided. There's no reason to forbid a local shadowing a global. All languages with nested scopes allow this. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@digicool.com Fri Dec 15 16:17:08 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Fri, 15 Dec 2000 11:17:08 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> Message-ID: <14906.17412.221040.895357@anthem.concentric.net> >>>>> "M" == M writes: M> It's the view of the FSF, AFAIK. You can't distribute an M> application in binary which dynamically links against M> libreadline (which is GPLed) on the user's machine, since even M> though you don't distribute libreadline the application running M> on the user's machine is considered the "whole" in terms of the M> GPL. M> FWIW, I don't agree with that view either, but that's probably M> because I'm a programmer and not a lawyer :) I'm not sure I agree with that view either, but mostly because there is a non-GPL replacement for parts of the readline API: http://www.cstr.ed.ac.uk/downloads/editline.html Don't know anything about it, so it may not be featureful enough for Python's needs, but if licensing is really a problem, it might be worth looking into. -Barry From paulp@ActiveState.com Fri Dec 15 16:16:37 2000 From: paulp@ActiveState.com (Paul Prescod) Date: Fri, 15 Dec 2000 08:16:37 -0800 Subject: [Python-Dev] new draft of PEP 227 References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com> Message-ID: <3A3A43E5.347AAF6C@ActiveState.com> Guido van Rossum wrote: > > ... > > Can we cut this out please? Paul is misguided. There's no reason to > forbid a local shadowing a global. All languages with nested scopes > allow this. Python is the only one I know of that implicitly shadows without requiring some form of declaration. JavaScript has it right: reading and writing of globals are symmetrical. In the rare case that you explicitly want to shadow, you need a declaration. Python's rule is confusing, implicit and error causing. In my opinion, of course. If you are dead-set against explicit declarations then I would say that disallowing the ambiguous construct is better than silently treating it as a declaration. Paul Prescod From guido@python.org Fri Dec 15 16:23:07 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 11:23:07 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: Your message of "Fri, 15 Dec 2000 08:16:37 PST." <3A3A43E5.347AAF6C@ActiveState.com> References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com> <3A3A43E5.347AAF6C@ActiveState.com> Message-ID: <200012151623.LAA27630@cj20424-a.reston1.va.home.com> > Python is the only one I know of that implicitly shadows without > requiring some form of declaration. JavaScript has it right: reading and > writing of globals are symmetrical. In the rare case that you explicitly > want to shadow, you need a declaration. Python's rule is confusing, > implicit and error causing. In my opinion, of course. If you are > dead-set against explicit declarations then I would say that disallowing > the ambiguous construct is better than silently treating it as a > declaration. Let's agree to differ. This will never change. In Python, assignment is declaration. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Dec 15 17:01:33 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 12:01:33 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: Your message of "Fri, 15 Dec 2000 08:27:24 EST." <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> Message-ID: <200012151701.MAA28058@cj20424-a.reston1.va.home.com> > unicodeobject.c contains this code: > > PyErr_Format(PyExc_ValueError, > "unsupported format character '%c' (0x%x) " > "at index %i", > c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat)); > > c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits, > so '%\u3000' % 1 results in an error message containing "'\000' > (0x3000)". Is this worth fixing? I'd say no, since the hex value is > more useful for Unicode strings anyway. (I still wanted to mention > this little buglet, since I just touched this bit of code.) Sounds like the '%c' should just be deleted. --Guido van Rossum (home page: http://www.python.org/~guido/) From bckfnn@worldonline.dk Fri Dec 15 17:05:42 2000 From: bckfnn@worldonline.dk (Finn Bock) Date: Fri, 15 Dec 2000 17:05:42 GMT Subject: [Python-Dev] CWD in sys.path. Message-ID: <3a3a480b.28490597@smtp.worldonline.dk> Hi, I'm trying to understand the initialization of sys.path and especially if CWD is supposed to be included in sys.path by default. (I understand the purpose of sys.path[0], that is not the focus of my question). My setup is Python2.0 on Win2000, no PYTHONHOME or PYTHONPATH envvars. In this setup, an empty string exists as sys.path[1], but I'm unsure if this is by careful design or some freak accident. The empty entry is added because HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath does *not* have any subkey. There are a default value, but that value appears to be ignored. If I add a subkey "foo": HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath\foo with a default value of "d:\foo", the CWD is no longer in sys.path. i:\java\jython.cvs\org\python\util>d:\Python20\python.exe -S Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. >>> import sys >>> sys.path ['', 'd:\\foo', 'D:\\PYTHON20\\DLLs', 'D:\\PYTHON20\\lib', 'D:\\PYTHON20\\lib\\plat-win', 'D:\\PYTHON20\\lib\\lib-tk', 'D:\\PYTHON20'] >>> I noticed that some of the PYTHONPATH macros in PC/config.h includes the '.', others does not. So, to put it as a question (for jython): Should CWD be included in sys.path? Are there some situation (like embedding) where CWD shouldn't be in sys.path? regards, finn From guido@python.org Fri Dec 15 17:12:03 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 12:12:03 -0500 Subject: [Python-Dev] CWD in sys.path. In-Reply-To: Your message of "Fri, 15 Dec 2000 17:05:42 GMT." <3a3a480b.28490597@smtp.worldonline.dk> References: <3a3a480b.28490597@smtp.worldonline.dk> Message-ID: <200012151712.MAA02544@cj20424-a.reston1.va.home.com> On Unix, CWD is not in sys.path unless as sys.path[0]. --Guido van Rossum (home page: http://www.python.org/~guido/) From moshez@zadka.site.co.il Sat Dec 16 01:43:41 2000 From: moshez@zadka.site.co.il (Moshe Zadka) Date: Sat, 16 Dec 2000 03:43:41 +0200 (IST) Subject: [Python-Dev] new draft of PEP 227 Message-ID: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il> On Fri, 15 Dec 2000 08:16:37 -0800, Paul Prescod wrote: > Python is the only one I know of that implicitly shadows without > requiring some form of declaration. Perl and Scheme permit implicit shadowing too. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tismer@tismer.com Fri Dec 15 16:42:18 2000 From: tismer@tismer.com (Christian Tismer) Date: Fri, 15 Dec 2000 18:42:18 +0200 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule) References: Message-ID: <3A3A49EA.5D9418E@tismer.com> Tim Peters wrote: ... > > Another issue: since Python doesn't link Python scripts, is it > > still true that if one (pure) Python package is covered by the GPL, > > then all other packages needed by that application will also fall > > under GPL ? > > Sorry, couldn't make sense of the question. Just as well, since you should > ask about it on a GNU forum anyway . The GNU license is transitive. It automatically extends on other parts of a project, unless they are identifiable, independent developments. As soon as a couple of modules is published together, based upon one GPL-ed module, this propagates. I think this is what MAL meant? Anyway, I'd be interested to hear what the GNU forum says. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From akuchlin@mems-exchange.org Fri Dec 15 18:10:34 2000 From: akuchlin@mems-exchange.org (A.M. Kuchling) Date: Fri, 15 Dec 2000 13:10:34 -0500 Subject: [Python-Dev] What to do about PEP 229? Message-ID: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> I began writing the fabled fancy setup script described in PEP 229, and then realized there was duplication going on here. The code in setup.py would need to know what libraries, #defines, &c., are needed by each module in order to check if they're needed and set them. But if Modules/Setup can be used to override setup.py's behaviour, then much of this information would need to be in that file, too; the details of compiling a module are in two places. Possibilities: 1) Setup contains fully-loaded module descriptions, and the setup script drops unneeded bits. For example, the socket module requires -lnsl on some platforms. The Setup file would contain "socket socketmodule.c -lnsl" on all platforms, and setup.py would check for an nsl library and only use if it's there. This seems dodgy to me; what if -ldbm is needed on one platform and -lndbm on another? 2) Drop setup completely and just maintain setup.py, with some different overriding mechanism. This is more radical. Adding a new module is then not just a matter of editing a simple text file; you'd have to modify setup.py, making it more like maintaining an autoconf script. Remember, the underlying goal of PEP 229 is to have the out-of-the-box Python installation you get from "./configure;make" contain many more useful modules; right now you wouldn't get zlib, syslog, resource, any of the DBM modules, PyExpat, &c. I'm not wedded to using Distutils to get that, but think that's the only practical way; witness the hackery required to get the DB module automatically compiled. You can also wave your hands in the direction of packagers such as ActiveState or Red Hat, and say "let them make to compile everything". But this problem actually inconveniences *me*, since I always build Python myself and have to extensively edit Setup, so I'd like to fix the problem. Thoughts? --amk From nas@arctrix.com Fri Dec 15 12:03:04 2000 From: nas@arctrix.com (Neil Schemenauer) Date: Fri, 15 Dec 2000 04:03:04 -0800 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <14906.17412.221040.895357@anthem.concentric.net>; from barry@digicool.com on Fri, Dec 15, 2000 at 11:17:08AM -0500 References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> Message-ID: <20001215040304.A22056@glacier.fnational.com> On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > I'm not sure I agree with that view either, but mostly because there > is a non-GPL replacement for parts of the readline API: > > http://www.cstr.ed.ac.uk/downloads/editline.html It doesn't work with the current readline module. It is much smaller than readline and works just as well in my experience. Would there be any interest in including a copy with the standard distribution? The license is quite nice (X11 type). Neil From nas@arctrix.com Fri Dec 15 12:14:50 2000 From: nas@arctrix.com (Neil Schemenauer) Date: Fri, 15 Dec 2000 04:14:50 -0800 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012151509.HAA18093@slayer.i.sourceforge.net>; from gvanrossum@users.sourceforge.net on Fri, Dec 15, 2000 at 07:09:46AM -0800 References: <200012151509.HAA18093@slayer.i.sourceforge.net> Message-ID: <20001215041450.B22056@glacier.fnational.com> On Fri, Dec 15, 2000 at 07:09:46AM -0800, Guido van Rossum wrote: > Update of /cvsroot/python/python/dist/src/Lib > In directory slayer.i.sourceforge.net:/tmp/cvs-serv18082 > > Modified Files: > httplib.py > Log Message: > Get rid of string functions. Can you explain the logic behind this recent interest in removing string functions from the standard library? It it performance? Some unicode issue? I don't have a great attachment to string.py but I also don't see the justification for the amount of work it requires. Neil From guido@python.org Fri Dec 15 19:29:37 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 14:29:37 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Fri, 15 Dec 2000 04:14:50 PST." <20001215041450.B22056@glacier.fnational.com> References: <200012151509.HAA18093@slayer.i.sourceforge.net> <20001215041450.B22056@glacier.fnational.com> Message-ID: <200012151929.OAA03073@cj20424-a.reston1.va.home.com> > Can you explain the logic behind this recent interest in removing > string functions from the standard library? It it performance? > Some unicode issue? I don't have a great attachment to string.py > but I also don't see the justification for the amount of work it > requires. I figure that at *some* point we should start putting our money where our mouth is, deprecate most uses of the string module, and start warning about it. Not in 2.1 probably, given my experience below. As a realistic test of the warnings module I played with some warnings about the string module, and then found that say most of the std library modules use it, triggering an extraordinary amount of warnings. I then decided to experiment with the conversion. I quickly found out it's too much work to do manually, so I'll hold off until someone comes up with a tool that does 99% of the work. (The selection of std library modules to convert manually was triggered by something pretty random -- I decided to silence a particular cron job I was running. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From Barrett@stsci.edu Fri Dec 15 19:32:10 2000 From: Barrett@stsci.edu (Paul Barrett) Date: Fri, 15 Dec 2000 14:32:10 -0500 (EST) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> Message-ID: <14906.17712.830224.481130@nem-srvr.stsci.edu> Guido, Here are my comments on PEP 207. (I've also gone back and read most of the 1998 discussion. What a tedious, in terms of time, but enlightening, in terms of content, discussion that was.) | - New function: | | PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op) | | This performs the requested rich comparison, returning a Python | object or raising an exception. The 3rd argument must be one of | LT, LE, EQ, NE, GT or GE. I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc. | Classes | | - Classes can define new special methods __lt__, __le__, __gt__, | __ge__, __eq__, __ne__ to override the corresponding operators. | (You gotta love the Fortran heritage.) If a class overrides | __cmp__ as well, it is only used by PyObject_Compare(). Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to __lt__, __le__, __eq__, etc. I'm not keen on the FORTRAN derived symbolism. I also find it contrary to Python's heritage of being clear and concise. I don't mind typing __lessequal__ (or __less_equal__) once per class for the additional clarity. | - Should we even bother upgrading the existing types? Isn't this question partly related to the coercion issue and which type of comparison takes precedence? And if so, then I would think the answer would be 'yes'. Or better still see below my suggestion of adding poor and rich comparison operators along with matrix-type operators. - If so, how should comparisons on container types be defined? Suppose we have a list whose items define rich comparisons. How should the itemwise comparisons be done? For example: def __lt__(a, b): # a bi: return 0 raise TypeError, "incomparable item types" return len(a) < len(b) This uses the same sequence of comparisons as cmp(), so it may as well use cmp() instead: def __lt__(a, b): # a 0: return 0 assert 0 # unreachable return len(a) < len(b) And now there's not really a reason to change lists to rich comparisons. I don't understand this example. If a[i] and b[i] define rich comparisons, then 'a[i] < b[i]' is likely to return a non-boolean value. Yet the 'if' statement expects a boolean value. I don't see how the above example will work. This example also makes me think that the proposals for new operators (ie. PEP 211 and 225) are a good idea. The discussion of rich comparisons in 1998 also lends some support to this. I can see many uses for two types of comparison operators (as well as the proposed matrix-type operators), one set for poor or boolean comparisons and one for rich or non-boolean comparisons. For example, numeric arrays can define both. Rich comparison operators would return an array of boolean values, while poor comparison operators return a boolean value by performing an implied 'and.reduce' operation. These operators provide clarity and conciseness, without much change to current Python behavior. -- Paul From guido@python.org Fri Dec 15 19:51:04 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 14:51:04 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Fri, 15 Dec 2000 14:32:10 EST." <14906.17712.830224.481130@nem-srvr.stsci.edu> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> Message-ID: <200012151951.OAA03219@cj20424-a.reston1.va.home.com> > Here are my comments on PEP 207. (I've also gone back and read most > of the 1998 discussion. What a tedious, in terms of time, but > enlightening, in terms of content, discussion that was.) > > | - New function: > | > | PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op) > | > | This performs the requested rich comparison, returning a Python > | object or raising an exception. The 3rd argument must be one of > | LT, LE, EQ, NE, GT or GE. > > I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc. This is only at the C level. Having to do a string compare is too slow. Since some of these are multi-character symbols, a character constant doesn't suffice (multi-character character constants are not portable). > | Classes > | > | - Classes can define new special methods __lt__, __le__, __gt__, > | __ge__, __eq__, __ne__ to override the corresponding operators. > | (You gotta love the Fortran heritage.) If a class overrides > | __cmp__ as well, it is only used by PyObject_Compare(). > > Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to > __lt__, __le__, __eq__, etc. I'm not keen on the FORTRAN derived > symbolism. I also find it contrary to Python's heritage of being > clear and concise. I don't mind typing __lessequal__ (or > __less_equal__) once per class for the additional clarity. I don't care about Fortran, but you just showed why I think the short operator names are better: there's less guessing or disagreement about how they are to be spelled. E.g. should it be __lessthan__ or __less_than__ or __less__? > | - Should we even bother upgrading the existing types? > > Isn't this question partly related to the coercion issue and which > type of comparison takes precedence? And if so, then I would think > the answer would be 'yes'. It wouldn't make much of a difference -- comparisons between different types numbers would get the same outcome either way. > Or better still see below my suggestion of > adding poor and rich comparison operators along with matrix-type > operators. > > > - If so, how should comparisons on container types be defined? > Suppose we have a list whose items define rich comparisons. How > should the itemwise comparisons be done? For example: > > def __lt__(a, b): # a for i in range(min(len(a), len(b))): > ai, bi = a[i], b[i] > if ai < bi: return 1 > if ai == bi: continue > if ai > bi: return 0 > raise TypeError, "incomparable item types" > return len(a) < len(b) > > This uses the same sequence of comparisons as cmp(), so it may > as well use cmp() instead: > > def __lt__(a, b): # a for i in range(min(len(a), len(b))): > c = cmp(a[i], b[i]) > if c < 0: return 1 > if c == 0: continue > if c > 0: return 0 > assert 0 # unreachable > return len(a) < len(b) > > And now there's not really a reason to change lists to rich > comparisons. > > I don't understand this example. If a[i] and b[i] define rich > comparisons, then 'a[i] < b[i]' is likely to return a non-boolean > value. Yet the 'if' statement expects a boolean value. I don't see > how the above example will work. Sorry. I was thinking of list items that contain objects that respond to the new overloading protocol, but still return Boolean outcomes. My conclusion is that __cmp__ is just as well. > This example also makes me think that the proposals for new operators > (ie. PEP 211 and 225) are a good idea. The discussion of rich > comparisons in 1998 also lends some support to this. I can see many > uses for two types of comparison operators (as well as the proposed > matrix-type operators), one set for poor or boolean comparisons and > one for rich or non-boolean comparisons. For example, numeric arrays > can define both. Rich comparison operators would return an array of > boolean values, while poor comparison operators return a boolean value > by performing an implied 'and.reduce' operation. These operators > provide clarity and conciseness, without much change to current Python > behavior. Maybe. That can still be decided later. Right now, adding operators is not on the table for 2.1 (if only because there are two conflicting PEPs); adding rich comparisons *is* on the table because it doesn't change the parser (and because the rich comparisons idea was already pretty much worked out two years ago). --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Fri Dec 15 21:08:02 2000 From: tim.one@home.com (Tim Peters) Date: Fri, 15 Dec 2000 16:08:02 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012151929.OAA03073@cj20424-a.reston1.va.home.com> Message-ID: [Neil Schemenauer] > Can you explain the logic behind this recent interest in removing > string functions from the standard library? It it performance? > Some unicode issue? I don't have a great attachment to string.py > but I also don't see the justification for the amount of work it > requires. [Guido] > I figure that at *some* point we should start putting our money where > our mouth is, deprecate most uses of the string module, and start > warning about it. Not in 2.1 probably, given my experience below. I think this begs Neil's questions: *is* our mouth there , and if so, why? The only public notice of impending string module deprecation anyone came up with was a vague note on the 1.6 web page, and one not repeated in any of the 2.0 release material. "string" is right up there with "os" and "sys" as a FIM (Frequently Imported Module), so the required code changes will be massive. As a user, I don't see what's in it for me to endure that pain: the string module functions work fine! Neither are they warts in the language, any more than that we say sin(pi) instead of pi.sin(). Keeping the functions around doesn't hurt anybody that I can see. > As a realistic test of the warnings module I played with some warnings > about the string module, and then found that say most of the std > library modules use it, triggering an extraordinary amount of > warnings. I then decided to experiment with the conversion. I > quickly found out it's too much work to do manually, so I'll hold off > until someone comes up with a tool that does 99% of the work. Ah, so that's the *easy* way to kill this crusade -- forget I said anything . From Barrett@stsci.edu Fri Dec 15 21:20:20 2000 From: Barrett@stsci.edu (Paul Barrett) Date: Fri, 15 Dec 2000 16:20:20 -0500 (EST) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <200012151951.OAA03219@cj20424-a.reston1.va.home.com> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> <200012151951.OAA03219@cj20424-a.reston1.va.home.com> Message-ID: <14906.33325.5784.118110@nem-srvr.stsci.edu> >> This example also makes me think that the proposals for new operators >> (ie. PEP 211 and 225) are a good idea. The discussion of rich >> comparisons in 1998 also lends some support to this. I can see many >> uses for two types of comparison operators (as well as the proposed >> matrix-type operators), one set for poor or boolean comparisons and >> one for rich or non-boolean comparisons. For example, numeric arrays >> can define both. Rich comparison operators would return an array of >> boolean values, while poor comparison operators return a boolean value >> by performing an implied 'and.reduce' operation. These operators >> provide clarity and conciseness, without much change to current Python >> behavior. > > Maybe. That can still be decided later. Right now, adding operators > is not on the table for 2.1 (if only because there are two conflicting > PEPs); adding rich comparisons *is* on the table because it doesn't > change the parser (and because the rich comparisons idea was already > pretty much worked out two years ago). Yes, it was worked out previously _assuming_ rich comparisons do not use any new operators. But let's stop for a moment and contemplate adding rich comparisons along with new comparison operators. What do we gain? 1. The current boolean operator behavior does not have to change, and hence will be backward compatible. 2. It eliminates the need to decide whether or not rich comparisons takes precedence over boolean comparisons. 3. The new operators add additional behavior without directly impacting current behavior and the use of them is unambigous, at least in relation to current Python behavior. You know by the operator what type of comparison will be returned. This should appease Jim Fulton, based on his arguments in 1998 about comparison operators always returning a boolean value. 4. Compound objects, such as lists, could implement both rich and boolean comparisons. The boolean comparison would remain as is, while the rich comparison would return a list of boolean values. Current behavior doesn't change; just a new feature, which you may or may not choose to use, is added. If we go one step further and add the matrix-style operators along with the comparison operators, we can provide a consistent user interface to array/complex operations without changing current Python behavior. If a user has no need for these new operators, he doesn't have to use them or even know about them. All we've done is made Python richer, but I believe with making it more complex. For example, all element-wise operations could have a ':' appended to them, e.g. '+:', '<:', etc.; and will define element-wise addition, element-wise less-than, etc. The traditional '*', '/', etc. operators can then be used for matrix operations, which will appease the Matlab people. Therefore, I don't think rich comparisons and matrix-type operators should be considered separable. I really think you should consider this suggestion. It appeases many groups while providing a consistent and clear user interface, while greatly impacting current Python behavior. Always-causing-havoc-at-the-last-moment-ly Yours, Paul -- Dr. Paul Barrett Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Group FAX: 410-338-4767 Baltimore, MD 21218 From guido@python.org Fri Dec 15 21:23:46 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 16:23:46 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Fri, 15 Dec 2000 16:08:02 EST." References: Message-ID: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> > "string" is right up there with "os" and "sys" as a FIM (Frequently > Imported Module), so the required code changes will be massive. As > a user, I don't see what's in it for me to endure that pain: the > string module functions work fine! Neither are they warts in the > language, any more than that we say sin(pi) instead of pi.sin(). > Keeping the functions around doesn't hurt anybody that I can see. Hm. I'm not saying that this one will be easy. But I don't like having "two ways to do it". It means more learning, etc. (you know the drill). We could have chosen to make the strop module support Unicode; instead, we chose to give string objects methods and promote the use of those methods instead of the string module. (And in a generous mood, we also supported Unicode in the string module -- by providing wrappers that invoke string methods.) If you're saying that we should give users ample time for the transition, I'm with you. If you're saying that you think the string module is too prominent to ever start deprecating its use, I'm afraid we have a problem. I'd also like to note that using the string module's wrappers incurs the overhead of a Python function call -- using string methods is faster. Finally, I like the look of fields[i].strip().lower() much better than that of string.lower(string.strip(fields[i])) -- an actual example from mimetools.py. Ideally, I would like to deprecate the entire string module, so that I can place a single warning at its top. This will cause a single warning to be issued for programs that still use it (no matter how many times it is imported). Unfortunately, there are a couple of things that still need it: string.letters etc., and string.maketrans(). --Guido van Rossum (home page: http://www.python.org/~guido/) From gvwilson@nevex.com Fri Dec 15 21:43:47 2000 From: gvwilson@nevex.com (Greg Wilson) Date: Fri, 15 Dec 2000 16:43:47 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <14906.33325.5784.118110@nem-srvr.stsci.edu> Message-ID: <002901c066e0$1b3f13c0$770a0a0a@nevex.com> This is a multi-part message in MIME format. ------=_NextPart_000_002A_01C066B6.32690BC0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Hi, Paul; thanks for your mail. W.r.t. adding matrix operators to Python, you may want to take a look at the counter-arguments in PEP 0211 (attached). Basically, I spoke with the authors of GNU Octave (a GPL'd clone of MATLAB) about what users really used. They felt that the only matrix operator that really mattered was matrix-matrix multiply; other operators (including the left and right division operators that even experienced MATLAB users often mix up) were second order at best, and were better handled with methods or functions. Thanks, Greg p.s. PEP 0225 (also attached) is an alternative to PEP 0211 which would add most of the MATLAB-ish operators to Python. ------=_NextPart_000_002A_01C066B6.32690BC0 Content-Type: text/plain; name="pep-0211.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="pep-0211.txt" PEP: 211=0A= Title: Adding New Linear Algebra Operators to Python=0A= Version: $Revision: 1.5 $=0A= Author: gvwilson@nevex.com (Greg Wilson)=0A= Status: Draft=0A= Type: Standards Track=0A= Python-Version: 2.1=0A= Created: 15-Jul-2000=0A= Post-History:=0A= =0A= =0A= Introduction=0A= =0A= This PEP describes a conservative proposal to add linear algebra=0A= operators to Python 2.0. It discusses why such operators are=0A= desirable, and why a minimalist approach should be adopted at this=0A= point. This PEP summarizes discussions held in mailing list=0A= forums, and provides URLs for further information, where=0A= appropriate. The CVS revision history of this file contains the=0A= definitive historical record.=0A= =0A= =0A= Summary=0A= =0A= Add a single new infix binary operator '@' ("across"), and=0A= corresponding special methods "__across__()", "__racross__()", and=0A= "__iacross__()". This operator will perform mathematical matrix=0A= multiplication on NumPy arrays, and generate cross-products when=0A= applied to built-in sequence types. No existing operator=0A= definitions will be changed.=0A= =0A= =0A= Background=0A= =0A= The first high-level programming language, Fortran, was invented=0A= to do arithmetic. While this is now just a small part of=0A= computing, there are still many programmers who need to express=0A= complex mathematical operations in code.=0A= =0A= The most influential of Fortran's successors was APL [1]. Its=0A= author, Kenneth Iverson, designed the language as a notation for=0A= expressing matrix algebra, and received the 1980 Turing Award for=0A= his work.=0A= =0A= APL's operators supported both familiar algebraic operations, such=0A= as vector dot product and matrix multiplication, and a wide range=0A= of structural operations, such as stitching vectors together to=0A= create arrays. Even by programming's standards, APL is=0A= exceptionally cryptic: many of its symbols did not exist on=0A= standard keyboards, and expressions have to be read right to left.=0A= =0A= Most subsequent work numerical languages, such as Fortran-90,=0A= MATLAB, and Mathematica, have tried to provide the power of APL=0A= without the obscurity. Python's NumPy [2] has most of the=0A= features that users of such languages expect, but these are=0A= provided through named functions and methods, rather than=0A= overloaded operators. This makes NumPy clumsier than most=0A= alternatives.=0A= =0A= The author of this PEP therefore consulted the developers of GNU=0A= Octave [3], an open source clone of MATLAB. When asked how=0A= important it was to have infix operators for matrix solution,=0A= Prof. James Rawlings replied [4]:=0A= =0A= I DON'T think it's a must have, and I do a lot of matrix=0A= inversion. I cannot remember if its A\b or b\A so I always=0A= write inv(A)*b instead. I recommend dropping \.=0A= =0A= Rawlings' feedback on other operators was similar. It is worth=0A= noting in this context that notations such as "/" and "\" for=0A= matrix solution were invented by programmers, not mathematicians,=0A= and have not been adopted by the latter.=0A= =0A= Based on this discussion, and feedback from classes at the US=0A= national laboratories and elsewhere, we recommend only adding a=0A= matrix multiplication operator to Python at this time. If there=0A= is significant user demand for syntactic support for other=0A= operations, these can be added in a later release.=0A= =0A= =0A= Requirements=0A= =0A= The most important requirement is minimal impact on existing=0A= Python programs and users: the proposal must not break existing=0A= code (except possibly NumPy).=0A= =0A= The second most important requirement is the ability to handle all=0A= common cases cleanly and clearly. There are nine such cases:=0A= =0A= |5 6| * 9 =3D |45 54| MS: matrix-scalar multiplication=0A= |7 8| |63 72|=0A= =0A= 9 * |5 6| =3D |45 54| SM: scalar-matrix multiplication=0A= |7 8| |63 72|=0A= =0A= |2 3| * |4 5| =3D |8 15| VE: vector elementwise = multiplication=0A= =0A= =0A= |2 3| * |4| =3D 23 VD: vector dot product=0A= |5|=0A= =0A= |2| * |4 5| =3D | 8 10| VO: vector outer product=0A= |3| |12 15|=0A= =0A= |1 2| * |5 6| =3D | 5 12| ME: matrix elementwise = multiplication=0A= |3 4| |7 8| |21 32|=0A= =0A= |1 2| * |5 6| =3D |19 22| MM: mathematical matrix = multiplication=0A= |3 4| |7 8| |43 50|=0A= =0A= |1 2| * |5 6| =3D |19 22| VM: vector-matrix multiplication=0A= |7 8|=0A= =0A= |5 6| * |1| =3D |17| MV: matrix-vector multiplication=0A= |7 8| |2| |23|=0A= =0A= Note that 1-dimensional vectors are treated as rows in VM, as=0A= columns in MV, and as both in VD and VO. Both are special cases=0A= of 2-dimensional matrices (Nx1 and 1xN respectively). We will=0A= therefore define the new operator only for 2-dimensional arrays,=0A= and provide an easy (and efficient) way for users to treat=0A= 1-dimensional structures as 2-dimensional.=0A= =0A= Third, we must avoid confusion between Python's notation and those=0A= of MATLAB and Fortran-90. In particular, mathematical matrix=0A= multiplication (case MM) should not be represented as '.*', since:=0A= =0A= (a) MATLAB uses prefix-'.' forms to mean 'elementwise', and raw=0A= forms to mean "mathematical"; and=0A= =0A= (b) even if the Python parser can be taught how to handle dotted=0A= forms, '1.*A' will still be visually ambiguous.=0A= =0A= =0A= Proposal=0A= =0A= The meanings of all existing operators will be unchanged. In=0A= particular, 'A*B' will continue to be interpreted elementwise.=0A= This takes care of the cases MS, SM, VE, and ME, and ensures=0A= minimal impact on existing programs.=0A= =0A= A new operator '@' (pronounced "across") will be added to Python,=0A= along with special methods "__across__()", "__racross__()", and=0A= "__iacross__()", with the usual semantics. (We recommend using=0A= "@", rather than the times-like "><", because of the ease with=0A= which the latter could be mis-typed as inequality "<>".)=0A= =0A= No new operators will be defined to mean "solve a set of linear=0A= equations", or "invert a matrix".=0A= =0A= (Optional) When applied to sequences, the "@" operator will return=0A= a tuple of tuples containing the cross-product of their elements=0A= in left-to-right order:=0A= =0A= >>> [1, 2] @ (3, 4)=0A= ((1, 3), (1, 4), (2, 3), (2, 4))=0A= =0A= >>> [1, 2] @ (3, 4) @ (5, 6)=0A= ((1, 3, 5), (1, 3, 6), =0A= (1, 4, 5), (1, 4, 6),=0A= (2, 3, 5), (2, 3, 6),=0A= (2, 4, 5), (2, 4, 6))=0A= =0A= This will require the same kind of special support from the parser=0A= as chained comparisons (such as "a>> for (i, j) in [1, 2] @ [3, 4]:=0A= >>> print i, j=0A= 1 3=0A= 1 4=0A= 2 3=0A= 2 4=0A= =0A= as a short-hand for the common nested loop idiom:=0A= =0A= >>> for i in [1, 2]:=0A= >>> for j in [3, 4]:=0A= >>> print i, j=0A= =0A= Response to the 'lockstep loop' questionnaire [5] indicated that=0A= newcomers would be comfortable with this (so comfortable, in fact,=0A= that most of them interpreted most multi-loop 'zip' syntaxes [6]=0A= as implementing single-stage nesting).=0A= =0A= =0A= Alternatives=0A= =0A= 01. Don't add new operators.=0A= =0A= Python is not primarily a numerical language; it may not be worth=0A= complexifying it for this special case. NumPy's success is proof=0A= that users can and will use functions and methods for linear=0A= algebra. However, support for real matrix multiplication is=0A= frequently requested, as:=0A= =0A= * functional forms are cumbersome for lengthy formulas, and do not=0A= respect the operator precedence rules of conventional mathematics;=0A= and=0A= =0A= * method forms are asymmetric in their operands.=0A= =0A= What's more, the proposed semantics for "@" for built-in sequence=0A= types would simplify expression of a very common idiom (nested=0A= loops). User testing during discussion of 'lockstep loops'=0A= indicated that both new and experienced users would understand=0A= this immediately.=0A= =0A= 02. Introduce prefixed forms of all existing operators, such as=0A= "~*" and "~+", as proposed in PEP 0225 [7].=0A= =0A= This proposal would duplicate all built-in mathematical operators=0A= with matrix equivalents, as in numerical languages such as=0A= MATLAB. Our objections to this are:=0A= =0A= * Python is not primarily a numerical programming language. While=0A= the (self-selected) participants in the discussions that led to=0A= PEP 0225 may want all of these new operators, the majority of=0A= Python users would be indifferent. The extra complexity they=0A= would introduce into the language therefore does not seem=0A= merited. (See also Rawlings' comments, quoted in the Background=0A= section, about these operators not being essential.)=0A= =0A= * The proposed syntax is difficult to read (i.e. passes the "low=0A= toner" readability test).=0A= =0A= 03. Retain the existing meaning of all operators, but create a=0A= behavioral accessor for arrays, such that:=0A= =0A= A * B=0A= =0A= is elementwise multiplication (ME), but:=0A= =0A= A.m() * B.m()=0A= =0A= is mathematical multiplication (MM). The method "A.m()" would=0A= return an object that aliased A's memory (for efficiency), but=0A= which had a different implementation of __mul__().=0A= =0A= This proposal was made by Moshe Zadka, and is also considered by=0A= PEP 0225 [7]. Its advantage is that it has no effect on the=0A= existing implementation of Python: changes are localized in the=0A= Numeric module. The disadvantages are=0A= =0A= * The semantics of "A.m() * B", "A + B.m()", and so on would have=0A= to be defined, and there is no "obvious" choice for them.=0A= =0A= * Aliasing objects to trigger different operator behavior feels=0A= less Pythonic than either calling methods (as in the existing=0A= Numeric module) or using a different operator. This PEP is=0A= primarily about look and feel, and about making Python more=0A= attractive to people who are not already using it.=0A= =0A= =0A= Related Proposals=0A= =0A= 0207 : Rich Comparisons=0A= =0A= It may become possible to overload comparison operators=0A= such as '<' so that an expression such as 'A < B' returns=0A= an array, rather than a scalar value.=0A= =0A= 0209 : Adding Multidimensional Arrays=0A= =0A= Multidimensional arrays are currently an extension to=0A= Python, rather than a built-in type.=0A= =0A= 0225 : Elementwise/Objectwise Operators=0A= =0A= A larger proposal that addresses the same subject, but=0A= which proposes many more additions to the language.=0A= =0A= =0A= Acknowledgments=0A= =0A= I am grateful to Huaiyu Zhu [8] for initiating this discussion,=0A= and for some of the ideas and terminology included below.=0A= =0A= =0A= References=0A= =0A= [1] http://www.acm.org/sigapl/whyapl.htm=0A= [2] http://numpy.sourceforge.net=0A= [3] http://bevo.che.wisc.edu/octave/=0A= [4] http://www.egroups.com/message/python-numeric/4=0A= [5] http://www.python.org/pipermail/python-dev/2000-July/013139.html=0A= [6] PEP-0201.txt "Lockstep Iteration"=0A= [7] = http://www.python.org/pipermail/python-list/2000-August/112529.html=0A= =0A= =0A= Appendix: NumPy=0A= =0A= NumPy will overload "@" to perform mathematical multiplication of=0A= arrays where shapes permit, and to throw an exception otherwise.=0A= Its implementation of "@" will treat built-in sequence types as if=0A= they were column vectors. This takes care of the cases MM and MV.=0A= =0A= An attribute "T" will be added to the NumPy array type, such that=0A= "m.T" is:=0A= =0A= (a) the transpose of "m" for a 2-dimensional array=0A= =0A= (b) the 1xN matrix transpose of "m" if "m" is a 1-dimensional=0A= array; or=0A= =0A= (c) a runtime error for an array with rank >=3D 3.=0A= =0A= This attribute will alias the memory of the base object. NumPy's=0A= "transpose()" function will be extended to turn built-in sequence=0A= types into row vectors. This takes care of the VM, VD, and VO=0A= cases. We propose an attribute because:=0A= =0A= (a) the resulting notation is similar to the 'superscript T' (at=0A= least, as similar as ASCII allows), and=0A= =0A= (b) it signals that the transposition aliases the original object.=0A= =0A= NumPy will define a value "inv", which will be recognized by the=0A= exponentiation operator, such that "A ** inv" is the inverse of=0A= "A". This is similar in spirit to NumPy's existing "newaxis"=0A= value.=0A= =0A= =0A= =0C=0A= Local Variables:=0A= mode: indented-text=0A= indent-tabs-mode: nil=0A= End:=0A= ------=_NextPart_000_002A_01C066B6.32690BC0 Content-Type: text/plain; name="pep-0225.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="pep-0225.txt" PEP: 225=0A= Title: Elementwise/Objectwise Operators=0A= Version: $Revision: 1.2 $=0A= Author: hzhu@users.sourceforge.net (Huaiyu Zhu),=0A= gregory.lielens@fft.be (Gregory Lielens)=0A= Status: Draft =0A= Type: Standards Track=0A= Python-Version: 2.1=0A= Created: 19-Sep-2000=0A= Post-History:=0A= =0A= =0A= Introduction=0A= =0A= This PEP describes a proposal to add new operators to Python which=0A= are useful for distinguishing elementwise and objectwise=0A= operations, and summarizes discussions in the news group=0A= comp.lang.python on this topic. See Credits and Archives section=0A= at end. Issues discussed here include:=0A= =0A= - Background.=0A= - Description of proposed operators and implementation issues.=0A= - Analysis of alternatives to new operators.=0A= - Analysis of alternative forms.=0A= - Compatibility issues=0A= - Description of wider extensions and other related ideas.=0A= =0A= A substantial portion of this PEP describes ideas that do not go=0A= into the proposed extension. They are presented because the=0A= extension is essentially syntactic sugar, so its adoption must be=0A= weighed against various possible alternatives. While many=0A= alternatives may be better in some aspects, the current proposal=0A= appears to be overall advantageous.=0A= =0A= The issues concerning elementwise-objectwise operations extends to=0A= wider areas than numerical computation. This document also=0A= describes how the current proposal may be integrated with more=0A= general future extensions.=0A= =0A= =0A= Background=0A= =0A= Python provides six binary infix math operators: + - * / % **=0A= hereafter generically represented by "op". They can be overloaded=0A= with new semantics for user-defined classes. However, for objects=0A= composed of homogeneous elements, such as arrays, vectors and=0A= matrices in numerical computation, there are two essentially=0A= distinct flavors of semantics. The objectwise operations treat=0A= these objects as points in multidimensional spaces. The=0A= elementwise operations treat them as collections of individual=0A= elements. These two flavors of operations are often intermixed in=0A= the same formulas, thereby requiring syntactical distinction.=0A= =0A= Many numerical computation languages provide two sets of math=0A= operators. For example, in MatLab, the ordinary op is used for=0A= objectwise operation while .op is used for elementwise operation.=0A= In R, op stands for elementwise operation while %op% stands for=0A= objectwise operation.=0A= =0A= In Python, there are other methods of representation, some of=0A= which already used by available numerical packages, such as=0A= =0A= - function: mul(a,b)=0A= - method: a.mul(b)=0A= - casting: a.E*b =0A= =0A= In several aspects these are not as adequate as infix operators.=0A= More details will be shown later, but the key points are=0A= =0A= - Readability: Even for moderately complicated formulas, infix=0A= operators are much cleaner than alternatives.=0A= =0A= - Familiarity: Users are familiar with ordinary math operators.=0A= =0A= - Implementation: New infix operators will not unduly clutter=0A= Python syntax. They will greatly ease the implementation of=0A= numerical packages.=0A= =0A= While it is possible to assign current math operators to one=0A= flavor of semantics, there is simply not enough infix operators to=0A= overload for the other flavor. It is also impossible to maintain=0A= visual symmetry between these two flavors if one of them does not=0A= contain symbols for ordinary math operators.=0A= =0A= =0A= Proposed extension=0A= =0A= - Six new binary infix operators ~+ ~- ~* ~/ ~% ~** are added to=0A= core Python. They parallel the existing operators + - * / % **.=0A= =0A= - Six augmented assignment operators ~+=3D ~-=3D ~*=3D ~/=3D ~%=3D = ~**=3D are=0A= added to core Python. They parallel the operators +=3D -=3D *=3D = /=3D=0A= %=3D **=3D available in Python 2.0.=0A= =0A= - Operator ~op retains the syntactical properties of operator op,=0A= including precedence.=0A= =0A= - Operator ~op retains the semantical properties of operator op on=0A= built-in number types.=0A= =0A= - Operator ~op raise syntax error on non-number builtin types.=0A= This is temporary until the proper behavior can be agreed upon.=0A= =0A= - These operators are overloadable in classes with names that=0A= prepend "t" (for tilde) to names of ordinary math operators.=0A= For example, __tadd__ and __rtadd__ work for ~+ just as __add__=0A= and __radd__ work for +.=0A= =0A= - As with exiting operators, the __r*__() methods are invoked when=0A= the left operand does not provide the appropriate method.=0A= =0A= It is intended that one set of op or ~op is used for elementwise=0A= operations, the other for objectwise operations, but it is not=0A= specified which version of operators stands for elementwise or=0A= objectwise operations, leaving the decision to applications.=0A= =0A= The proposed implementation is to patch several files relating to=0A= the tokenizer, parser, grammar and compiler to duplicate the=0A= functionality of corresponding existing operators as necessary.=0A= All new semantics are to be implemented in the classes that=0A= overload them.=0A= =0A= The symbol ~ is already used in Python as the unary "bitwise not"=0A= operator. Currently it is not allowed for binary operators. The=0A= new operators are completely backward compatible.=0A= =0A= =0A= Prototype Implementation=0A= =0A= Greg Lielens implemented the infix ~op as a patch against Python=0A= 2.0b1 source[1].=0A= =0A= To allow ~ to be part of binary operators, the tokenizer would=0A= treat ~+ as one token. This means that currently valid expression=0A= ~+1 would be tokenized as ~+ 1 instead of ~ + 1. The parser would=0A= then treat ~+ as composite of ~ +. The effect is invisible to=0A= applications.=0A= =0A= Notes about current patch:=0A= =0A= - It does not include ~op=3D operators yet.=0A= =0A= - The ~op behaves the same as op on lists, instead of raising=0A= exceptions.=0A= =0A= These should be fixed when the final version of this proposal is=0A= ready.=0A= =0A= - It reserves xor as an infix operator with the semantics=0A= equivalent to:=0A= =0A= def __xor__(a, b):=0A= if not b: return a=0A= elif not a: return b=0A= else: 0=0A= =0A= This preserves true value as much as possible, otherwise preserve=0A= left hand side value if possible.=0A= =0A= This is done so that bitwise operators could be regarded as=0A= elementwise logical operators in the future (see below).=0A= =0A= =0A= Alternatives to adding new operators=0A= =0A= The discussions on comp.lang.python and python-dev mailing list=0A= explored many alternatives. Some of the leading alternatives are=0A= listed here, using the multiplication operator as an example.=0A= =0A= 1. Use function mul(a,b).=0A= =0A= Advantage:=0A= - No need for new operators.=0A= =0A= Disadvantage: =0A= - Prefix forms are cumbersome for composite formulas.=0A= - Unfamiliar to the intended users.=0A= - Too verbose for the intended users.=0A= - Unable to use natural precedence rules.=0A= =0A= 2. Use method call a.mul(b)=0A= =0A= Advantage:=0A= - No need for new operators.=0A= =0A= Disadvantage:=0A= - Asymmetric for both operands.=0A= - Unfamiliar to the intended users.=0A= - Too verbose for the intended users.=0A= - Unable to use natural precedence rules.=0A= =0A= 3. Use "shadow classes". For matrix class define a shadow array=0A= class accessible through a method .E, so that for matrices a=0A= and b, a.E*b would be a matrix object that is=0A= elementwise_mul(a,b).=0A= =0A= Likewise define a shadow matrix class for arrays accessible=0A= through a method .M so that for arrays a and b, a.M*b would be=0A= an array that is matrixwise_mul(a,b).=0A= =0A= Advantage:=0A= - No need for new operators.=0A= - Benefits of infix operators with correct precedence rules.=0A= - Clean formulas in applications.=0A= =0A= Disadvantage:=0A= - Hard to maintain in current Python because ordinary numbers=0A= cannot have user defined class methods; i.e. a.E*b will fail=0A= if a is a pure number.=0A= - Difficult to implement, as this will interfere with existing=0A= method calls, like .T for transpose, etc.=0A= - Runtime overhead of object creation and method lookup.=0A= - The shadowing class cannot replace a true class, because it=0A= does not return its own type. So there need to be a M class=0A= with shadow E class, and an E class with shadow M class.=0A= - Unnatural to mathematicians.=0A= =0A= 4. Implement matrixwise and elementwise classes with easy casting=0A= to the other class. So matrixwise operations for arrays would=0A= be like a.M*b.M and elementwise operations for matrices would=0A= be like a.E*b.E. For error detection a.E*b.M would raise=0A= exceptions.=0A= =0A= Advantage:=0A= - No need for new operators.=0A= - Similar to infix notation with correct precedence rules.=0A= =0A= Disadvantage:=0A= - Similar difficulty due to lack of user-methods for pure numbers.=0A= - Runtime overhead of object creation and method lookup.=0A= - More cluttered formulas=0A= - Switching of flavor of objects to facilitate operators=0A= becomes persistent. This introduces long range context=0A= dependencies in application code that would be extremely hard=0A= to maintain.=0A= =0A= 5. Using mini parser to parse formulas written in arbitrary=0A= extension placed in quoted strings.=0A= =0A= Advantage:=0A= - Pure Python, without new operators=0A= =0A= Disadvantage:=0A= - The actual syntax is within the quoted string, which does not=0A= resolve the problem itself.=0A= - Introducing zones of special syntax.=0A= - Demanding on the mini-parser.=0A= =0A= 6. Introducing a single operator, such as @, for matrix=0A= multiplication.=0A= =0A= Advantage:=0A= - Introduces less operators=0A= =0A= Disadvantage:=0A= - The distinctions for operators like + - ** are equally=0A= important. Their meaning in matrix or array-oriented=0A= packages would be reversed (see below).=0A= - The new operator occupies a special character.=0A= - This does not work well with more general object-element issues.=0A= =0A= Among these alternatives, the first and second are used in current=0A= applications to some extent, but found inadequate. The third is=0A= the most favorite for applications, but it will incur huge=0A= implementation complexity. The fourth would make applications=0A= codes very context-sensitive and hard to maintain. These two=0A= alternatives also share significant implementational difficulties=0A= due to current type/class split. The fifth appears to create more=0A= problems than it would solve. The sixth does not cover the same=0A= range of applications.=0A= =0A= =0A= Alternative forms of infix operators=0A= =0A= Two major forms and several minor variants of new infix operators=0A= were discussed:=0A= =0A= - Bracketed form=0A= =0A= (op)=0A= [op]=0A= {op}=0A= =0A= :op:=0A= ~op~=0A= %op%=0A= =0A= - Meta character form=0A= =0A= .op=0A= @op=0A= ~op=0A= =0A= Alternatively the meta character is put after the operator.=0A= =0A= - Less consistent variations of these themes. These are=0A= considered unfavorably. For completeness some are listed here=0A= =0A= - Use @/ and /@ for left and right division=0A= - Use [*] and (*) for outer and inner products=0A= - Use a single operator @ for multiplication.=0A= =0A= - Use __call__ to simulate multiplication.=0A= a(b) or (a)(b)=0A= =0A= Criteria for choosing among the representations include:=0A= =0A= - No syntactical ambiguities with existing operators. =0A= =0A= - Higher readability in actual formulas. This makes the=0A= bracketed forms unfavorable. See examples below.=0A= =0A= - Visually similar to existing math operators.=0A= =0A= - Syntactically simple, without blocking possible future=0A= extensions.=0A= =0A= With these criteria the overall winner in bracket form appear to=0A= be {op}. A clear winner in the meta character form is ~op.=0A= Comparing these it appears that ~op is the favorite among them=0A= all.=0A= =0A= Some analysis are as follows:=0A= =0A= - The .op form is ambiguous: 1.+a would be different from 1 .+a=0A= =0A= - The bracket type operators are most favorable when standing=0A= alone, but not in formulas, as they interfere with visual=0A= parsing of parenthesis for precedence and function argument.=0A= This is so for (op) and [op], and somewhat less so for {op}=0A= and .=0A= =0A= - The form has the potential to be confused with < > and =3D=0A= =0A= - The @op is not favored because @ is visually heavy (dense,=0A= more like a letter): a@+b is more readily read as a@ + b=0A= than a @+ b.=0A= =0A= - For choosing meta-characters: Most of existing ASCII symbols=0A= have already been used. The only three unused are @ $ ?.=0A= =0A= =0A= Semantics of new operators=0A= =0A= There are convincing arguments for using either set of operators=0A= as objectwise or elementwise. Some of them are listed here:=0A= =0A= 1. op for element, ~op for object=0A= =0A= - Consistent with current multiarray interface of Numeric package=0A= - Consistent with some other languages=0A= - Perception that elementwise operations are more natural=0A= - Perception that elementwise operations are used more frequently=0A= =0A= 2. op for object, ~op for element=0A= =0A= - Consistent with current linear algebra interface of MatPy = package=0A= - Consistent with some other languages=0A= - Perception that objectwise operations are more natural=0A= - Perception that objectwise operations are used more frequently=0A= - Consistent with the current behavior of operators on lists=0A= - Allow ~ to be a general elementwise meta-character in future=0A= extensions.=0A= =0A= It is generally agreed upon that =0A= =0A= - there is no absolute reason to favor one or the other=0A= - it is easy to cast from one representation to another in a=0A= sizable chunk of code, so the other flavor of operators is=0A= always minority=0A= - there are other semantic differences that favor existence of=0A= array-oriented and matrix-oriented packages, even if their=0A= operators are unified.=0A= - whatever the decision is taken, codes using existing=0A= interfaces should not be broken for a very long time.=0A= =0A= Therefore not much is lost, and much flexibility retained, if the=0A= semantic flavors of these two sets of operators are not dictated=0A= by the core language. The application packages are responsible=0A= for making the most suitable choice. This is already the case for=0A= NumPy and MatPy which use opposite semantics. Adding new=0A= operators will not break this. See also observation after=0A= subsection 2 in the Examples below.=0A= =0A= The issue of numerical precision was raised, but if the semantics=0A= is left to the applications, the actual precisions should also go=0A= there.=0A= =0A= =0A= Examples=0A= =0A= Following are examples of the actual formulas that will appear=0A= using various operators or other representations described above.=0A= =0A= 1. The matrix inversion formula:=0A= =0A= - Using op for object and ~op for element:=0A= =0A= b =3D a.I - a.I * u / (c.I + v/a*u) * v / a=0A= =0A= b =3D a.I - a.I * u * (c.I + v*a.I*u).I * v * a.I=0A= =0A= - Using op for element and ~op for object:=0A= =0A= b =3D a.I @- a.I @* u @/ (c.I @+ v@/a@*u) @* v @/ a=0A= =0A= b =3D a.I ~- a.I ~* u ~/ (c.I ~+ v~/a~*u) ~* v ~/ a=0A= =0A= b =3D a.I (-) a.I (*) u (/) (c.I (+) v(/)a(*)u) (*) v (/) a=0A= =0A= b =3D a.I [-] a.I [*] u [/] (c.I [+] v[/]a[*]u) [*] v [/] a=0A= =0A= b =3D a.I <-> a.I <*> u (c.I <+> va<*>u) <*> v a=0A= =0A= b =3D a.I {-} a.I {*} u {/} (c.I {+} v{/}a{*}u) {*} v {/} a=0A= =0A= Observation: For linear algebra using op for object is preferable.=0A= =0A= Observation: The ~op type operators look better than (op) type=0A= in complicated formulas.=0A= =0A= - using named operators=0A= =0A= b =3D a.I @sub a.I @mul u @div (c.I @add v @div a @mul u) @mul = v @div a=0A= =0A= b =3D a.I ~sub a.I ~mul u ~div (c.I ~add v ~div a ~mul u) ~mul = v ~div a=0A= =0A= Observation: Named operators are not suitable for math formulas.=0A= =0A= 2. Plotting a 3d graph=0A= =0A= - Using op for object and ~op for element:=0A= =0A= z =3D sin(x~**2 ~+ y~**2); plot(x,y,z)=0A= =0A= - Using op for element and ~op for object:=0A= =0A= z =3D sin(x**2 + y**2); plot(x,y,z)=0A= =0A= Observation: Elementwise operations with broadcasting allows=0A= much more efficient implementation than MatLab.=0A= =0A= Observation: It is useful to have two related classes with the=0A= semantics of op and ~op swapped. Using these the ~op=0A= operators would only need to appear in chunks of code where=0A= the other flavor dominates, while maintaining consistent=0A= semantics of the code.=0A= =0A= 3. Using + and - with automatic broadcasting=0A= =0A= a =3D b - c; d =3D a.T*a=0A= =0A= Observation: This would silently produce hard-to-trace bugs if=0A= one of b or c is row vector while the other is column vector.=0A= =0A= =0A= Miscellaneous issues:=0A= =0A= - Need for the ~+ ~- operators. The objectwise + - are important=0A= because they provide important sanity checks as per linear=0A= algebra. The elementwise + - are important because they allow=0A= broadcasting that are very efficient in applications.=0A= =0A= - Left division (solve). For matrix, a*x is not necessarily equal=0A= to x*a. The solution of a*x=3D=3Db, denoted x=3Dsolve(a,b), is=0A= therefore different from the solution of x*a=3D=3Db, denoted=0A= x=3Ddiv(b,a). There are discussions about finding a new symbol=0A= for solve. [Background: MatLab use b/a for div(b,a) and a\b for=0A= solve(a,b).]=0A= =0A= It is recognized that Python provides a better solution without=0A= requiring a new symbol: the inverse method .I can be made to be=0A= delayed so that a.I*b and b*a.I are equivalent to Mat lab's a\b=0A= and b/a. The implementation is quite simple and the resulting=0A= application code clean.=0A= =0A= - Power operator. Python's use of a**b as pow(a,b) has two=0A= perceived disadvantages:=0A= =0A= - Most mathematicians are more familiar with a^b for this purpose.=0A= - It results in long augmented assignment operator ~**=3D.=0A= =0A= However, this issue is distinct from the main issue here.=0A= =0A= - Additional multiplication operators. Several forms of=0A= multiplications are used in (multi-)linear algebra. Most can be=0A= seen as variations of multiplication in linear algebra sense=0A= (such as Kronecker product). But two forms appear to be more=0A= fundamental: outer product and inner product. However, their=0A= specification includes indices, which can be either=0A= =0A= - associated with the operator, or=0A= - associated with the objects.=0A= =0A= The latter (the Einstein notation) is used extensively on paper,=0A= and is also the easier one to implement. By implementing a=0A= tensor-with-indices class, a general form of multiplication=0A= would cover both outer and inner products, and specialize to=0A= linear algebra multiplication as well. The index rule can be=0A= defined as class methods, like,=0A= =0A= a =3D b.i(1,2,-1,-2) * c.i(4,-2,3,-1) # a_ijkl =3D b_ijmn = c_lnkm=0A= =0A= Therefore one objectwise multiplication is sufficient.=0A= =0A= - Bitwise operators. =0A= =0A= - The proposed new math operators use the symbol ~ that is=0A= "bitwise not" operator. This poses no compatibility problem=0A= but somewhat complicates implementation.=0A= =0A= - The symbol ^ might be better used for pow than bitwise xor.=0A= But this depends on the future of bitwise operators. It does=0A= not immediately impact on the proposed math operator.=0A= =0A= - The symbol | was suggested to be used for matrix solve. But=0A= the new solution of using delayed .I is better in several=0A= ways.=0A= =0A= - The current proposal fits in a larger and more general=0A= extension that will remove the need for special bitwise=0A= operators. (See elementization below.)=0A= =0A= - Alternative to special operator names used in definition,=0A= =0A= def "+"(a, b) in place of def __add__(a, b)=0A= =0A= This appears to require greater syntactical change, and would=0A= only be useful when arbitrary additional operators are allowed.=0A= =0A= =0A= Impact on general elementization=0A= =0A= The distinction between objectwise and elementwise operations are=0A= meaningful in other contexts as well, where an object can be=0A= conceptually regarded as a collection of elements. It is=0A= important that the current proposal does not preclude possible=0A= future extensions.=0A= =0A= One general future extension is to use ~ as a meta operator to=0A= "elementize" a given operator. Several examples are listed here:=0A= =0A= 1. Bitwise operators. Currently Python assigns six operators to=0A= bitwise operations: and (&), or (|), xor (^), complement (~),=0A= left shift (<<) and right shift (>>), with their own precedence=0A= levels.=0A= =0A= Among them, the & | ^ ~ operators can be regarded as=0A= elementwise versions of lattice operators applied to integers=0A= regarded as bit strings.=0A= =0A= 5 and 6 # 6=0A= 5 or 6 # 5=0A= =0A= 5 ~and 6 # 4=0A= 5 ~or 6 # 7=0A= =0A= These can be regarded as general elementwise lattice operators,=0A= not restricted to bits in integers.=0A= =0A= In order to have named operators for xor ~xor, it is necessary=0A= to make xor a reserved word.=0A= =0A= 2. List arithmetics. =0A= =0A= [1, 2] + [3, 4] # [1, 2, 3, 4]=0A= [1, 2] ~+ [3, 4] # [4, 6]=0A= =0A= ['a', 'b'] * 2 # ['a', 'b', 'a', 'b']=0A= 'ab' * 2 # 'abab'=0A= =0A= ['a', 'b'] ~* 2 # ['aa', 'bb']=0A= [1, 2] ~* 2 # [2, 4]=0A= =0A= It is also consistent to Cartesian product=0A= =0A= [1,2]*[3,4] # [(1,3),(1,4),(2,3),(2,4)]=0A= =0A= 3. List comprehension.=0A= =0A= a =3D [1, 2]; b =3D [3, 4]=0A= ~f(a,b) # [f(x,y) for x, y in zip(a,b)]=0A= ~f(a*b) # [f(x,y) for x in a for y in b]=0A= a ~+ b # [x + y for x, y in zip(a,b)]=0A= =0A= 4. Tuple generation (the zip function in Python 2.0)=0A= =0A= [1, 2, 3], [4, 5, 6] # ([1,2, 3], [4, 5, 6])=0A= [1, 2, 3]~,[4, 5, 6] # [(1,4), (2, 5), (3,6)]=0A= =0A= 5. Using ~ as generic elementwise meta-character to replace map=0A= =0A= ~f(a, b) # map(f, a, b)=0A= ~~f(a, b) # map(lambda *x:map(f, *x), a, b)=0A= =0A= More generally,=0A= =0A= def ~f(*x): return map(f, *x)=0A= def ~~f(*x): return map(~f, *x)=0A= ...=0A= =0A= 6. Elementwise format operator (with broadcasting)=0A= =0A= a =3D [1,2,3,4,5]=0A= print ["%5d "] ~% a =0A= a =3D [[1,2],[3,4]]=0A= print ["%5d "] ~~% a=0A= =0A= 7. Rich comparison=0A= =0A= [1, 2, 3] ~< [3, 2, 1] # [1, 0, 0]=0A= [1, 2, 3] ~=3D=3D [3, 2, 1] # [0, 1, 0]=0A= =0A= 8. Rich indexing=0A= =0A= [a, b, c, d] ~[2, 3, 1] # [c, d, b]=0A= =0A= 9. Tuple flattening=0A= =0A= a =3D (1,2); b =3D (3,4)=0A= f(~a, ~b) # f(1,2,3,4) =0A= =0A= 10. Copy operator=0A= =0A= a ~=3D b # a =3D b.copy()=0A= =0A= There can be specific levels of deep copy=0A= =0A= a ~~=3D b # a =3D b.copy(2)=0A= =0A= Notes:=0A= =0A= 1. There are probably many other similar situations. This general=0A= approach seems well suited for most of them, in place of=0A= several separated extensions for each of them (parallel and=0A= cross iteration, list comprehension, rich comparison, etc).=0A= =0A= 2. The semantics of "elementwise" depends on applications. For=0A= example, an element of matrix is two levels down from the=0A= list-of-list point of view. This requires more fundamental=0A= change than the current proposal. In any case, the current=0A= proposal will not negatively impact on future possibilities of=0A= this nature.=0A= =0A= Note that this section describes a type of future extensions that=0A= is consistent with current proposal, but may present additional=0A= compatibility or other problems. They are not tied to the current=0A= proposal.=0A= =0A= =0A= Impact on named operators=0A= =0A= The discussions made it generally clear that infix operators is a=0A= scarce resource in Python, not only in numerical computation, but=0A= in other fields as well. Several proposals and ideas were put=0A= forward that would allow infix operators be introduced in ways=0A= similar to named functions. We show here that the current=0A= extension does not negatively impact on future extensions in this=0A= regard.=0A= =0A= 1. Named infix operators.=0A= =0A= Choose a meta character, say @, so that for any identifier=0A= "opname", the combination "@opname" would be a binary infix=0A= operator, and=0A= =0A= a @opname b =3D=3D opname(a,b)=0A= =0A= Other representations mentioned include .name ~name~ :name:=0A= (.name) %name% and similar variations. The pure bracket based=0A= operators cannot be used this way.=0A= =0A= This requires a change in the parser to recognize @opname, and=0A= parse it into the same structure as a function call. The=0A= precedence of all these operators would have to be fixed at=0A= one level, so the implementation would be different from=0A= additional math operators which keep the precedence of=0A= existing math operators.=0A= =0A= The current proposed extension do not limit possible future=0A= extensions of such form in any way.=0A= =0A= 2. More general symbolic operators.=0A= =0A= One additional form of future extension is to use meta=0A= character and operator symbols (symbols that cannot be used in=0A= syntactical structures other than operators). Suppose @ is=0A= the meta character. Then=0A= =0A= a + b, a @+ b, a @@+ b, a @+- b=0A= =0A= would all be operators with a hierarchy of precedence, defined by=0A= =0A= def "+"(a, b)=0A= def "@+"(a, b)=0A= def "@@+"(a, b)=0A= def "@+-"(a, b)=0A= =0A= One advantage compared with named operators is greater=0A= flexibility for precedences based on either the meta character=0A= or the ordinary operator symbols. This also allows operator=0A= composition. The disadvantage is that they are more like=0A= "line noise". In any case the current proposal does not=0A= impact its future possibility.=0A= =0A= These kinds of future extensions may not be necessary when=0A= Unicode becomes generally available.=0A= =0A= Note that this section discusses compatibility of the proposed=0A= extension with possible future extensions. The desirability=0A= or compatibility of these other extensions themselves are=0A= specifically not considered here.=0A= =0A= =0A= Credits and archives=0A= =0A= The discussions mostly happened in July to August of 2000 on news=0A= group comp.lang.python and the mailing list python-dev. There are=0A= altogether several hundred postings, most can be retrieved from=0A= these two pages (and searching word "operator"):=0A= =0A= http://www.python.org/pipermail/python-list/2000-July/=0A= http://www.python.org/pipermail/python-list/2000-August/=0A= =0A= The names of contributers are too numerous to mention here,=0A= suffice to say that a large proportion of ideas discussed here are=0A= not our own.=0A= =0A= Several key postings (from our point of view) that may help to=0A= navigate the discussions include:=0A= =0A= http://www.python.org/pipermail/python-list/2000-July/108893.html=0A= http://www.python.org/pipermail/python-list/2000-July/108777.html=0A= http://www.python.org/pipermail/python-list/2000-July/108848.html=0A= http://www.python.org/pipermail/python-list/2000-July/109237.html=0A= http://www.python.org/pipermail/python-list/2000-July/109250.html=0A= http://www.python.org/pipermail/python-list/2000-July/109310.html=0A= http://www.python.org/pipermail/python-list/2000-July/109448.html=0A= http://www.python.org/pipermail/python-list/2000-July/109491.html=0A= http://www.python.org/pipermail/python-list/2000-July/109537.html=0A= http://www.python.org/pipermail/python-list/2000-July/109607.html=0A= http://www.python.org/pipermail/python-list/2000-July/109709.html=0A= http://www.python.org/pipermail/python-list/2000-July/109804.html=0A= http://www.python.org/pipermail/python-list/2000-July/109857.html=0A= http://www.python.org/pipermail/python-list/2000-July/110061.html=0A= http://www.python.org/pipermail/python-list/2000-July/110208.html=0A= = http://www.python.org/pipermail/python-list/2000-August/111427.html=0A= = http://www.python.org/pipermail/python-list/2000-August/111558.html=0A= = http://www.python.org/pipermail/python-list/2000-August/112551.html=0A= = http://www.python.org/pipermail/python-list/2000-August/112606.html=0A= = http://www.python.org/pipermail/python-list/2000-August/112758.html=0A= =0A= http://www.python.org/pipermail/python-dev/2000-July/013243.html=0A= http://www.python.org/pipermail/python-dev/2000-July/013364.html=0A= = http://www.python.org/pipermail/python-dev/2000-August/014940.html=0A= =0A= These are earlier drafts of this PEP:=0A= =0A= = http://www.python.org/pipermail/python-list/2000-August/111785.html=0A= = http://www.python.org/pipermail/python-list/2000-August/112529.html=0A= = http://www.python.org/pipermail/python-dev/2000-August/014906.html=0A= =0A= There is an alternative PEP (with official PEP number 211) by Greg=0A= Wilson, titled "Adding New Linear Algebra Operators to Python".=0A= =0A= Its first (and current) version is at:=0A= =0A= = http://www.python.org/pipermail/python-dev/2000-August/014876.html=0A= http://python.sourceforge.net/peps/pep-0211.html=0A= =0A= =0A= Additional References=0A= =0A= [1] http://MatPy.sourceforge.net/Misc/index.html=0A= =0A= =0A= =0C=0A= Local Variables:=0A= mode: indented-text=0A= indent-tabs-mode: nil=0A= End:=0A= ------=_NextPart_000_002A_01C066B6.32690BC0-- From guido@python.org Fri Dec 15 21:55:46 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 16:55:46 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Fri, 15 Dec 2000 16:20:20 EST." <14906.33325.5784.118110@nem-srvr.stsci.edu> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> <200012151951.OAA03219@cj20424-a.reston1.va.home.com> <14906.33325.5784.118110@nem-srvr.stsci.edu> Message-ID: <200012152155.QAA03879@cj20424-a.reston1.va.home.com> > > Maybe. That can still be decided later. Right now, adding operators > > is not on the table for 2.1 (if only because there are two conflicting > > PEPs); adding rich comparisons *is* on the table because it doesn't > > change the parser (and because the rich comparisons idea was already > > pretty much worked out two years ago). > > Yes, it was worked out previously _assuming_ rich comparisons do not > use any new operators. > > But let's stop for a moment and contemplate adding rich comparisons > along with new comparison operators. What do we gain? > > 1. The current boolean operator behavior does not have to change, and > hence will be backward compatible. What incompatibility do you see in the current proposal? > 2. It eliminates the need to decide whether or not rich comparisons > takes precedence over boolean comparisons. Only if you want different semantics -- that's only an issue for NumPy. > 3. The new operators add additional behavior without directly impacting > current behavior and the use of them is unambigous, at least in > relation to current Python behavior. You know by the operator what > type of comparison will be returned. This should appease Jim > Fulton, based on his arguments in 1998 about comparison operators > always returning a boolean value. As you know, I'm now pretty close to Jim. :-) He seemed pretty mellow about this now. > 4. Compound objects, such as lists, could implement both rich > and boolean comparisons. The boolean comparison would remain as > is, while the rich comparison would return a list of boolean > values. Current behavior doesn't change; just a new feature, which > you may or may not choose to use, is added. > > If we go one step further and add the matrix-style operators along > with the comparison operators, we can provide a consistent user > interface to array/complex operations without changing current Python > behavior. If a user has no need for these new operators, he doesn't > have to use them or even know about them. All we've done is made > Python richer, but I believe with making it more complex. For > example, all element-wise operations could have a ':' appended to > them, e.g. '+:', '<:', etc.; and will define element-wise addition, > element-wise less-than, etc. The traditional '*', '/', etc. operators > can then be used for matrix operations, which will appease the Matlab > people. > > Therefore, I don't think rich comparisons and matrix-type operators > should be considered separable. I really think you should consider > this suggestion. It appeases many groups while providing a consistent > and clear user interface, while greatly impacting current Python > behavior. > > Always-causing-havoc-at-the-last-moment-ly Yours, I think you misunderstand. Rich comparisons are mostly about allowing the separate overloading of <, <=, ==, !=, >, and >=. This is useful in its own light. If you don't want to use this overloading facility for elementwise comparisons in NumPy, that's fine with me. Nobody says you have to -- it's just that you *could*. Red my lips: there won't be *any* new operators in 2.1. There will a better way to overload the existing Boolean operators, and they will be able to return non-Boolean results. That's useful in other situations besides NumPy. Feel free to lobby for elementwise operators -- but based on the discussion about this subject so far, I don't give it much of a chance even past Python 2.1. They would add a lot of baggage to the language (e.g. the table of operators in all Python books would be about twice as long) and by far the most users don't care about them. (Read the intro to 211 for some of the concerns -- this PEP tries to make the addition palatable by adding exactly *one* new operator.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Fri Dec 15 22:16:34 2000 From: guido@python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 17:16:34 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Fri, 08 Dec 2000 17:58:03 EST." <200012082258.RAA02389@cj20424-a.reston1.va.home.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> Message-ID: <200012152216.RAA11098@cj20424-a.reston1.va.home.com> I've checked in the essential parts of the warnings PEP, and closed the SF patch. I haven't checked in the examples in the patch -- it's too early for that. But I figured that it's easier to revise the code once it's checked in. I'm pretty confident that it works as advertised. Still missing is documentation: the warnings module, the new API functions, and the new command line option should all be documented. I'll work on that over the holidays. I consider the PEP done. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Fri Dec 15 22:21:24 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:21:24 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> Message-ID: <3A3A9964.A6B3DD11@lemburg.com> Neil Schemenauer wrote: > > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > > I'm not sure I agree with that view either, but mostly because there > > is a non-GPL replacement for parts of the readline API: > > > > http://www.cstr.ed.ac.uk/downloads/editline.html > > It doesn't work with the current readline module. It is much > smaller than readline and works just as well in my experience. > Would there be any interest in including a copy with the standard > distribution? The license is quite nice (X11 type). +1 from here -- line editing is simply a very important part of an interactive prompt and readline is not only big, slow and full of strange surprises, but also GPLed ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Fri Dec 15 22:24:34 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:24:34 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> Message-ID: <3A3A9A22.E9BA9551@lemburg.com> "A.M. Kuchling" wrote: > > unicodeobject.c contains this code: > > PyErr_Format(PyExc_ValueError, > "unsupported format character '%c' (0x%x) " > "at index %i", > c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat)); > > c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits, > so '%\u3000' % 1 results in an error message containing "'\000' > (0x3000)". Is this worth fixing? I'd say no, since the hex value is > more useful for Unicode strings anyway. (I still wanted to mention > this little buglet, since I just touched this bit of code.) Why would you want to fix it ? Format characters will always be ASCII and thus 7-bit -- theres really no need to expand the set of possibilities beyond 8 bits ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake@acm.org Fri Dec 15 22:22:34 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 15 Dec 2000 17:22:34 -0500 (EST) Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012152216.RAA11098@cj20424-a.reston1.va.home.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <200012152216.RAA11098@cj20424-a.reston1.va.home.com> Message-ID: <14906.39338.795843.947683@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > Still missing is documentation: the warnings module, the new API > functions, and the new command line option should all be documented. > I'll work on that over the holidays. I've assigned a bug to you in case you forget. I've given it a "show-stopper" priority level, so I'll feel good ripping the code out if you don't get docs written in time. ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From mal@lemburg.com Fri Dec 15 22:39:18 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:39:18 +0100 Subject: [Python-Dev] What to do about PEP 229? References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> Message-ID: <3A3A9D96.80781D61@lemburg.com> "A.M. Kuchling" wrote: > > I began writing the fabled fancy setup script described in PEP 229, > and then realized there was duplication going on here. The code in > setup.py would need to know what libraries, #defines, &c., are needed > by each module in order to check if they're needed and set them. But > if Modules/Setup can be used to override setup.py's behaviour, then > much of this information would need to be in that file, too; the > details of compiling a module are in two places. > > Possibilities: > > 1) Setup contains fully-loaded module descriptions, and the setup > script drops unneeded bits. For example, the socket module > requires -lnsl on some platforms. The Setup file would contain > "socket socketmodule.c -lnsl" on all platforms, and setup.py would > check for an nsl library and only use if it's there. > > This seems dodgy to me; what if -ldbm is needed on one platform and > -lndbm on another? Can't distutils try both and then settle for the working combination ? [distutils isn't really ready for auto-configure yet, but Greg has already provided most of the needed functionality -- it's just not well integrated into the rest of the build process in version 1.0.1 ... BTW, where is Gerg ? I haven't heard from him in quite a while.] > 2) Drop setup completely and just maintain setup.py, with some > different overriding mechanism. This is more radical. Adding a > new module is then not just a matter of editing a simple text file; > you'd have to modify setup.py, making it more like maintaining an > autoconf script. Why not parse Setup and use it as input to distutils setup.py ? > Remember, the underlying goal of PEP 229 is to have the out-of-the-box > Python installation you get from "./configure;make" contain many more > useful modules; right now you wouldn't get zlib, syslog, resource, any > of the DBM modules, PyExpat, &c. I'm not wedded to using Distutils to > get that, but think that's the only practical way; witness the hackery > required to get the DB module automatically compiled. > > You can also wave your hands in the direction of packagers such as > ActiveState or Red Hat, and say "let them make to compile everything". > But this problem actually inconveniences *me*, since I always build > Python myself and have to extensively edit Setup, so I'd like to fix > the problem. > > Thoughts? Nice idea :-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Fri Dec 15 22:44:15 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:44:15 +0100 Subject: [Python-Dev] Death to string functions! References: <200012151509.HAA18093@slayer.i.sourceforge.net> <20001215041450.B22056@glacier.fnational.com> <200012151929.OAA03073@cj20424-a.reston1.va.home.com> Message-ID: <3A3A9EBF.3F9306B6@lemburg.com> Guido van Rossum wrote: > > > Can you explain the logic behind this recent interest in removing > > string functions from the standard library? It it performance? > > Some unicode issue? I don't have a great attachment to string.py > > but I also don't see the justification for the amount of work it > > requires. > > I figure that at *some* point we should start putting our money where > our mouth is, deprecate most uses of the string module, and start > warning about it. Not in 2.1 probably, given my experience below. > > As a realistic test of the warnings module I played with some warnings > about the string module, and then found that say most of the std > library modules use it, triggering an extraordinary amount of > warnings. I then decided to experiment with the conversion. I > quickly found out it's too much work to do manually, so I'll hold off > until someone comes up with a tool that does 99% of the work. This would also help a lot of programmers out there who are stuch with 100k LOCs of Python code using string.py ;) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Fri Dec 15 22:49:01 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:49:01 +0100 Subject: [Python-Dev] Death to string functions! References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: <3A3A9FDD.E6F021AF@lemburg.com> Guido van Rossum wrote: > > Ideally, I would like to deprecate the entire string module, so that I > can place a single warning at its top. This will cause a single > warning to be issued for programs that still use it (no matter how > many times it is imported). Unfortunately, there are a couple of > things that still need it: string.letters etc., and > string.maketrans(). Can't we come up with a module similar to unicodedata[.py] ? string.py could then still provide the interfaces, but the implementation would live in stringdata.py [Perhaps we won't need stringdata by then... Unicode will have taken over and the discussion be mood ;-)] -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From thomas@xs4all.net Fri Dec 15 22:54:25 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 15 Dec 2000 23:54:25 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <20001215040304.A22056@glacier.fnational.com>; from nas@arctrix.com on Fri, Dec 15, 2000 at 04:03:04AM -0800 References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> Message-ID: <20001215235425.A29681@xs4all.nl> On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote: > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > > I'm not sure I agree with that view either, but mostly because there > > is a non-GPL replacement for parts of the readline API: > > > > http://www.cstr.ed.ac.uk/downloads/editline.html > > It doesn't work with the current readline module. It is much > smaller than readline and works just as well in my experience. > Would there be any interest in including a copy with the standard > distribution? The license is quite nice (X11 type). Definately +1 from here. Readline reminds me of the cold war, for some reason. (Actually, multiple reasons ;) I don't have time to do it myself, unfortunately, or I would. (Looking at editline has been on my TODO list for a while... :P) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From martin@loewis.home.cs.tu-berlin.de Sat Dec 16 12:32:30 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 16 Dec 2000 13:32:30 +0100 Subject: [Python-Dev] PEP 226 Message-ID: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> I remember earlier discussion on the Python 2.1 release schedule, and never managed to comment on those. I believe that Python contributors and maintainers did an enourmous job in releasing Python 2, which took quite some time from everybody's life. I think it is unrealistic to expect the same amount of commitment for the next release, especially if that release appears just a few months after the previous release (that is, one month from now). So I'd like to ask the release manager to take that into account. I'm not quite sure what kind of action I expect; possible alternatives are: - declare 2.1 a pure bug fix release only; with a minimal set of new features. In particular, don't push for completion of PEPs; everybody should then accept that most features that are currently discussed will appear in Python 2.2. - move the schedule for Python 2.1 back (or is it forward?) by, say, a few month. This will people give some time to do the things that did not get the right amount of attention during 2.0 release, and will still allow to work on new and interesting features. Just my 0.02EUR, Martin From guido@python.org Sat Dec 16 16:38:28 2000 From: guido@python.org (Guido van Rossum) Date: Sat, 16 Dec 2000 11:38:28 -0500 Subject: [Python-Dev] PEP 226 In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100." <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> Message-ID: <200012161638.LAA13888@cj20424-a.reston1.va.home.com> > I remember earlier discussion on the Python 2.1 release schedule, and > never managed to comment on those. > > I believe that Python contributors and maintainers did an enourmous > job in releasing Python 2, which took quite some time from everybody's > life. I think it is unrealistic to expect the same amount of > commitment for the next release, especially if that release appears > just a few months after the previous release (that is, one month from > now). > > So I'd like to ask the release manager to take that into > account. I'm not quite sure what kind of action I expect; possible > alternatives are: > - declare 2.1 a pure bug fix release only; with a minimal set of new > features. In particular, don't push for completion of PEPs; everybody > should then accept that most features that are currently discussed > will appear in Python 2.2. > > - move the schedule for Python 2.1 back (or is it forward?) by, say, a > few month. This will people give some time to do the things that did > not get the right amount of attention during 2.0 release, and will > still allow to work on new and interesting features. > > Just my 0.02EUR, You're right -- 2.0 (including 1.6) was a monumental effort, and I'm grateful to all who contributed. I don't expect that 2.1 will be anywhere near the same amount of work! Let's look at what's on the table. 0042 Small Feature Requests Hylton SD 205 pep-0205.txt Weak References Drake S 207 pep-0207.txt Rich Comparisons Lemburg, van Rossum S 208 pep-0208.txt Reworking the Coercion Model Schemenauer S 217 pep-0217.txt Display Hook for Interactive Use Zadka S 222 pep-0222.txt Web Library Enhancements Kuchling I 226 pep-0226.txt Python 2.1 Release Schedule Hylton S 227 pep-0227.txt Statically Nested Scopes Hylton S 230 pep-0230.txt Warning Framework van Rossum S 232 pep-0232.txt Function Attributes Warsaw S 233 pep-0233.txt Python Online Help Prescod From guido@python.org Sat Dec 16 16:46:32 2000 From: guido@python.org (Guido van Rossum) Date: Sat, 16 Dec 2000 11:46:32 -0500 Subject: [Python-Dev] PEP 226 In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100." <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> Message-ID: <200012161646.LAA13947@cj20424-a.reston1.va.home.com> [Oops, I posted a partial edit of this message by mistake before.] > I remember earlier discussion on the Python 2.1 release schedule, and > never managed to comment on those. > > I believe that Python contributors and maintainers did an enourmous > job in releasing Python 2, which took quite some time from everybody's > life. I think it is unrealistic to expect the same amount of > commitment for the next release, especially if that release appears > just a few months after the previous release (that is, one month from > now). > > So I'd like to ask the release manager to take that into > account. I'm not quite sure what kind of action I expect; possible > alternatives are: > - declare 2.1 a pure bug fix release only; with a minimal set of new > features. In particular, don't push for completion of PEPs; everybody > should then accept that most features that are currently discussed > will appear in Python 2.2. > > - move the schedule for Python 2.1 back (or is it forward?) by, say, a > few month. This will people give some time to do the things that did > not get the right amount of attention during 2.0 release, and will > still allow to work on new and interesting features. > > Just my 0.02EUR, You're right -- 2.0 (including 1.6) was a monumental effort, and I'm grateful to all who contributed. I don't expect that 2.1 will be anywhere near the same amount of work! Let's look at what's on the table. These are listed as Active PEPs -- under serious consideration for Python 2.1: > 0042 Small Feature Requests Hylton We can do some of these or leave them. > 0205 Weak References Drake This one's open. > 0207 Rich Comparisons Lemburg, van Rossum This is really not that much work -- I would've done it already if I weren't distracted by the next one. > 0208 Reworking the Coercion Model Schemenauer Neil has most of this under control. I don't doubt for a second that it will be finished. > 0217 Display Hook for Interactive Use Zadka Probably a 20-line fix. > 0222 Web Library Enhancements Kuchling Up to Andrew. If he doesn't get to it, no big deal. > 0226 Python 2.1 Release Schedule Hylton I still think this is realistic -- a release before the conference seems doable! > 0227 Statically Nested Scopes Hylton This one's got a 50% chance at least. Jeremy seems motivated to do it. > 0230 Warning Framework van Rossum Done except for documentation. > 0232 Function Attributes Warsaw We need to discuss this more, but it's not much work to implement. > 0233 Python Online Help Prescod If Paul can control his urge to want to solve everything at once, I see no reason whi this one couldn't find its way into 2.1. Now, officially the PEP deadline is closed today: the schedule says "16-Dec-2000: 2.1 PEPs ready for review". That means that no new PEPs will be considered for inclusion in 2.1, and PEPs not in the active list won't be considered either. But the PEPs in the list above are all ready for review, even if we don't agree with all of them. I'm actually more worried about the ever-growing number of bug reports and submitted patches. But that's for another time. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin@mems-exchange.org Sun Dec 17 00:09:28 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Sat, 16 Dec 2000 19:09:28 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:24:34PM +0100 References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> <3A3A9A22.E9BA9551@lemburg.com> Message-ID: <20001216190928.A6703@kronos.cnri.reston.va.us> On Fri, Dec 15, 2000 at 11:24:34PM +0100, M.-A. Lemburg wrote: >Why would you want to fix it ? Format characters will always >be ASCII and thus 7-bit -- theres really no need to expand the >set of possibilities beyond 8 bits ;-) This message is for characters that aren't format characters, which therefore includes all characters >127. --amk From akuchlin@mems-exchange.org Sun Dec 17 00:17:39 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Sat, 16 Dec 2000 19:17:39 -0500 Subject: [Python-Dev] What to do about PEP 229? In-Reply-To: <3A3A9D96.80781D61@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:39:18PM +0100 References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> Message-ID: <20001216191739.B6703@kronos.cnri.reston.va.us> On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote: >Can't distutils try both and then settle for the working combination ? I'm worried about subtle problems; what if an unneeded -lfoo drags in a customized malloc, or has symbols which conflict with some other library. >... BTW, where is Greg ? I haven't heard from him in quite a while.] Still around; he just hasn't been posting much these days. >Why not parse Setup and use it as input to distutils setup.py ? That was option 1. The existing Setup format doesn't really contain enough intelligence, though; the intelligence is usually in comments such as "Uncomment the following line for Solaris". So either the Setup format is modified (bad, since we'd break existing 3rd-party packages that still use a Makefile.pre.in), or I give up and just do everything in a setup.py. --amk From guido@python.org Sun Dec 17 02:38:01 2000 From: guido@python.org (Guido van Rossum) Date: Sat, 16 Dec 2000 21:38:01 -0500 Subject: [Python-Dev] What to do about PEP 229? In-Reply-To: Your message of "Sat, 16 Dec 2000 19:17:39 EST." <20001216191739.B6703@kronos.cnri.reston.va.us> References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> <20001216191739.B6703@kronos.cnri.reston.va.us> Message-ID: <200012170238.VAA14466@cj20424-a.reston1.va.home.com> > >Why not parse Setup and use it as input to distutils setup.py ? > > That was option 1. The existing Setup format doesn't really contain > enough intelligence, though; the intelligence is usually in comments > such as "Uncomment the following line for Solaris". So either the > Setup format is modified (bad, since we'd break existing 3rd-party > packages that still use a Makefile.pre.in), or I give up and just do > everything in a setup.py. Forget Setup. Convert it and be done with it. There really isn't enough there to hang on to. We'll support Setup format (through the makesetup script and the Misc/Makefile.pre.in file) for 3rd party b/w compatibility, but we won't need to use it ourselves. (Too bad for 3rd party documentation that describes the Setup format. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Sun Dec 17 07:34:27 2000 From: tim.one@home.com (Tim Peters) Date: Sun, 17 Dec 2000 02:34:27 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: <20001216190928.A6703@kronos.cnri.reston.va.us> Message-ID: [MAL] > Why would you want to fix it ? Format characters will always > be ASCII and thus 7-bit -- theres really no need to expand the > set of possibilities beyond 8 bits ;-) [AMK] > This message is for characters that aren't format characters, which > therefore includes all characters >127. I'm with the wise man who suggested to drop the %c in this case and just display the hex value. Although it would be more readable to drop the %c if and only if the bogus format character isn't printable 7-bit ASCII. Which is obvious, yes? A new if/else isn't going to hurt anything. From tim.one@home.com Sun Dec 17 07:57:01 2000 From: tim.one@home.com (Tim Peters) Date: Sun, 17 Dec 2000 02:57:01 -0500 Subject: [Python-Dev] PEP 226 In-Reply-To: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> Message-ID: [Martin v. Loewis] > ... > - move the schedule for Python 2.1 back (or is it forward?) by, say, a > few month. This will people give some time to do the things that did > not get the right amount of attention during 2.0 release, and will > still allow to work on new and interesting features. Just a stab in the dark, but is one of your real concerns the spotty state of Unicode support in the std libraries? If so, nobody working on the PEPs Guido identified would be likely to work on improving Unicode support even if the PEPs vanished. I don't know how Unicode support is going to improve, but in the absence of visible work in that direction-- or even A Plan to get some --I doubt we're going to hold up 2.1 waiting for magic. no-feature-is-ever-done-ly y'rs - tim From tim.one@home.com Sun Dec 17 08:30:24 2000 From: tim.one@home.com (Tim Peters) Date: Sun, 17 Dec 2000 03:30:24 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A387D6A.782E6A3B@prescod.net> Message-ID: [Tim] >> I've rarely seen problems due to shadowing a global, but have often >> seen problems due to shadowing a builtin. [Paul Prescod] > Really? Yes. > I think that there are two different issues here. One is consciously > choosing to create a new variable but not understanding that there > already exists a variable by that name. (i.e. str, list). Yes, and that's what I've often seen, typically long after the original code is written: someone sticks in some debugging output, or makes a small change to the implementation, and introduces e.g. str = some_preexisting_var + ":" yadda(str) "Suddenly" the program misbehaves in baffling ways. They're "baffling" because the errors do not occur on the lines where the changes were made, and are almost never related to the programmer's intent when making the changes. > Another is trying to assign to a global but actually shadowing it. I've rarely seen that. > There is no way that anyone coming from another language is going > to consider this transcript reasonable: True, but I don't really care: everyone gets burned once, the better ones eventually learn to use classes instead of mutating globals, and even the dull get over it. It is not, in my experience, an on-going problem for anyone. But I still get burned regularly by shadowing builtins. The burns are not fatal, however, and I can't think of an ointment less painful than the blisters. > >>> a=5 > >>> def show(): > ... print a > ... > >>> def set(val): > ... a=val > ... > >>> a > 5 > >>> show() > 5 > >>> set(10) > >>> show() > 5 > > It doesn't seem to make any sense. My solution is to make the assignment > in "set" illegal unless you add a declaration that says: "No, really. I > mean it. Override that sucker." As the PEP points out, overriding is > seldom a good idea so the requirement to declare would be rarely > invoked. I expect it would do less harm to introduce a compile-time warning for locals that are never referenced (such as the "a" in "set"). > ... > The "right answer" in terms of namespace theory is to consistently refer > to builtins with a prefix (whether "__builtins__" or "$") but that's > pretty unpalatable from an aesthetic point of view. Right, that's one of the ointments I won't apply to my own code, so wouldn't think of asking others to either. WRT mutable globals, people who feel they have to use them would be well served to adopt a naming convention. For example, begin each name with "g" and capitalize the second letter. This can make global-rich code much easier to follow (I've done-- and very happily --similar things in Javascript and C++). From pf@artcom-gmbh.de Sun Dec 17 09:59:11 2000 From: pf@artcom-gmbh.de (Peter Funk) Date: Sun, 17 Dec 2000 10:59:11 +0100 (MET) Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 15, 2000 4:23:46 pm" Message-ID: Hi, Guido van Rossum: > If you're saying that you think the string module is too prominent to > ever start deprecating its use, I'm afraid we have a problem. I strongly believe the string module is too prominent. > I'd also like to note that using the string module's wrappers incurs > the overhead of a Python function call -- using string methods is > faster. I think most care more about readbility than about run time performance. For people without much OOP experience, the method syntax hurts readability. > Finally, I like the look of fields[i].strip().lower() much better than > that of string.lower(string.strip(fields[i])) -- an actual example > from mimetools.py. Hmmmm.... May be this is just a matter of taste? Like my preference for '<>' instead of '!='? Personally I still like the old fashinoned form more. Especially, if string.join() or string.split() are involved. Since Python 1.5.2 will stay around for several years, keeping backward compatibility in our Python coding is still major issue for us. So we won't change our Python coding style soon if ever. > Ideally, I would like to deprecate the entire string module, so that I [...] I share Mark Lutz and Tim Peters oppinion, that this crusade will do more harm than good to Python community. IMO this is a really bad idea. Just my $0.02, Peter From martin@loewis.home.cs.tu-berlin.de Sun Dec 17 11:13:09 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 17 Dec 2000 12:13:09 +0100 Subject: [Python-Dev] PEP 226 In-Reply-To: References: Message-ID: <200012171113.MAA00733@loewis.home.cs.tu-berlin.de> > Just a stab in the dark, but is one of your real concerns the spotty state > of Unicode support in the std libraries? Not at all. I really responded to amk's message # All the PEPs for 2.1 are supposed to be complete for Dec. 16, and # some of those PEPs are pretty complicated. I'm a bit worried that # it's been so quiet on python-dev lately, especially after the # previous two weeks of lively discussion. I just thought that something was wrong here - contributing to a free software project ought to be fun for contributors, not a cause for worries. There-are-other-things-but-i18n-although-they-are-not-that-interesting y'rs, Martin From guido@python.org Sun Dec 17 14:38:07 2000 From: guido@python.org (Guido van Rossum) Date: Sun, 17 Dec 2000 09:38:07 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: Your message of "Sun, 17 Dec 2000 03:30:24 EST." References: Message-ID: <200012171438.JAA21603@cj20424-a.reston1.va.home.com> > I expect it would do less harm to introduce a compile-time warning for > locals that are never referenced (such as the "a" in "set"). Another warning that would be quite useful (and trap similar cases) would be "local variable used before set". --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Sun Dec 17 14:40:40 2000 From: guido@python.org (Guido van Rossum) Date: Sun, 17 Dec 2000 09:40:40 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Sun, 17 Dec 2000 10:59:11 +0100." References: Message-ID: <200012171440.JAA21620@cj20424-a.reston1.va.home.com> > I think most care more about readbility than about run time performance. > For people without much OOP experience, the method syntax hurts > readability. I don't believe one bit of this. By that standard, we would do better to define a new module "list" and start writing list.append(L, x) for L.append(x). > I share Mark Lutz and Tim Peters oppinion, that this crusade will do > more harm than good to Python community. IMO this is a really bad > idea. You are entitled to your opinion, but given that your arguments seem very weak I will continue to ignore it (except to argue with you :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@digicool.com Sun Dec 17 16:17:12 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Sun, 17 Dec 2000 11:17:12 -0500 Subject: [Python-Dev] Death to string functions! References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: <14908.59144.321167.419762@anthem.concentric.net> >>>>> "PF" == Peter Funk writes: PF> Hmmmm.... May be this is just a matter of taste? Like my PF> preference for '<>' instead of '!='? Personally I still like PF> the old fashinoned form more. Especially, if string.join() or PF> string.split() are involved. Hey cool! I prefer <> over != too, but I also (not surprisingly) strongly prefer string methods over string module functions. TOOWTDI-MA-ly y'rs, -Barry From gvwilson@nevex.com Sun Dec 17 16:25:17 2000 From: gvwilson@nevex.com (Greg Wilson) Date: Sun, 17 Dec 2000 11:25:17 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <14908.59144.321167.419762@anthem.concentric.net> Message-ID: <000201c06845$f1afdb40$770a0a0a@nevex.com> +1 on deprecating string functions. Every Python book and tutorial (including mine) emphasizes Python's simplicity and lack of Perl-ish redundancy; the more we practice what we preach, the more persuasive this argument is. Greg (who admittedly only has a few thousand lines of Python to maintain) From pf@artcom-gmbh.de Sun Dec 17 17:40:06 2000 From: pf@artcom-gmbh.de (Peter Funk) Date: Sun, 17 Dec 2000 18:40:06 +0100 (MET) Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012171440.JAA21620@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 17, 2000 9:40:40 am" Message-ID: [string.function(S, ...) vs. S.method(...)] Guido van Rossum: > I don't believe one bit of this. By that standard, we would do better > to define a new module "list" and start writing list.append(L, x) for > L.append(x). list objects have only very few methods. Strings have so many methods. Some of them have names, that clash easily with the method names of other kind of objects. Since there are no type declarations in Python, looking at the code in isolation and seeing a line i = string.index(some_parameter) tells at the first glance, that some_parameter should be a string object even if the doc string of this function is too terse. However in i = some_parameter.index() it could be a list, a database or whatever. > You are entitled to your opinion, but given that your arguments seem > very weak I will continue to ignore it (except to argue with you :-). I see. But given the time frame that the string module wouldn't go away any time soon, I guess I have a lot of time to either think about some stronger arguments or to get finally accustomed to that new style of coding. But since we have to keep compatibility with Python 1.5.2 for at least the next two years chances for the latter are bad. Regards and have a nice vacation, Peter From mwh21@cam.ac.uk Sun Dec 17 18:18:24 2000 From: mwh21@cam.ac.uk (Michael Hudson) Date: 17 Dec 2000 18:18:24 +0000 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: Thomas Wouters's message of "Fri, 15 Dec 2000 23:54:25 +0100" References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> <20001215235425.A29681@xs4all.nl> Message-ID: Thomas Wouters writes: > On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote: > > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > > > I'm not sure I agree with that view either, but mostly because there > > > is a non-GPL replacement for parts of the readline API: > > > > > > http://www.cstr.ed.ac.uk/downloads/editline.html > > > > It doesn't work with the current readline module. It is much > > smaller than readline and works just as well in my experience. > > Would there be any interest in including a copy with the standard > > distribution? The license is quite nice (X11 type). > > Definately +1 from here. Readline reminds me of the cold war, for > some reason. (Actually, multiple reasons ;) I don't have time to do > it myself, unfortunately, or I would. (Looking at editline has been > on my TODO list for a while... :P) It wouldn't be particularly hard to rewrite editline in Python (we have termios & the terminal handling functions in curses - and even ioctl if we get really keen). I've been hacking on my own Python line reader on and off for a while; it's still pretty buggy, but if you're feeling brave you could look at: http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.0.0.tar.gz To try it out, unpack it, cd into the ./pyrl directory and try: >>> import foo # sorry >>> foo.test_loop() It sort of imitates the Python command prompt, except that it doesn't actually execute the code you type. You need a recent _cursesmodule.c for it to work. Cheers, M. -- 41. Some programming languages manage to absorb change, but withstand progress. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From thomas@xs4all.net Sun Dec 17 18:30:38 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Sun, 17 Dec 2000 19:30:38 +0100 Subject: [Python-Dev] Death to string functions! In-Reply-To: <000201c06845$f1afdb40$770a0a0a@nevex.com>; from gvwilson@nevex.com on Sun, Dec 17, 2000 at 11:25:17AM -0500 References: <14908.59144.321167.419762@anthem.concentric.net> <000201c06845$f1afdb40$770a0a0a@nevex.com> Message-ID: <20001217193038.C29681@xs4all.nl> On Sun, Dec 17, 2000 at 11:25:17AM -0500, Greg Wilson wrote: > +1 on deprecating string functions. How wonderfully ambiguous ! Do you mean string methods, or the string module? :) FWIW, I agree that in time, the string module should be deprecated. But I also think that 'in time' should be a considerable timespan. Don't deprecate it before everything it provides is available though some other means. Wait a bit longer than that, even, before calling it deprecated -- that scares people off. And then keep it for practically forever (until Py3K) just to support old code. And don't forget to document it 'deprecated' everywhere, not just one minor release note. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tismer@tismer.com Sun Dec 17 17:38:31 2000 From: tismer@tismer.com (Christian Tismer) Date: Sun, 17 Dec 2000 19:38:31 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> Message-ID: <3A3CFA17.ED26F51A@tismer.com> This is a multi-part message in MIME format. --------------0B643A01C67D836AADED505B Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Old topic: {}.popitem() (was Re: {}.first[key,value,item] ...) Christian Tismer wrote: > > Fredrik Lundh wrote: > > > > christian wrote: > > > That algorithm is really a gem which you should know, > > > so let me try to explain it. > > > > I think someone just won the "brain exploder 2000" award ;-) > As you might have guessed, I didn't do this just for fun. > It is the old game of explaining what is there, convincing > everybody that you at least know what you are talking about, > and then three days later coming up with an improved > application of the theory. > > Today is Monday, 2 days left. :-) Ok, today is Sunday, I had no time to finish this. But now it is here. =========================== ===== Claim: ===== =========================== - Dictionary access time can be improved with a minimal change - On the hash() function: All Objects are supposed to provide a hash function which is as good as possible. Good means to provide a wide range of different keys for different values. Problem: There are hash functions which are "good" in this sense, but they do not spread their randomness uniformly over the 32 bits. Example: Integers use their own value as hash. This is ok, as far as the integers are uniformly distributed. But if they all contain a high power of two, for instance, the low bits give a very bad hash function. Take a dictionary with integers range(1000) as keys and access all entries. Then use a dictionay with the integers shifted left by 16. Access time is slowed down by a factor of 100, since every access is a linear search now. This is not an urgent problem, although applications exist where this can play a role (memory addresses for instance can have high factors of two when people do statistics on page accesses...) While this is not a big problem, it is ugly enough to think of a solution. Solution 1: ------------- Try to involve more bits of the hash value by doing extra shuffling, either a) in the dictlook function, or b) in the hash generation itself. I believe, both can't be really justified for a rare problem. But how about changing the existing solution in a way that an improvement is gained without extra cost? Solution 2: (*the* solution) ---------------------------- Some people may remember what I wrote about re-hashing functions through the multiplicative group GF(2^n)*, and I don't want to repeat this here. The simple idea can be summarized quickly: The original algorithm uses multiplication by polynomials, and it is guaranteed that these re-hash values are jittering through all possible nonzero patterns of the n bits. Observation: Whe are using an operation of a finite field. This means that the inverse of multiplication also exists! Old algortithm (multiplication): shift the index left by 1 if index > mask: xor the index with the generator polynomial New algorithm (division): if low bit of index set: xor the index with the generator polynomial shift the index right by 1 What does this mean? Not so much, we are just cycling through our bit patterns in reverse order. But now for the big difference. First change: We change from multiplication to division. Second change: We do not mask the hash value before! The second change is what I was after: By not masking the hash value when computing the initial index, all the existing bits in the hash come into play. This can be seen like a polynomial division, but the initial remainder (the hash value) was not normalized. After a number of steps, all the extra bits are wheeled into our index, but not wasted by masking them off. That gives our re-hash some more randomness. When all the extra bits are sucked in, the guaranteed single-visit cycle begins. There cannot be more than 27 extra cycles in the worst case (dict size = 32, so there are 27 bits to consume). I do not expect any bad effect from this modification. Here some results, dictionaries have 1000 entries: timing for strings old= 5.097 new= 5.088 timing for bad integers (<<10) old=101.540 new=12.610 timing for bad integers (<<16) old=571.210 new=19.220 On strings, both algorithms behave the same. On numbers, they differ dramatically. While the current algorithm is 110 times slower on a worst case dict (quadratic behavior), the new algorithm accounts a little for the extra cycle, but is only 4 times slower. Alternative implementation: The above approach is conservative in the sense that it tries not to slow down the current implementation in any way. An alternative would be to comsume all of the extra bits at once. But this would add an extra "warmup" loop like this to the algorithm: while index > mask: if low bit of index set: xor the index with the generator polynomial shift the index right by 1 This is of course a very good digest of the higher bits, since it is a polynomial division and not just some bit xor-ing which might give quite predictable cancellations, therefore it is "the right way" in my sense. It might be cheap, but would add over 20 cycles to every small dict. I therefore don't think it is worth to do this. Personally, I prefer the solution to merge the bits during the actual lookup, since it suffices to get access time from quadratic down to logarithmic. Attached is a direct translation of the relevant parts of dictobject.c into Python, with both algorithms implemented. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com --------------0B643A01C67D836AADED505B Content-Type: text/plain; charset=us-ascii; name="dictest.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="dictest.py" ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # even the shifting my not be worth it. incr = _hash ^ (_hash >> 3) ###### TO HERE if (not incr): incr = mask while 1: ep = ep0[(i+incr)&mask] if (ep[me_key] is NULL) : if (freeslot != NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def timing(func, args=None, n=1, **keywords) : import time time=time.time appl=apply if args is None: args = () if type(args) != type(()) : args=(args,) rep=range(n) dummyarg = ("",) dummykw = {} dummyfunc = len if keywords: before=time() for i in rep: res=appl(dummyfunc, dummyarg, dummykw) empty = time()-before before=time() for i in rep: res=appl(func, args, keywords) else: before=time() for i in rep: res=appl(dummyfunc, dummyarg) empty = time()-before before=time() for i in rep: res=appl(func, args) after = time() return round(after-before-empty,4), res def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash for i in range(1000): s = str(i) * 5 d1[s] = d2[s] = i return d1, d2 def badnum_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash for i in range(1000): bad = i << 16 d1[bad] = d2[bad] = i return d1, d2 def do_test(dict, keys, n): t0 = timing(nulltest, (keys, dict), n)[0] t1 = timing(test, (keys, dict), n)[0] return t1-t0 if __name__ == "__main__": sdold, sdnew = string_dicts() bdold, bdnew = badnum_dicts() print "timing for strings old=%.3f new=%.3f" % ( do_test(sdold, sdold.keys(), 100), do_test(sdnew, sdnew.keys(), 100) ) print "timing for bad integers old=%.3f new=%.3f" % ( do_test(bdold, bdold.keys(), 10) *10, do_test(bdnew, bdnew.keys(), 10) *10) """ D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 """ --------------0B643A01C67D836AADED505B-- From fdrake@acm.org Sun Dec 17 18:49:58 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Sun, 17 Dec 2000 13:49:58 -0500 (EST) Subject: [Python-Dev] Death to string functions! In-Reply-To: <20001217193038.C29681@xs4all.nl> References: <14908.59144.321167.419762@anthem.concentric.net> <000201c06845$f1afdb40$770a0a0a@nevex.com> <20001217193038.C29681@xs4all.nl> Message-ID: <14909.2774.158973.760077@cj42289-a.reston1.va.home.com> Thomas Wouters writes: > FWIW, I agree that in time, the string module should be deprecated. But I > also think that 'in time' should be a considerable timespan. Don't deprecate *If* most functions in the string module are going to be deprecated, that should be done *now*, so that the documentation will include the appropriate warning to users. When they should actually be removed is another matter, and I think Guido is sufficiently aware of their widespread use and won't remove them too quickly -- his creation of Python isn't the reason he's *accepted* as BDFL, it just made it a possibility. He's had to actually *earn* the BDFL position, I think. With regard to converting the standard library to string methods: that needs to be done as part of the deprecation. The code in the library is commonly used as example code, and should be good example code wherever possible. > support old code. And don't forget to document it 'deprecated' everywhere, > not just one minor release note. When Guido tells me exactly what is deprecated, the documentation will be updated with proper deprecation notices in the appropriate places. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tismer@tismer.com Sun Dec 17 18:10:07 2000 From: tismer@tismer.com (Christian Tismer) Date: Sun, 17 Dec 2000 20:10:07 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> Message-ID: <3A3D017F.62AD599F@tismer.com> This is a multi-part message in MIME format. --------------D1825E07B23FE5AC1D48DB49 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Christian Tismer wrote: ... (my timings) Attached is the updated script with the timings mentioned in the last posting. Sorry, I posted an older version before. -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com --------------D1825E07B23FE5AC1D48DB49 Content-Type: text/plain; charset=us-ascii; name="dictest.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="dictest.py" ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # even the shifting my not be worth it. incr = _hash ^ (_hash >> 3) ###### TO HERE if (not incr): incr = mask while 1: ep = ep0[(i+incr)&mask] if (ep[me_key] is NULL) : if (freeslot != NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def timing(func, args=None, n=1, **keywords) : import time time=time.time appl=apply if args is None: args = () if type(args) != type(()) : args=(args,) rep=range(n) dummyarg = ("",) dummykw = {} dummyfunc = len if keywords: before=time() for i in rep: res=appl(dummyfunc, dummyarg, dummykw) empty = time()-before before=time() for i in rep: res=appl(func, args, keywords) else: before=time() for i in rep: res=appl(dummyfunc, dummyarg) empty = time()-before before=time() for i in rep: res=appl(func, args) after = time() return round(after-before-empty,4), res def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash for i in range(1000): s = str(i) * 5 d1[s] = d2[s] = i return d1, d2 def badnum_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash shift = 10 if EXTREME: shift = 16 for i in range(1000): bad = i << 16 d1[bad] = d2[bad] = i return d1, d2 def do_test(dict, keys, n): t0 = timing(nulltest, (keys, dict), n)[0] t1 = timing(test, (keys, dict), n)[0] return t1-t0 EXTREME=1 if __name__ == "__main__": sdold, sdnew = string_dicts() bdold, bdnew = badnum_dicts() print "timing for strings old=%.3f new=%.3f" % ( do_test(sdold, sdold.keys(), 100), do_test(sdnew, sdnew.keys(), 100) ) print "timing for bad integers old=%.3f new=%.3f" % ( do_test(bdold, bdold.keys(), 10) *10, do_test(bdnew, bdnew.keys(), 10) *10) """ Results with a shift of 10 (EXTREME=0): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 Results with a shift of 16 (EXTREME=1): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.218 new=5.147 timing for bad integers old=571.210 new=19.220 """ --------------D1825E07B23FE5AC1D48DB49-- From lutz@rmi.net Sun Dec 17 19:09:47 2000 From: lutz@rmi.net (Mark Lutz) Date: Sun, 17 Dec 2000 12:09:47 -0700 Subject: [Python-Dev] Death to string functions! References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: <001f01c0685c$ef555200$7bdb5da6@vaio> As a longstanding Python advocate and user, I find this thread disturbing, and feel compelled to add a few words: > > [Tim wrote:] > > "string" is right up there with "os" and "sys" as a FIM (Frequently > > Imported Module), so the required code changes will be massive. As > > a user, I don't see what's in it for me to endure that pain: the > > string module functions work fine! Neither are they warts in the > > language, any more than that we say sin(pi) instead of pi.sin(). > > Keeping the functions around doesn't hurt anybody that I can see. > > [Guido wrote:] > Hm. I'm not saying that this one will be easy. But I don't like > having "two ways to do it". It means more learning, etc. (you know > the drill). But with all due respect, there are already _lots_ of places in Python that provide at least two ways to do something already. Why be so strict on this one alone? Consider lambda and def; tuples and lists; map and for loops; the loop else and boolean exit flags; and so on. The notion of Python forcing a single solution is largely a myth. And as someone who makes a living teaching this stuff, I can tell you that none of the existing redundancies prevent anyone from learning Python. More to the point, many of those shiny new features added to 2.0 fall squarely into this category too, and are completely redundant with other tools. Consider list comprehensions and simple loops; extended print statements and sys.std* assignments; augmented assignment statements and simpler ones. Eliminating redundancy at a time when we're also busy introducing it seems a tough goal to sell. I understand the virtues of aesthetics too, but removing the string module seems an incredibly arbitrary application of it. > If you're saying that you think the string module is too prominent to > ever start deprecating its use, I'm afraid we have a problem. > > [...] > Ideally, I'd like to deprecate the entire string module, so that I > can place a single warning at its top. This will cause a single > warning to be issued for programs that still use it (no matter how > many times it is imported). And to me, this seems the real crux of the matter. For a decade now, the string module _has_ been the right way to do it. And today, half a million Python developers absolutely rely on it as an essential staple in their toolbox. What could possibly be wrong with keeping it around for backward compatibility, albeit as a less recommended option? If almost every Python program ever written suddenly starts issuing warning messages, then I think we do have a problem indeed. Frankly, a Python that changes without regard to its user base seems an ominous thing to me. And keep in mind that I like Python; others will look much less generously upon a tool that seems inclined to rip the rug out from under its users. Trust me on this; I've already heard the rumblings out there. So please: can we keep string around? Like it or not, we're way past the point of removing such core modules at this point. Such a radical change might pass in a future non-backward- compatible Python mutation; I'm not sure such a different system will still be "Python", but that's a topic for another day. All IMHO, of course, --Mark Lutz (http://www.rmi.net~lutz) From tim.one@home.com Sun Dec 17 19:50:55 2000 From: tim.one@home.com (Tim Peters) Date: Sun, 17 Dec 2000 14:50:55 -0500 Subject: [Python-Dev] SourceForge SSH silliness Message-ID: Starting last night, I get this msg whenever I update Python code w/ CVSROOT=:ext:tim_one@cvs.python.sourceforge.net:/cvsroot/python: """ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that the host key has just been changed. Please contact your system administrator. Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message. Password authentication is disabled to avoid trojan horses. """ This is SourceForge's doing, and is permanent (they've changed keys on their end). Here's a link to a thread that may or may not make sense to you: http://sourceforge.net/forum/forum.php?forum_id=52867 Deleting the sourceforge entries from my .ssh/known_hosts file worked for me. But everyone in the thread above who tried it says that they haven't been able to get scp working again (I haven't tried it yet ...). From paulp@ActiveState.com Sun Dec 17 20:04:27 2000 From: paulp@ActiveState.com (Paul Prescod) Date: Sun, 17 Dec 2000 12:04:27 -0800 Subject: [Python-Dev] Pragmas and warnings Message-ID: <3A3D1C4B.8F08A744@ActiveState.com> A couple of other threads started me to thinking that there are a couple of things missing from our warnings framework. Many languages have pragmas that allow you turn warnings on and off in code. For instance, I should be able to put a pragma at the top of a module that uses string functions to say: "I know that this module doesn't adhere to the latest Python conventions. Please don't warn me about it." I should also be able to put a declaration that says: "I'm really paranoid about shadowing globals and builtins. Please warn me when I do that." Batch and visual linters could also use the declarations to customize their behaviors. And of course we have a stack of other features that could use pragmas: * type signatures * Unicode syntax declarations * external object model language binding hints * ... A case could be made that warning pragmas could use a totally different syntax from "user-defined" pragmas. I don't care much. Paul From thomas@xs4all.net Sun Dec 17 21:00:08 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Sun, 17 Dec 2000 22:00:08 +0100 Subject: [Python-Dev] SourceForge SSH silliness In-Reply-To: ; from tim.one@home.com on Sun, Dec 17, 2000 at 02:50:55PM -0500 References: Message-ID: <20001217220008.D29681@xs4all.nl> On Sun, Dec 17, 2000 at 02:50:55PM -0500, Tim Peters wrote: > Starting last night, I get this msg whenever I update Python code w/ > CVSROOT=:ext:tim_one@cvs.python.sourceforge.net:/cvsroot/python: > """ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > @ WARNING: HOST IDENTIFICATION HAS CHANGED! @ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! > Someone could be eavesdropping on you right now (man-in-the-middle attack)! > It is also possible that the host key has just been changed. > Please contact your system administrator. > Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message. > Password authentication is disabled to avoid trojan horses. > """ > This is SourceForge's doing, and is permanent (they've changed keys on their > end). Here's a link to a thread that may or may not make sense to you: > http://sourceforge.net/forum/forum.php?forum_id=52867 > Deleting the sourceforge entries from my .ssh/known_hosts file worked for > me. But everyone in the thread above who tried it says that they haven't > been able to get scp working again (I haven't tried it yet ...). What sourceforge did was switch Linux distributions, and upgrade. The switch doesn't really matter for the SSH problem, because recent Debian and recent RedHat releases both use a new ssh, the OpenBSD ssh imlementation. Apparently, it isn't entirely backwards compatible to old versions of F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This might or might not be your problem; if it is, you should get a decent message that gives a relatively clear message such as 'cypher type 'idea' not supported'. You should be able to pass the '-c' option to scp/ssh to use a different cypher, like 3des (aka triple-des.) Or maybe the windows versions have a menu to configure that kind of thing :) Another possible problem is that it might not have good support for older protocol versions. The 'current' protocol version, at least for 'ssh1', is 1.5. The one message on the sourceforge thread above that actually mentions a version in the *cough* bugreport is using an older ssh that only supports protocol version 1.4. Since that particular version of F-secure ssh has known problems (why else would they release 16 more versions ?) I'd suggest anyone with problems first try a newer version. I hope that doesn't break WinCVS, but it would suck if it did :P If that doesn't work, which is entirely possible, it might be an honest bug in the OpenBSD ssh that Sourceforge is using. If anyone cared, we could do a bit of experimenting with the openssh-2.0 betas installed by Debian woody (unstable) to see if the problem occurs there as well. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From greg@cosc.canterbury.ac.nz Sun Dec 17 23:05:41 2000 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 18 Dec 2000 12:05:41 +1300 (NZDT) Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il> Message-ID: <200012172305.MAA02512@s454.cosc.canterbury.ac.nz> Moshe Zadka : > Perl and Scheme permit implicit shadowing too. But Scheme always requires declarations! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From martin@loewis.home.cs.tu-berlin.de Sun Dec 17 23:45:56 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 18 Dec 2000 00:45:56 +0100 Subject: [Python-Dev] Death to string functions! Message-ID: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> > But with all due respect, there are already _lots_ of places in > Python that provide at least two ways to do something already. Exactly. My favourite one here is string exceptions, which have quite some analogy to the string module. At some time, there were only string exceptions. Then, instance exceptions were added, some releases later they were considered the better choice, so the standard library was converted to use them. Still, there is no sign whatsoever that anybody plans to deprecate string exceptions. I believe the string module will get less importance over time. Comparing it with string exception, that may be well 5 years. It seems there are two ways of "deprecation": a loud "we will remove that, change your code", and a silent "strings have methods" (i.e. don't mention the module when educating people). The latter approach requires educators to agree that the module is "uninteresting", and people to really not use once they find out it exists. I think deprecation should be only attempted once there is a clear sign that people don't use it massively for new code anymore. Removal should only occur if keeping the module less pain than maintaining it. Regards, Martin From skip@mojam.com (Skip Montanaro) Sun Dec 17 23:55:10 2000 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Sun, 17 Dec 2000 17:55:10 -0600 (CST) Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in? Message-ID: <14909.21086.92774.940814@beluga.mojam.com> I executed cvs update today (removing the sourceforge machines from .ssh/known_hosts worked fine for me, btw) followed by a configure and a make clean. The last step failed with this output: ... make[1]: Entering directory `/home/beluga/skip/src/python/dist/src/Modules' Makefile.pre.in:20: *** missing separator. Stop. make[1]: Leaving directory `/home/beluga/skip/src/python/dist/src/Modules' make: [clean] Error 2 (ignored) I found the following at line 20 of Modules/Makefile.pre.in: @SET_CXX@ I then tried a cvs annotate on that file but saw that line 20 had been there since rev 1.60 (16-Dec-99). I then checked the top-level Makefile.in thinking something must have changed in the clean target recently, but cvs annotate shows no recent changes there either: 1.1 (guido 24-Dec-93): clean: localclean 1.1 (guido 24-Dec-93): -for i in $(SUBDIRS); do \ 1.74 (guido 19-May-98): if test -d $$i; then \ 1.24 (guido 20-Jun-96): (echo making clean in subdirectory $$i; cd $$i; \ 1.4 (guido 01-Aug-94): if test -f Makefile; \ 1.4 (guido 01-Aug-94): then $(MAKE) clean; \ 1.4 (guido 01-Aug-94): else $(MAKE) -f Makefile.*in clean; \ 1.4 (guido 01-Aug-94): fi); \ 1.74 (guido 19-May-98): else true; fi; \ 1.1 (guido 24-Dec-93): done Make distclean succeeded so I tried the following: make distclean ./configure make clean but the last step still failed. Any idea why make clean is now failing (for me)? Can anyone else reproduce this problem? Skip From greg@cosc.canterbury.ac.nz Mon Dec 18 00:02:32 2000 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 18 Dec 2000 13:02:32 +1300 (NZDT) Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com> Message-ID: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> "M.-A. Lemburg" : > Format characters will always > be ASCII and thus 7-bit -- theres really no need to expand the > set of possibilities beyond 8 bits ;-) But the error message is being produced because the character is NOT a valid format character. One of the reasons for that might be because it's not in the 7-bit range! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From MarkH@ActiveState.com Mon Dec 18 06:02:27 2000 From: MarkH@ActiveState.com (Mark Hammond) Date: Mon, 18 Dec 2000 17:02:27 +1100 Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in? In-Reply-To: <14909.21086.92774.940814@beluga.mojam.com> Message-ID: > I found the following at line 20 of Modules/Makefile.pre.in: > > @SET_CXX@ I dont have time to investigate this specific problem, but I definitely had problems with SET_CXX around 6 months back. This was trying to build an external C++ application, so may be different. My message and other followups at the time implied noone really knew and everyone agreed it was likely SET_CXX was broken :-( I even referenced the CVS chekin that I thought broke it. Mark. From mal@lemburg.com Mon Dec 18 09:58:37 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 10:58:37 +0100 Subject: [Python-Dev] Pragmas and warnings References: <3A3D1C4B.8F08A744@ActiveState.com> Message-ID: <3A3DDFCD.34AB05B2@lemburg.com> Paul Prescod wrote: > > A couple of other threads started me to thinking that there are a couple > of things missing from our warnings framework. > > Many languages have pragmas that allow you turn warnings on and off in > code. For instance, I should be able to put a pragma at the top of a > module that uses string functions to say: "I know that this module > doesn't adhere to the latest Python conventions. Please don't warn me > about it." I should also be able to put a declaration that says: "I'm > really paranoid about shadowing globals and builtins. Please warn me > when I do that." > > Batch and visual linters could also use the declarations to customize > their behaviors. > > And of course we have a stack of other features that could use pragmas: > > * type signatures > * Unicode syntax declarations > * external object model language binding hints > * ... > > A case could be made that warning pragmas could use a totally different > syntax from "user-defined" pragmas. I don't care much. There was a long thread about this some months ago. We agreed to add a new keyword to the language (I think it was "define") which then uses a very simple syntax which can be interpreted at compile time to modify the behaviour of the compiler, e.g. define = There was also a discussion about allowing limited forms of expressions instead of the constant literal. define source_encoding = "utf-8" was the original motivation for this, but (as always ;) the usefulness for other application areas was quickly recognized, e.g. to enable compilation in optimization mode on a per module basis. PS: "define" is perhaps not obscure enough as keyword... -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Mon Dec 18 10:04:08 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:04:08 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> Message-ID: <3A3DE118.3355896D@lemburg.com> Greg Ewing wrote: > > "M.-A. Lemburg" : > > > Format characters will always > > be ASCII and thus 7-bit -- theres really no need to expand the > > set of possibilities beyond 8 bits ;-) > > But the error message is being produced because the > character is NOT a valid format character. One of the > reasons for that might be because it's not in the > 7-bit range! True. I think removing %c completely in that case is the right solution (in case you don't want to convert the Unicode char using the default encoding to a string first). -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Mon Dec 18 10:09:16 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:09:16 +0100 Subject: [Python-Dev] What to do about PEP 229? References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> <20001216191739.B6703@kronos.cnri.reston.va.us> Message-ID: <3A3DE24C.DA0B2F6C@lemburg.com> Andrew Kuchling wrote: > > On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote: > >Can't distutils try both and then settle for the working combination ? > > I'm worried about subtle problems; what if an unneeded -lfoo drags in > a customized malloc, or has symbols which conflict with some other > library. In that case, I think the user will have to decide. setup.py should then default to not integrating the module in question and issue a warning telling the use what to look for and how to call setup.py in order to add the right combination of libs. > >... BTW, where is Greg ? I haven't heard from him in quite a while.] > > Still around; he just hasn't been posting much these days. Good to know :) > >Why not parse Setup and use it as input to distutils setup.py ? > > That was option 1. The existing Setup format doesn't really contain > enough intelligence, though; the intelligence is usually in comments > such as "Uncomment the following line for Solaris". So either the > Setup format is modified (bad, since we'd break existing 3rd-party > packages that still use a Makefile.pre.in), or I give up and just do > everything in a setup.py. I would still like a simple input to setup.py -- one that doesn't require hacking setup.py just to enable a few more modules. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fredrik@effbot.org Mon Dec 18 10:15:26 2000 From: fredrik@effbot.org (Fredrik Lundh) Date: Mon, 18 Dec 2000 11:15:26 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com> Message-ID: <004a01c068db$72403170$3c6340d5@hagrid> mal wrote: > > But the error message is being produced because the > > character is NOT a valid format character. One of the > > reasons for that might be because it's not in the > > 7-bit range! > > True. > > I think removing %c completely in that case is the right > solution (in case you don't want to convert the Unicode char > using the default encoding to a string first). how likely is it that a human programmer will use a bad formatting character that's not in the ASCII range? -1 on removing it -- people shouldn't have to learn the octal ASCII table just to be able to fix trivial typos. +1 on mapping the character back to a string in the same was as "repr" -- that is, print ASCII characters as is, map anything else to an octal escape. +0 on leaving it as it is, or mapping non-printables to "?". From mal@lemburg.com Mon Dec 18 10:34:02 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:34:02 +0100 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> Message-ID: <3A3DE81A.4B825D89@lemburg.com> > Here some results, dictionaries have 1000 entries: > > timing for strings old= 5.097 new= 5.088 > timing for bad integers (<<10) old=101.540 new=12.610 > timing for bad integers (<<16) old=571.210 new=19.220 Even though I think concentrating on string keys would provide more performance boost for Python in general, I think you have a point there. +1 from here. BTW, would changing the hash function on strings from the simple XOR scheme to something a little smarter help improve the performance too (e.g. most strings used in programming never use the 8-th bit) ? I also think that we could inline the string compare function in dictobject:lookdict_string to achieve even better performance. Currently it uses a function which doesn't trigger compiler inlining. And finally: I think a generic PyString_Compare() API would be useful in a lot of places where strings are being compared (e.g. dictionaries and keyword parameters). Unicode already has such an API (along with dozens of other useful APIs which are not available for strings). -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Mon Dec 18 10:41:38 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:41:38 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com> <004a01c068db$72403170$3c6340d5@hagrid> Message-ID: <3A3DE9E2.77FF0FA9@lemburg.com> Fredrik Lundh wrote: > > mal wrote: > > > > But the error message is being produced because the > > > character is NOT a valid format character. One of the > > > reasons for that might be because it's not in the > > > 7-bit range! > > > > True. > > > > I think removing %c completely in that case is the right > > solution (in case you don't want to convert the Unicode char > > using the default encoding to a string first). > > how likely is it that a human programmer will use a bad formatting > character that's not in the ASCII range? Not very likely... the most common case of this error is probably the use of % as percent sign in a formatting string. The next character in those cases is usually whitespace. > -1 on removing it -- people shouldn't have to learn the octal ASCII > table just to be able to fix trivial typos. > > +1 on mapping the character back to a string in the same was as > "repr" -- that is, print ASCII characters as is, map anything else to > an octal escape. > > +0 on leaving it as it is, or mapping non-printables to "?". Agreed. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer@tismer.com Mon Dec 18 11:08:34 2000 From: tismer@tismer.com (Christian Tismer) Date: Mon, 18 Dec 2000 13:08:34 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> <3A3DE81A.4B825D89@lemburg.com> Message-ID: <3A3DF032.5F86AD15@tismer.com> "M.-A. Lemburg" wrote: > > > Here some results, dictionaries have 1000 entries: > > > > timing for strings old= 5.097 new= 5.088 > > timing for bad integers (<<10) old=101.540 new=12.610 > > timing for bad integers (<<16) old=571.210 new=19.220 > > Even though I think concentrating on string keys would provide more > performance boost for Python in general, I think you have a point > there. +1 from here. > > BTW, would changing the hash function on strings from the simple > XOR scheme to something a little smarter help improve the performance > too (e.g. most strings used in programming never use the 8-th > bit) ? Yes, it would. I spent the rest of last night to do more accurate tests, also refined the implementation (using longs for the shifts etc), and turned from timing over to trip counting, i.e. a dict counts every round through the re-hash. That showed two things: - The bits used from the string hash are not well distributed - using a "warmup wheel" on the hash to suck all bits in gives the same quality of hashes like random numbers. I will publish some results later today. > I also think that we could inline the string compare function > in dictobject:lookdict_string to achieve even better performance. > Currently it uses a function which doesn't trigger compiler > inlining. Sure! ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From guido@python.org Mon Dec 18 14:20:22 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 09:20:22 -0500 Subject: [Python-Dev] The Dictionary Gem is polished! In-Reply-To: Your message of "Sun, 17 Dec 2000 19:38:31 +0200." <3A3CFA17.ED26F51A@tismer.com> References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> Message-ID: <200012181420.JAA25063@cj20424-a.reston1.va.home.com> > Problem: There are hash functions which are "good" in this sense, > but they do not spread their randomness uniformly over the > 32 bits. > > Example: Integers use their own value as hash. > This is ok, as far as the integers are uniformly distributed. > But if they all contain a high power of two, for instance, > the low bits give a very bad hash function. > > Take a dictionary with integers range(1000) as keys and access > all entries. Then use a dictionay with the integers shifted > left by 16. > Access time is slowed down by a factor of 100, since every > access is a linear search now. Ai. I think what happened is this: long ago, the hash table sizes were primes, or at least not powers of two! I'll leave it to the more mathematically-inclined to judge your solution... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Dec 18 14:52:35 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 09:52:35 -0500 Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in? In-Reply-To: Your message of "Sun, 17 Dec 2000 17:55:10 CST." <14909.21086.92774.940814@beluga.mojam.com> References: <14909.21086.92774.940814@beluga.mojam.com> Message-ID: <200012181452.JAA04372@cj20424-a.reston1.va.home.com> > Make distclean succeeded so I tried the following: > > make distclean > ./configure > make clean > > but the last step still failed. Any idea why make clean is now failing (for > me)? Can anyone else reproduce this problem? Yes. I don't understand it, but this takes care of it: make distclean ./configure make Makefiles # <--------- !!! make clean --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Dec 18 14:54:20 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 09:54:20 -0500 Subject: [Python-Dev] Pragmas and warnings In-Reply-To: Your message of "Mon, 18 Dec 2000 10:58:37 +0100." <3A3DDFCD.34AB05B2@lemburg.com> References: <3A3D1C4B.8F08A744@ActiveState.com> <3A3DDFCD.34AB05B2@lemburg.com> Message-ID: <200012181454.JAA04394@cj20424-a.reston1.va.home.com> > There was a long thread about this some months ago. We agreed > to add a new keyword to the language (I think it was "define") I don't recall agreeing. :-) This is PEP material. For 2.2, please! --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Mon Dec 18 14:56:33 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 15:56:33 +0100 Subject: [Python-Dev] Pragmas and warnings References: <3A3D1C4B.8F08A744@ActiveState.com> <3A3DDFCD.34AB05B2@lemburg.com> <200012181454.JAA04394@cj20424-a.reston1.va.home.com> Message-ID: <3A3E25A1.DFD2BDBF@lemburg.com> Guido van Rossum wrote: > > > There was a long thread about this some months ago. We agreed > > to add a new keyword to the language (I think it was "define") > > I don't recall agreeing. :-) Well, maybe it was a misinterpretation on my part... you said something like "add a new keyword and live with the consequences". AFAIR, of course :-) > This is PEP material. For 2.2, please! Ok. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@python.org Mon Dec 18 15:15:26 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 10:15:26 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Sun, 17 Dec 2000 12:09:47 MST." <001f01c0685c$ef555200$7bdb5da6@vaio> References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> <001f01c0685c$ef555200$7bdb5da6@vaio> Message-ID: <200012181515.KAA04571@cj20424-a.reston1.va.home.com> [Mark Lutz] > So please: can we keep string around? Like it or not, we're > way past the point of removing such core modules at this point. Of course we're keeping string around. I already said that for backwards compatibility reasons it would not disappear before Py3K. I think there's a misunderstanding about the meaning of deprecation, too. That word doesn't mean to remove a feature. It doesn't even necessarily mean to warn every time a feature is used. It just means (to me) that at some point in the future the feature will change or disappear, there's a new and better way to do it, and that we encourage users to start using the new way, to save them from work later. In my mind, there's no reason to start emitting warnings about every deprecated feature. The warnings are only needed late in the deprecation cycle. PEP 5 says "There must be at least a one-year transition period between the release of the transitional version of Python and the release of the backwards incompatible version." Can we now stop getting all bent out of shape over this? String methods *are* recommended over equivalent string functions. Those string functions *are* already deprecated, in the informal sense (i.e. just that it is recommended to use string methods instead). This *should* (take notice, Fred!) be documented per 2.1. We won't however be issuing run-time warnings about the use of string functions until much later. (Lint-style tools may start warning sooner -- that's up to the author of the lint tool to decide.) Note that I believe Java makes a useful distinction that PEP 5 misses: it defines both deprecated features and obsolete features. *Deprecated* features are simply features for which a better alternative exists. *Obsolete* features are features that are only being kept around for backwards compatibility. Deprecated features may also be (and usually are) *obsolescent*, meaning they will become obsolete in the future. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@python.org Mon Dec 18 15:22:09 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 10:22:09 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Mon, 18 Dec 2000 00:45:56 +0100." <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> References: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> Message-ID: <200012181522.KAA04597@cj20424-a.reston1.va.home.com> > At some time, there were only string exceptions. Then, instance > exceptions were added, some releases later they were considered the > better choice, so the standard library was converted to use them. > Still, there is no sign whatsoever that anybody plans to deprecate > string exceptions. Now there is: I hereby state that I officially deprecate string exceptions. Py3K won't support them, and it *may* even require that all exception classes are derived from Exception. > I believe the string module will get less importance over > time. Comparing it with string exception, that may be well 5 years. > It seems there are two ways of "deprecation": a loud "we will remove > that, change your code", and a silent "strings have methods" > (i.e. don't mention the module when educating people). The latter > approach requires educators to agree that the module is > "uninteresting", and people to really not use once they find out it > exists. Exactly. This is what I hope will happen. I certainly hope that Mark Lutz has already started teaching string methods! > I think deprecation should be only attempted once there is a clear > sign that people don't use it massively for new code anymore. Right. So now we're on the first step: get the word out! > Removal should only occur if keeping the module [is] less pain than > maintaining it. Exactly. Guess where the string module falls today. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From Barrett@stsci.edu Mon Dec 18 16:50:49 2000 From: Barrett@stsci.edu (Paul Barrett) Date: Mon, 18 Dec 2000 11:50:49 -0500 (EST) Subject: [Python-Dev] PEP 207 -- Rich Comparisons Message-ID: <14910.16431.554136.374725@nem-srvr.stsci.edu> Guido van Rossum writes: > > > > 1. The current boolean operator behavior does not have to change, and > > hence will be backward compatible. > > What incompatibility do you see in the current proposal? You have to choose between using rich comparisons or boolean comparisons. You can't use both for the same (rich/complex) object. > > 2. It eliminates the need to decide whether or not rich comparisons > > takes precedence over boolean comparisons. > > Only if you want different semantics -- that's only an issue for NumPy. No. I think NumPy is the tip of the iceberg, when discussing new semantics. Most users don't consider these broader semantic issues, because Python doesn't give them the opportunity to do so. I can see possible scenarios of using both boolean and non-boolean comparisons for Python lists and dictionaries in addition to NumPy. I chose to use Python because it provides a richer framework than other languages. When Python fails to provide such benefits, I'll move to another language. I moved from PERL to Python because the multi-dimensional array syntax is vastly better in Python than PERL, though as a novice I don't have to know that it exists. What I'm proposing here is in a similar vein. > > 3. The new operators add additional behavior without directly impacting > > current behavior and the use of them is unambigous, at least in > > relation to current Python behavior. You know by the operator what > > type of comparison will be returned. This should appease Jim > > Fulton, based on his arguments in 1998 about comparison operators > > always returning a boolean value. > > As you know, I'm now pretty close to Jim. :-) He seemed pretty mellow > about this now. Yes, I would hope so! It appears though that you misunderstand me. My point was that I tend to agree with Jim Fulton's arguments for a limited interpretation of the current comparison operators. I too expect them to return a boolean result. I have never felt comfortable using such comparison operators in an array context, e.g. as in the array language, IDL. It just looks wrong. So my suggestion is to create new ones whose implicit meaning is to provide element-wise or rich comparison behavior. And to add similar behavior for the other operators for consistency. Can someone provide an example in mathematics where comparison operators are used in a non-boolean, ie. rich comparison, context. If so, this might shut me up! > > 4. Compound objects, such as lists, could implement both rich > > and boolean comparisons. The boolean comparison would remain as > > is, while the rich comparison would return a list of boolean > > values. Current behavior doesn't change; just a new feature, which > > you may or may not choose to use, is added. > > > > If we go one step further and add the matrix-style operators along > > with the comparison operators, we can provide a consistent user > > interface to array/complex operations without changing current Python > > behavior. If a user has no need for these new operators, he doesn't > > have to use them or even know about them. All we've done is made > > Python richer, but I believe with making it more complex. For Phrase should be: "but I believe without making it more complex.". ------- > > example, all element-wise operations could have a ':' appended to > > them, e.g. '+:', '<:', etc.; and will define element-wise addition, > > element-wise less-than, etc. The traditional '*', '/', etc. operators > > can then be used for matrix operations, which will appease the Matlab > > people. > > > > Therefore, I don't think rich comparisons and matrix-type operators > > should be considered separable. I really think you should consider > > this suggestion. It appeases many groups while providing a consistent > > and clear user interface, while greatly impacting current Python > > behavior. The last phrase should read: "while not greatly impacting current --- Python behavior." > > > > Always-causing-havoc-at-the-last-moment-ly Yours, > > I think you misunderstand. Rich comparisons are mostly about allowing > the separate overloading of <, <=, ==, !=, >, and >=. This is useful > in its own light. No, I do understand. I've read most of the early discussions on this issue and one of those issues was about having to choose between boolean and rich comparisons and what should take precedence, when both may be appropriate. I'm suggesting an alternative here. > If you don't want to use this overloading facility for elementwise > comparisons in NumPy, that's fine with me. Nobody says you have to -- > it's just that you *could*. Yes, I understand. > Red my lips: there won't be *any* new operators in 2.1. OK, I didn't expect this to make it into 2.1. > There will a better way to overload the existing Boolean operators, > and they will be able to return non-Boolean results. That's useful in > other situations besides NumPy. Yes, I agree, this should be done anyway. I'm just not sure that the implicit meaning that these comparison operators are being given is the best one. I'm just looking for ways to incorporate rich comparisons into a broader framework, numpy just currently happens to be the primary example of this proposal. Assuming the current comparison operator overloading is already implemented and has been used to implement rich comparisons for some objects, then my rich comparison proposal would cause confusion. This is what I'm trying to avoid. > Feel free to lobby for elementwise operators -- but based on the > discussion about this subject so far, I don't give it much of a chance > even past Python 2.1. They would add a lot of baggage to the language > (e.g. the table of operators in all Python books would be about twice > as long) and by far the most users don't care about them. (Read the > intro to 211 for some of the concerns -- this PEP tries to make the > addition palatable by adding exactly *one* new operator.) So! Introductory books don't have to discuss these additional operators. I don't have to know about XML and socket modules to start using Python effectively, nor do I have to know about 'zip' or list comprehensions. These additions decrease the code size and increase efficiency, but don't really add any new expressive power that can't already be done by a 'for' loop. I'll try to convince myself that this suggestion is crazy and not bother you with this issue for awhile. Cheers, Paul From guido@python.org Mon Dec 18 17:18:11 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 12:18:11 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Mon, 18 Dec 2000 11:50:49 EST." <14910.16431.554136.374725@nem-srvr.stsci.edu> References: <14910.16431.554136.374725@nem-srvr.stsci.edu> Message-ID: <200012181718.MAA14030@cj20424-a.reston1.va.home.com> Paul Barret: > > > 1. The current boolean operator behavior does not have to change, and > > > hence will be backward compatible. Guido van Rossum: > > What incompatibility do you see in the current proposal? Paul Barret: > You have to choose between using rich comparisons or boolean > comparisons. You can't use both for the same (rich/complex) object. Sure. I thought that the NumPy folks were happy with this. Certainly two years ago they seemed to be. > > > 2. It eliminates the need to decide whether or not rich comparisons > > > takes precedence over boolean comparisons. > > > > Only if you want different semantics -- that's only an issue for NumPy. > > No. I think NumPy is the tip of the iceberg, when discussing new > semantics. Most users don't consider these broader semantic issues, > because Python doesn't give them the opportunity to do so. I can see > possible scenarios of using both boolean and non-boolean comparisons > for Python lists and dictionaries in addition to NumPy. That's the same argument that has been made for new operators all along. I've explained already why they are not on the table for 2.1. > I chose to use Python because it provides a richer framework than > other languages. When Python fails to provide such benefits, I'll > move to another language. I moved from PERL to Python because the > multi-dimensional array syntax is vastly better in Python than PERL, > though as a novice I don't have to know that it exists. What I'm > proposing here is in a similar vein. > > > > 3. The new operators add additional behavior without directly impacting > > > current behavior and the use of them is unambigous, at least in > > > relation to current Python behavior. You know by the operator what > > > type of comparison will be returned. This should appease Jim > > > Fulton, based on his arguments in 1998 about comparison operators > > > always returning a boolean value. > > > > As you know, I'm now pretty close to Jim. :-) He seemed pretty mellow > > about this now. > > Yes, I would hope so! > > It appears though that you misunderstand me. My point was that I tend > to agree with Jim Fulton's arguments for a limited interpretation of > the current comparison operators. I too expect them to return a > boolean result. I have never felt comfortable using such comparison > operators in an array context, e.g. as in the array language, IDL. It > just looks wrong. So my suggestion is to create new ones whose > implicit meaning is to provide element-wise or rich comparison > behavior. And to add similar behavior for the other operators for > consistency. > > Can someone provide an example in mathematics where comparison > operators are used in a non-boolean, ie. rich comparison, context. > If so, this might shut me up! Not me (I no longer consider myself a mathematician :-). Why are you requiring an example from math though? Again, you will be able to make this argument to the NumPy folks when they are ready to change the meaning of A > > 4. Compound objects, such as lists, could implement both rich > > > and boolean comparisons. The boolean comparison would remain as > > > is, while the rich comparison would return a list of boolean > > > values. Current behavior doesn't change; just a new feature, which > > > you may or may not choose to use, is added. > > > > > > If we go one step further and add the matrix-style operators along > > > with the comparison operators, we can provide a consistent user > > > interface to array/complex operations without changing current Python > > > behavior. If a user has no need for these new operators, he doesn't > > > have to use them or even know about them. All we've done is made > > > Python richer, but I believe with making it more complex. For > > Phrase should be: "but I believe without making it more complex.". > ------- > > > > example, all element-wise operations could have a ':' appended to > > > them, e.g. '+:', '<:', etc.; and will define element-wise addition, > > > element-wise less-than, etc. The traditional '*', '/', etc. operators > > > can then be used for matrix operations, which will appease the Matlab > > > people. > > > > > > Therefore, I don't think rich comparisons and matrix-type operators > > > should be considered separable. I really think you should consider > > > this suggestion. It appeases many groups while providing a consistent > > > and clear user interface, while greatly impacting current Python > > > behavior. > > The last phrase should read: "while not greatly impacting current > --- > Python behavior." I don't see any argument for elementwise operators here that I haven't heard before, and AFAIK it's all in the two PEPs. > > > Always-causing-havoc-at-the-last-moment-ly Yours, > > > > I think you misunderstand. Rich comparisons are mostly about allowing > > the separate overloading of <, <=, ==, !=, >, and >=. This is useful > > in its own light. > > No, I do understand. I've read most of the early discussions on this > issue and one of those issues was about having to choose between > boolean and rich comparisons and what should take precedence, when > both may be appropriate. I'm suggesting an alternative here. Note that Python doesn't decide which should take precedent. The implementer of an individual extension type decides what his comparison operators will return. > > If you don't want to use this overloading facility for elementwise > > comparisons in NumPy, that's fine with me. Nobody says you have to -- > > it's just that you *could*. > > Yes, I understand. > > > Red my lips: there won't be *any* new operators in 2.1. > > OK, I didn't expect this to make it into 2.1. > > > There will a better way to overload the existing Boolean operators, > > and they will be able to return non-Boolean results. That's useful in > > other situations besides NumPy. > > Yes, I agree, this should be done anyway. I'm just not sure that the > implicit meaning that these comparison operators are being given is > the best one. I'm just looking for ways to incorporate rich > comparisons into a broader framework, numpy just currently happens to > be the primary example of this proposal. > > Assuming the current comparison operator overloading is already > implemented and has been used to implement rich comparisons for some > objects, then my rich comparison proposal would cause confusion. This > is what I'm trying to avoid. AFAIK, rich comparisons haven't been used anywhere to return non-Boolean results. > > Feel free to lobby for elementwise operators -- but based on the > > discussion about this subject so far, I don't give it much of a chance > > even past Python 2.1. They would add a lot of baggage to the language > > (e.g. the table of operators in all Python books would be about twice > > as long) and by far the most users don't care about them. (Read the > > intro to 211 for some of the concerns -- this PEP tries to make the > > addition palatable by adding exactly *one* new operator.) > > So! Introductory books don't have to discuss these additional > operators. I don't have to know about XML and socket modules to start > using Python effectively, nor do I have to know about 'zip' or list > comprehensions. These additions decrease the code size and increase > efficiency, but don't really add any new expressive power that can't > already be done by a 'for' loop. > > I'll try to convince myself that this suggestion is crazy and not > bother you with this issue for awhile. Happy holidays nevertheless. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one@home.com Mon Dec 18 18:38:13 2000 From: tim.one@home.com (Tim Peters) Date: Mon, 18 Dec 2000 13:38:13 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <14910.16431.554136.374725@nem-srvr.stsci.edu> Message-ID: [Paul Barrett] > ... > Can someone provide an example in mathematics where comparison > operators are used in a non-boolean, ie. rich comparison, context. > If so, this might shut me up! By my informal accounting, over the years there have been more requests for three-outcome comparison operators than for elementwise ones, although the three-outcome lobby isn't organized so is less visible. It's a natural request for anyone working with partial orderings (a < b -> one of {yes, no, unordered}). Another large group of requests comes from people working with variants of fuzzy logic, where it's desired that the comparison operators be definable to return floats (intuitively corresponding to the probability that the stated relation "is true"). Another desire comes from the symbolic math camp, which would like to be able to-- as is possible for "+", "*", etc --define "<" so that e.g. "x < y" return an object capturing that somebody *asked* for "x < y"; they're not interested in numeric or Boolean results so much as symbolic expressions. "<" is used for all these things in the literature too. Whatever. "<" and friends are just collections of pixels. Add 300 new operator symbols, and people will want to redefine all of them at will too. draw-a-line-in-the-sand-and-the-wind-blows-it-away-ly y'rs - tim From tim.one@home.com Mon Dec 18 20:37:13 2000 From: tim.one@home.com (Tim Peters) Date: Mon, 18 Dec 2000 15:37:13 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > ... > If you're saying that we should give users ample time for the > transition, I'm with you. Then we're with each other, for suitably large values of "ample" . > If you're saying that you think the string module is too prominent to > ever start deprecating its use, I'm afraid we have a problem. We may. Time will tell. It needs a conversion tool, else I think it's unsellable. > ... > I'd also like to note that using the string module's wrappers incurs > the overhead of a Python function call -- using string methods is > faster. > > Finally, I like the look of fields[i].strip().lower() much better than > that of string.lower(string.strip(fields[i])) -- an actual example > from mimetools.py. I happen to like string methods better myself; I don't think that's at issue (except that loads of people apparently don't like "join" as a string method -- idiots ). The issue to me is purely breaking old code someday -- "string" is in very heavy use, and unlike as when deprecating regex in favor of re (either pre or especially sre), string methods aren't orders of magnitude better than the old way; and also unlike regex-vs-re it's not the case that the string module has become unmaintainable (to the contrary, string.py has become trivial). IOW, this one would be unprecedented fiddling. > ... > Note that I believe Java makes a useful distinction that PEP 5 misses: > it defines both deprecated features and obsolete features. > *Deprecated* features are simply features for which a better > alternative exists. *Obsolete* features are features that are only > being kept around for backwards compatibility. Deprecated features > may also be (and usually are) *obsolescent*, meaning they will become > obsolete in the future. I agree it would be useful to define these terms, although those particular definitions appear to be missing the most important point from the user's POV (not a one says "going away someday"). A Google search on "java obsolete obsolescent deprecated" doesn't turn up anything useful, so I doubt the usages you have in mind come from Java (it has "deprecated", but doesn't appear to have any well-defined meaning for the others). In keeping with the religious nature of the battle-- and religion offers precise terms for degrees of damnation! --I suggest: struggling -- a supported feature; the initial state of all features; may transition to Anathematized anathematized -- this feature is now cursed, but is supported; may transition to Condemned or Struggling; intimacy with Anathematized features is perilous condemned -- a feature scheduled for crucifixion; may transition to Crucified, Anathematized (this transition is called "a pardon"), or Struggling (this transition is called "a miracle"); intimacy with Condemned features is suicidal crucified -- a feature that is no longer supported; may transition to Resurrected resurrected -- a once-Crucified feature that is again supported; may transition to Condemned, Anathematized or Struggling; although since Resurrection is a state of grace, there may be no point in human time at which a feature is identifiably Resurrected (i.e., it may *appear*, to the unenlightened, that a feature moved directly from Crucified to Anathematized or Struggling or Condemned -- although saying so out loud is heresy). From tismer@tismer.com Mon Dec 18 22:58:03 2000 From: tismer@tismer.com (Christian Tismer) Date: Mon, 18 Dec 2000 23:58:03 +0100 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> <200012181420.JAA25063@cj20424-a.reston1.va.home.com> Message-ID: <3A3E967B.BE404114@tismer.com> Guido van Rossum wrote: [me, expanding on hashes, integers,and how to tame them cheaply] > Ai. I think what happened is this: long ago, the hash table sizes > were primes, or at least not powers of two! At some time I will wake up and they tell me that I'm reducible :-) > I'll leave it to the more mathematically-inclined to judge your > solution... I love small lists! - ciao - chris +1 (being a member, hopefully) -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From greg@cosc.canterbury.ac.nz Mon Dec 18 23:04:42 2000 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Dec 2000 12:04:42 +1300 (NZDT) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Message-ID: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> [Paul Barrett] > ... > Can someone provide an example in mathematics where comparison > operators are used in a non-boolean, ie. rich comparison, context. > If so, this might shut me up! Not exactly mathematical, but some day I'd like to create a database access module which lets you say things like mydb = OpenDB("inventory") parts = mydb.parts tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42) Of course, to really make this work I need to be able to overload "and" and "or" as well, but that's a whole 'nother PEP... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido@python.org Mon Dec 18 23:32:51 2000 From: guido@python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 18:32:51 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Tue, 19 Dec 2000 12:04:42 +1300." <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> References: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> Message-ID: <200012182332.SAA18456@cj20424-a.reston1.va.home.com> > Not exactly mathematical, but some day I'd like to create > a database access module which lets you say things like > > mydb = OpenDB("inventory") > parts = mydb.parts > tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42) > > Of course, to really make this work I need to be able > to overload "and" and "or" as well, but that's a whole > 'nother PEP... Believe it or not, in 1998 we already had a suggestion for overloading these too. This is hinted at in David Ascher's proposal (the Appendix of PEP 208) where objects could define __boolean_and__ to overload x Message-ID: Sounds good to me! It's a very cheap way to get the high bits into play. > i = (~_hash) & mask The ~ here seems like pure superstition to me (and the comments in the C code don't justify it at all -- I added a nag of my own about that the last time I checked in dictobject.c -- and see below for a bad consequence of doing ~). > # note that we do not mask! > # even the shifting my not be worth it. > incr = _hash ^ (_hash >> 3) The shifting was just another cheap trick to get *some* influence from the high bits. It's very limited, though. Toss it (it appears to be from the "random operations yield random results" matchbook school of design). [MAL] > BTW, would changing the hash function on strings from the simple > XOR scheme to something a little smarter help improve the performance > too (e.g. most strings used in programming never use the 8-th > bit) ? Don't understand -- the string hash uses multiplication: x = (1000003*x) ^ *p++; in a loop. Replacing "^" there by "+" should yield slightly better results. As is, string hashes are a lot like integer hashes, in that "consecutive" strings J001 J002 J003 J004 ... yield hashes very close together in value. But, because the current dict algorithm uses ~ on the full hash but does not use ~ on the initial increment, (~hash)+incr too often yields the same result for distinct hashes (i.e., there's a systematic (but weak) form of clustering). Note that Python is doing something very unusual: hashes are *usually* designed to yield an approximation to randomness across all bits. But Python's hashes never achieve that. This drives theoreticians mad (like the fellow who originally came up with the GF idea), but tends to work "better than random" in practice (e.g., a truly random hash function would almost certainly produce many collisions when fed a fat range of consecutive integers but still less than half the table size; but Python's trivial "identity" integer hash produces no collisions in that common case). [Christian] > - The bits used from the string hash are not well distributed > - using a "warmup wheel" on the hash to suck all bits in > gives the same quality of hashes like random numbers. See above and be very cautious: none of Python's hash functions produce well-distributed bits, and-- in effect --that's why Python dicts often perform "better than random" on common data. Even what you've done so far appears to provide marginally worse statistics for Guido's favorite kind of test case ("worse" in two senses: total number of collisions (a measure of amortized lookup cost), and maximum collision chain length (a measure of worst-case lookup cost)): d = {} for i in range(N): d[repr(i)] = i check-in-one-thing-then-let-it-simmer-ly y'rs - tim From tismer@tismer.com Tue Dec 19 01:16:27 2000 From: tismer@tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 02:16:27 +0100 Subject: [Python-Dev] The Dictionary Gem is polished! References: Message-ID: <3A3EB6EB.C79A3896@tismer.com> This is a multi-part message in MIME format. --------------E592273E7D1C3FC9F78A4489 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Greg Wilson wrote: > > > > > Here some results, dictionaries have 1000 entries: > > I will publish some results later today. > > In Doctor Dobb's Journal, right? :-) We'd *really* like this article... Well, the results are not so bad: I stopped testing computation time for the Python dictionary implementation, in favor of "trips". How many trips does the re-hash take in a dictionary? Tests were run for dictionaries of size 1000, 2000, 3000, 4000. Dictionary 1 consists of i, formatted as string. Dictionary 2 consists of strings containig the binary of i. Dictionary 3 consists of random numbers. Dictionary 4 consists of i << 16. Algorithms: old is the original dictionary algorithm implemented in Python (probably quite correct now, using longs :-) new is the proposed incremental bits-suck-in-division algorithm. new2 is a version of new, where all extra bits of the hash function are wheeled in in advance. The computation time of this is not neglectible, so please use this result for reference, only. Here the results: (bad integers(old) not computed for n>1000 ) """ D:\crml_doc\platf\py>python dictest.py N=1000 trips for strings old=293 new=302 new2=221 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=499500 new=13187 new2=999 trips for random integers old=377 new=371 new2=393 trips for windows names old=230 new=207 new2=200 N=2000 trips for strings old=1093 new=1109 new2=786 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=26455 new2=1999 trips for random integers old=691 new=710 new2=741 trips for windows names old=503 new=542 new2=564 N=3000 trips for strings old=810 new=839 new2=609 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=38681 new2=2999 trips for random integers old=762 new=740 new2=735 trips for windows names old=712 new=711 new2=691 N=4000 trips for strings old=1850 new=1843 new2=1375 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=52994 new2=3999 trips for random integers old=1440 new=1450 new2=1414 trips for windows names old=1449 new=1434 new2=1457 D:\crml_doc\platf\py> """ Interpretation: --------------- Short numeric strings show a slightly too high trip number. This means that the hash() function could be enhanced. But the effect would be below 10 percent compared to random hashes, therefore not worth it. Binary representations of numbers as strings still create perfect hash numbers. Bad integers (complete hash clash due to high power of 2) are handled fairly well by the new algorithm. "new2" shows that they can be brought down to nearly perfect hashes just by applying the "hash melting wheel": Windows names are almost upper case, and almost verbose. They appear to perform nearly as well as random numbers. This means: The Python string has function is very good for a wide area of applications. In Summary: I would try to modify the string hash function slightly for short strings, but only if this does not negatively affect the results of above. Summary of summary: There is no really low hanging fruit in string hashing. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com --------------E592273E7D1C3FC9F78A4489 Content-Type: text/plain; charset=us-ascii; name="dictest.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="dictest.py" ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] polys = map(long, polys) class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg mp.warmup = newalg>1 mp.trips = 0 def getTrips(self): trips = self.trips self.trips = 0 return trips def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # the shifting is worth it in the incremental case. ## added after posting to python-dev: uhash = _hash & 0xffffffffl if mp.warmup: incr = uhash mask2 = 0xffffffffl ^ mask while mask2 > mask: if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 mask2 = mask2>>1 # this loop *can* be sped up by tables # with precomputed multiple shifts. # But I'm not sure if it is worth it at all. else: incr = uhash ^ (uhash >> 3) ###### TO HERE if (not incr): incr = mask while 1: mp.trips = mp.trips+1 ep = ep0[int((i+incr)&mask)] if (ep[me_key] is NULL) : if (freeslot is not NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup for i in range(n): s = str(i) #* 5 #s = chr(i%256) + chr(i>>8)## d1[s] = d2[s] = d3[s] = i return d1, d2, d3 def istring_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup for i in range(n): s = chr(i%256) + chr(i>>8) d1[s] = d2[s] = d3[s] = i return d1, d2, d3 def random_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup from whrandom import randint import sys keys = [] for i in range(n): keys.append(randint(0, sys.maxint-1)) for i in keys: d1[i] = d2[i] = d3[i] = i return d1, d2, d3 def badnum_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup shift = 10 if EXTREME: shift = 16 for i in range(n): bad = i << 16 d2[bad] = d3[bad] = i if n <= 1000: d1[bad] = i return d1, d2, d3 def names_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup import win32con keys = win32con.__dict__.keys() if len(keys) < n: keys = [] for s in keys[:n]: d1[s] = d2[s] = d3[s] = s return d1, d2, d3 def do_test(dict): keys = dict.keys() dict.getTrips() # reset test(keys, dict) return dict.getTrips() EXTREME=1 if __name__ == "__main__": for N in (1000,2000,3000,4000): sdold, sdnew, sdnew2 = string_dicts(N) idold, idnew, idnew2 = istring_dicts(N) bdold, bdnew, bdnew2 = badnum_dicts(N) rdold, rdnew, rdnew2 = random_dicts(N) ndold, ndnew, ndnew2 = names_dicts(N) print "N=%d" %N print "trips for strings old=%d new=%d new2=%d" % tuple( map(do_test, (sdold, sdnew, sdnew2)) ) print "trips for bin strings old=%d new=%d new2=%d" % tuple( map(do_test, (idold, idnew, idnew2)) ) print "trips for bad integers old=%d new=%d new2=%d" % tuple( map(do_test, (bdold, bdnew, bdnew2))) print "trips for random integers old=%d new=%d new2=%d" % tuple( map(do_test, (rdold, rdnew, rdnew2))) print "trips for windows names old=%d new=%d new2=%d" % tuple( map(do_test, (ndold, ndnew, ndnew2))) """ Results with a shift of 10 (EXTREME=0): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 Results with a shift of 16 (EXTREME=1): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.218 new=5.147 timing for bad integers old=571.210 new=19.220 """ --------------E592273E7D1C3FC9F78A4489-- From tismer@tismer.com Tue Dec 19 01:51:32 2000 From: tismer@tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 02:51:32 +0100 Subject: [Python-Dev] Re: The Dictionary Gem is polished! References: Message-ID: <3A3EBF23.750CF761@tismer.com> Tim Peters wrote: > > Sounds good to me! It's a very cheap way to get the high bits into play. That's what I wanted to hear. It's also the reason why I try to stay conservative: Just do an obviously useful bit, but do not break any of the inherent benefits, like those "better than random" amenities. Python's dictionary algorithm appears to be "near perfect" and of "never touch but veery carefully or redo it completely". I tried the tightrope walk of just adding a tiny topping. > > i = (~_hash) & mask Yes that stuff was 2 hours last nite :-) I just decided to not touch it. Arbitrary crap! Although an XOR with hash >> number of mask bits would perform much better (in many cases but not all). Anyway, simple shifting cannot solve general bit distribution problems. Nor can I :-) > The ~ here seems like pure superstition to me (and the comments in the C > code don't justify it at all -- I added a nag of my own about that the last > time I checked in dictobject.c -- and see below for a bad consequence of > doing ~). > > > # note that we do not mask! > > # even the shifting my not be worth it. > > incr = _hash ^ (_hash >> 3) > > The shifting was just another cheap trick to get *some* influence from the > high bits. It's very limited, though. Toss it (it appears to be from the > "random operations yield random results" matchbook school of > design). Now, comment it out, and you see my new algorithm perform much worse. I just kept it since it had an advantage on "my case". (bad guy I know). And I wanted to have an argument for my change to get accepted. "No cost, just profit, nearly the same" was what I tried to sell. > [MAL] > > BTW, would changing the hash function on strings from the simple > > XOR scheme to something a little smarter help improve the performance > > too (e.g. most strings used in programming never use the 8-th > > bit) ? > > Don't understand -- the string hash uses multiplication: > > x = (1000003*x) ^ *p++; > > in a loop. Replacing "^" there by "+" should yield slightly better results. For short strings, this prime has bad influence on the low bits, making it perform supoptimally for small dicts. See the new2 algo which funnily corrects for that. The reason is obvious: Just look at the bit pattern of 1000003: '0xf4243' Without giving proof, this smells like bad bit distribution on small strings to me. You smell it too, right? > As is, string hashes are a lot like integer hashes, in that "consecutive" > strings > > J001 > J002 > J003 > J004 > ... > > yield hashes very close together in value. A bad generator in that case. I'll look for a better one. > But, because the current dict > algorithm uses ~ on the full hash but does not use ~ on the initial > increment, (~hash)+incr too often yields the same result for distinct hashes > (i.e., there's a systematic (but weak) form of clustering). You name it. > Note that Python is doing something very unusual: hashes are *usually* > designed to yield an approximation to randomness across all bits. But > Python's hashes never achieve that. This drives theoreticians mad (like the > fellow who originally came up with the GF idea), but tends to work "better > than random" in practice (e.g., a truly random hash function would almost > certainly produce many collisions when fed a fat range of consecutive > integers but still less than half the table size; but Python's trivial > "identity" integer hash produces no collisions in that common case). A good reason to be careful with changes(ahem). > [Christian] > > - The bits used from the string hash are not well distributed > > - using a "warmup wheel" on the hash to suck all bits in > > gives the same quality of hashes like random numbers. > > See above and be very cautious: none of Python's hash functions produce > well-distributed bits, and-- in effect --that's why Python dicts often > perform "better than random" on common data. Even what you've done so far > appears to provide marginally worse statistics for Guido's favorite kind of > test case ("worse" in two senses: total number of collisions (a measure of > amortized lookup cost), and maximum collision chain length (a measure of > worst-case lookup cost)): > > d = {} > for i in range(N): > d[repr(i)] = i Nah, I did quite a lot of tests, and the trip number shows a variation of about 10%, without judging old or new for better. This is just the randomness inside. > check-in-one-thing-then-let-it-simmer-ly y'rs - tim This is why I think to be even more conservative: Try to use a division wheel, but with the inverses of the original primitive roots, just in order to get at Guido's results :-) making-perfect-hashes-of-interneds-still-looks-promising - ly y'rs - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From greg@cosc.canterbury.ac.nz Tue Dec 19 03:07:56 2000 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Dec 2000 16:07:56 +1300 (NZDT) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <200012182332.SAA18456@cj20424-a.reston1.va.home.com> Message-ID: <200012190307.QAA02663@s454.cosc.canterbury.ac.nz> > The problem I have with this is that the code to evaluate g() has to > be generated twice! I have an idea how to fix that. There need to be two methods, __boolean_and_1__ and __boolean_and_2__. The first operand is evaluated and passed to __boolean_and_1__. If it returns a result, that becomes the result of the expression, and the second operand is short-circuited. If __boolean_and_1__ raises a NeedOtherOperand exception (or there is no __boolean_and_1__ method), the second operand is evaluated, and both operands are passed to __boolean_and_2__. The bytecode would look something like BOOLEAN_AND_1 label BOOLEAN_AND_2 label: ... I'll make a PEP out of this one day if I get enthusiastic enough. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one@home.com Tue Dec 19 04:55:33 2000 From: tim.one@home.com (Tim Peters) Date: Mon, 18 Dec 2000 23:55:33 -0500 Subject: [Python-Dev] The Dictionary Gem is polished! In-Reply-To: <3A3EB6EB.C79A3896@tismer.com> Message-ID: Something else to ponder: my tests show that the current ("old") algorithm performs much better (somewhat worse than "new2" == new algorithm + warmup) if incr is simply initialized like so instead: if mp.oldalg: incr = (_hash & 0xffffffffL) % (mp.ma_size - 1) That's another way to get all the bits to contribute to the result. Note that a mod by size-1 is analogous to "casting out nines" in decimal: it's the same as breaking hash into fixed-sized pieces from the right (10 bits each if size=2**10, etc), adding the pieces together, and repeating that process until only one piece remains. IOW, it's a degenerate form of division, but works well all the same. It didn't improve over that when I tried a mod by the largest prime less than the table size (which suggests we're sucking all we can out of the *probe* sequence given a sometimes-poor starting index). However, it's subject to the same weak clustering phenomenon as the old method due to the ill-advised "~hash" operation in computing the initial index. If ~ is also thrown away, it's as good as new2 (here I've tossed out the "windows names", and "old" == existing algorithm except (a) get rid of ~ when computing index and (b) do mod by size-1 when computing incr): N=1000 trips for strings old=230 new=261 new2=239 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=999 new=13212 new2=999 trips for random integers old=399 new=421 new2=410 N=2000 trips for strings old=787 new=1066 new2=827 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=26481 new2=1999 trips for random integers old=652 new=733 new2=650 N=3000 trips for strings old=547 new=760 new2=617 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=38701 new2=2999 trips for random integers old=724 new=743 new2=768 N=4000 trips for strings old=1311 new=1657 new2=1315 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=53014 new2=3999 trips for random integers old=1476 new=1587 new2=1493 The new and new2 values differ in minor ways from the ones you posted because I got rid of the ~ (the ~ has a bad interaction with "additive" means of computing incr, because the ~ tends to move the index in the opposite direction, and these moves in opposite directions tend to cancel out when computing incr+index the first time). too-bad-mod-is-expensive!-ly y'rs - tim From tim.one@home.com Tue Dec 19 05:50:01 2000 From: tim.one@home.com (Tim Peters) Date: Tue, 19 Dec 2000 00:50:01 -0500 Subject: [Python-Dev] SourceForge SSH silliness In-Reply-To: <20001217220008.D29681@xs4all.nl> Message-ID: [Tim] > Starting last night, I get this msg whenever I update Python code w/ > CVSROOT=:ext:tim_one@cvs.python.sourceforge.net:/cvsroot/python: > > """ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > @ WARNING: HOST IDENTIFICATION HAS CHANGED! @ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! > Someone could be eavesdropping on you right now > (man-in-the-middle attack)! > It is also possible that the host key has just been changed. > Please contact your system administrator. > Add correct host key in C:\Code/.ssh/known_hosts to get rid of > this message. > Password authentication is disabled to avoid trojan horses. > """ > > This is SourceForge's doing, and is permanent (they've changed > keys on their end). ... [Thomas Wouters] > What sourceforge did was switch Linux distributions, and upgrade. > The switch doesn't really matter for the SSH problem, because recent > Debian and recent RedHat releases both use a new ssh, the OpenBSD > ssh imlementation. > Apparently, it isn't entirely backwards compatible to old versions of > F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This > might or might not be your problem; if it is, you should get a decent > message that gives a relatively clear message such as 'cypher type 'idea' > not supported'. > ... [and quite a bit more] ... I hope you're feeling better today . "The problem" was one the wng msg spelled out: "It is also possible that the host key has just been changed.". SF changed keys. That's the whole banana right there. Deleting the sourceforge keys from known_hosts fixed it (== convinced ssh to install new SF keys the next time I connected). From tim.one@home.com Tue Dec 19 05:58:45 2000 From: tim.one@home.com (Tim Peters) Date: Tue, 19 Dec 2000 00:58:45 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <200012171438.JAA21603@cj20424-a.reston1.va.home.com> Message-ID: [Tim] > I expect it would do less harm to introduce a compile-time warning for > locals that are never referenced (such as the "a" in "set"). [Guido] > Another warning that would be quite useful (and trap similar cases) > would be "local variable used before set". Java elevated that last one to a compile-time error, via its "definite assignment" rules: you not only have to make sure a local is bound before reference, you have to make it *obvious* to the compiler that it's bound before reference. I think this is a Good Thing, because with intense training, people can learn to think like a compiler too . Seriously, in several of the cases where gcc warned about "maybe used before set" in the Python implementation, the warnings were bogus but it was non-trivial to deduce that. Such code is very brittle under modification, and the definite assignment rules make that path to error a non-starter. Example: def f(N): if N > 0: for i in range(N): if i == 0: j = 42 else: f2(i) elif N <= 0: j = 24 return j It's a Crime Against Humanity to make the code reader *deduce* that j is always bound by the time "return" is executed. From guido@python.org Tue Dec 19 06:08:14 2000 From: guido@python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 01:08:14 -0500 Subject: [Python-Dev] Error: syncmail script missing Message-ID: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> I just checked in the documentation for the warnings module. (Check it out!) When I ran "cvs commit" in the Doc directory, it said, amongst other things: sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory I suppose this may be a side effect of the transition to new hardware of the SourceForge CVS archive. (Which, by the way, has dramatically improved the performance of typical CVS operations -- I am no longer afraid to do a cvs diff or cvs log in Emacs, or to do a cvs update just to be sure.) Could some of the Powers That Be (Fred or Barry :-) check into what happened to the syncmail script? --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Tue Dec 19 06:10:04 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 19 Dec 2000 01:10:04 -0500 (EST) Subject: [Python-Dev] Error: syncmail script missing In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> Message-ID: <14910.64444.662460.48236@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > Could some of the Powers That Be (Fred or Barry :-) check into what > happened to the syncmail script? We've seen this before, but I'm not sure what it was. Barry, do you recall? Had the Python interpreter landed in a different directory? Or perhaps the location of the CVS repository is different, so syncmail isn't where loginfo says. Tomorrow... scp to SF appears broken as well. ;( -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tim.one@home.com Tue Dec 19 06:16:15 2000 From: tim.one@home.com (Tim Peters) Date: Tue, 19 Dec 2000 01:16:15 -0500 Subject: [Python-Dev] Error: syncmail script missing In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > I just checked in the documentation for the warnings module. (Check > it out!) Everyone should note that this means Guido will be taking his traditional post-release vacation almost immediately . > When I ran "cvs commit" in the Doc directory, it said, amongst other > things: > > sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory > > I suppose this may be a side effect of the transition to new hardware > of the SourceForge CVS archive. The lack of checkin mail was first noted on a Jython list. Finn wisely replied that he'd just sit back and wait for the CPython people to figure out how to fix it. > ... > Could some of the Powers That Be (Fred or Barry :-) check into what > happened to the syncmail script? Don't worry, I'll do my part by nagging them in your absence . Bon holiday voyage! From cgw@fnal.gov Tue Dec 19 06:32:15 2000 From: cgw@fnal.gov (Charles G Waldman) Date: Tue, 19 Dec 2000 00:32:15 -0600 (CST) Subject: [Python-Dev] cycle-GC question Message-ID: <14911.239.12288.546710@buffalo.fnal.gov> The following program: import rexec while 1: x = rexec.RExec() del x leaks memory at a fantastic rate. It seems clear (?) that this is due to the call to "set_rexec" at rexec.py:140, which creates a circular reference between the `rexec' and `hooks' objects. (There's even a nice comment to that effect). I'm curious however as to why the spiffy new cyclic-garbage collector doesn't pick this up? Just-wondering-ly y'rs, cgw From tim_one@email.msn.com Tue Dec 19 09:24:18 2000 From: tim_one@email.msn.com (Tim Peters) Date: Tue, 19 Dec 2000 04:24:18 -0500 Subject: [Python-Dev] RE: The Dictionary Gem is polished! In-Reply-To: <3A3EBF23.750CF761@tismer.com> Message-ID: [Christian Tismer] > ... > For short strings, this prime has bad influence on the low bits, > making it perform supoptimally for small dicts. > See the new2 algo which funnily corrects for that. > The reason is obvious: Just look at the bit pattern > of 1000003: '0xf4243' > > Without giving proof, this smells like bad bit distribution on small > strings to me. You smell it too, right? > ... [Tim] > As is, string hashes are a lot like integer hashes, in that > "consecutive" strings > > J001 > J002 > J003 > J004 > ... > > yield hashes very close together in value. [back to Christian] > A bad generator in that case. I'll look for a better one. Not necessarily! It's for that same reason "consecutive strings" can have "better than random" behavior today. And consecutive strings-- like consecutive ints --are a common case. Here are the numbers for the synthesized string cases: N=1000 trips for strings old=293 new=302 new2=221 trips for bin strings old=0 new=0 new2=0 N=2000 trips for strings old=1093 new=1109 new2=786 trips for bin strings old=0 new=0 new2=0 N=3000 trips for strings old=810 new=839 new2=609 trips for bin strings old=0 new=0 new2=0 N=4000 trips for strings old=1850 new=1843 new2=1375 trips for bin strings old=0 new=0 new2=0 Here they are again, after doing nothing except changing the "^" to "+" in the string hash, i.e. replacing x = (1000003*x) ^ *p++; by x = (1000003*x) + *p++; N=1000 trips for strings old=140 new=127 new2=108 trips for bin strings old=0 new=0 new2=0 N=2000 trips for strings old=480 new=434 new2=411 trips for bin strings old=0 new=0 new2=0 N=3000 trips for strings old=821 new=857 new2=631 trips for bin strings old=0 new=0 new2=0 N=4000 trips for strings old=1892 new=1852 new2=1476 trips for bin strings old=0 new=0 new2=0 The first two sizes are dramatically better, the last two a wash. If you want to see a real disaster, replace the "+" with "*" : N=1000 trips for strings old=71429 new=6449 new2=2048 trips for bin strings old=81187 new=41117 new2=41584 N=2000 trips for strings old=26882 new=9300 new2=6103 trips for bin strings old=96018 new=46932 new2=42408 I got tired of waiting at that point ... suspecting-a-better-string-hash-is-hard-to-find-ly y'rs - tim From martin@loewis.home.cs.tu-berlin.de Tue Dec 19 11:58:17 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 19 Dec 2000 12:58:17 +0100 Subject: [Python-Dev] Death to string functions! Message-ID: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> > I agree it would be useful to define these terms, although those > particular definitions appear to be missing the most important point > from the user's POV (not a one says "going away someday"). PEP 4 says # Usage of a module may be `deprecated', which means that it may be # removed from a future Python release. Proposals for better wording are welcome (and yes, I still have to get the comments that I got into the document). Regards, Martin From guido@python.org Tue Dec 19 14:48:47 2000 From: guido@python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 09:48:47 -0500 Subject: [Python-Dev] cycle-GC question In-Reply-To: Your message of "Tue, 19 Dec 2000 00:32:15 CST." <14911.239.12288.546710@buffalo.fnal.gov> References: <14911.239.12288.546710@buffalo.fnal.gov> Message-ID: <200012191448.JAA28737@cj20424-a.reston1.va.home.com> > The following program: > > import rexec > while 1: > x = rexec.RExec() > del x > > leaks memory at a fantastic rate. > > It seems clear (?) that this is due to the call to "set_rexec" at > rexec.py:140, which creates a circular reference between the `rexec' > and `hooks' objects. (There's even a nice comment to that effect). > > I'm curious however as to why the spiffy new cyclic-garbage collector > doesn't pick this up? Me too. I turned on gc debugging (gc.set_debug(077) :-) and got messages suggesting that it is not collecting everything. The output looks like this: . . . gc: collecting generation 0... gc: objects in each generation: 764 6726 89174 gc: done. gc: collecting generation 1... gc: objects in each generation: 0 8179 89174 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 0 97235 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 747 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 1386 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 2082 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 2721 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 3417 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 4056 97184 gc: done. . . . With the third number growing each time a "generation 1" collection is done. Maybe Neil can shed some light? The gc.garbage list is empty. This is about as much as I know about the GC stuff... --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli@amber.org Tue Dec 19 15:25:18 2000 From: petrilli@amber.org (Christopher Petrilli) Date: Tue, 19 Dec 2000 10:25:18 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 12:58:17PM +0100 References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> Message-ID: <20001219102518.A14288@trump.amber.org> So I was thinking about this whole thing, and wondering why it was that seeing things like: " ".join(aList) bugged me to no end, while: aString.lower() didn't seem to look wrong. I finally put my finger on it, and I haven't seen anyone mention it, so I guess I'll do so. To me, the concept of "join" on a string is just not quite kosher, instead it should be something like this: aList.join(" ") or if you want it without the indirection: ['item', 'item', 'item'].join(" ") Now *THAT* looks right to me. The example of a join method on a string just doesn't quite gel in my head, and I did some thinking and digging, and well, when I pulled up my Smalltalk browser, things like join are done on Collections, not on Strings. You're joining the collection, not the string. Perhaps in a rush to move some things that were "string related" in the string module into methods on the strings themselves (something I whole-heartedly support), we moved a few too many things there---things that symantically don't really belong as methods on a string object. How this gets resolved, I don't know... but I know a lot of people have looked at the string methods---and they each keep coming back to 1 or 2 that bug them... and I think it's those that really aren't methods of a string, but instead something that operates with strings, but expects other things. Chris -- | Christopher Petrilli | petrilli@amber.org From guido@python.org Tue Dec 19 15:37:15 2000 From: guido@python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 10:37:15 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Tue, 19 Dec 2000 10:25:18 EST." <20001219102518.A14288@trump.amber.org> References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org> Message-ID: <200012191537.KAA28909@cj20424-a.reston1.va.home.com> > So I was thinking about this whole thing, and wondering why it was > that seeing things like: > > " ".join(aList) > > bugged me to no end, while: > > aString.lower() > > didn't seem to look wrong. I finally put my finger on it, and I > haven't seen anyone mention it, so I guess I'll do so. To me, the > concept of "join" on a string is just not quite kosher, instead it > should be something like this: > > aList.join(" ") > > or if you want it without the indirection: > > ['item', 'item', 'item'].join(" ") > > Now *THAT* looks right to me. The example of a join method on a > string just doesn't quite gel in my head, and I did some thinking and > digging, and well, when I pulled up my Smalltalk browser, things like > join are done on Collections, not on Strings. You're joining the > collection, not the string. > > Perhaps in a rush to move some things that were "string related" in > the string module into methods on the strings themselves (something I > whole-heartedly support), we moved a few too many things > there---things that symantically don't really belong as methods on a > string object. > > How this gets resolved, I don't know... but I know a lot of people > have looked at the string methods---and they each keep coming back to > 1 or 2 that bug them... and I think it's those that really aren't > methods of a string, but instead something that operates with strings, > but expects other things. Boy, are you stirring up a can of worms that we've been through many times before! Nothing you say hasn't been said at least a hundred times before, on this list as well as on c.l.py. The problem is that if you want to make this a method on lists, you'll also have to make it a method on tuples, and on arrays, and on NumPy arrays, and on any user-defined type that implements the sequence protocol... That's just not reasonable to expect. There really seem to be only two possibilities that don't have this problem: (1) make it a built-in, or (2) make it a method on strings. We chose for (2) for uniformity, and to avoid the potention with os.path.join(), which is sometimes imported as a local. If " ".join(L) bugs you, try this: space = " " # This could be a global . . . s = space.join(L) --Guido van Rossum (home page: http://www.python.org/~guido/) From barry@digicool.com Tue Dec 19 15:46:55 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 10:46:55 -0500 Subject: [Python-Dev] Death to string functions! References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org> Message-ID: <14911.33519.764029.306876@anthem.concentric.net> >>>>> "CP" == Christopher Petrilli writes: CP> So I was thinking about this whole thing, and wondering why it CP> was that seeing things like: CP> " ".join(aList) CP> bugged me to no end, while: CP> aString.lower() CP> didn't seem to look wrong. I finally put my finger on it, and CP> I haven't seen anyone mention it, so I guess I'll do so. Actually, it has been debated to death. ;) This looks better: SPACE = ' ' SPACE.join(aList) That reads good to me ("space-join this list") and that's how I always write it. That said, there are certainly lots of people who agree with you. You can't put join() on sequences though, until you have builtin base-classes, or interfaces, or protocols or some such construct, because otherwise you'd have to add it to EVERY sequence, including classes that act like sequences. One idea that I believe has merit is to consider adding join() to the builtins, probably with a signature like: join(aList, aString) -> aString This horse has been whacked pretty good too, but I don't remember seeing a patch or a pronouncement. -Barry From nas@arctrix.com Tue Dec 19 08:53:36 2000 From: nas@arctrix.com (Neil Schemenauer) Date: Tue, 19 Dec 2000 00:53:36 -0800 Subject: [Python-Dev] cycle-GC question In-Reply-To: <200012191448.JAA28737@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Dec 19, 2000 at 09:48:47AM -0500 References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com> Message-ID: <20001219005336.A303@glacier.fnational.com> On Tue, Dec 19, 2000 at 09:48:47AM -0500, Guido van Rossum wrote: > > import rexec > > while 1: > > x = rexec.RExec() > > del x > > > > leaks memory at a fantastic rate. > > > > It seems clear (?) that this is due to the call to "set_rexec" at > > rexec.py:140, which creates a circular reference between the `rexec' > > and `hooks' objects. (There's even a nice comment to that effect). Line 140 is not the only place a circular reference is created. There is another one which is trickier to find: def add_module(self, mname): if self.modules.has_key(mname): return self.modules[mname] self.modules[mname] = m = self.hooks.new_module(mname) m.__builtins__ = self.modules['__builtin__'] return m If the module being added is __builtin__ then m.__builtins__ = m. The GC currently doesn't track modules. I guess it should. It might be possible to avoid this circular reference but I don't know enough about how RExec works. Would something like: def add_module(self, mname): if self.modules.has_key(mname): return self.modules[mname] self.modules[mname] = m = self.hooks.new_module(mname) if mname != '__builtin__': m.__builtins__ = self.modules['__builtin__'] return m do the trick? Neil From fredrik@effbot.org Tue Dec 19 15:39:49 2000 From: fredrik@effbot.org (Fredrik Lundh) Date: Tue, 19 Dec 2000 16:39:49 +0100 Subject: [Python-Dev] Death to string functions! References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org> Message-ID: <008301c069d3$76560a20$3c6340d5@hagrid> "Christopher Petrilli" wrote: > didn't seem to look wrong. I finally put my finger on it, and I > haven't seen anyone mention it, so I guess I'll do so. To me, the > concept of "join" on a string is just not quite kosher, instead it > should be something like this: > > aList.join(" ") > > or if you want it without the indirection: > > ['item', 'item', 'item'].join(" ") > > Now *THAT* looks right to me. why do we keep coming back to this? aString.join can do anything string.join can do, but aList.join cannot. if you don't understand why, check the archives. From martin@loewis.home.cs.tu-berlin.de Tue Dec 19 15:44:48 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 19 Dec 2000 16:44:48 +0100 Subject: [Python-Dev] cycle-GC question Message-ID: <200012191544.QAA11408@loewis.home.cs.tu-berlin.de> > It seems clear (?) that this is due to the call to "set_rexec" at > rexec.py:140, which creates a circular reference between the `rexec' > and `hooks' objects. (There's even a nice comment to that effect). It's not all that clear that *this* is the cycle. In fact, it is not. > I'm curious however as to why the spiffy new cyclic-garbage > collector doesn't pick this up? It's an interesting problem, so I spent this afternoon investigating it. I soon found that I need a tool, so I introduced a new function gc.getreferents which, when given an object, returns a list of objects referring to that object. The patch for that feature is in http://sourceforge.net/patch/?func=detailpatch&patch_id=102925&group_id=5470 Applying that function recursively, I can get an output that looks like that: dictionary 0x81f4f24 dictionary 0x81f4f24 (seen) dictionary 0x81f4f24 (seen) dictionary 0x8213bc4 dictionary 0x820869c dictionary 0x820866c (seen) dictionary 0x8213bf4 dictionary 0x820866c (seen) dictionary 0x8214144 dictionary 0x820866c (seen) Each indentation level shows the objects which refer to the outer-next object, e.g. the dictionary 0x820869c refers to the RExec instance, and the RHooks instance refers to that dictionary. Clearly, the dictionary 0x820869c is the RHooks' __dict__, and the reference belongs to the 'rexec' key in that dictionary. The recursion stops only when an object has been seen before (so its a cycle, or other non-tree graph), or if there are no referents (the lists created to do the iteration are ignored). So it appears that the r_import method is referenced from some dictionary, but that dictionary is not referenced anywhere??? Checking the actual structures shows that rexec creates a __builtin__ module, which has a dictionary that has an __import__ key. So the reference to the method comes from the __builtin__ module, which in turn is referenced as the RExec's .modules attribute, giving another cycle. However, module objects don't participate in garbage collection. Therefore, gc.getreferents cannot traverse a module, and the garbage collector won't find a cycle involving a garbage module. I just submitted a bug report, http://sourceforge.net/bugs/?func=detailbug&bug_id=126345&group_id=5470 which suggests that modules should also participate in garbage collection. Regards, Martin From guido@python.org Tue Dec 19 16:01:46 2000 From: guido@python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 11:01:46 -0500 Subject: [Python-Dev] cycle-GC question In-Reply-To: Your message of "Tue, 19 Dec 2000 00:53:36 PST." <20001219005336.A303@glacier.fnational.com> References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com> <20001219005336.A303@glacier.fnational.com> Message-ID: <200012191601.LAA29015@cj20424-a.reston1.va.home.com> > might be possible to avoid this circular reference but I don't > know enough about how RExec works. Would something like: > > def add_module(self, mname): > if self.modules.has_key(mname): > return self.modules[mname] > self.modules[mname] = m = self.hooks.new_module(mname) > if mname != '__builtin__': > m.__builtins__ = self.modules['__builtin__'] > return m > > do the trick? That's certainly a good thing to do (__builtin__ has no business having a __builtins__!), but (in my feeble experiment) it doesn't make the leaks go away. Note that almost every module participates heavily in cycles: whenever you define a function f(), f.func_globals is the module's __dict__, which also contains a reference to f. Similar for classes, with an extra hop via the class object and its __dict__. --Guido van Rossum (home page: http://www.python.org/~guido/) From cgw@fnal.gov Tue Dec 19 16:06:06 2000 From: cgw@fnal.gov (Charles G Waldman) Date: Tue, 19 Dec 2000 10:06:06 -0600 (CST) Subject: [Python-Dev] cycle-GC question In-Reply-To: <20001219005336.A303@glacier.fnational.com> References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com> <20001219005336.A303@glacier.fnational.com> Message-ID: <14911.34670.664178.418523@buffalo.fnal.gov> Neil Schemenauer writes: > > Line 140 is not the only place a circular reference is created. > There is another one which is trickier to find: > > def add_module(self, mname): > if self.modules.has_key(mname): > return self.modules[mname] > self.modules[mname] = m = self.hooks.new_module(mname) > m.__builtins__ = self.modules['__builtin__'] > return m > > If the module being added is __builtin__ then m.__builtins__ = m. > The GC currently doesn't track modules. I guess it should. It > might be possible to avoid this circular reference but I don't > know enough about how RExec works. Would something like: > > def add_module(self, mname): > if self.modules.has_key(mname): > return self.modules[mname] > self.modules[mname] = m = self.hooks.new_module(mname) > if mname != '__builtin__': > m.__builtins__ = self.modules['__builtin__'] > return m > > do the trick? No... if you change "add_module" in exactly the way you suggest (without worrying about whether it breaks the functionality of rexec!) and run the test while 1: rexec.REXec() you will find that it still leaks memory at a prodigious rate. So, (unless there is yet another module-level cyclic reference) I don't think this theory explains the problem. From martin@loewis.home.cs.tu-berlin.de Tue Dec 19 16:07:04 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 19 Dec 2000 17:07:04 +0100 Subject: [Python-Dev] cycle-GC question Message-ID: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de> > There is another one which is trickier to find: [__builtin__.__builtins__ == __builtin__] > Would something like: [do not add builtins to builtin > work? No, because there is another one that is even trickier to find :-) >>> print r >>> print r.modules['__builtin__'].open.im_self Please see my other message; I think modules should be gc'ed. Regards, Martin From nas@arctrix.com Tue Dec 19 09:24:29 2000 From: nas@arctrix.com (Neil Schemenauer) Date: Tue, 19 Dec 2000 01:24:29 -0800 Subject: [Python-Dev] cycle-GC question In-Reply-To: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 05:07:04PM +0100 References: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de> Message-ID: <20001219012429.A520@glacier.fnational.com> On Tue, Dec 19, 2000 at 05:07:04PM +0100, Martin v. Loewis wrote: > I think modules should be gc'ed. I agree. Its easy to do. If no one does over Christmas I will do it before 2.1 is released. Neil From tismer@tismer.com Tue Dec 19 15:48:58 2000 From: tismer@tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 17:48:58 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: Message-ID: <3A3F836A.DEDF1011@tismer.com> Tim Peters wrote: > > Something else to ponder: my tests show that the current ("old") algorithm > performs much better (somewhat worse than "new2" == new algorithm + warmup) > if incr is simply initialized like so instead: > > if mp.oldalg: > incr = (_hash & 0xffffffffL) % (mp.ma_size - 1) Sure. I did this as well, but didn't consider a division since it said to be too slow. But this is very platform dependant. On Pentiums this might be not noticeable. > That's another way to get all the bits to contribute to the result. Note > that a mod by size-1 is analogous to "casting out nines" in decimal: it's > the same as breaking hash into fixed-sized pieces from the right (10 bits > each if size=2**10, etc), adding the pieces together, and repeating that > process until only one piece remains. IOW, it's a degenerate form of > division, but works well all the same. It didn't improve over that when I > tried a mod by the largest prime less than the table size (which suggests > we're sucking all we can out of the *probe* sequence given a sometimes-poor > starting index). Again I tried this too. Instead of the largest near prime I used the nearest prime. Remarkably the nearest prime is identical to the primitive element in a lot of cases. But no improvement over the modulus. > > However, it's subject to the same weak clustering phenomenon as the old > method due to the ill-advised "~hash" operation in computing the initial > index. If ~ is also thrown away, it's as good as new2 (here I've tossed out > the "windows names", and "old" == existing algorithm except (a) get rid of ~ > when computing index and (b) do mod by size-1 when computing incr): ... > The new and new2 values differ in minor ways from the ones you posted > because I got rid of the ~ (the ~ has a bad interaction with "additive" > means of computing incr, because the ~ tends to move the index in the > opposite direction, and these moves in opposite directions tend to cancel > out when computing incr+index the first time). Remarkable. > too-bad-mod-is-expensive!-ly y'rs - tim Yes. The wheel is cheapest yet. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From just@letterror.com Tue Dec 19 17:11:55 2000 From: just@letterror.com (Just van Rossum) Date: Tue, 19 Dec 2000 18:11:55 +0100 Subject: [Python-Dev] Death to string functions! Message-ID: Barry wrote: >Actually, it has been debated to death. ;) This looks better: > > SPACE = ' ' > SPACE.join(aList) > >That reads good to me ("space-join this list") and that's how I always >write it. I just did a quick scan through the 1.5.2 library, and _most_ occurrances of string.join() are used with a string constant for the second argument. There is a whole bunch of one-arg string.join()'s, too. Recommending replacing all of these (not to mention all the code "out there") with named constants seems plain silly. Sure, " ".join() is the most "logical" choice for Python as it stands, but it's definitely not the most intuitive, as evidenced by the number of times this comes up on c.l.py: to many people it simply "looks wrong". Maybe this is the deal: joiner.join() makes a whole lot of sense from an _implementation_ standpoint, but a whole lot less as a public interface. It's easy to explain why join() can't be a method of sequences (in Python), but that alone doesn't justify a string method. string.join() is not quite unlike map() and friends: map() wouldn't be so bad as a sequence method, but that isn't practical for exactly the same reasons: so it's a builtin. (And not a function method...) So, making join() a builtin makes a whole lot of sense. Not doing this because people sometimes use a local reference to os.path.join seems awfully backward. Hm, maybe joiner.join() could become a special method: joiner.__join__(), that way other objects could define their own implementation for join(). (Hm, wouldn't be the worst thing for, say, a file path object...) Just From barry@digicool.com Tue Dec 19 17:20:07 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 12:20:07 -0500 Subject: [Python-Dev] Death to string functions! References: Message-ID: <14911.39111.710940.342986@anthem.concentric.net> >>>>> "JvR" == Just van Rossum writes: JvR> Recommending replacing all of these (not to mention all the JvR> code "out there") with named constants seems plain silly. Until there's a tool to do the migration, I don't (personally) recommend wholesale migration. For new code I write though, I usually do it the way I described (which is intuitive to me, but then so is moving your fingers at a blinding speed up and down 5 long strips of metal to cause low bowel-tickling rumbly noises). JvR> So, making join() a builtin makes a whole lot of sense. Not JvR> doing this because people sometimes use a local reference to JvR> os.path.join seems awfully backward. I agree. Have we agreed on the semantics and signature of builtin join() though? Is it just string.join() stuck in builtins? -Barry From fredrik@effbot.org Tue Dec 19 17:25:49 2000 From: fredrik@effbot.org (Fredrik Lundh) Date: Tue, 19 Dec 2000 18:25:49 +0100 Subject: [Python-Dev] Death to string functions! References: <14911.39111.710940.342986@anthem.concentric.net> Message-ID: <012901c069e0$bd724fb0$3c6340d5@hagrid> Barry wrote: > JvR> So, making join() a builtin makes a whole lot of sense. Not > JvR> doing this because people sometimes use a local reference to > JvR> os.path.join seems awfully backward. > > I agree. Have we agreed on the semantics and signature of builtin > join() though? Is it just string.join() stuck in builtins? +1 (let's leave the __join__ slot and other super-generalized variants for 2.2) From thomas@xs4all.net Tue Dec 19 17:54:34 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Tue, 19 Dec 2000 18:54:34 +0100 Subject: [Python-Dev] SourceForge SSH silliness In-Reply-To: ; from tim.one@home.com on Tue, Dec 19, 2000 at 12:50:01AM -0500 References: <20001217220008.D29681@xs4all.nl> Message-ID: <20001219185434.E29681@xs4all.nl> On Tue, Dec 19, 2000 at 12:50:01AM -0500, Tim Peters wrote: > [Thomas Wouters] > > What sourceforge did was switch Linux distributions, and upgrade. > > ... [and quite a bit more] ... > I hope you're feeling better today . "The problem" was one the wng > msg spelled out: "It is also possible that the host key has just been > changed.". SF changed keys. That's the whole banana right there. Deleting > the sourceforge keys from known_hosts fixed it (== convinced ssh to install > new SF keys the next time I connected). Well, if you'd read the thread , you'll notice that other people had problems even after that. I'm glad you're not one of them, though :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From barry@digicool.com Tue Dec 19 18:22:19 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 13:22:19 -0500 Subject: [Python-Dev] Error: syncmail script missing References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> Message-ID: <14911.42843.284822.935268@anthem.concentric.net> Folks, Python wasn't installed on the new SF CVS machine, which was why syncmail was broken. My thanks to the SF guys for quickly remedying this situation! Please give it a test. -Barry From barry@digicool.com Tue Dec 19 18:23:32 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 13:23:32 -0500 Subject: [Python-Dev] Error: syncmail script missing References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> <14911.42843.284822.935268@anthem.concentric.net> Message-ID: <14911.42916.573600.922606@anthem.concentric.net> >>>>> "BAW" == Barry A Warsaw writes: BAW> Python wasn't installed on the new SF CVS machine, which was BAW> why syncmail was broken. My thanks to the SF guys for BAW> quickly remedying this situation! BTW, it's currently Python 1.5.2. From tismer@tismer.com Tue Dec 19 17:34:14 2000 From: tismer@tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 19:34:14 +0200 Subject: [Python-Dev] Re: The Dictionary Gem is polished! References: Message-ID: <3A3F9C16.562F9D9F@tismer.com> This is a multi-part message in MIME format. --------------F2E36624A7D999AC873AD6CE Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Again... Tim Peters wrote: > > Sounds good to me! It's a very cheap way to get the high bits into play. ... > [Christian] > > - The bits used from the string hash are not well distributed > > - using a "warmup wheel" on the hash to suck all bits in > > gives the same quality of hashes like random numbers. > > See above and be very cautious: none of Python's hash functions produce > well-distributed bits, and-- in effect --that's why Python dicts often > perform "better than random" on common data. Even what you've done so far > appears to provide marginally worse statistics for Guido's favorite kind of > test case ("worse" in two senses: total number of collisions (a measure of > amortized lookup cost), and maximum collision chain length (a measure of > worst-case lookup cost)): > > d = {} > for i in range(N): > d[repr(i)] = i I will look into this. > check-in-one-thing-then-let-it-simmer-ly y'rs - tim Are you saying I should check the thing in? Really? In another reply to this message I was saying """ This is why I think to be even more conservative: Try to use a division wheel, but with the inverses of the original primitive roots, just in order to get at Guido's results :-) """ This was a religious desire, but such an inverse cannot exist. Well, all inverses exists, but it is an error to think that they can produce similar bit patterns. Changing the root means changing the whole system, since we have just a *representation* of a goup, via polynomial coefficients. A simple example which renders my thought useless is this: There is no general pattern that can turn a physical right shift into a left shift, for all bit combinations. Anyway, how can I produce a nearly complete scheme like today with the same "cheaper than random" properties? Ok, we have to stick with the given polymomials to stay compatible, and we also have to shift left. How do we then rotate the random bits in? Well, we can in fact do a rotation of the whole index, moving the highest bit into the lowest. Too bad that this isn't supported in C. It is a native machine instruction on X86 machines. We would then have: incr = ROTATE_LEFT(incr, 1) if (incr > mask): incr = incr ^ mp.ma_poly The effect is similar to the "old" algorithm, bits are shiftet left. Only if the hash happens to have hight bits, they appear in the modulus. On the current "faster than random" cases, I assume that high bits in the hash are less likely than low bits, so it is more likely that an entry finds its good place in the dict, before bits are rotated in. hence the "good" cases would be kept. I did all tests again, now including maximum trip length, and added a "rotate-left" version as well: D:\crml_doc\platf\py>python dictest.py N=1000 trips for strings old=293/9 new=302/7 new2=221/7 rot=278/5 trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16754/31 trips for random integers old=360/8 new=369/8 new2=358/6 rot=356/7 trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/5 N=2000 trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1082/8 trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33524/34 trips for random integers old=704/7 new=686/8 new2=685/7 rot=693/7 trips for windows names old=503/8 new=542/9 new2=564/6 rot=529/7 N=3000 trips for strings old=810/5 new=839/6 new2=609/5 rot=796/5 trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=49828/38 trips for random integers old=708/5 new=723/7 new2=724/5 rot=722/6 trips for windows names old=712/6 new=711/5 new2=691/5 rot=738/9 N=4000 trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1848/10 trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=66356/38 trips for random integers old=1395/9 new=1397/8 new2=1435/9 rot=1394/13 trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1513/9 D:\crml_doc\platf\py> Concerning trip length, rotate is better than old in most cases. Random integers seem to withstand any of these procedures. For bad integers, rot takes naturally more trips than new, since the path to the bits is longer. All in all I don't see more than marginal differences between the approaches, and I tent to stick with "new", since it is theapest to implement. (it does not cost anything and might instead be a little cheaper for some compilers, since it does not reference the mask variable). I'd say let's do the patch -- ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com --------------F2E36624A7D999AC873AD6CE Content-Type: text/plain; charset=us-ascii; name="dictest.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="dictest.py" ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] polys = map(long, polys) class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg mp.warmup = newalg==2 mp.rotleft = newalg==3 mp.trips = 0 mp.tripmax = 0 def getTrips(self): trips, tripmax = self.trips, self.tripmax self.trips = self.tripmax = 0 return trips, tripmax def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # the shifting is worth it in the incremental case. ## added after posting to python-dev: uhash = _hash & 0xffffffffl if mp.warmup: incr = uhash mask2 = 0xffffffffl ^ mask while mask2 > mask: if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 mask2 = mask2>>1 # this loop *can* be sped up by tables # with precomputed multiple shifts. # But I'm not sure if it is worth it at all. else: incr = uhash ^ (uhash >> 3) ###### TO HERE if (not incr): incr = mask triplen = 0 while 1: mp.trips = mp.trips+1 triplen = triplen+1 if triplen > mp.tripmax: mp.tripmax = triplen ep = ep0[int((i+incr)&mask)] if (ep[me_key] is NULL) : if (freeslot is not NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly elif mp.rotleft: if incr &0x80000000L: incr = (incr << 1) | 1 else: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft for i in range(n): s = str(i) #* 5 #s = chr(i%256) + chr(i>>8)## d1[s] = d2[s] = d3[s] = d4[s] = i return d1, d2, d3, d4 def istring_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft for i in range(n): s = chr(i%256) + chr(i>>8) d1[s] = d2[s] = d3[s] = d4[s] = i return d1, d2, d3, d4 def random_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft from whrandom import randint import sys keys = [] for i in range(n): keys.append(randint(0, sys.maxint-1)) for i in keys: d1[i] = d2[i] = d3[i] = d4[i] = i return d1, d2, d3, d4 def badnum_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft shift = 10 if EXTREME: shift = 16 for i in range(n): bad = i << 16 d2[bad] = d3[bad] = d4[bad] = i if n <= 1000: d1[bad] = i return d1, d2, d3, d4 def names_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft import win32con keys = win32con.__dict__.keys() if len(keys) < n: keys = [] for s in keys[:n]: d1[s] = d2[s] = d3[s] = d4[s] = s return d1, d2, d3, d4 def do_test(dict): keys = dict.keys() dict.getTrips() # reset test(keys, dict) return "%d/%d" % dict.getTrips() EXTREME=1 if __name__ == "__main__": for N in (1000,2000,3000,4000): sdold, sdnew, sdnew2, sdrot = string_dicts(N) #idold, idnew, idnew2, idrot = istring_dicts(N) bdold, bdnew, bdnew2, bdrot = badnum_dicts(N) rdold, rdnew, rdnew2, rdrot = random_dicts(N) ndold, ndnew, ndnew2, ndrot = names_dicts(N) fmt = "old=%s new=%s new2=%s rot=%s" print "N=%d" %N print ("trips for strings "+fmt) % tuple( map(do_test, (sdold, sdnew, sdnew2, sdrot)) ) #print ("trips for bin strings "+fmt) % tuple( # map(do_test, (idold, idnew, idnew2, idrot)) ) print ("trips for bad integers "+fmt) % tuple( map(do_test, (bdold, bdnew, bdnew2, bdrot))) print ("trips for random integers "+fmt) % tuple( map(do_test, (rdold, rdnew, rdnew2, rdrot))) print ("trips for windows names "+fmt) % tuple( map(do_test, (ndold, ndnew, ndnew2, ndrot))) """ Results with a shift of 10 (EXTREME=0): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 Results with a shift of 16 (EXTREME=1): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.218 new=5.147 timing for bad integers old=571.210 new=19.220 """ --------------F2E36624A7D999AC873AD6CE-- From just@letterror.com Tue Dec 19 18:46:18 2000 From: just@letterror.com (Just van Rossum) Date: Tue, 19 Dec 2000 19:46:18 +0100 Subject: [Python-Dev] Death to string functions! In-Reply-To: <14911.39111.710940.342986@anthem.concentric.net> References: Message-ID: At 12:20 PM -0500 19-12-2000, Barry A. Warsaw wrote: >I agree. Have we agreed on the semantics and signature of builtin >join() though? Is it just string.join() stuck in builtins? Yep. I'm with /F that further generalization can be done later. Oh, does this mean that "".join() becomes deprecated? (Nice test case for the warning framework...) Just From barry@digicool.com Tue Dec 19 18:56:45 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 13:56:45 -0500 Subject: [Python-Dev] Death to string functions! References: Message-ID: <14911.44909.414520.788073@anthem.concentric.net> >>>>> "JvR" == Just van Rossum writes: JvR> Oh, does this mean that "".join() becomes deprecated? Please, no. From guido@python.org Tue Dec 19 18:56:39 2000 From: guido@python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 13:56:39 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Tue, 19 Dec 2000 13:56:45 EST." <14911.44909.414520.788073@anthem.concentric.net> References: <14911.44909.414520.788073@anthem.concentric.net> Message-ID: <200012191856.NAA30524@cj20424-a.reston1.va.home.com> > >>>>> "JvR" == Just van Rossum writes: > > JvR> Oh, does this mean that "".join() becomes deprecated? > > Please, no. No. --Guido van Rossum (home page: http://www.python.org/~guido/) From just@letterror.com Tue Dec 19 19:15:19 2000 From: just@letterror.com (Just van Rossum) Date: Tue, 19 Dec 2000 20:15:19 +0100 Subject: [Python-Dev] Death to string functions! In-Reply-To: <14911.44909.414520.788073@anthem.concentric.net> References: Message-ID: At 1:56 PM -0500 19-12-2000, Barry A. Warsaw wrote: >>>>>> "JvR" == Just van Rossum writes: > > JvR> Oh, does this mean that "".join() becomes deprecated? > >Please, no. And keep two non-deprecated ways to do the same thing? I'm not saying it should be removed, just that the powers that be declare that _one_ of them is the preferred way. And-if-that-one-isn't-builtin-join()-I-don't-know-why-to-even-bother y'rs -- Just From greg@cosc.canterbury.ac.nz Tue Dec 19 22:35:05 2000 From: greg@cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 20 Dec 2000 11:35:05 +1300 (NZDT) Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012191537.KAA28909@cj20424-a.reston1.va.home.com> Message-ID: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz> Guido: > Boy, are you stirring up a can of worms that we've been through many > times before! Nothing you say hasn't been said at least a hundred > times before, on this list as well as on c.l.py. And I'll wager you'll continue to hear them said at regular intervals for a long time to come, because you've done something which a lot of people feel very strongly was a mistake, and they have some very rational arguments as to why it was a mistake, whereas you don't seem to have any arguments to the contrary which those people are likely to find convincing. > There really seem to be only two possibilities that don't have this > problem: (1) make it a built-in, or (2) make it a method on strings. False dichotomy. Some other possibilities: (3) Use an operator. (4) Leave it in the string module! Really, I don't see what would be so bad about that. You still need somewhere to put all the string-related constants, so why not keep the string module for those, plus the few functions that don't have any other obvious place? > If " ".join(L) bugs you, try this: > > space = " " # This could be a global > . > . > . > s = space.join(L) Surely you must realise that this completely fails to address Mr. Petrilli's concern? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From akuchlin@mems-exchange.org Wed Dec 20 14:40:58 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 20 Dec 2000 09:40:58 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: ; from noreply@sourceforge.net on Tue, Dec 19, 2000 at 07:02:05PM -0800 References: Message-ID: <20001220094058.A17623@kronos.cnri.reston.va.us> On Tue, Dec 19, 2000 at 07:02:05PM -0800, noreply@sourceforge.net wrote: >Date: 2000-Dec-19 19:02 >By: tim_one >Unrelated to your patch but in the same area: the other msg, "ord() >expected string or Unicode character", doesn't read right. The type >names in question are "string" and "unicode": > >>>> type("") > >>>> type(u"") > >>>> > >"character" is out of place, or not in enough places. Just thought I'd mention that, since *you're* so cute! Is it OK to refer to 8-bit strings under that name? How about "expected an 8-bit string or Unicode string", when the object passed to ord() isn't of the right type. Similarly, when the value is of the right type but has length>1, the message is "ord() expected a character, length-%d string found". Should that be "length-%d (string / unicode) found)" And should the type names be changed to '8-bit string'/'Unicode string', maybe? --amk From barry@digicool.com Wed Dec 20 15:39:30 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Wed, 20 Dec 2000 10:39:30 -0500 Subject: [Python-Dev] IGNORE - this is only a test Message-ID: <14912.53938.280864.596141@anthem.concentric.net> Testing the new MX for python.org... From fdrake@acm.org Wed Dec 20 16:57:09 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 20 Dec 2000 11:57:09 -0500 (EST) Subject: [Python-Dev] scp with SourceForge Message-ID: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> I've not been able to get scp to work with SourceForge since they upgraded their machines. ssh works fine. Is this related to the protocol mismatch problem that was discussed earlier? My ssh tells me "SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote sshd is sending it's version as "Remote protocol version 1.99, remote software version OpenSSH_2.2.0p1". Was there a reasonable way to deal with this? I'm running Linux-Mandrake 7.1 with very little customization or extra stuff. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tismer@tismer.com Wed Dec 20 16:31:00 2000 From: tismer@tismer.com (Christian Tismer) Date: Wed, 20 Dec 2000 18:31:00 +0200 Subject: [Python-Dev] Re: The Dictionary Gem is polished! References: <3A3F9C16.562F9D9F@tismer.com> Message-ID: <3A40DEC4.5F659E8E@tismer.com> Christian Tismer wrote: ... When talking about left rotation, an error crept in. Sorry! > We would then have: > > incr = ROTATE_LEFT(incr, 1) > if (incr > mask): > incr = incr ^ mp.ma_poly If incr contains the high bits of the hash, then the above must be replaced by incr = ROTATE_LEFT(incr, 1) if (incr & (mask+1)): incr = incr ^ mp.ma_poly or the multiplicative group is not guaranteed to be generated, obviously. This doesn't change my results, rotating right is still my choice. ciao - chris D:\crml_doc\platf\py>python dictest.py N=1000 trips for strings old=293/9 new=302/7 new2=221/7 rot=272/8 trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16982/27 trips for random integers old=339/9 new=337/7 new2=343/10 rot=342/8 trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/6 N=2000 trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1090/9 trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33985/31 trips for random integers old=747/10 new=733/7 new2=734/7 rot=728/8 trips for windows names old=503/8 new=542/9 new2=564/6 rot=521/11 N=3000 trips for strings old=810/5 new=839/6 new2=609/5 rot=820/6 trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=50985/26 trips for random integers old=709/4 new=728/5 new2=767/5 rot=711/6 trips for windows names old=712/6 new=711/5 new2=691/5 rot=727/7 N=4000 trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1861/9 trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=67986/26 trips for random integers old=1584/9 new=1606/8 new2=1505/9 rot=1579/8 trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1476/7 -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tim.one@home.com Wed Dec 20 19:52:40 2000 From: tim.one@home.com (Tim Peters) Date: Wed, 20 Dec 2000 14:52:40 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: [Fred L. Drake, Jr.] > I've not been able to get scp to work with SourceForge since they > upgraded their machines. ssh works fine. Same here. In particular, I can use ssh to log in to shell.sourceforge.net, but attempts to scp there act like this (breaking long lines by hand with \n\t): > scp -v pep-0042.html tim_one@shell.sourceforge.net:/home/groups/python/htdocs/peps Executing: host shell.sourceforge.net, user tim_one, command scp -v -t /home/groups/python/htdocs/peps SSH Version 1.2.14 [winnt-4.0-x86], protocol version 1.4. Standard version. Does not use RSAREF. ssh_connect: getuid 0 geteuid 0 anon 0 Connecting to shell.sourceforge.net [216.136.171.201] port 22. Connection established. Remote protocol version 1.99, remote software version OpenSSH_2.2.0p1 Waiting for server public key. Received server public key (768 bits) and host key (1024 bits). Host 'shell.sourceforge.net' is known and matches the host key. Initializing random; seed file C:\Code/.ssh/random_seed IDEA not supported, using 3des instead. Encryption type: 3des Sent encrypted session key. Received encrypted confirmation. Trying RSA authentication with key 'sourceforge' Server refused our key. Doing password authentication. Password: **** here tim enteredth his password **** Sending command: scp -v -t /home/groups/python/htdocs/peps Entering interactive session. And there it sits forever. Several others report the same symptom on SF forums, and assorted unresolved SF Support and Bug reports. We don't know what your symptom is! > Is this related to the protocol mismatch problem that was discussed > earlier? Doubt it. Most commentators pin the blame elsewhere. > ... > Was there a reasonable way to deal with this? A new note was added to http://sourceforge.net/support/?func=detailsupport&support_id=110235&group_i d=1 today, including: """ Re: Shell server We're also aware of the number of problems on the shell server with respect to restricitive permissions on some programs - and sourcing of shell environments. We're also aware of the troubles with scp and transferring files. As a work around, we recommend either editing files on the shell server, or scping files to the shell server from external hosts to the shell server, whilst logged in to the shell server. """ So there you go: scp files to the shell server from external hosts to the shell server whilst logged in to the shell server . Is scp working for *anyone*??? From fdrake@acm.org Wed Dec 20 20:17:58 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 20 Dec 2000 15:17:58 -0500 (EST) Subject: [Python-Dev] scp with SourceForge In-Reply-To: References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com> Tim Peters writes: > And there it sits forever. Several others report the same symptom on SF > forums, and assorted unresolved SF Support and Bug reports. We don't know > what your symptom is! Exactly the same. > So there you go: scp files to the shell server from external hosts to the > shell server whilst logged in to the shell server . Yeah, that really helps.... NOT! All I want to be able to do is post a new development version of the documentation. ;-( -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From bckfnn@worldonline.dk Wed Dec 20 20:23:33 2000 From: bckfnn@worldonline.dk (Finn Bock) Date: Wed, 20 Dec 2000 20:23:33 GMT Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <3a411449.5247545@smtp.worldonline.dk> [Fred L. Drake] > I've not been able to get scp to work with SourceForge since they >upgraded their machines. ssh works fine. Is this related to the >protocol mismatch problem that was discussed earlier? My ssh tells me >"SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote >sshd is sending it's version as "Remote protocol version 1.99, remote >software version OpenSSH_2.2.0p1". > Was there a reasonable way to deal with this? I'm running >Linux-Mandrake 7.1 with very little customization or extra stuff. I managed to update the jython website by logging into the shell machine by ssh and doing a ftp back to my machine (using the IP number). That isn't exactly reasonable, but I was desperate. regards, finn From tim.one@home.com Wed Dec 20 20:42:11 2000 From: tim.one@home.com (Tim Peters) Date: Wed, 20 Dec 2000 15:42:11 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com> Message-ID: [Tim] > So there you go: scp files to the shell server from external > hosts to the shell server whilst logged in to the shell server . [Fred] > Yeah, that really helps.... NOT! All I want to be able to do is > post a new development version of the documentation. ;-( All I want to do is make a measly change to a PEP -- I'm afraid it doesn't ask how trivial your intents are. If some suck^H^H^H^Hdeveloper admits that scp works for them, maybe we can mail them stuff and have *them* copy it over. no-takers-so-far-though-ly y'rs - tim From barry@digicool.com Wed Dec 20 20:49:00 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Wed, 20 Dec 2000 15:49:00 -0500 Subject: [Python-Dev] scp with SourceForge References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <14913.6972.934625.840781@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> So there you go: scp files to the shell server from external TP> hosts to the shell server whilst logged in to the shell server TP> . Psheesh, /that/ was obvious. Did you even have to ask? TP> Is scp working for *anyone*??? Nope, same thing happens to me; it just hangs. -Barry From tim.one@home.com Wed Dec 20 20:53:38 2000 From: tim.one@home.com (Tim Peters) Date: Wed, 20 Dec 2000 15:53:38 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14913.6972.934625.840781@anthem.concentric.net> Message-ID: [Tim, quoting a bit of immortal SF support prose] > TP> So there you go: scp files to the shell server from external > TP> hosts to the shell server whilst logged in to the shell server > TP> . [Barry] > Psheesh, /that/ was obvious. Did you even have to ask? Actually, isn't this easy to do on Linux? That is, run an ssh server (whatever) on your home machine, log in to the SF shell (which everyone seems able to do), then scp whatever your_home_IP_address:your_home_path from the SF shell? Heck, I can even get that to work on Windows, except I don't know how to set up anything on my end to accept the connection . > TP> Is scp working for *anyone*??? > Nope, same thing happens to me; it just hangs. That's good to know -- since nobody else mentioned this, Fred probably figured he was unique. not-that-he-isn't-it's-just-that-he's-not-ly y'rs - tim From fdrake@acm.org Wed Dec 20 20:52:10 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 20 Dec 2000 15:52:10 -0500 (EST) Subject: [Python-Dev] scp with SourceForge In-Reply-To: References: <14913.6972.934625.840781@anthem.concentric.net> Message-ID: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com> Tim Peters writes: > Actually, isn't this easy to do on Linux? That is, run an ssh server > (whatever) on your home machine, log in to the SF shell (which everyone > seems able to do), then > > scp whatever your_home_IP_address:your_home_path > > from the SF shell? Heck, I can even get that to work on Windows, except I > don't know how to set up anything on my end to accept the connection . Err, yes, that's easy to do, but... that means putting your private key on SourceForge. They're a great bunch of guys, but they can't have my private key! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tim.one@home.com Wed Dec 20 21:06:07 2000 From: tim.one@home.com (Tim Peters) Date: Wed, 20 Dec 2000 16:06:07 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com> Message-ID: [Fred] > Err, yes, that's easy to do, but... that means putting your private > key on SourceForge. They're a great bunch of guys, but they can't > have my private key! So generate a unique one-shot key pair for the life of the copy. I can do that for you on Windows if you lack a real OS . From thomas@xs4all.net Wed Dec 20 22:59:49 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Wed, 20 Dec 2000 23:59:49 +0100 Subject: [Python-Dev] scp with SourceForge In-Reply-To: ; from tim.one@home.com on Wed, Dec 20, 2000 at 02:52:40PM -0500 References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <20001220235949.F29681@xs4all.nl> On Wed, Dec 20, 2000 at 02:52:40PM -0500, Tim Peters wrote: > So there you go: scp files to the shell server from external hosts to the > shell server whilst logged in to the shell server . > Is scp working for *anyone*??? Not for me, anyway. And I'm not just saying that to avoid scp-duty :) And I'm using the same ssh version, which works fine on all other machines. It probably has to do with the funky setup Sourceforge uses. (Try looking at 'df' and 'cat /proc/mounts', and comparing the two -- you'll see what I mean :) That also means I'm not tempted to try and reproduce it, obviously :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim.one@home.com Thu Dec 21 03:24:12 2000 From: tim.one@home.com (Tim Peters) Date: Wed, 20 Dec 2000 22:24:12 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz> Message-ID: [Guido] >> Boy, are you stirring up a can of worms that we've been through many >> times before! Nothing you say hasn't been said at least a hundred >> times before, on this list as well as on c.l.py. [Greg Ewing] > And I'll wager you'll continue to hear them said at regular intervals > for a long time to come, because you've done something which a lot of > people feel very strongly was a mistake, and they have some very > rational arguments as to why it was a mistake, whereas you don't seem > to have any arguments to the contrary which those people are likely to > find convincing. Then it's a wash: Guido doesn't find their arguments convincing either, and ties favor the status quo even in the absence of BDFLness. >> There really seem to be only two possibilities that don't have this >> problem: (1) make it a built-in, or (2) make it a method on strings. > False dichotomy. Some other possibilities: > > (3) Use an operator. Oh, that's likely . > (4) Leave it in the string module! Really, I don't see what > would be so bad about that. You still need somewhere to put > all the string-related constants, so why not keep the string > module for those, plus the few functions that don't have > any other obvious place? Guido said he wants to deprecate the entire string module, so that Python can eventually warn on the mere presence of "import string". That's what he said when I earlier ranted in favor of keeping the string module around. My guess is that making it a builtin is the only alternative that stands any chance at this point. >> If " ".join(L) bugs you, try this: >> >> space = " " # This could be a global >> . >> . >> . >> s = space.join(L) > Surely you must realise that this completely fails to > address Mr. Petrilli's concern? Don't know about Guido, but I don't realize that, and we haven't heard back from Charles. His objections were raised the first day " ".join was suggested, space.join was suggested almost immediately after, and that latter suggestion did seem to pacify at least several objectors. Don't know whether it makes Charles happier, but since it *has* made others happier in the past, it's not unreasonable to imagine that Charles might like it too. if-we're-to-be-swayed-by-his-continued-outrage-afraid-it-will- have-to-come-from-him-ly y'rs - tim From tim.one@home.com Thu Dec 21 07:44:19 2000 From: tim.one@home.com (Tim Peters) Date: Thu, 21 Dec 2000 02:44:19 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: <20001220094058.A17623@kronos.cnri.reston.va.us> Message-ID: [Andrew Kuchling] > Is it OK to refer to 8-bit strings under that name? > How about "expected an 8-bit string or Unicode string", when the > object passed to ord() isn't of the right type. > > Similarly, when the value is of the right type but has length>1, > the message is "ord() expected a character, length-%d string found". > Should that be "length-%d (string / unicode) found)" > > And should the type names be changed to '8-bit string'/'Unicode > string', maybe? Actually, upon reflection I think it was a mistake to add all these "or Unicode" clauses to the error msgs to begin with. Python used to have only one string type, we're saying that's also a hope for the future, and in the meantime I know I'd have no trouble understanding "string" as including both 8-bit strings and Unicode strings. So we should say "8-bit string" or "Unicode string" when *only* one of those is allowable. So "ord() expected string ..." instead of (even a repaired version of) "ord() expected string or Unicode character ..." but-i'm-not-even-motivated-enough-to-finish-this-sig- From tim.one@home.com Thu Dec 21 08:52:54 2000 From: tim.one@home.com (Tim Peters) Date: Thu, 21 Dec 2000 03:52:54 -0500 Subject: [Python-Dev] RE: The Dictionary Gem is polished! In-Reply-To: <3A3F9C16.562F9D9F@tismer.com> Message-ID: [Christian Tismer] > Are you saying I should check the thing in? Really? Of course. The first thing you talked about showed a major improvement in some bad cases, did no harm in the others, and both results were more than just plausible -- they made compelling sense and were backed by simulation. So why not check it in? It's a clear net win! Stuff since then has been a spattering of maybe-good maybe-bad maybe-neutral ideas that hasn't gotten anywhere conclusive. What I want to avoid is another "Unicode compression" scenario, where we avoid grabbing a clear win for months just because it may not be the best possible of all conceivable compression schemes -- and then mistakes get made in a last-second rush to get *any* improvement. Checking in a clear improvement today does not preclude checking in a better one next week . > ... > Ok, we have to stick with the given polymomials to stay > compatible, Na, feel free to explore that too, if you like. It really should get some study! The polys there now are utterly arbitrary: of all polys that happen to be irreducible and that have x as a primitive root in the induced multiplicative group, these are simply the smallest when viewed as binary integers. That's because they were *found* by trying all odd binary ints with odd parity (even ints and ints with even parity necessarily correspond to reducible polys), starting with 2**N+3 and going up until finding the first one that was both irreducible and had x as a primitive root. There's no theory at all that I know of to say that any such poly is any better for this purpose than any other. And they weren't tested for that either -- they're simply the first ones "that worked at all" in a brute search. Besides, Python's "better than random" dict behavior-- when it obtains! --is almost entirely due to that its hash functions produce distinct starting indices more often than a random hash function would. The contribution of the GF-based probe sequence in case of collision is to avoid the terrible behavior most other forms of probe sequence would cause given that Python's hash functions also tend to fill solid contiguous slices of the table more often than would a random hash function. [stuff about rotation] > ... > Too bad that this isn't supported in C. It is a native > machine instruction on X86 machines. Guido long ago rejected hash functions based on rotation for this reason; he's not likely to approve of rotations more in the probe sequence . A similar frustration is that almost modern CPUs have a fast instruction to get at the high 32 bits of a 32x32->64 bit multiply: another way to get the high bits of the hash code into play is to multiply the 32-bit hash code by a 32-bit constant (see Knuth for "Fibonacci hashing" details), and take the least-significant N bits of the *upper* 32 bits of the 64-bit product as the initial table index. If the constant is chosen correctly, this defines a permutation on the space of 32-bit unsigned ints, and can be very effective at "scrambling" arithmetic progressions (which Python's hash functions often produce). But C doesn't give a decent way to get at that either. > ... > On the current "faster than random" cases, I assume that > high bits in the hash are less likely than low bits, I'm not sure what this means. As the comment in dictobject.c says, it's common for Python's hash functions to return a result with lots of leading zeroes. But the lookup currently applies ~ to those first (which is a bad idea -- see earlier msgs), so the actual hash that gets *used* often has lots of leading ones. > so it is more likely that an entry finds its good place in the dict, > before bits are rotated in. hence the "good" cases would be kept. I can agree with this easily if I read the above as asserting that in the very good cases today, the low bits of hashes (whether or not ~ is applied) vary more than the high bits. > ... > Random integers seem to withstand any of these procedures. If you wanted to, you could *define* random this way . > ... > I'd say let's do the patch -- ciao - chris full-circle-ly y'rs - tim From mal@lemburg.com Thu Dec 21 11:16:27 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 21 Dec 2000 12:16:27 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix References: Message-ID: <3A41E68B.6B12CD71@lemburg.com> Tim Peters wrote: > > [Andrew Kuchling] > > Is it OK to refer to 8-bit strings under that name? > > How about "expected an 8-bit string or Unicode string", when the > > object passed to ord() isn't of the right type. > > > > Similarly, when the value is of the right type but has length>1, > > the message is "ord() expected a character, length-%d string found". > > Should that be "length-%d (string / unicode) found)" > > > > And should the type names be changed to '8-bit string'/'Unicode > > string', maybe? > > Actually, upon reflection I think it was a mistake to add all these "or > Unicode" clauses to the error msgs to begin with. Python used to have only > one string type, we're saying that's also a hope for the future, and in the > meantime I know I'd have no trouble understanding "string" as including both > 8-bit strings and Unicode strings. > > So we should say "8-bit string" or "Unicode string" when *only* one of those > is allowable. So > > "ord() expected string ..." > > instead of (even a repaired version of) > > "ord() expected string or Unicode character ..." I think this has to do with understanding that there are two string types in Python 2.0 -- a novice won't notice this until she sees the error message. My understanding is similar to yours, "string" should mean "any string object" and in cases where the difference between 8-bit string and Unicode matters, these should be referred to as "8-bit string" and "Unicode string". Still, I think it is a good idea to make people aware of the possibility of passing Unicode objects to these functions, so perhaps the idea of adding both possibilies to error messages is not such a bad idea for 2.1. The next phases would be converting all messages back to "string" and then convert all strings to Unicode ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From akuchlin@mems-exchange.org Thu Dec 21 18:37:19 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Thu, 21 Dec 2000 13:37:19 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: ; from tim.one@home.com on Thu, Dec 21, 2000 at 02:44:19AM -0500 References: <20001220094058.A17623@kronos.cnri.reston.va.us> Message-ID: <20001221133719.B11880@kronos.cnri.reston.va.us> On Thu, Dec 21, 2000 at 02:44:19AM -0500, Tim Peters wrote: >So we should say "8-bit string" or "Unicode string" when *only* one of those >is allowable. So OK... how about this patch? Index: bltinmodule.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v retrieving revision 2.185 diff -u -r2.185 bltinmodule.c --- bltinmodule.c 2000/12/20 15:07:34 2.185 +++ bltinmodule.c 2000/12/21 18:36:54 @@ -1524,13 +1524,14 @@ } } else { PyErr_Format(PyExc_TypeError, - "ord() expected string or Unicode character, " \ + "ord() expected string of length 1, but " \ "%.200s found", obj->ob_type->tp_name); return NULL; } PyErr_Format(PyExc_TypeError, - "ord() expected a character, length-%d string found", + "ord() expected a character, " + "but string of length %d found", size); return NULL; } From thomas@xs4all.net Fri Dec 22 15:21:43 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 22 Dec 2000 16:21:43 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: ; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800 References: Message-ID: <20001222162143.A5515@xs4all.nl> On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply@sourceforge.net wrote: > * Guido-style: 8-column hard-tab indents. > * New style: 4-column space-only indents. Hm, I must have missed this... Is 'new style' the preferred style, as its name suggests, or is Guido mounting a rebellion to adhere to the One True Style (or rather his own version of it, which just has the * in pointer type declarations wrong ? :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From fdrake@acm.org Fri Dec 22 15:31:21 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 10:31:21 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <20001222162143.A5515@xs4all.nl> References: <20001222162143.A5515@xs4all.nl> Message-ID: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> Thomas Wouters writes: > Hm, I must have missed this... Is 'new style' the preferred style, as its > name suggests, or is Guido mounting a rebellion to adhere to the One True > Style (or rather his own version of it, which just has the * in pointer > type declarations wrong ? :) Guido has grudgingly granted that new code in the "New style" is acceptable, mostly because many people complain that "Guido style" causes too much code to get scrunched up on the right margin. The "New style" is more like the recommendations for Python code as well, so it's easier for Python programmers to read (Tabs are hard to read clearly! ;). -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From cgw@fnal.gov Fri Dec 22 15:43:45 2000 From: cgw@fnal.gov (Charles G Waldman) Date: Fri, 22 Dec 2000 09:43:45 -0600 (CST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> Message-ID: <14915.30385.201343.360880@buffalo.fnal.gov> Fred L. Drake, Jr. writes: > > Guido has grudgingly granted that new code in the "New style" is > acceptable, mostly because many people complain that "Guido style" > causes too much code to get scrunched up on the right margin. I am reminded of Linus Torvalds comments on this subject (see /usr/src/linux/Documentation/CodingStyle): Now, some people will claim that having 8-character indentations makes the code move too far to the right, and makes it hard to read on a 80-character terminal screen. The answer to that is that if you need more than 3 levels of indentation, you're screwed anyway, and should fix your program. From fdrake@acm.org Fri Dec 22 15:58:56 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 10:58:56 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.30385.201343.360880@buffalo.fnal.gov> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> Message-ID: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> Charles G Waldman writes: > I am reminded of Linus Torvalds comments on this subject (see > /usr/src/linux/Documentation/CodingStyle): The catch, of course, is Python/cevel.c, where breaking it up can hurt performance. People scream when you do things like that.... -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From cgw@fnal.gov Fri Dec 22 16:07:47 2000 From: cgw@fnal.gov (Charles G Waldman) Date: Fri, 22 Dec 2000 10:07:47 -0600 (CST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> Message-ID: <14915.31827.250987.283364@buffalo.fnal.gov> Fred L. Drake, Jr. writes: > > The catch, of course, is Python/cevel.c, where breaking it up can > hurt performance. People scream when you do things like that.... Quoting again from the same source: Use helper functions with descriptive names (you can ask the compiler to in-line them if you think it's performance-critical, and it will probably do a better job of it that you would have done). But I should have pointed out that I was quoting the great Linus mostly for entertainment/cultural value, and was not really trying to add fuel to the fire. In other words, a message that I thought was amusing, but probably shouldn't have sent ;-) From fdrake@acm.org Fri Dec 22 16:20:52 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 11:20:52 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.31827.250987.283364@buffalo.fnal.gov> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> <14915.31827.250987.283364@buffalo.fnal.gov> Message-ID: <14915.32612.252115.562296@cj42289-a.reston1.va.home.com> Charles G Waldman writes: > But I should have pointed out that I was quoting the great Linus > mostly for entertainment/cultural value, and was not really trying to > add fuel to the fire. In other words, a message that I thought was > amusing, but probably shouldn't have sent ;-) I understood the intent; I think he's really got a point. There are a few places in Python where it would really help to break things up! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From fredrik@effbot.org Fri Dec 22 16:33:37 2000 From: fredrik@effbot.org (Fredrik Lundh) Date: Fri, 22 Dec 2000 17:33:37 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support References: <20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov> <14915.32612.252115.562296@cj42289-a.reston1.va.home.com> Message-ID: <004b01c06c34$f08151c0$e46940d5@hagrid> Fred wrote: > I understood the intent; I think he's really got a point. There are > a few places in Python where it would really help to break things up! if that's what you want, maybe you could start by putting the INLINE stuff back again? (if C/C++ compatibility is a problem, put it inside a cplusplus ifdef, and mark it as "for internal use only. don't use inline on public interfaces") From fdrake@acm.org Fri Dec 22 16:36:15 2000 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 11:36:15 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <004b01c06c34$f08151c0$e46940d5@hagrid> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> <14915.31827.250987.283364@buffalo.fnal.gov> <14915.32612.252115.562296@cj42289-a.reston1.va.home.com> <004b01c06c34$f08151c0$e46940d5@hagrid> Message-ID: <14915.33535.520957.215310@cj42289-a.reston1.va.home.com> Fredrik Lundh writes: > if that's what you want, maybe you could start by > putting the INLINE stuff back again? I could not see the value in the inline stuff that configure was setting up, and still don't. > (if C/C++ compatibility is a problem, put it inside a > cplusplus ifdef, and mark it as "for internal use only. > don't use inline on public interfaces") We should be able to come up with something reasonable, but I don't have time right now, and my head isn't currently wrapped around C compilers. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From akuchlin@mems-exchange.org Fri Dec 22 18:01:43 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 22 Dec 2000 13:01:43 -0500 Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: ; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800 References: Message-ID: <20001222130143.B7127@newcnri.cnri.reston.va.us> On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply@sourceforge.net wrote: > * Guido-style: 8-column hard-tab indents. > * New style: 4-column space-only indents. > * _curses style: 2 column indents. > >I'd prefer "New style", myself. New style it is. (Barry, is the "python" style in cc-mode.el going to be changed to new style, or a "python2" style added?) I've been wanting to reformat _cursesmodule.c to match the Python style for some time. Probably I'll do that a little while after the panel module has settled down a bit. Fred, did you look at the use of the CObject for exposing the API? Did that look reasonable? Also, should py_curses.h go in the Include/ subdirectory instead of Modules/? --amk From fredrik@effbot.org Fri Dec 22 18:03:43 2000 From: fredrik@effbot.org (Fredrik Lundh) Date: Fri, 22 Dec 2000 19:03:43 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support References: <20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov><14915.32612.252115.562296@cj42289-a.reston1.va.home.com><004b01c06c34$f08151c0$e46940d5@hagrid> <14915.33535.520957.215310@cj42289-a.reston1.va.home.com> Message-ID: <006701c06c41$896a1a00$e46940d5@hagrid> Fred wrote: > > if that's what you want, maybe you could start by > > putting the INLINE stuff back again? > > I could not see the value in the inline stuff that configure was > setting up, and still don't. the INLINE stuff guarantees that "inline" is defined to be whatever directive the compiler uses for explicit inlining. quoting the autoconf docs: If the C compiler supports the keyword inline, do nothing. Otherwise define inline to __inline__ or __inline if it accepts one of those, otherwise define inline to be empty as a result, you can always use "inline" in your code, and have it do the right thing on all compilers that support ex- plicit inlining (all modern C compilers, in practice). ::: to deal with people compiling Python with a C compiler, but linking it with a C++ compiler, the config.h.in file could be written as: /* Define "inline" to be whatever the C compiler calls it. To avoid problems when mixing C and C++, make sure to only use "inline" for internal interfaces. */ #ifndef __cplusplus #undef inline #endif From akuchlin@mems-exchange.org Fri Dec 22 19:40:15 2000 From: akuchlin@mems-exchange.org (A.M. Kuchling) Date: Fri, 22 Dec 2000 14:40:15 -0500 Subject: [Python-Dev] PEP 222 draft Message-ID: <200012221940.OAA01936@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com> I've completed a draft of PEP 222 (sort of -- note the XXX comments in the text for things that still need to be resolved). This is being posted to python-dev, python-web-modules, and python-list/comp.lang.python, to get comments on the proposed interface. I'm on all three lists, but would prefer to see followups on python-list/comp.lang.python, so if you can reply there, please do so. --amk Abstract This PEP proposes a set of enhancements to the CGI development facilities in the Python standard library. Enhancements might be new features, new modules for tasks such as cookie support, or removal of obsolete code. The intent is to incorporate the proposals emerging from this document into Python 2.1, due to be released in the first half of 2001. Open Issues This section lists changes that have been suggested, but about which no firm decision has yet been made. In the final version of this PEP, this section should be empty, as all the changes should be classified as accepted or rejected. cgi.py: We should not be told to create our own subclass just so we can handle file uploads. As a practical matter, I have yet to find the time to do this right, so I end up reading cgi.py's temp file into, at best, another file. Some of our legacy code actually reads it into a second temp file, then into a final destination! And even if we did, that would mean creating yet another object with its __init__ call and associated overhead. cgi.py: Currently, query data with no `=' are ignored. Even if keep_blank_values is set, queries like `...?value=&...' are returned with blank values but queries like `...?value&...' are completely lost. It would be great if such data were made available through the FieldStorage interface, either as entries with None as values, or in a separate list. Utility function: build a query string from a list of 2-tuples Dictionary-related utility classes: NoKeyErrors (returns an empty string, never a KeyError), PartialStringSubstitution (returns the original key string, never a KeyError) New Modules This section lists details about entire new packages or modules that should be added to the Python standard library. * fcgi.py : A new module adding support for the FastCGI protocol. Robin Dunn's code needs to be ported to Windows, though. Major Changes to Existing Modules This section lists details of major changes to existing modules, whether in implementation or in interface. The changes in this section therefore carry greater degrees of risk, either in introducing bugs or a backward incompatibility. The cgi.py module would be deprecated. (XXX A new module or package name hasn't been chosen yet: 'web'? 'cgilib'?) Minor Changes to Existing Modules This section lists details of minor changes to existing modules. These changes should have relatively small implementations, and have little risk of introducing incompatibilities with previous versions. Rejected Changes The changes listed in this section were proposed for Python 2.1, but were rejected as unsuitable. For each rejected change, a rationale is given describing why the change was deemed inappropriate. * An HTML generation module is not part of this PEP. Several such modules exist, ranging from HTMLgen's purely programming interface to ASP-inspired simple templating to DTML's complex templating. There's no indication of which templating module to enshrine in the standard library, and that probably means that no module should be so chosen. * cgi.py: Allowing a combination of query data and POST data. This doesn't seem to be standard at all, and therefore is dubious practice. Proposed Interface XXX open issues: naming convention (studlycaps or underline-separated?); need to look at the cgi.parse*() functions and see if they can be simplified, too. Parsing functions: carry over most of the parse* functions from cgi.py # The Response class borrows most of its methods from Zope's # HTTPResponse class. class Response: """ Attributes: status: HTTP status code to return headers: dictionary of response headers body: string containing the body of the HTTP response """ def __init__(self, status=200, headers={}, body=""): pass def setStatus(self, status, reason=None): "Set the numeric HTTP response code" pass def setHeader(self, name, value): "Set an HTTP header" pass def setBody(self, body): "Set the body of the response" pass def setCookie(self, name, value, path = '/', comment = None, domain = None, max-age = None, expires = None, secure = 0 ): "Set a cookie" pass def expireCookie(self, name): "Remove a cookie from the user" pass def redirect(self, url): "Redirect the browser to another URL" pass def __str__(self): "Convert entire response to a string" pass def dump(self): "Return a string representation useful for debugging" pass # XXX methods for specific classes of error:serverError, badRequest, etc.? class Request: """ Attributes: XXX should these be dictionaries, or dictionary-like objects? .headers : dictionary containing HTTP headers .cookies : dictionary of cookies .fields : data from the form .env : environment dictionary """ def __init__(self, environ=os.environ, stdin=sys.stdin, keep_blank_values=1, strict_parsing=0): """Initialize the request object, using the provided environment and standard input.""" pass # Should people just use the dictionaries directly? def getHeader(self, name, default=None): pass def getCookie(self, name, default=None): pass def getField(self, name, default=None): "Return field's value as a string (even if it's an uploaded file)" pass def getUploadedFile(self, name): """Returns a file object that can be read to obtain the contents of an uploaded file. XXX should this report an error if the field isn't actually an uploaded file? Or should it wrap a StringIO around simple fields for consistency? """ def getURL(self, n=0, query_string=0): """Return the URL of the current request, chopping off 'n' path components from the right. Eg. if the URL is "http://foo.com/bar/baz/quux", n=2 would return "http://foo.com/bar". Does not include the query string (if any) """ def getBaseURL(self, n=0): """Return the base URL of the current request, adding 'n' path components to the end to recreate more of the whole URL. Eg. if the request URL is "http://foo.com/q/bar/baz/qux", n=0 would return "http://foo.com/", and n=2 "http://foo.com/q/bar". Returned URL does not include the query string, if any. """ def dump(self): "String representation suitable for debugging output" pass # Possibilities? I don't know if these are worth doing in the # basic objects. def getBrowser(self): "Returns Mozilla/IE/Lynx/Opera/whatever" def isSecure(self): "Return true if this is an SSLified request" # Module-level function def wrapper(func, logfile=sys.stderr): """ Calls the function 'func', passing it the arguments (request, response, logfile). Exceptions are trapped and sent to the file 'logfile'. """ # This wrapper will detect if it's being called from the command-line, # and if so, it will run in a debugging mode; name=value pairs # can be entered on standard input to set field values. # (XXX how to do file uploads in this syntax?) Copyright This document has been placed in the public domain. From tim.one@home.com Fri Dec 22 19:31:07 2000 From: tim.one@home.com (Tim Peters) Date: Fri, 22 Dec 2000 14:31:07 -0500 Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support) In-Reply-To: <20001222162143.A5515@xs4all.nl> Message-ID: [Thomas Wouters] >> * Guido-style: 8-column hard-tab indents. >> * New style: 4-column space-only indents. > > Hm, I must have missed this... Is 'new style' the preferred style, as > its name suggests, or is Guido mounting a rebellion to adhere to the > One True Style (or rather his own version of it, which just has > the * in pointer type declarations wrong ? :) Every time this comes up wrt C code, 1. Fred repeats that he thinks Guido caved in (but doesn't supply a reference to anything saying so). 2. Guido repeats that he prefers old-style (but in a wishy-washy way that leaves it uncertain (*)). 3. Fredrik and/or I repeat a request for a BDFL Pronouncement. 4. And there the thread ends. It's *very* hard to find this history in the Python-Dev archives because these threads always have subject lines like this one originally had ("RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support"). Fred already did the #1 bit in this thread. You can consider this msg the repeat of #3. Since Guido is out of town, we can skip #2 and go straight to #4 early . (*) Two examples of #2 from this year: Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/ Modules mmapmodule.c,2.1,2.2 From: Guido van Rossum Date: Fri, 31 Mar 2000 07:10:45 -0500 > Can we change the 8-space-tab rule for all new C code that goes in? I > know that we can't practically change existing code right now, but for > new C code, I propose we use no tab characters, and we use a 4-space > block indentation. Actually, this one was formatted for 8-space indents but using 4-space tabs, so in my editor it looked like 16-space indents! Given that we don't want to change existing code, I'd prefer to stick with 1-tab 8-space indents. Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules linuxaudiodev.c,2.2,2.3 From: Guido van Rossum Date: Sat, 08 Jul 2000 09:39:51 -0500 > Aren't tabs preferred as C-source indents, instead of 4-spaces ? At > least, that's what I see in Python/*.c and Object/*.c, but I only > vaguely recall it from the style document... Yes, you're right. From fredrik@effbot.org Fri Dec 22 20:37:35 2000 From: fredrik@effbot.org (Fredrik Lundh) Date: Fri, 22 Dec 2000 21:37:35 +0100 Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support) References: Message-ID: <00e201c06c57$052fff00$e46940d5@hagrid> > 3. Fredrik and/or I repeat a request for a BDFL Pronouncement. and. From akuchlin@mems-exchange.org Fri Dec 22 21:09:47 2000 From: akuchlin@mems-exchange.org (A.M. Kuchling) Date: Fri, 22 Dec 2000 16:09:47 -0500 Subject: [Python-Dev] Reviving the bookstore Message-ID: <200012222109.QAA02737@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com> Since the PSA isn't doing anything for us any longer, I've been working on reviving the bookstore at a new location with a new affiliate code. A draft version is up at its new home, http://www.kuchling.com/bookstore/ . Please take a look and offer comments. Book authors, please take a look at the entry for your book and let me know about any corrections. Links to reviews of books would also be really welcomed. I'd like to abolish having book listings with no description or review, so if you notice a book that you've read has no description, please feel free to submit a description and/or review. --amk From tim.one@home.com Sat Dec 23 07:15:59 2000 From: tim.one@home.com (Tim Peters) Date: Sat, 23 Dec 2000 02:15:59 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: <3A41E68B.6B12CD71@lemburg.com> Message-ID: [Tim] > ... > So we should say "8-bit string" or "Unicode string" when *only* > one of those is allowable. So > > "ord() expected string ..." > > instead of (even a repaired version of) > > "ord() expected string or Unicode character ..." [MAL] > I think this has to do with understanding that there are two > string types in Python 2.0 -- a novice won't notice this until > she sees the error message. Except that this error msg has nothing to do with how many string types there are: they didn't pass *any* flavor of string when they get this msg. At the time they pass (say) a float to ord(), that there are currently two flavors of string is more information than they need to know. > My understanding is similar to yours, "string" should mean > "any string object" and in cases where the difference between > 8-bit string and Unicode matters, these should be referred to > as "8-bit string" and "Unicode string". In that happy case of universal harmony, the msg above should say just "string" and leave it at that. > Still, I think it is a good idea to make people aware of the > possibility of passing Unicode objects to these functions, Me too. > so perhaps the idea of adding both possibilies to error messages > is not such a bad idea for 2.1. But not that. The user is trying to track down their problem. Advertising an irrelevant (to their problem) distinction at that time of crisis is simply spam. TypeError: ord() requires an 8-bit string or a Unicode string. On the other hand, you'd be surprised to discover all the things you can pass to chr(): it's not just ints. Long ints are also accepted, by design, and due to an obscure bug in the Python internals, you can also pass floats, which get truncated to ints. > The next phases would be converting all messages back to "string" > and then convert all strings to Unicode ;-) Then we'll save a lot of work by skipping the need for the first half of that -- unless you're volunteering to do all of it . From tim.one@home.com Sat Dec 23 07:16:29 2000 From: tim.one@home.com (Tim Peters) Date: Sat, 23 Dec 2000 02:16:29 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: <20001221133719.B11880@kronos.cnri.reston.va.us> Message-ID: [Tim] > So we should say "8-bit string" or "Unicode string" when *only* > one of those is allowable. [Andrew] > OK... how about this patch? +1 from me. And maybe if you offer to send a royalty to Marc-Andre each time it's printed, he'll back down from wanting to use the error msgs as a billboard . > Index: bltinmodule.c > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v > retrieving revision 2.185 > diff -u -r2.185 bltinmodule.c > --- bltinmodule.c 2000/12/20 15:07:34 2.185 > +++ bltinmodule.c 2000/12/21 18:36:54 > @@ -1524,13 +1524,14 @@ > } > } else { > PyErr_Format(PyExc_TypeError, > - "ord() expected string or Unicode character, " \ > + "ord() expected string of length 1, but " \ > "%.200s found", obj->ob_type->tp_name); > return NULL; > } > > PyErr_Format(PyExc_TypeError, > - "ord() expected a character, length-%d string found", > + "ord() expected a character, " > + "but string of length %d found", > size); > return NULL; > } From barry@digicool.com Sat Dec 23 16:43:37 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Sat, 23 Dec 2000 11:43:37 -0500 Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support References: <20001222130143.B7127@newcnri.cnri.reston.va.us> Message-ID: <14916.54841.418495.194558@anthem.concentric.net> >>>>> "AK" == Andrew Kuchling writes: AK> New style it is. (Barry, is the "python" style in cc-mode.el AK> going to be changed to new style, or a "python2" style added?) There should probably be a second style added to cc-mode.el. I haven't maintained that package in a long time, but I'll work out a patch and send it to the current maintainer. Let's call it "python2". -Barry From cgw@fnal.gov Sat Dec 23 17:09:57 2000 From: cgw@fnal.gov (Charles G Waldman) Date: Sat, 23 Dec 2000 11:09:57 -0600 (CST) Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14916.54841.418495.194558@anthem.concentric.net> References: <20001222130143.B7127@newcnri.cnri.reston.va.us> <14916.54841.418495.194558@anthem.concentric.net> Message-ID: <14916.56421.370499.762023@buffalo.fnal.gov> Barry A. Warsaw writes: > There should probably be a second style added to cc-mode.el. I > haven't maintained that package in a long time, but I'll work out a > patch and send it to the current maintainer. Let's call it > "python2". Maybe we should wait for the BDFL's pronouncement? From barry@digicool.com Sat Dec 23 19:24:42 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Sat, 23 Dec 2000 14:24:42 -0500 Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support References: <20001222130143.B7127@newcnri.cnri.reston.va.us> <14916.54841.418495.194558@anthem.concentric.net> <14916.56421.370499.762023@buffalo.fnal.gov> Message-ID: <14916.64506.56351.443287@anthem.concentric.net> >>>>> "CGW" == Charles G Waldman writes: CGW> Barry A. Warsaw writes: >> There should probably be a second style added to cc-mode.el. I >> haven't maintained that package in a long time, but I'll work >> out a patch and send it to the current maintainer. Let's call >> it "python2". CGW> Maybe we should wait for the BDFL's pronouncement? Sure, at least before submitting a patch. Here's the simple one liner you can add to your .emacs file to play with the new style in the meantime. -Barry (c-add-style "python2" '("python" (c-basic-offset . 4))) From tim.one@home.com Sun Dec 24 04:04:47 2000 From: tim.one@home.com (Tim Peters) Date: Sat, 23 Dec 2000 23:04:47 -0500 Subject: [Python-Dev] PEP 208 and __coerce__ In-Reply-To: <20001209033006.A3737@glacier.fnational.com> Message-ID: [Neil Schemenauer Saturday, December 09, 2000 6:30 AM] > While working on the implementation of PEP 208, I discovered that > __coerce__ has some surprising properties. Initially I > implemented __coerce__ so that the numberic operation currently > being performed was called on the values returned by __coerce__. > This caused test_class to blow up due to code like this: > > class Test: > def __coerce__(self, other): > return (self, other) > > The 2.0 "solves" this by not calling __coerce__ again if the > objects returned by __coerce__ are instances. If C.__coerce__ doesn't *know* it can do the full job, it should return None. This is what's documented, too: a coerce method should return a pair consisting of objects of the same type, or return None. It's always going to be somewhat clumsy since what you really want is double (or, in the case of pow, sometimes triple) dispatch. Now there's a deliberate cheat that may not have gotten documented comprehensibly: when __coerce__ returns a pair, Python does not check to verify both elements are of the same class. That's because "a pair consisting of objects of the same type" is often not what you *want* from coerce. For example, if I've got a matrix class M, then in M() + 42 I really don't want M.__coerce__ "promoting" 42 to a multi-gigabyte matrix matching the shape and size of M(). M.__add__ can deal with that much more efficiently if it gets 42 directly. OTOH, M.__coerce__ may want to coerce types other than scalar numbers to conform to the shape and size of self, or fiddle self to conform to some other type. What Python accepts back from __coerce__ has to be flexible enough to allow all of those without further interference from the interpreter (just ask MAL : the *real* problem in practice is making coerce more of a help than a burden to the end user; outside of int->long->float->complex (which is itself partly broken, because long->float can lose precision or even fail outright), "coercion to a common type" is almost never quite right; note that C99 introduces distinct imaginary and complex types, because even auto-conversion of imaginary->complex can be a royal PITA!). > This has the effect of making code like: > > class A: > def __coerce__(self, other): > return B(), other > > class B: > def __coerce__(self, other): > return 1, other > > A() + 1 > > fail to work in the expected way. I have no idea how you expected that to work. Neither coerce() method looks reasonable: they don't follow the rules for coerce methods. If A thinks it needs to create a B() and have coercion "start over from scratch" with that, then it should do so explicitly: class A: def __coerce__(self, other): return coerce(B(), other) > The question is: how should __coerce__ work? This can't be answered by a one-liner: the intended behavior is documented by a complex set of rules at the bottom of Lang Ref 3.3.6 ("Emulating numeric types"). Alternatives should be written up as a diff against those rules, which Guido worked hard on in years past -- more than once, too . From esr@thyrsus.com Mon Dec 25 09:17:23 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Mon, 25 Dec 2000 04:17:23 -0500 Subject: [Python-Dev] Tkinter support under RH 7.0? Message-ID: <20001225041723.A9567@thyrsus.com> I just upgraded to Red Hat 7.0 and installed Python 2.0. Anybody have a recipe for making Tkinter support work in this environment? -- Eric S. Raymond "Government is not reason, it is not eloquence, it is force; like fire, a troublesome servant and a fearful master. Never for a moment should it be left to irresponsible action." -- George Washington, in a speech of January 7, 1790 From thomas@xs4all.net Mon Dec 25 10:59:45 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Mon, 25 Dec 2000 11:59:45 +0100 Subject: [Python-Dev] Tkinter support under RH 7.0? In-Reply-To: <20001225041723.A9567@thyrsus.com>; from esr@thyrsus.com on Mon, Dec 25, 2000 at 04:17:23AM -0500 References: <20001225041723.A9567@thyrsus.com> Message-ID: <20001225115945.A25820@xs4all.nl> On Mon, Dec 25, 2000 at 04:17:23AM -0500, Eric S. Raymond wrote: > I just upgraded to Red Hat 7.0 and installed Python 2.0. Anybody have > a recipe for making Tkinter support work in this environment? I installed Python 2.0 + Tkinter both from the BeOpen rpms and later from source (for various reasons) and both were a breeze. I didn't really use the 2.0+tkinter rpm version until I needed Numpy and various other things and had to revert to the self-compiled version, but it seemed to work fine. As far as I can recall, there's only two things you have to keep in mind: the tcl/tk version that comes with RedHat 7.0 is 8.3, so you have to adjust the Tkinter section of Modules/Setup accordingly, and some of the RedHat-supplied scripts stop working because they use deprecated modules (at least 'rand') and use the socket.socket call wrong. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From esr@thyrsus.com Wed Dec 27 19:37:50 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Wed, 27 Dec 2000 14:37:50 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues Message-ID: <20001227143750.A26894@thyrsus.com> I have 2.0 up and running on RH7.0, compiled from sources. In the process, I discovered a couple of issues: 1. The curses module is commented out in the default Modules/Setup file. This is not good, as it may lead careless distribution builders to ship Python 2.0s that will not be able to support the curses front end in CML2. Supporting CML2 (and thus getting Python the "design win" of being involved in the Linux kernel build) was the major point of integrating the curses module into the Python core. It is possible that one little "#" may have blown that. 2.The default Modules/Setup file assumes that various Tkinter-related libraries are in /usr/local. But /usr would be a more appropriate choice under most circumstances. Most Linux users now install their Tcl/Tk stuff from RPMs or .deb packages that place the binaries and libraries under /usr. Under most other Unixes (e.g. Solaris) they were there to begin with. 3. The configure machinery could be made to deduce more about the contents of Modules/Setup than it does now. In particular, it's silly that the person building Python has to fill in the locations of X librasries when configure is in principle perfectly capable of finding them. -- Eric S. Raymond Our society won't be truly free until "None of the Above" is always an option. From guido@digicool.com Wed Dec 27 21:04:27 2000 From: guido@digicool.com (Guido van Rossum) Date: Wed, 27 Dec 2000 16:04:27 -0500 Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support) In-Reply-To: Your message of "Fri, 22 Dec 2000 14:31:07 EST." References: Message-ID: <200012272104.QAA22278@cj20424-a.reston1.va.home.com> > 2. Guido repeats that he prefers old-style (but in a wishy-washy way that > leaves it uncertain (*)). OK, since a pronouncement is obviously needed, here goes: Python C source code should be indented using tabs only. Exceptions: (1) If 3rd party code is already written using a different style, it can stay that way, especially if it's a large volume that would be hard to reformat. But only if it is consistent within a file or set of files (e.g. a 3rd party patch will have to conform to the prevailing style in the patched file). (2) Occasionally (e.g. in ceval.c) there is code that's very deeply nested. I will allow 4-space indents for the innermost nesting levels here. Other C whitespace nits: - Always place spaces around assignment operators, comparisons, &&, ||. - No space between function name and left parenthesis. - Always a space between a keyword ('if', 'for' etc.) and left paren. - No space inside parentheses, brackets etc. - No space before a comma or semicolon. - Always a space after a comma (and semicolon, if not at end of line). - Use ``return x;'' instead of ``return(x)''. --Guido van Rossum (home page: http://www.python.org/~guido/) From cgw@fnal.gov Wed Dec 27 22:17:31 2000 From: cgw@fnal.gov (Charles G Waldman) Date: Wed, 27 Dec 2000 16:17:31 -0600 (CST) Subject: [Python-Dev] sourceforge: problems with bug list? Message-ID: <14922.27259.456364.750295@buffalo.fnal.gov> Is it just me, or is anybody else getting this error when trying to access the bug list? > An error occured in the logger. ERROR: pg_atoi: error in "5470/": > can't parse "/" From akuchlin@mems-exchange.org Wed Dec 27 22:39:35 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 27 Dec 2000 17:39:35 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500 References: <20001227143750.A26894@thyrsus.com> Message-ID: <20001227173935.A25605@kronos.cnri.reston.va.us> On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote: >1. The curses module is commented out in the default Modules/Setup >file. This is not good, as it may lead careless distribution builders It always has been commented out. Good distributions ship with most of the available modules enabled; I can't say if RH7.0 counts as a good distribution or not (still on 6.2). >3. The configure machinery could be made to deduce more about the contents >of Modules/Setup than it does now. In particular, it's silly that the person This is the point of PEP 229 and patch #102588, which uses a setup.py script to build extension modules. (I need to upload an updated version of the patch which actually includes setup.py -- thought I did that, but apparently not...) The patch is still extremely green, though, but I think it's the best course; witness the tissue of hackery required to get the bsddb module automatically detected and built. --amk From guido@digicool.com Wed Dec 27 22:54:26 2000 From: guido@digicool.com (Guido van Rossum) Date: Wed, 27 Dec 2000 17:54:26 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: Your message of "Fri, 22 Dec 2000 10:58:56 EST." <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> Message-ID: <200012272254.RAA22931@cj20424-a.reston1.va.home.com> > Charles G Waldman writes: > > I am reminded of Linus Torvalds comments on this subject (see > > /usr/src/linux/Documentation/CodingStyle): Fred replied: > The catch, of course, is Python/cevel.c, where breaking it up can > hurt performance. People scream when you do things like that.... Funny, Jeremy is doing just that, and it doesn't seem to be hurting performance at all. See http://sourceforge.net/patch/?func=detailpatch&patch_id=102337&group_id=5470 (though this is not quite finished). --Guido van Rossum (home page: http://www.python.org/~guido/) From esr@thyrsus.com Wed Dec 27 23:05:46 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Wed, 27 Dec 2000 18:05:46 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001227173935.A25605@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Dec 27, 2000 at 05:39:35PM -0500 References: <20001227143750.A26894@thyrsus.com> <20001227173935.A25605@kronos.cnri.reston.va.us> Message-ID: <20001227180546.A4365@thyrsus.com> Andrew Kuchling : > >1. The curses module is commented out in the default Modules/Setup > >file. This is not good, as it may lead careless distribution builders > > It always has been commented out. Good distributions ship with most > of the available modules enabled; I can't say if RH7.0 counts as a > good distribution or not (still on 6.2). I think this needs to change. If curses is a core facility now, the default build should tread it as one. -- Eric S. Raymond If a thousand men were not to pay their tax-bills this year, that would ... [be] the definition of a peaceable revolution, if any such is possible. -- Henry David Thoreau From tim.one@home.com Thu Dec 28 00:44:29 2000 From: tim.one@home.com (Tim Peters) Date: Wed, 27 Dec 2000 19:44:29 -0500 Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109 In-Reply-To: Message-ID: [Barry Warsaw] > Modified Files: > python-mode.el > Log Message: > (python-font-lock-keywords): Add highlighting of `as' as a keyword, > but only in "import foo as bar" statements (including optional > preceding `from' clause). Oh, that's right, try to make IDLE look bad, will you? I've got half a mind to take up the challenge. Unfortunately, I only have half a mind in total, so you may get away with this backstabbing for a while . From thomas@xs4all.net Thu Dec 28 09:53:31 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 10:53:31 +0100 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500 References: <20001227143750.A26894@thyrsus.com> Message-ID: <20001228105331.A6042@xs4all.nl> On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote: > I have 2.0 up and running on RH7.0, compiled from sources. In the process, > I discovered a couple of issues: > 1. The curses module is commented out in the default Modules/Setup > file. This is not good, as it may lead careless distribution builders > to ship Python 2.0s that will not be able to support the curses front > end in CML2. Supporting CML2 (and thus getting Python the "design > win" of being involved in the Linux kernel build) was the major point > of integrating the curses module into the Python core. It is possible > that one little "#" may have blown that. Note that Tkinter is off by default too. And readline. And ssl. And the use of shared libraries. We *can't* enable the cursesmodule by default, because we don't know what the system's curses library is called. We'd have to auto-detect that before we can enable it (and lots of other modules) automatically, and that's a lot of work. I personally favour autoconf for the job, but since amk is already busy on using distutils, I'm not going to work on that. > 2.The default Modules/Setup file assumes that various Tkinter-related libraries > are in /usr/local. But /usr would be a more appropriate choice under most > circumstances. Most Linux users now install their Tcl/Tk stuff from RPMs > or .deb packages that place the binaries and libraries under /usr. Under > most other Unixes (e.g. Solaris) they were there to begin with. This is nonsense. The line above it specifically states 'edit to reflect where your Tcl/Tk headers are'. And besides from the issue whether they are usually found in /usr (I don't believe so, not even on Solaris, but 'my' Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane choice, since /usr is already included in the include-path, but /usr/local usually is not. > 3. The configure machinery could be made to deduce more about the contents > of Modules/Setup than it does now. In particular, it's silly that the person > building Python has to fill in the locations of X librasries when > configure is in principle perfectly capable of finding them. In principle, I agree. It's a lot of work, though. For instance, Debian stores the Tcl/Tk headers in /usr/include/tcl, which means you can compile for more than one tcl version, by just changing your include path and the library you link with. And there are undoubtedly several other variants out there. Should we really make the Setup file default to Linux, and leave other operating systems in the dark about what it might be on their system ? I think people with Linux and without clue are the least likely people to compile their own Python, since Linux distributions already come with a decent enough Python. And, please, lets assume the people assembling those know how to read ? Maybe we just need a HOWTO document covering Setup ? (Besides, won't this all be fixed when CML2 comes with a distribution, Eric ? They'll *have* to have working curses/tkinter then :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From MarkH@ActiveState.com Thu Dec 28 12:34:09 2000 From: MarkH@ActiveState.com (Mark Hammond) Date: Thu, 28 Dec 2000 23:34:09 +1100 Subject: [Python-Dev] Fwd: try...else Message-ID: <3A4B3341.5010707@ActiveState.com> Spotted on c.l.python. Although Pythonwin is mentioned, python.exe gives the same results - as does 1.5.2. Seems a reasonable question... [Also, if Robin hasn't been invited to join us here, I think it could make some sense...] Mark. -------- Original Message -------- Subject: try...else Date: Fri, 22 Dec 2000 18:02:27 +0000 From: Robin Becker Newsgroups: comp.lang.python I had expected that in try: except: else the else clause always got executed, but it seems not for return PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32.Portions Copyright 1994-2000 Mark Hammond (MarkH@ActiveState.com) - see 'Help/About PythonWin' for further copyright information. >>> def bang(): .... try: .... return 'return value' .... except: .... print 'bang failed' .... else: .... print 'bang succeeded' .... >>> bang() 'return value' >>> is this a 'feature' or bug. The 2.0 docs seem not to mention return/continue except for try finally. -- Robin Becker From mal@lemburg.com Thu Dec 28 14:45:49 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 15:45:49 +0100 Subject: [Python-Dev] Fwd: try...else References: <3A4B3341.5010707@ActiveState.com> Message-ID: <3A4B521D.4372224A@lemburg.com> Mark Hammond wrote: > > Spotted on c.l.python. Although Pythonwin is mentioned, python.exe > gives the same results - as does 1.5.2. > > Seems a reasonable question... > > [Also, if Robin hasn't been invited to join us here, I think it could > make some sense...] > > Mark. > -------- Original Message -------- > Subject: try...else > Date: Fri, 22 Dec 2000 18:02:27 +0000 > From: Robin Becker > Newsgroups: comp.lang.python > > I had expected that in try: except: else > the else clause always got executed, but it seems not for return I think Robin mixed up try...finally with try...except...else. The finally clause is executed even in case an exception occurred. He does have a point however that 'return' will bypass try...else and try...finally clauses. I don't think we can change that behaviour, though, as it would break code. > PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on > win32.Portions Copyright 1994-2000 Mark Hammond (MarkH@ActiveState.com) > - see 'Help/About PythonWin' for further copyright information. > >>> def bang(): > .... try: > .... return 'return value' > .... except: > .... print 'bang failed' > .... else: > .... print 'bang succeeded' > .... > >>> bang() > 'return value' > >>> > > is this a 'feature' or bug. The 2.0 docs seem not to mention > return/continue except for try finally. > -- > Robin Becker > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://www.python.org/mailman/listinfo/python-dev -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido@digicool.com Thu Dec 28 15:04:23 2000 From: guido@digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 10:04:23 -0500 Subject: [Python-Dev] chomp()? Message-ID: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Someone just posted a patch to implement s.chomp() as a string method: http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470 Pseudo code (for those not aware of the Perl function by that name): def chomp(s): if s[-2:] == '\r\n': return s[:-2] if s[-1:] == '\r' or s[-1:] == '\n': return s[:-1] return s I.e. it removes a trailing \r\n, \r, or \n. Any comments? Is this needed given that we have s.rstrip() already? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido@digicool.com Thu Dec 28 15:30:57 2000 From: guido@digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 10:30:57 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: Your message of "Wed, 27 Dec 2000 14:37:50 EST." <20001227143750.A26894@thyrsus.com> References: <20001227143750.A26894@thyrsus.com> Message-ID: <200012281530.KAA26049@cj20424-a.reston1.va.home.com> Eric, I think your recent posts have shown a worldview that's a bit too Eric-centered. :-) Not all the world is Linux. CML2 isn't the only Python application that matters. Python world domination is not a goal. There is no Eric conspiracy! :-) That said, I think that the future is bright: Anderw is already working on a much more intelligent configuration manager. I believe it would be a mistake to enable curses by default using the current approach to module configuration: it doesn't compile out of the box on every platform, and you wouldn't believe how much email I get from clueless Unix users trying to build Python when there's a problem like that in the distribution. So I'd rather wait for Andrew's work. You could do worse than help him with that, to further your goal! --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake@acm.org Thu Dec 28 15:41:23 2000 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 28 Dec 2000 10:41:23 -0500 Subject: [Python-Dev] chomp()? In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: On Thu, 28 Dec 2000 10:04:23 -0500, Guido wrote: > Someone just posted a patch to implement s.chomp() as a > string method: ... > Any comments? Is this needed given that we have > s.rstrip() already? I've always considered this a different operation from rstrip(). When you intend to be as surgical in your changes as possible, it is important *not* to use rstrip(). I don't feel strongly that it needs to be implemented in C, though I imagine people who do a lot of string processing feel otherwise. It's just hard to beat the performance difference if you are doing this a lot. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From barry@digicool.com Thu Dec 28 16:00:36 2000 From: barry@digicool.com (Barry A. Warsaw) Date: Thu, 28 Dec 2000 11:00:36 -0500 Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109 References: Message-ID: <14923.25508.668453.186209@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> [Barry Warsaw] >> Modified Files: python-mode.el Log Message: >> (python-font-lock-keywords): Add highlighting of `as' as a >> keyword, but only in "import foo as bar" statements (including >> optional preceding `from' clause). TP> Oh, that's right, try to make IDLE look bad, will you? I've TP> got half a mind to take up the challenge. Unfortunately, I TP> only have half a mind in total, so you may get away with this TP> backstabbing for a while . With my current network (un)connectivity, I feel like a nuclear sub which can only surface once a month to receive low frequency orders from some remote antenna farm out in New Brunswick. Just think of me as a rogue commander who tries to do as much damage as possible when he's not joyriding in the draft-wake of giant squids. rehoming-all-remaining-missiles-at-the-Kingdom-of-Timbotia-ly y'rs, -Barry From esr@thyrsus.com Thu Dec 28 16:01:54 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 11:01:54 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <200012281530.KAA26049@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 10:30:57AM -0500 References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> Message-ID: <20001228110154.D32394@thyrsus.com> Guido van Rossum : > Not all the world is Linux. CML2 isn't the only Python application > that matters. Python world domination is not a goal. There is no > Eric conspiracy! :-) Perhaps I misunderstood you, then. I thought you considered CML2 an potentially important design win, and that was why curses didn't get dropped from the core. Have you changed your mind about this? If Python world domination is not a goal then I can only conclude that you haven't had your morning coffee yet :-). There's a more general question here about what it means for something to be in the core language. Developers need to have a clear, bright-line picture of what they can count on to be present. To me this implies that it's the job of the Python maintainers to make sure that a facility declared "core" by its presence in the standard library documentation is always present, for maximum "batteries are included" effect. Yes, dealing with cross-platform variations in linking curses is a pain -- but dealing with that kind of pain so the Python user doesn't have to is precisely our job. Or so I understand it, anyway. -- Eric S. Raymond Conservatism is the blind and fear-filled worship of dead radicals. From moshez@zadka.site.co.il Thu Dec 28 16:51:32 2000 From: moshez@zadka.site.co.il (Moshe Zadka) Date: 28 Dec 2000 16:51:32 -0000 Subject: [Python-Dev] chomp()? In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <20001228165132.8025.qmail@stimpy.scso.com> On Thu, 28 Dec 2000, Guido van Rossum wrote: > Someone just posted a patch to implement s.chomp() as a string method: ... > Any comments? Is this needed given that we have s.rstrip() already? Yes. i=0 for line in fileinput.input(): print '%d: %s' % (i, line.chomp()) i++ I want that operation to be invertable by sed 's/^[0-9]*: //' From guido@digicool.com Thu Dec 28 17:08:18 2000 From: guido@digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 12:08:18 -0500 Subject: [Python-Dev] scp to sourceforge Message-ID: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> I've seen a thread on this but there was no conclusive answer, so I'm reopening this. I can't SCP updated PEPs to the SourceForge machine. The "pep2html.py -i" command just hangs. I can ssh into shell.sourceforge.net just fine, but scp just hangs. "scp -v" prints a bunch of things suggesting that it can authenticate itself just fine, ending with these three lines: cj20424-a.reston1.va.home.com: RSA authentication accepted by server. cj20424-a.reston1.va.home.com: Sending command: scp -v -t . cj20424-a.reston1.va.home.com: Entering interactive session. and then nothing. It just sits there. Would somebody please figure out a way to update the PEPs? It's kind of pathetic to see the website not have the latest versions... --Guido van Rossum (home page: http://www.python.org/~guido/) From moshez@zadka.site.co.il Thu Dec 28 16:28:07 2000 From: moshez@zadka.site.co.il (Moshe Zadka) Date: 28 Dec 2000 16:28:07 -0000 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B521D.4372224A@lemburg.com> References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> Message-ID: <20001228162807.7229.qmail@stimpy.scso.com> On Thu, 28 Dec 2000, "M.-A. Lemburg" wrote: > He does have a point however that 'return' will bypass > try...else and try...finally clauses. I don't think we can change > that behaviour, though, as it would break code. It doesn't bypass try..finally: >>> def foo(): ... try: ... print "hello" ... return ... finally: ... print "goodbye" ... >>> foo() hello goodbye From guido@digicool.com Thu Dec 28 16:43:26 2000 From: guido@digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 11:43:26 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: Your message of "Thu, 28 Dec 2000 11:01:54 EST." <20001228110154.D32394@thyrsus.com> References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> <20001228110154.D32394@thyrsus.com> Message-ID: <200012281643.LAA26687@cj20424-a.reston1.va.home.com> > Guido van Rossum : > > Not all the world is Linux. CML2 isn't the only Python application > > that matters. Python world domination is not a goal. There is no > > Eric conspiracy! :-) > > Perhaps I misunderstood you, then. I thought you considered CML2 an > potentially important design win, and that was why curses didn't get > dropped from the core. Have you changed your mind about this? Supporting CML2 was one of the reasons to keep curses in the core, but not the only one. Linux kernel configuration is so far removed from my daily use of computers that I don't have a good way to judge its importance in the grand scheme of things. Since you obviously consider it very important, and since I generally trust your judgement (except on the issue of firearms :-), your plea for keeping, and improving, curses support in the Python core made a difference in my decision. And don't worry, I don't expect to change that decision -- though I personally still find it curious that curses is so important. I find curses-style user interfaces pretty pathetic, and wished that Linux migrated to a real GUI for configuration. (And the linuxconf approach does *not* qualify as a a real GUI. :-) > If Python world domination is not a goal then I can only conclude that > you haven't had your morning coffee yet :-). Sorry to disappoint you, Eric. I gave up coffee years ago. :-) I was totally serious though: my personal satisfaction doesn't come from Python world domination. Others seem have that goal, and if it doesn't inconvenience me too much I'll play along, but in the end I've got some goals in my personal life that are much more important. > There's a more general question here about what it means for something > to be in the core language. Developers need to have a clear, > bright-line picture of what they can count on to be present. To me > this implies that it's the job of the Python maintainers to make sure > that a facility declared "core" by its presence in the standard > library documentation is always present, for maximum "batteries are > included" effect. We do the best we can. Using the current module configuration system, it's a one-character edit to enable curses if you need it. With Andrew's new scheme, it will be automatic. > Yes, dealing with cross-platform variations in linking curses is a > pain -- but dealing with that kind of pain so the Python user doesn't > have to is precisely our job. Or so I understand it, anyway. So help Andrew: http://python.sourceforge.net/peps/pep-0229.html --Guido van Rossum (home page: http://www.python.org/~guido/) From mal@lemburg.com Thu Dec 28 16:52:36 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 17:52:36 +0100 Subject: [Python-Dev] Fwd: try...else References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> Message-ID: <3A4B6FD3.9B576E9A@lemburg.com> Moshe Zadka wrote: > > On Thu, 28 Dec 2000, "M.-A. Lemburg" wrote: > > > He does have a point however that 'return' will bypass > > try...else and try...finally clauses. I don't think we can change > > that behaviour, though, as it would break code. > > It doesn't bypass try..finally: > > >>> def foo(): > ... try: > ... print "hello" > ... return > ... finally: > ... print "goodbye" > ... > >>> foo() > hello > goodbye Hmm, that must have changed between Python 1.5 and more recent versions: Python 1.5: >>> def f(): ... try: ... return 1 ... finally: ... print 'finally' ... >>> f() 1 >>> Python 2.0: >>> def f(): ... try: ... return 1 ... finally: ... print 'finally' ... >>> f() finally 1 >>> -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From moshez@stimpy.scso.com Thu Dec 28 16:59:32 2000 From: moshez@stimpy.scso.com (Moshe Zadka) Date: 28 Dec 2000 16:59:32 -0000 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B6FD3.9B576E9A@lemburg.com> References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> Message-ID: <20001228165932.8143.qmail@stimpy.scso.com> On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" wrote: [about try..finally not playing well with return] > Hmm, that must have changed between Python 1.5 and more recent > versions: I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2? From esr@thyrsus.com Thu Dec 28 17:20:48 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 12:20:48 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001228105331.A6042@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 10:53:31AM +0100 References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> Message-ID: <20001228122048.A1381@thyrsus.com> Thomas Wouters : > > 1. The curses module is commented out in the default Modules/Setup > > file. This is not good, as it may lead careless distribution builders > > to ship Python 2.0s that will not be able to support the curses front > > end in CML2. Supporting CML2 (and thus getting Python the "design > > win" of being involved in the Linux kernel build) was the major point > > of integrating the curses module into the Python core. It is possible > > that one little "#" may have blown that. > > Note that Tkinter is off by default too. And readline. And ssl. And the use > of shared libraries. IMO ssl isn't an issue because it's not documented as being in the standard module set. Readline is a minor issue because raw_input()'s functionality changes somewhat if it's not linked, but I think we can live with this -- the change isn't visible to calling programs. Hm. It appears tkinter isn't documented in the standard set of modules either. Interesting. Technically this means I don't have a problem with it not being built in by default, but I think there is a problem here... My more general point is that right now Pyjthon has three classes of modules: 1. Documented as being in the core and built in by default. 2. Not documented as being in the core and not built in by default. 3. Documented as being in the core but not built in by default. My more general claim is that the existence of class 3 is a problem, because it compromises the "batteries are included" effect -- it means Python users don't have a bright-line test for what will be present in every Python (or at least every Python on an operating system theoretically feature-compatible with theirs). My struggle to get CML2 adopted brings this problem into particularly sharp focus because the kernel group is allergic to big footprints or having to download extension modules to do a build. But the issue is really broader than that. I think we ought to be migrating stuff out of class 3 into class 1 where possible and to class 2 only where unavoidable. > We *can't* enable the cursesmodule by default, because > we don't know what the system's curses library is called. We'd have to > auto-detect that before we can enable it (and lots of other modules) > automatically, and that's a lot of work. I personally favour autoconf for > the job, but since amk is already busy on using distutils, I'm not going to > work on that. Yes, we need to do a lot more autodetection -- this is a large part of my point. I have nothing against distutils, but I don't see how it solves this problem unless we assume that we'll always have Python already available on any platform where we're building Python. I'm willing to put my effort where my mouth is on this. I have a lot of experience with autoconf; I'm willing to write some of these nasty config tests. > > 2.The default Modules/Setup file assumes that various Tkinter-related libraries > > are in /usr/local. But /usr would be a more appropriate choice under most > > circumstances. Most Linux users now install their Tcl/Tk stuff from RPMs > > or .deb packages that place the binaries and libraries under /usr. Under > > most other Unixes (e.g. Solaris) they were there to begin with. > > This is nonsense. The line above it specifically states 'edit to reflect > where your Tcl/Tk headers are'. And besides from the issue whether they are > usually found in /usr (I don't believe so, not even on Solaris, but 'my' > Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane > choice, since /usr is already included in the include-path, but /usr/local > usually is not. Is it? That is not clear from the comment. Perhaps this is just a documentation problem. I'll look again. > > 3. The configure machinery could be made to deduce more about the contents > > of Modules/Setup than it does now. In particular, it's silly that the > > person building Python has to fill in the locations of X librasries when > > configure is in principle perfectly capable of finding them. > > In principle, I agree. It's a lot of work, though. For instance, Debian > stores the Tcl/Tk headers in /usr/include/tcl, which means you can > compile for more than one tcl version, by just changing your include path > and the library you link with. And there are undoubtedly several other > variants out there. As I said to Guido, I think it is exactly our job to deal with this sort of grottiness. One of Python's major selling points is supposed to be cross-platform consistency of the API. If we fail to do what you're describing, we're failing to meet Python users' reasonable expectations for the language. > Should we really make the Setup file default to Linux, and leave other > operating systems in the dark about what it might be on their system ? I > think people with Linux and without clue are the least likely people to > compile their own Python, since Linux distributions already come with a > decent enough Python. And, please, lets assume the people assembling those > know how to read ? Please note that I am specifically *not* advocating making the build defaults Linux-centric. That's not my point at all. > Maybe we just need a HOWTO document covering Setup ? That would be a good idea. > (Besides, won't this all be fixed when CML2 comes with a distribution, Eric ? > They'll *have* to have working curses/tkinter then :-) I'm concerned that it will work the other way around, that CML2 won't happen if the core does not reliably include these facilities. In itself CML2 not happening wouldn't be the end of the world of course, but I'm pushing on this because I think the larger issue of class 3 modules is actually important to the health of Python and needs to be attacked seriously. -- Eric S. Raymond The Bible is not my book, and Christianity is not my religion. I could never give assent to the long, complicated statements of Christian dogma. -- Abraham Lincoln From cgw@fnal.gov Thu Dec 28 17:36:06 2000 From: cgw@fnal.gov (Charles G Waldman) Date: Thu, 28 Dec 2000 11:36:06 -0600 (CST) Subject: [Python-Dev] chomp()? In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <14923.31238.65155.496546@buffalo.fnal.gov> Guido van Rossum writes: > Someone just posted a patch to implement s.chomp() as a string method: > I.e. it removes a trailing \r\n, \r, or \n. > > Any comments? Is this needed given that we have s.rstrip() already? -1 from me. P=NP (Python is not Perl). "Chomp" is an excessively cute name. And like you said, this is too much like "rstrip" to merit a separate method. From esr@thyrsus.com Thu Dec 28 17:41:17 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 12:41:17 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <200012281643.LAA26687@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 11:43:26AM -0500 References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> <20001228110154.D32394@thyrsus.com> <200012281643.LAA26687@cj20424-a.reston1.va.home.com> Message-ID: <20001228124117.B1381@thyrsus.com> Guido van Rossum : > Supporting CML2 was one of the reasons to keep curses in the core, but > not the only one. Linux kernel configuration is so far removed from > my daily use of computers that I don't have a good way to judge its > importance in the grand scheme of things. Since you obviously > consider it very important, and since I generally trust your judgement > (except on the issue of firearms :-), your plea for keeping, and > improving, curses support in the Python core made a difference in my > decision. And don't worry, I don't expect to change that decision > -- though I personally still find it curious that curses is so important. > I find curses-style user interfaces pretty pathetic, and wished that > Linux migrated to a real GUI for configuration. (And the linuxconf > approach does *not* qualify as a a real GUI. :-) Thank you, that makes your priorities much clearer. Actually I agree with you that curses interfaces are mostly pretty pathetic. A lot of people still like them, though, because they tend to be fast and lightweight. Then, too, a really well-designed curses interface can in fact be good enough that the usability gain from GUIizing is marginal. My favorite examples of this are mutt and slrn. The fact that GUI programs have failed to make much headway against this is not simply due to user conservatism, it's genuinely hard to see how a GUI interface could be made significantly better. And unfortunately, there is a niche where it is still important to support curses interfacing independently of anyone's preferences in interface style -- early in the system-configuration process before one has bootstrapped to the point where X is reliably available. I hasten to add that this is not just *my* problem -- one of your more important Python constituencies in a practical sense is the guys who maintain Red Hat's installer. > I was totally serious though: my personal satisfaction doesn't come > from Python world domination. Others seem have that goal, and if it > doesn't inconvenience me too much I'll play along, but in the end I've > got some goals in my personal life that are much more important. There speaks the new husband :-). OK. So what *do* you want from Python? Personally, BTW, my goal is not exactly Python world domination either -- it's that the world should be dominated by the language that has the least tendency to produce grotty fragile code (remember that I tend to obsess about the software-quality problem :-)). Right now that's Python. -- Eric S. Raymond The people of the various provinces are strictly forbidden to have in their possession any swords, short swords, bows, spears, firearms, or other types of arms. The possession of unnecessary implements makes difficult the collection of taxes and dues and tends to foment uprisings. -- Toyotomi Hideyoshi, dictator of Japan, August 1588 From mal@lemburg.com Thu Dec 28 17:43:13 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 18:43:13 +0100 Subject: [Python-Dev] chomp()? References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <3A4B7BB1.F09660ED@lemburg.com> Guido van Rossum wrote: > > Someone just posted a patch to implement s.chomp() as a string method: > > http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470 > > Pseudo code (for those not aware of the Perl function by that name): > > def chomp(s): > if s[-2:] == '\r\n': > return s[:-2] > if s[-1:] == '\r' or s[-1:] == '\n': > return s[:-1] > return s > > I.e. it removes a trailing \r\n, \r, or \n. > > Any comments? Is this needed given that we have s.rstrip() already? We already have .splitlines() which does the above (remove line breaks) not only for a single line, but for many lines at once. Even better: .splitlines() also does the right thing for Unicode. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Thu Dec 28 19:06:33 2000 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 20:06:33 +0100 Subject: [Python-Dev] Fwd: try...else References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> <20001228165932.8143.qmail@stimpy.scso.com> Message-ID: <3A4B8F39.58C64EFB@lemburg.com> Moshe Zadka wrote: > > On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" wrote: > > [about try..finally not playing well with return] > > Hmm, that must have changed between Python 1.5 and more recent > > versions: > > I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2? Sorry, false alarm: there was a bug in my patched 1.5 version. The original 1.5 version does not show the described behaviour. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From thomas@xs4all.net Thu Dec 28 20:21:15 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 21:21:15 +0100 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B521D.4372224A@lemburg.com>; from mal@lemburg.com on Thu, Dec 28, 2000 at 03:45:49PM +0100 References: <3A4B3341.5010707@ActiveState.com> <3A4B521D.4372224A@lemburg.com> Message-ID: <20001228212115.C1811@xs4all.nl> On Thu, Dec 28, 2000 at 03:45:49PM +0100, M.-A. Lemburg wrote: > > I had expected that in try: except: else > > the else clause always got executed, but it seems not for return > I think Robin mixed up try...finally with try...except...else. > The finally clause is executed even in case an exception occurred. (MAL and I already discussed this in private mail: Robin did mean try/except/else, and 'finally' already executes when returning directly from the 'try' block, even in Python 1.5) > He does have a point however that 'return' will bypass > try...else and try...finally clauses. I don't think we can change > that behaviour, though, as it would break code. This code: try: return except: pass else: print "returning" will indeed not print 'returning', but I believe it's by design. I'm against changing it, in any case, and not just because it'd break code :) If you want something that always executes, use a 'finally'. Or don't return from the 'try', but return in the 'else' clause. The 'except' clause is documented to execute if a matching exception occurs, and 'else' if no exception occurs. Maybe the intent of the 'else' clause would be clearer if it was documented to 'execute if the try: clause finishes without an exception being raised' ? The 'else' clause isn't executed when you 'break' or (after applying my continue-in-try patch ;) 'continue' out of the 'try', either. Robin... Did I already reply this, on python-list or to you directly ? I distinctly remember writing that post, but I'm not sure if it arrived. Maybe I didn't send it after all, or maybe something on mail.python.org is detaining it ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas@xs4all.net Thu Dec 28 18:19:06 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 19:19:06 +0100 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001228122048.A1381@thyrsus.com>; from esr@thyrsus.com on Thu, Dec 28, 2000 at 12:20:48PM -0500 References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com> Message-ID: <20001228191906.F1281@xs4all.nl> On Thu, Dec 28, 2000 at 12:20:48PM -0500, Eric S. Raymond wrote: > My more general point is that right now Pyjthon has three classes of > modules: > 1. Documented as being in the core and built in by default. > 2. Not documented as being in the core and not built in by default. > 3. Documented as being in the core but not built in by default. > My more general claim is that the existence of class 3 is a problem, > because it compromises the "batteries are included" effect -- it means > Python users don't have a bright-line test for what will be present in > every Python (or at least every Python on an operating system > theoretically feature-compatible with theirs). It depends on your definition of 'being in the core'. Some of the things that are 'in the core' are simply not possible on all platforms. So if you want really portable code, you don't want to use them. Other features are available on all systems that matter [to you], so you don't really care about it, just use them, and at best document that they need feature X. There is also the subtle difference between a Python user and a Python compiler/assembler (excuse my overloading of the terms, but you know what I mean). People who choose to compile their own Python should realize that they might disable or misconfigure some parts of it. I personally trust most people that assemble OS distributions to compile a proper Python binary + modules, but I think a HOWTO isn't a bad idea -- unless we autoconf everything. > I think we ought to be migrating stuff out > of class 3 into class 1 where possible and to class 2 only where > unavoidable. [ and ] > I'm willing to put my effort where my mouth is on this. I have a lot > of experience with autoconf; I'm willing to write some of these nasty > config tests. [ and ] > As I said to Guido, I think it is exactly our job to deal with this sort > of grottiness. One of Python's major selling points is supposed to be > cross-platform consistency of the API. If we fail to do what you're > describing, we're failing to meet Python users' reasonable expectations > for the language. [ and ] > Please note that I am specifically *not* advocating making the build defaults > Linux-centric. That's not my point at all. I apologize for the tone of my previous post, and the above snippet. I'm not trying to block progress here ;) I'm actually all for autodetecting as much as possible, and more than willing to put effort into it as well (as long as it's deemed useful, and isn't supplanted by a distutils variant weeks later.) And I personally have my doubts about the distutils variant, too, but that's partly because I have little experience with distutils. If we can work out a deal where both autoconf and distutils are an option, I'm happy to write a few, if not all, autoconf tests for the currently disabled modules. So, Eric, let's split the work. I'll do Tkinter if you do curses. :) However, I'm also keeping those oddball platforms that just don't support some features in mind. If you want truly portable code, you have to work at it. I think it's perfectly okay to say "your Python needs to have the curses module or the tkinter module compiled in -- contact your administrator if it has neither". There will still be platforms that don't have curses, or syslog, or crypt(), though hopefully none of them will be Linux. Oh, and I also apologize for possibly duplicating what has already been said by others. I haven't seen anything but this post (which was CC'd to me directly) since I posted my reply to Eric, due to the ululating bouts of delay on mail.python.org. Maybe DC should hire some *real* sysadmins, instead of those silly programmer-kniggits ? >:-> -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mwh21@cam.ac.uk Thu Dec 28 18:27:48 2000 From: mwh21@cam.ac.uk (Michael Hudson) Date: Thu, 28 Dec 2000 18:27:48 +0000 (GMT) Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B521D.4372224A@lemburg.com> Message-ID: On Thu, 28 Dec 2000, M.-A. Lemburg wrote: > I think Robin mixed up try...finally with try...except...else. I think so too. > The finally clause is executed even in case an exception occurred. > > He does have a point however that 'return' will bypass > try...else and try...finally clauses. I don't think we can change > that behaviour, though, as it would break code. return does not skip finally clauses[1]. In my not especially humble opinion, the current behaviour is the Right Thing. I'd have to think for a moment before saying what Robin's example would print, but I think the alternative would disturb me far more. Cheers, M. [1] In fact the flow of control on return is very similar to that of an exception - ooh, look at that implementation... From esr@thyrsus.com Thu Dec 28 19:17:51 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 14:17:51 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001228191906.F1281@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 07:19:06PM +0100 References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com> <20001228191906.F1281@xs4all.nl> Message-ID: <20001228141751.B2528@thyrsus.com> Thomas Wouters : > > My more general claim is that the existence of class 3 is a problem, > > because it compromises the "batteries are included" effect -- it means > > Python users don't have a bright-line test for what will be present in > > every Python (or at least every Python on an operating system > > theoretically feature-compatible with theirs). > > It depends on your definition of 'being in the core'. Some of the things > that are 'in the core' are simply not possible on all platforms. So if you > want really portable code, you don't want to use them. Other features are > available on all systems that matter [to you], so you don't really care > about it, just use them, and at best document that they need feature X. I understand. We can't, for example, guarantee to duplicate the Windows- specific stuff in the Unix port (nor would we want to in most cases :-)). However, I think "we build in curses/Tkinter everywhere the corresponding libraries exist" is a guarantee we can and should make. Similarly for other modules presently in class 3. > There is also the subtle difference between a Python user and a Python > compiler/assembler (excuse my overloading of the terms, but you know what I > mean). Yes. We have three categories here: 1. People who use python for applications (what I've been calling users) 2. People who configure Python binary packages for distribution (what you call a "compiler/assembler" and I think of as a "builder"). 3. People who hack Python itself. Problem is that "developer" is very ambiguous in this context... > People who choose to compile their own Python should realize that > they might disable or misconfigure some parts of it. I personally trust most > people that assemble OS distributions to compile a proper Python binary + > modules, but I think a HOWTO isn't a bad idea -- unless we autoconf > everything. I'd like to see both things happen (HOWTO and autoconfing) and am willing to work on both. > I apologize for the tone of my previous post, and the above snippet. No offense taken at all, I assure you. > I'm not > trying to block progress here ;) I'm actually all for autodetecting as much > as possible, and more than willing to put effort into it as well (as long as > it's deemed useful, and isn't supplanted by a distutils variant weeks > later.) And I personally have my doubts about the distutils variant, too, > but that's partly because I have little experience with distutils. If we can > work out a deal where both autoconf and distutils are an option, I'm happy > to write a few, if not all, autoconf tests for the currently disabled > modules. I admit I'm not very clear on the scope of what distutils is supposed to handle, and how. Perhaps amk can enlighten us? > So, Eric, let's split the work. I'll do Tkinter if you do curses. :) You've got a deal. I'll start looking at the autoconf code. I've already got a fair idea how to do this. -- Eric S. Raymond No one who's seen it in action can say the phrase "government help" without either laughing or crying. From tim.one@home.com Fri Dec 29 02:59:53 2000 From: tim.one@home.com (Tim Peters) Date: Thu, 28 Dec 2000 21:59:53 -0500 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > I've seen a thread on this but there was no conclusive answer, so I'm > reopening this. It hasn't budged an inch since then: your "Entering interactive session" problem is the same one everyone has; it gets reported on SF's bug and/or support managers at least daily; SF has not fixed it yet; these days they don't even respond to scp bug reports anymore; the cause appears to be SF's custom sfshell, and only SF can change that; the only known workarounds are to (a) modify files on SF directly (they suggest vi ), or (b) initiate scp *from* SF, using your local machine as a server (if you can do that -- I cannot, or at least haven't succeeded). From martin@loewis.home.cs.tu-berlin.de Thu Dec 28 22:52:02 2000 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 28 Dec 2000 23:52:02 +0100 Subject: [Python-Dev] curses in the core? Message-ID: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de> > If curses is a core facility now, the default build should tread it > as one. ... > IMO ssl isn't an issue because it's not documented as being in the > standard module set. ... > 3. Documented as being in the core but not built in by default. > My more general claim is that the existence of class 3 is a problem In the case of curses, I believe there is a documentation error in the 2.0 documentation. The curses packages is listed under "Generic Operating System Services". I believe this is wrong, it should be listed as "Unix Specific Services". Unless I'm mistaken, the curses module is not available on the Mac and on Windows. With that change, the curses module would then fall into Eric's category 2 (Not documented as being in the core and not built in by default). That documentation change should be carried out even if curses is autoconfigured; autoconf is used on Unix only, either. Regards, Martin P.S. The "Python Library Reference" content page does not mention the word "core" at all, except as part of asyncore... From thomas@xs4all.net Thu Dec 28 22:58:25 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 23:58:25 +0100 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 12:08:18PM -0500 References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> Message-ID: <20001228235824.E1811@xs4all.nl> On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote: > I've seen a thread on this but there was no conclusive answer, so I'm > reopening this. Actually there was: it's all SourceForge's fault. (At least that's my professional opinion ;) They honestly have a strange setup, though how strange and to what end I cannot tell. > Would somebody please figure out a way to update the PEPs? It's kind > of pathetic to see the website not have the latest versions... The way to update the peps is by ssh'ing into shell.sourceforge.net, and then scp'ing the files from your work repository to the htdocs/peps directory. That is, until SF fixes the scp problem. This method works (I just updated all PEPs to up-to-date CVS versions) but it's a bit cumbersome. And it only works if you have ssh access to your work environment. And it's damned hard to script; I tried playing with a single ssh command that did all the work, but between shell weirdness, scp weirdness and a genuine bash bug I couldn't figure it out. I assume that SF is aware of the severity of this problem, and is working on something akin to a fix or workaround. Until then, I can do an occasional update of the PEPs, for those that can't themselves. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas@xs4all.net Thu Dec 28 23:05:28 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 29 Dec 2000 00:05:28 +0100 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <20001228235824.E1811@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 11:58:25PM +0100 References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> <20001228235824.E1811@xs4all.nl> Message-ID: <20001229000528.F1811@xs4all.nl> On Thu, Dec 28, 2000 at 11:58:25PM +0100, Thomas Wouters wrote: > On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote: > > Would somebody please figure out a way to update the PEPs? It's kind > > of pathetic to see the website not have the latest versions... > > The way to update the peps is by ssh'ing into shell.sourceforge.net, and > then scp'ing the files from your work repository to the htdocs/peps [ blah blah ] And then they fixed it ! At least, for me, direct scp now works fine. (I should've tested that before posting my blah blah, sorry.) Anybody else, like people using F-secure ssh (unix or windows) experience the same ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From MarkH@ActiveState.com Thu Dec 28 23:15:01 2000 From: MarkH@ActiveState.com (Mark Hammond) Date: Fri, 29 Dec 2000 10:15:01 +1100 Subject: [Python-Dev] chomp()? In-Reply-To: <14923.31238.65155.496546@buffalo.fnal.gov> Message-ID: > -1 from me. P=NP (Python is not Perl). "Chomp" is an > excessively cute name. > And like you said, this is too much like "rstrip" to merit a separate > method. My thoughts exactly. I can't remember _ever_ wanting to chomp() when rstrip() wasnt perfectly suitable. I'm sure it happens, but not often enough to introduce an ambiguous new function purely for "feature parity" with Perl. Mark. From esr@thyrsus.com Thu Dec 28 23:25:28 2000 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 18:25:28 -0500 Subject: [Python-Dev] Re: curses in the core? In-Reply-To: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Thu, Dec 28, 2000 at 11:52:02PM +0100 References: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de> Message-ID: <20001228182528.A10743@thyrsus.com> Martin v. Loewis : > In the case of curses, I believe there is a documentation error in the > 2.0 documentation. The curses packages is listed under "Generic > Operating System Services". I believe this is wrong, it should be listed > as "Unix Specific Services". I agree that this is an error and should be fixed. > Unless I'm mistaken, the curses module is not available on the Mac and > on Windows. With that change, the curses module would then fall into > Eric's category 2 (Not documented as being in the core and not built > in by default). Well...that's a definitional question that is part of the larger issue here. What does being in the Python core mean? There are two potential definitions: 1. Documentation says it's available on all platforms. 2. Documentation restricts it to one of the three platform groups (Unix/Windows/Mac) but implies that it will be available on any OS in that group. I think the second one is closer to what application programmers thinking about which batteries are included expect. But I could be persuaded otherwise by a good argument. -- Eric S. Raymond The difference between death and taxes is death doesn't get worse every time Congress meets -- Will Rogers From akuchlin@mems-exchange.org Fri Dec 29 00:33:36 2000 From: akuchlin@mems-exchange.org (A.M. Kuchling) Date: Thu, 28 Dec 2000 19:33:36 -0500 Subject: [Python-Dev] Bookstore completed Message-ID: <200012290033.TAA01295@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> OK, I think I'm ready to declare the Python bookstore complete enough to go public. Before I set up redirects from www.python.org, please take another look. (More book descriptions would be helpful...) http://www.amk.ca/bookstore/ --amk From akuchlin@mems-exchange.org Fri Dec 29 00:46:16 2000 From: akuchlin@mems-exchange.org (A.M. Kuchling) Date: Thu, 28 Dec 2000 19:46:16 -0500 Subject: [Python-Dev] Help wanted with setup.py script Message-ID: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> Want to help with the laudable goal of automating the Python build process? It'll need lots of testing on many different platforms, and I'd like to start the process now. First, download the setup.py script from http://www.amk.ca/files/python/setup.py Next, drop it in the root directory of your Python source tree and run "python setup.py build". If it dies with an exception, let me know. (Replies to this list are OK.) If it runs to completion, look in the Modules/build/lib. directory to see which modules got built. (On my system, is "linux-i686-2.0", but of course this will depend on your platform.) Is anything missing that should have been built? (_tkinter.so is the prime candidate; the autodetection code is far too simple at the moment and assumes one particular version of Tcl and Tk.) Did an attempt at building a module fail? These indicate problems autodetecting something, so if you can figure out how to find the required library or include file, let me know what to do. --amk From fdrake@acm.org Fri Dec 29 04:12:18 2000 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 28 Dec 2000 23:12:18 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <20001228212115.C1811@xs4all.nl> Message-ID: On Thu, 28 Dec 2000 21:21:15 +0100, Thomas Wouters wrote: > The 'except' clause is documented to execute if a > matching exception occurs, > and 'else' if no exception occurs. Maybe the intent of > the 'else' clause This can certainly be clarified in the documentation -- please file a bug report at http://sourceforge.net/projects/python/. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tim.one@home.com Fri Dec 29 04:25:44 2000 From: tim.one@home.com (Tim Peters) Date: Thu, 28 Dec 2000 23:25:44 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <20001228212115.C1811@xs4all.nl> Message-ID: [Fred, suggested doc change near the end] [Thomas Wouters] > (MAL and I already discussed this in private mail: Robin did mean > try/except/else, and 'finally' already executes when returning > directly from the 'try' block, even in Python 1.5) > > This code: > > try: > return > except: > pass > else: > print "returning" > > will indeed not print 'returning', but I believe it's by design. > I'm against changing it, in any case, and not just because it'd > break code :) If you want something that always executes, use a > 'finally'. Or don't return from the 'try', but return in the > 'else' clause. Guido's out of town again, so I'll channel him: Thomas is correct on all counts. In try/else, the "else" clause should execute if and only if control "falls off the end" of the "try" block. IOW, consider: try: arbitrary stuff x = 1 An "else" clause added to that "try" should execute when and only when the code as written executes the "x = 1" after the block. When "arbitrary stuff" == "return", control does not fall off the end, so "else" shouldn't trigger. Same thing if "arbitrary stuff" == "break" and we're inside a loop, or "continue" after Thomas's patch gets accepted. > The 'except' clause is documented to execute if a matching > exception occurs, and 'else' if no exception occurs. Yup, and that's imprecise: the same words are used to describe (part of) when 'finally' executes, but they weren't intended to be the same. > Maybe the intent of the 'else' clause would be clearer if it > was documented to 'execute if the try: clause finishes without > an exception being raised' ? Sorry, I don't find that any clearer. Let's be explicit: The optional 'else' clause is executed when the 'try' clause terminates by any means other than an exception or executing a 'return', 'continue' or 'break' statement. Exceptions in the 'else' clause are not handled by the preceding 'except' clauses. > The 'else' clause isn't executed when you 'break' or (after > applying my continue-in-try patch ;) 'continue' out of the > 'try', either. Hey, now you're channeling me ! Be afraid -- be very afraid. From moshez@zadka.site.co.il Fri Dec 29 14:42:44 2000 From: moshez@zadka.site.co.il (Moshe Zadka) Date: Fri, 29 Dec 2000 16:42:44 +0200 (IST) Subject: [Python-Dev] chomp()? In-Reply-To: <3A4B7BB1.F09660ED@lemburg.com> References: <3A4B7BB1.F09660ED@lemburg.com>, <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <20001229144244.D5AD0A84F@darjeeling.zadka.site.co.il> On Thu, 28 Dec 2000, "M.-A. Lemburg" wrote: [about chomp] > We already have .splitlines() which does the above (remove > line breaks) not only for a single line, but for many lines at once. > > Even better: .splitlines() also does the right thing for Unicode. OK, I retract my earlier +1, and instead I move that this be added to the FAQ. Where is the FAQ maintained nowadays? The grail link doesn't work anymore. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From loewis@informatik.hu-berlin.de Fri Dec 29 16:52:13 2000 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Fri, 29 Dec 2000 17:52:13 +0100 (MET) Subject: [Python-Dev] Re: [Patch #103002] Fix for #116285: Properly raise UnicodeErrors Message-ID: <200012291652.RAA20251@pandora.informatik.hu-berlin.de> [resent since python.org ran out of disk space] > My only problem with it is your copyright notice. AFAIK, patches to > the Python core cannot contain copyright notices without proper > license information. OTOH, I don't think that these minor changes > really warrant adding a complete license paragraph. I'd like to get an "official" clarification on this question. Is it the case that patches containing copyright notices are only accepted if they are accompanied with license information? I agree that the changes are minor, I also believe that I hold the copyright to the changes whether I attach a notice or not (at least according to our local copyright law). What concerns me that without such a notice, gencodec.py looks as if CNRI holds the copyright to it. I'm not willing to assign the copyright of my changes to CNRI, and I'd like to avoid the impression of doing so. What is even more concerning is that CNRI also holds the copyright to the generated files, even though they are derived from information made available by the Unicode consortium! Regards, Martin From tim.one@home.com Fri Dec 29 19:56:36 2000 From: tim.one@home.com (Tim Peters) Date: Fri, 29 Dec 2000 14:56:36 -0500 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <20001229000528.F1811@xs4all.nl> Message-ID: [Thomas Wouters] > And then they fixed it ! At least, for me, direct scp now works > fine. (I should've tested that before posting my blah blah, sorry.) I tried it immediately before posting my blah-blah yesterday, and it was still hanging. > Anybody else, like people using F-secure ssh (unix or windows) > experience the same ? Same here: I tried it again just now (under Win98 cmdline ssh/scp) and it worked fine! We're in business again. Thanks for fixing it, Thomas . now-if-only-we-could-get-python-dev-email-on-an-approximation-to-the- same-day-it's-sent-ly y'rs - tim From tim.one@home.com Fri Dec 29 20:27:40 2000 From: tim.one@home.com (Tim Peters) Date: Fri, 29 Dec 2000 15:27:40 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: Message-ID: [Robin Becker] > The 2.0 docs clearly state 'The optional else clause is executed when no > exception occurs in the try clause.' This makes it sound as though it > gets executed on the 'way out'. Of course. That's not what the docs meant, though, and Guido is not going to change the implementation now because that would break code that relies on how Python has always *worked* in these cases. The way Python works is also the way Guido intended it to work (I'm allowed to channel him when he's on vacation <0.9 wink)>. Indeed, that's why I suggested a specific doc change. If your friend would also be confused by that, then we still have a problem; else we don't. From tim.one@home.com Fri Dec 29 20:37:08 2000 From: tim.one@home.com (Tim Peters) Date: Fri, 29 Dec 2000 15:37:08 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: Message-ID: [Fred] > This can certainly be clarified in the documentation -- > please file a bug report at http://sourceforge.net/projects/python/. Here you go: https://sourceforge.net/bugs/?func=detailbug&bug_id=127098&group_id=5470 From thomas@xs4all.net Fri Dec 29 20:59:16 2000 From: thomas@xs4all.net (Thomas Wouters) Date: Fri, 29 Dec 2000 21:59:16 +0100 Subject: [Python-Dev] Fwd: try...else In-Reply-To: ; from tim.one@home.com on Fri, Dec 29, 2000 at 03:27:40PM -0500 References: Message-ID: <20001229215915.L1281@xs4all.nl> On Fri, Dec 29, 2000 at 03:27:40PM -0500, Tim Peters wrote: > Indeed, that's why I suggested a specific doc change. If your friend would > also be confused by that, then we still have a problem; else we don't. Note that I already uploaded a patch to fix the docs, assigned to fdrake, using Tim's wording exactly. (patch #103045) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From moshez@zadka.site.co.il Sun Dec 31 00:33:30 2000 From: moshez@zadka.site.co.il (Moshe Zadka) Date: Sun, 31 Dec 2000 02:33:30 +0200 (IST) Subject: [Python-Dev] FAQ Horribly Out Of Date Message-ID: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Hi! The current FAQ is horribly out of date. I think the FAQ-Wizard method has proven itself not very efficient (for example, apparently no one noticed until now that it's not working <0.2 wink>). Is there any hope putting the FAQ in Misc/, having a script which scp's it to the SF page, and making that the official FAQ? On a related note, what is the current status of the PSA? Is it officially dead? -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tim.one@home.com Sat Dec 30 20:48:08 2000 From: tim.one@home.com (Tim Peters) Date: Sat, 30 Dec 2000 15:48:08 -0500 Subject: [Python-Dev] Most everything is busted Message-ID: Add this error to the pot: """ http://www.python.org/cgi-bin/moinmoin Proxy Error The proxy server received an invalid response from an upstream server. The proxy server could not handle the request GET /cgi-bin/moinmoin. Reason: Document contains no data ------------------------------------------------------------------- Apache/1.3.9 Server at www.python.org Port 80 """ Also, as far as I can tell: + news->mail for c.l.py hasn't delivered anything for well over 24 hours. + No mail to Python-Dev has showed up in the archives (let alone been delivered) since Fri, 29 Dec 2000 16:42:44 +0200 (IST). + The other Python mailing lists appear equally dead. time-for-a-new-year!-ly y'rs - tim From barry@wooz.org Sun Dec 31 01:06:23 2000 From: barry@wooz.org (Barry A. Warsaw) Date: Sat, 30 Dec 2000 20:06:23 -0500 Subject: [Python-Dev] Re: Most everything is busted References: Message-ID: <14926.34447.60988.553140@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> + news->mail for c.l.py hasn't delivered anything for well TP> over 24 hours. TP> + No mail to Python-Dev has showed up in the archives (let TP> alone been delivered) since Fri, 29 Dec 2000 16:42:44 +0200 TP> (IST). TP> + The other Python mailing lists appear equally dead. There's a stupid, stupid bug in Mailman 2.0, which I've just fixed and (hopefully) unjammed things on the Mailman end[1]. We're still probably subject to the Postfix delays unfortunately; I think those are DNS related, and I've gotten a few other reports of DNS oddities, which I've forwarded off to the DC sysadmins. I don't think that particular problem will be fixed until after the New Year. relax-and-enjoy-the-quiet-ly y'rs, -Barry [1] For those who care: there's a resource throttle in qrunner which limits the number of files any single qrunner process will handle. qrunner does a listdir() on the qfiles directory and ignores any .msg file it finds (it only does the bulk of the processing on the corresponding .db files). But it performs the throttle check on every file in listdir() so depending on the order that listdir() returns and the number of files in the qfiles directory, the throttle check might get triggered before any .db file is seen. Wedge city. This is serious enough to warrant a Mailman 2.0.1 release, probably mid-next week. From gstein@lyra.org Sun Dec 31 10:19:50 2000 From: gstein@lyra.org (Greg Stein) Date: Sun, 31 Dec 2000 02:19:50 -0800 Subject: [Python-Dev] FAQ Horribly Out Of Date In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200 References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Message-ID: <20001231021950.M28628@lyra.org> On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote: >... > On a related note, what is the current status of the PSA? Is it officially > dead? The PSA was always kind of a (legal) fiction with the basic intent to help provide some funding for Python development. Since that isn't occurring at CNRI any more, the PSA is a bit moot. There was always some idea that maybe the PSA would be the "sponsor" (and possibly the beneficiary) of the conferences. That wasn't ever really formalized either. >From the Consortium meeting back in July, when we spoke with Bob Kahn and Al Vezza, I recall that they agreed the PSA was moot now. So while I can't say it is "officially dead", it is fair to say that it is dead for all intents and purposes. There is little motivation or purpose for it at this point in time. Cheers, -g -- Greg Stein, http://www.lyra.org/ From akuchlin@mems-exchange.org Sun Dec 31 15:58:12 2000 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Sun, 31 Dec 2000 10:58:12 -0500 Subject: [Python-Dev] FAQ Horribly Out Of Date In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200 References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Message-ID: <20001231105812.A12168@newcnri.cnri.reston.va.us> On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote: >The current FAQ is horribly out of date. I think the FAQ-Wizard method >has proven itself not very efficient (for example, apparently no one >noticed until now that it's not working <0.2 wink>). Is there any It also leads to one section of the FAQ (#3, I think) having something like 60 questions jumbled together. IMHO the FAQ should be a text file, perhaps in the PEP format so it can be converted to HTML, and it should have an editor who'll arrange it into smaller sections. Any volunteers? (Must ... resist ... urge to volunteer myself... help me, Spock...) --amk From skip@mojam.com (Skip Montanaro) Sun Dec 31 19:25:18 2000 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Sun, 31 Dec 2000 13:25:18 -0600 (CST) Subject: [Python-Dev] plz test bsddb using shared linkage Message-ID: <14927.34846.153117.764547@beluga.mojam.com> A bug was filed on SF contending that the default linkage for bsddb should be shared instead of static because some Linux systems ship multiple versions of libdb. Would those of you who can and do build bsddb (probably only unixoids of some variety) please give this simple test a try? Uncomment the *shared* line in Modules/Setup.config.in, re-run configure, build Python and then try: import bsddb db = bsddb.btopen("/tmp/dbtest.db", "c") db["1"] = "1" print db["1"] db.close() del db If this doesn't fail for anyone I'll check the change in and close the bug report, otherwise I'll add a(nother) comment to the bug report that *shared* breaks bsddb for others and close the bug report. Thx, Skip From skip@mojam.com (Skip Montanaro) Sun Dec 31 19:26:16 2000 From: skip@mojam.com (Skip Montanaro) (Skip Montanaro) Date: Sun, 31 Dec 2000 13:26:16 -0600 (CST) Subject: [Python-Dev] plz test bsddb using shared linkage Message-ID: <14927.34904.20832.319647@beluga.mojam.com> oops, forgot the bug report is at http://sourceforge.net/bugs/?func=detailbug&bug_id=126564&group_id=5470 for those of you who do not monitor python-bugs-list. S From tim.one@home.com Sun Dec 31 20:28:47 2000 From: tim.one@home.com (Tim Peters) Date: Sun, 31 Dec 2000 15:28:47 -0500 Subject: [Python-Dev] FAQ Horribly Out Of Date In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Message-ID: [Moshe Zadka] > The current FAQ is horribly out of date. The password is Spam. Fix it . > I think the FAQ-Wizard method has proven itself not very > efficient (for example, apparently no one noticed until now > that it's not working <0.2 wink>). I'm afraid almost nothing on python.org with an active component works today (not searches, not the FAQ Wizard, not the 2.0 Wiki, ...). If history is any clue, these will remain broken until Guido gets back from vacation. > Is there any hope putting the FAQ in Misc/, having a script > which scp's it to the SF page, and making that the official FAQ? Would be OK by me. I'm more concerned that the whole of python.org has barely been updated since March; huge chunks of the FAQ are still relevant, but, e.g., the Job Board hasn't been touched in over 3 months; the News got so out of date Guido deleted the whole section; etc. > On a related note, what is the current status of the PSA? Is it > officially dead? It appears that CNRI can only think about one thing at a time <0.5 wink>. For the last 6 months, that thing has been the license. If they ever resolve the GPL compatibility issue, maybe they can be persuaded to think about the PSA. In the meantime, I'd suggest you not renew . From tim.one@home.com Sun Dec 31 22:12:43 2000 From: tim.one@home.com (Tim Peters) Date: Sun, 31 Dec 2000 17:12:43 -0500 Subject: [Python-Dev] plz test bsddb using shared linkage In-Reply-To: <14927.34846.153117.764547@beluga.mojam.com> Message-ID: [Skip Montanaro] > ... > Would those of you who can and do build bsddb (probably only > unixoids of some variety) please give this simple test a try? Just noting that bsddb already ships with the Windows installer as a (shared) DLL. But it's an old (1.85?) Windows port from Sam Rushing. From gward at mems-exchange.org Fri Dec 1 00:14:39 2000 From: gward at mems-exchange.org (Greg Ward) Date: Thu, 30 Nov 2000 18:14:39 -0500 Subject: [Python-Dev] PEP 229 and 222 In-Reply-To: <20001128215748.A22105@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Tue, Nov 28, 2000 at 09:57:48PM -0500 References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> Message-ID: <20001130181438.A21596@ludwig.cnri.reston.va.us> On 28 November 2000, Andrew Kuchling said: > On Tue, Nov 28, 2000 at 06:01:38PM -0500, Guido van Rossum wrote: > >- Always shared libs. What about Unixish systems that don't have > > shared libs? What if you just want something to be hardcoded as > > statically linked, e.g. for security reasons? (On the other hand > > Beats me. I'm not even sure if the Distutils offers a way to compile > a static Python binary. (GPW: well, does it?) It's in the CCompiler interface, but hasn't been exposed to the outside world. (IOW, it's mainly a question of desiging the right setup script/command line interface: the implementation should be fairly straightforward, assuming the existing CCompiler classes do the right thing for generating binary executables.) Greg From gward at mems-exchange.org Fri Dec 1 00:19:38 2000 From: gward at mems-exchange.org (Greg Ward) Date: Thu, 30 Nov 2000 18:19:38 -0500 Subject: [Python-Dev] A house upon the sand In-Reply-To: ; from tim.one@home.com on Wed, Nov 29, 2000 at 01:23:10AM -0500 References: <200011281510.KAA03475@cj20424-a.reston1.va.home.com> Message-ID: <20001130181937.B21596@ludwig.cnri.reston.va.us> On 29 November 2000, Tim Peters said: > [Guido] > > ... > > Because of its importance, the deprecation time of the string module > > will be longer than that of most deprecated modules. I expect it > > won't be removed until Python 3000. > > I see nothing in the 2.0 docs, code, or "what's new" web pages saying that > it's deprecated. So I don't think you can even start the clock on this one > before 2.1 (a fuzzy stmt on the web page for the unused 1.6 release doesn't > count ...). FWIW, I would argue against *ever* removing (much less "deprecating", ie. threatening to remove) the string module. To a rough approximation, every piece of Python code in existence code prior to Python 1.6 depends on the string module. I for one do not want to have to change all occurences of string.foo(x) to x.foo() -- it just doesn't buy enough to make it worth changing all that code. Not only does the amount of code to change mean the change would be non-trivial, it's not always the right thing, especially if you happen to be one of the people who dislikes the "delim.join(list)" idiom. (I'm still undecided.) Greg From fredrik at effbot.org Fri Dec 1 07:39:57 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Fri, 1 Dec 2000 07:39:57 +0100 Subject: [Python-Dev] TypeError: foo, bar Message-ID: <008f01c05b61$877263b0$3c6340d5@hagrid> just stumbled upon yet another (high-profile) python newbie confused a "TypeError: read-only character buffer, dictionary" message. how about changing "read-only character buffer" to "string or read-only character buffer", and the "foo, bar" format to "expected foo, found bar", so we get: "TypeError: expected string or read-only character buffer, found dictionary" From tim.one at home.com Fri Dec 1 07:58:53 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 1 Dec 2000 01:58:53 -0500 Subject: [Python-Dev] TypeError: foo, bar In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid> Message-ID: [Fredrik Lundh] > just stumbled upon yet another (high-profile) python newbie > confused a "TypeError: read-only character buffer, dictionary" > message. > > how about changing "read-only character buffer" to "string > or read-only character buffer", and the "foo, bar" format to > "expected foo, found bar", so we get: > > "TypeError: expected string or read-only character > buffer, found dictionary" +0. +1 if "found" is changed to "got". "found"-implies-a-search-ly y'rs - tim From thomas.heller at ion-tof.com Fri Dec 1 09:10:21 2000 From: thomas.heller at ion-tof.com (Thomas Heller) Date: Fri, 1 Dec 2000 09:10:21 +0100 Subject: [Python-Dev] PEP 229 and 222 References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us> Message-ID: <014301c05b6e$269716a0$e000a8c0@thomasnotebook> > > Beats me. I'm not even sure if the Distutils offers a way to compile > > a static Python binary. (GPW: well, does it?) > > It's in the CCompiler interface, but hasn't been exposed to the outside > world. (IOW, it's mainly a question of desiging the right setup > script/command line interface: the implementation should be fairly > straightforward, assuming the existing CCompiler classes do the right > thing for generating binary executables.) Distutils currently only supports build_*** commands for C-libraries and Python extensions. Shouldn't there also be build commands for shared libraries, executable programs and static Python binaries? Thomas BTW: Distutils-sig seems pretty dead these days... From ping at lfw.org Fri Dec 1 11:23:56 2000 From: ping at lfw.org (Ka-Ping Yee) Date: Fri, 1 Dec 2000 02:23:56 -0800 (PST) Subject: [Python-Dev] Cryptic error messages Message-ID: An attempt to use sockets for the first time yesterday left a friend of mine bewildered: >>> import socket >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) >>> s.connect('localhost:234') Traceback (most recent call last): File "", line 1, in ? TypeError: 2-sequence, 13-sequence >>> "What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked. I see in getargs.c (line 275) that this type of message is documented: /* Convert a tuple argument. [...] If the argument is invalid: [...] *msgbuf contains an error message, whose format is: ", ", where: is the name of the expected type, and is the name of the actual type, (so you can surround it by "expected ... found"), and msgbuf is returned. */ It's clear that the socketmodule is not prepending "expected" and appending "found", as the author of converttuple intended. But when i grepped through the source code, i couldn't find anyone applying this "expected %s found" % msgbuf convention outside of getargs.c. Is it really in use? Could we just change getargs.c so that converttuple() returns a message like "expected ..., got ..." instead of seterror()? Additionally it would be nice to say '13-character string' instead of '13-sequence'... -- ?!ng "All models are wrong; some models are useful." -- George Box From mwh21 at cam.ac.uk Fri Dec 1 12:20:23 2000 From: mwh21 at cam.ac.uk (Michael Hudson) Date: 01 Dec 2000 11:20:23 +0000 Subject: [Python-Dev] Cryptic error messages In-Reply-To: Ka-Ping Yee's message of "Fri, 1 Dec 2000 02:23:56 -0800 (PST)" References: Message-ID: Ka-Ping Yee writes: > An attempt to use sockets for the first time yesterday left a > friend of mine bewildered: > > >>> import socket > >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > >>> s.connect('localhost:234') > Traceback (most recent call last): > File "", line 1, in ? > TypeError: 2-sequence, 13-sequence > >>> > > "What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked. > I'm not sure about the general case, but in this case you could do something like: http://sourceforge.net/patch/?func=detailpatch&patch_id=102599&group_id=5470 Now you get an error message like: TypeError: getsockaddrarg: AF_INET address must be tuple, not string Cheers, M. -- I have gathered a posie of other men's flowers, and nothing but the thread that binds them is my own. -- Montaigne From guido at python.org Fri Dec 1 14:02:02 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 01 Dec 2000 08:02:02 -0500 Subject: [Python-Dev] TypeError: foo, bar In-Reply-To: Your message of "Fri, 01 Dec 2000 07:39:57 +0100." <008f01c05b61$877263b0$3c6340d5@hagrid> References: <008f01c05b61$877263b0$3c6340d5@hagrid> Message-ID: <200012011302.IAA31609@cj20424-a.reston1.va.home.com> > just stumbled upon yet another (high-profile) python newbie > confused a "TypeError: read-only character buffer, dictionary" > message. > > how about changing "read-only character buffer" to "string > or read-only character buffer", and the "foo, bar" format to > "expected foo, found bar", so we get: > > "TypeError: expected string or read-only character > buffer, found dictionary" The first was easy, and I've done it. The second one, for some reason, is hard. I forget why. Sorry. --Guido van Rossum (home page: http://www.python.org/~guido/) From cgw at fnal.gov Fri Dec 1 14:41:04 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Fri, 1 Dec 2000 07:41:04 -0600 (CST) Subject: [Python-Dev] TypeError: foo, bar In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid> References: <008f01c05b61$877263b0$3c6340d5@hagrid> Message-ID: <14887.43632.812342.414156@buffalo.fnal.gov> Fredrik Lundh writes: > how about changing "read-only character buffer" to "string > or read-only character buffer", and the "foo, bar" format to > "expected foo, found bar", so we get: > > "TypeError: expected string or read-only character > buffer, found dictionary" +100. Recently, I've been teaching Python to some beginners and they find this message absolutely inscrutable. Also agree with Tim about "found" vs. "got", but this is of secondary importance. From moshez at math.huji.ac.il Fri Dec 1 15:26:03 2000 From: moshez at math.huji.ac.il (Moshe Zadka) Date: Fri, 1 Dec 2000 16:26:03 +0200 (IST) Subject: [Python-Dev] [OT] Change of Address Message-ID: I'm sorry to bother you all with this, but from time to time you might need to reach my by e-mail... 30 days from now, this e-mail address will no longer be valid. Please use anything at zadka.site.co.il to reach me. Thank you for your time. -- Moshe Zadka -- 95855124 http://advogato.org/person/moshez From gward at mems-exchange.org Fri Dec 1 16:14:53 2000 From: gward at mems-exchange.org (Greg Ward) Date: Fri, 1 Dec 2000 10:14:53 -0500 Subject: [Python-Dev] PEP 229 and 222 In-Reply-To: <014301c05b6e$269716a0$e000a8c0@thomasnotebook>; from thomas.heller@ion-tof.com on Fri, Dec 01, 2000 at 09:10:21AM +0100 References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us> <014301c05b6e$269716a0$e000a8c0@thomasnotebook> Message-ID: <20001201101452.A26074@ludwig.cnri.reston.va.us> On 01 December 2000, Thomas Heller said: > Distutils currently only supports build_*** commands for > C-libraries and Python extensions. > > Shouldn't there also be build commands for shared libraries, > executable programs and static Python binaries? Andrew and I talked about this a bit yesterday, and the proposed interface is as follows: python setup.py build_ext --static will compile all extensions in the current module distribution, but instead of creating a .so (.pyd) file for each one, will create a new python binary in build/bin.. Issue to be resolved: what to call the new python binary, especially when installing it (presumably we *don't* want to clobber the stock binary, but supplement it with (eg.) "foopython"). Note that there is no provision for selectively building some extensions as shared. This means that Andrew's Distutil-ization of the standard library will have to override the build_ext command and have some extra way to select extensions for shared/static. Neither of us considered this a problem. > BTW: Distutils-sig seems pretty dead these days... Yeah, that's a combination of me playing on other things and python.net email being dead for over a week. I'll cc the sig on this and see if this interface proposal gets anyone's attention. Greg From jeremy at alum.mit.edu Fri Dec 1 20:27:14 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Dec 2000 14:27:14 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test Message-ID: <14887.64402.88530.714821@bitdiddle.concentric.net> There was recently some idle chatter in Guido's living room about using a unit testing framework (like PyUnit) for the Python regression test suite. We're also writing tests for some DC projects, and need to decide what framework to use. Does anyone have opinions on test frameworks? A quick web search turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver that allows implements xUnit-style unit tests. Are there other tools we should consider? Is anyone else interested in migrating the current test suite to a new framework? I hope the new framework will allow us to improve the test suite in a number of ways: - run an entire test suite to completion instead of stopping on the first failure - clearer reporting of what went wrong - better support for conditional tests, e.g. write a test for httplib that only runs if the network is up. This is tied into better error reporting, since the current test suite could only report that httplib succeeded or failed. Jeremy From fdrake at acm.org Fri Dec 1 20:24:46 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 1 Dec 2000 14:24:46 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net> References: <14887.64402.88530.714821@bitdiddle.concentric.net> Message-ID: <14887.64254.399477.935828@cj42289-a.reston1.va.home.com> Jeremy Hylton writes: > - better support for conditional tests, e.g. write a test for > httplib that only runs if the network is up. This is tied into > better error reporting, since the current test suite could only > report that httplib succeeded or failed. There is a TestSkipped exception that can be raised with an explanation of why. It's used in the largefile test (at least). I think it is documented in the README. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From akuchlin at mems-exchange.org Fri Dec 1 20:58:27 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Fri, 1 Dec 2000 14:58:27 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 02:27:14PM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> Message-ID: <20001201145827.D16751@kronos.cnri.reston.va.us> On Fri, Dec 01, 2000 at 02:27:14PM -0500, Jeremy Hylton wrote: >There was recently some idle chatter in Guido's living room about >using a unit testing framework (like PyUnit) for the Python regression >test suite. We're also writing tests for some DC projects, and need Someone remembered my post of 23 Nov, I see... The only other test framework I know of is the unittest.py inside Quixote, written because we thought PyUnit was kind of clunky. Greg Ward, who primarily wrote it, used more sneaky interpreter tricks to make the interface more natural, though it still worked with Jython last time we checked (some time ago, though). No GUI, but it can optionally show the code coverage of a test suite, too. See http://x63.deja.com/=usenet/getdoc.xp?AN=683946404 for some notes on using it. Obviously I think the Quixote unittest.py is the best choice for the stdlib. --amk From jeremy at alum.mit.edu Fri Dec 1 21:55:28 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Dec 2000 15:55:28 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us> References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> Message-ID: <14888.4160.838336.537708@bitdiddle.concentric.net> Is there any documentation for the Quixote unittest tool? The Example page is helpful, but it feels like there are some details that are not explained. Jeremy From akuchlin at mems-exchange.org Fri Dec 1 22:12:12 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Fri, 1 Dec 2000 16:12:12 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14888.4160.838336.537708@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 03:55:28PM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> Message-ID: <20001201161212.A12372@kronos.cnri.reston.va.us> On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote: >Is there any documentation for the Quixote unittest tool? The Example >page is helpful, but it feels like there are some details that are not >explained. I don't believe we've written docs at all for internal use. What details seem to be missing? --amk From jeremy at alum.mit.edu Fri Dec 1 22:21:27 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Dec 2000 16:21:27 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <20001201161212.A12372@kronos.cnri.reston.va.us> References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> <20001201161212.A12372@kronos.cnri.reston.va.us> Message-ID: <14888.5719.844387.435471@bitdiddle.concentric.net> >>>>> "AMK" == Andrew Kuchling writes: AMK> On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote: >> Is there any documentation for the Quixote unittest tool? The >> Example page is helpful, but it feels like there are some details >> that are not explained. AMK> I don't believe we've written docs at all for internal use. AMK> What details seem to be missing? Details: - I assume setup/shutdown are equivalent to setUp/tearDown - Is it possible to override constructor for TestScenario? - Is there something equivalent to PyUnit self.assert_ - What does parse_args() do? - What does run_scenarios() do? - If I have multiple scenarios, how do I get them to run? Jeremy From akuchlin at mems-exchange.org Fri Dec 1 22:34:30 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Fri, 1 Dec 2000 16:34:30 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14888.5719.844387.435471@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 04:21:27PM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> <20001201161212.A12372@kronos.cnri.reston.va.us> <14888.5719.844387.435471@bitdiddle.concentric.net> Message-ID: <20001201163430.A12417@kronos.cnri.reston.va.us> On Fri, Dec 01, 2000 at 04:21:27PM -0500, Jeremy Hylton wrote: > - I assume setup/shutdown are equivalent to setUp/tearDown Correct. > - Is it possible to override constructor for TestScenario? Beats me; I see no reason why you couldn't, though. > - Is there something equivalent to PyUnit self.assert_ Probably test_bool(), I guess: self.test_bool('self.run.is_draft()') asserts that self.run.is_draft() will return true. Or does self.assert_() do something more? > - What does parse_args() do? > - What does run_scenarios() do? > - If I have multiple scenarios, how do I get them to run? These 3 questions are all related, really. At the bottom of our test scripts, we have the following stereotyped code: if __name__ == "__main__": (scenarios, options) = parse_args() run_scenarios (scenarios, options) parse_args() ensures consistent arguments to test scripts; -c measures code coverage, -v is verbose, etc. It also looks in the __main__ module and finds all subclasses of TestScenario, so you can do: python test_process_run.py # Runs all N scenarios python test_process_run.py ProcessRunTest # Runs all cases for 1 scenario python test_process_run.py ProcessRunTest:check_access # Runs one test case # in one scenario class --amk From tim.one at home.com Fri Dec 1 22:47:54 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 1 Dec 2000 16:47:54 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net> Message-ID: [Jeremy Hylton] > There was recently some idle chatter in Guido's living room about > using a unit testing framework (like PyUnit) for the Python regression > test suite. We're also writing tests for some DC projects, and need > to decide what framework to use. > > Does anyone have opinions on test frameworks? A quick web search > turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver > that allows implements xUnit-style unit tests. Are there other tools > we should consider? My own doctest is loved by people other than just me , but is aimed at ensuring that examples in docstrings work exactly as shown (which is why it starts with "doc" instead of "test"). > Is anyone else interested in migrating the current test suite to a new > framework? Yes. > I hope the new framework will allow us to improve the test > suite in a number of ways: > > - run an entire test suite to completion instead of stopping on > the first failure doctest does that. > - clearer reporting of what went wrong Ditto. > - better support for conditional tests, e.g. write a test for > httplib that only runs if the network is up. This is tied into > better error reporting, since the current test suite could only > report that httplib succeeded or failed. A doctest test is simply an interactive Python session pasted into a docstring (or more than one session, and/or interspersed with prose). If you can write an example in the interactive shell, doctest will verify it still works as advertised. This allows for embedding unit tests into the docs for each function, method and class. Nothing about them "looks like" an artificial test tacked on: the examples in the docs *are* the test cases. I need to try the other frameworks. I dare say doctest is ideal for computational functions, where the intended input->output relationship can be clearly explicated via examples. It's useless for GUIs. Usefulness varies accordingly between those extremes (doctest is natural exactly to the extent that a captured interactive session is helpful for documentation purposes). testing-ain't-easy-ly y'rs - tim From barry at digicool.com Sat Dec 2 04:52:29 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Fri, 1 Dec 2000 22:52:29 -0500 Subject: [Python-Dev] PEP 231, __findattr__() Message-ID: <14888.29181.355023.669030@anthem.concentric.net> I've just uploaded PEP 231, which describes a new hook in the instance access mechanism, called __findattr__() after a similar mechanism that exists in Jython (but is not exposed at the Python layer). You can do all kinds of interesting things with __findattr__(), including implement the __of__() protocol of ExtensionClass, and thus implicit and explicit acquisitions, in pure Python. You can also do Java Bean-like interfaces and C++-like access control. The PEP contains sample implementations of all of these, although the latter isn't as clean as I'd like, due to other restrictions in Python. My hope is that __findattr__() would eliminate most, if not all, the need for ExtensionClass, at least within the Zope and ZODB contexts. I haven't tried to implement Persistent using it though. Since it's a long PEP, I won't include it here. You can read about it at this URL http://python.sourceforge.net/peps/pep-0231.html It includes a link to the patch implementing this feature on SourceForge. Enjoy, -Barry From moshez at math.huji.ac.il Sat Dec 2 10:11:50 2000 From: moshez at math.huji.ac.il (Moshe Zadka) Date: Sat, 2 Dec 2000 11:11:50 +0200 (IST) Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: On Fri, 1 Dec 2000, Barry A. Warsaw wrote: > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). There's one thing that bothers me about this: what exactly is "the call stack"? Let me clarify: what happens when you have threads. Either machine-level threads and stackless threads confuse the issues here, not to talk about stackless continuations. Can you add a few words to the PEP about dealing with those? From mal at lemburg.com Sat Dec 2 11:03:11 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 02 Dec 2000 11:03:11 +0100 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: <3A28C8DF.E430484F@lemburg.com> "Barry A. Warsaw" wrote: > > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). > > You can do all kinds of interesting things with __findattr__(), > including implement the __of__() protocol of ExtensionClass, and thus > implicit and explicit acquisitions, in pure Python. You can also do > Java Bean-like interfaces and C++-like access control. The PEP > contains sample implementations of all of these, although the latter > isn't as clean as I'd like, due to other restrictions in Python. > > My hope is that __findattr__() would eliminate most, if not all, the > need for ExtensionClass, at least within the Zope and ZODB contexts. > I haven't tried to implement Persistent using it though. The PEP does define when and how __findattr__() is called, but makes no statement about what it should do or return... Here's a slightly different idea: Given the name, I would expect it to go look for an attribute and then return the attribute and its container (this doesn't seem to be what you have in mind here, though). An alternative approach given the semantics above would then be to first try a __getattr__() lookup and revert to __findattr__() in case this fails. I don't think there is any need to overload __setattr__() in such a way, because you cannot be sure which object actually gets the new attribute. By exposing the functionality using a new builtin, findattr(), this could be used for all the examples you give too. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From barry at digicool.com Sat Dec 2 17:50:02 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Sat, 2 Dec 2000 11:50:02 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> <3A28C8DF.E430484F@lemburg.com> Message-ID: <14889.10298.621133.961677@anthem.concentric.net> >>>>> "M" == M writes: M> The PEP does define when and how __findattr__() is called, M> but makes no statement about what it should do or return... Good point. I've clarified that in the PEP. M> Here's a slightly different idea: M> Given the name, I would expect it to go look for an attribute M> and then return the attribute and its container (this doesn't M> seem to be what you have in mind here, though). No, because some applications won't need a wrapped object. E.g. in the Java bean example, it just returns the attribute (which is stored with a slightly different name). M> An alternative approach given the semantics above would then be M> to first try a __getattr__() lookup and revert to M> __findattr__() in case this fails. I don't think this is as useful. What would that buy you that you can't already do today? The key concept here is that you want to give the class first crack to interpose on every attribute access. You want this hook to get called before anybody else can get at, or set, your attributes. That gives you (the class) total control to implement whatever policy is useful. M> I don't think there is any need to overload __setattr__() in M> such a way, because you cannot be sure which object actually M> gets the new attribute. M> By exposing the functionality using a new builtin, findattr(), M> this could be used for all the examples you give too. No, because then people couldn't use the object in the normal dot-notational way. -Barry From tismer at tismer.com Sat Dec 2 17:27:33 2000 From: tismer at tismer.com (Christian Tismer) Date: Sat, 02 Dec 2000 18:27:33 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: <3A2922F5.C2E0D10@tismer.com> Hi Barry, "Barry A. Warsaw" wrote: > > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). > > You can do all kinds of interesting things with __findattr__(), > including implement the __of__() protocol of ExtensionClass, and thus > implicit and explicit acquisitions, in pure Python. You can also do > Java Bean-like interfaces and C++-like access control. The PEP > contains sample implementations of all of these, although the latter > isn't as clean as I'd like, due to other restrictions in Python. > > My hope is that __findattr__() would eliminate most, if not all, the > need for ExtensionClass, at least within the Zope and ZODB contexts. > I haven't tried to implement Persistent using it though. I have been using ExtensionClass for quite a long time, and I have to say that you indeed eliminate most of its need through this super-elegant idea. Congratulations! Besides acquisition and persitency interception, wrapping plain C objects and giving them Class-like behavior while retaining fast access to internal properties but being able to override methods by Python methods was my other use of ExtensionClass. I assume this is the other "20%" part you mention, which is much harder to achieve? But that part also looks easier to implement now, by the support of the __findattr__ method. > Since it's a long PEP, I won't include it here. You can read about it > at this URL > > http://python.sourceforge.net/peps/pep-0231.html Great. I had to read it twice, but it was fun. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tismer at tismer.com Sat Dec 2 17:55:21 2000 From: tismer at tismer.com (Christian Tismer) Date: Sat, 02 Dec 2000 18:55:21 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: Message-ID: <3A292979.60BB1731@tismer.com> Moshe Zadka wrote: > > On Fri, 1 Dec 2000, Barry A. Warsaw wrote: > > > I've just uploaded PEP 231, which describes a new hook in the instance > > access mechanism, called __findattr__() after a similar mechanism that > > exists in Jython (but is not exposed at the Python layer). > > There's one thing that bothers me about this: what exactly is "the > call stack"? Let me clarify: what happens when you have threads. > Either machine-level threads and stackless threads confuse the issues > here, not to talk about stackless continuations. Can you add a few > words to the PEP about dealing with those? As far as I understood the patch (just skimmed), thee is no stack involved directly, but the instance increments and decrments a variable infindattr. + if (v != NULL && !inst->infindaddr && + (func = inst->in_class->cl_findattr)) + { + PyObject *args, *res; + args = Py_BuildValue("(OOO)", inst, name, v); + if (args == NULL) + return -1; + ++inst->infindaddr; + res = PyEval_CallObject(func, args); + --inst->infindaddr; This is: The call modifies the instance's state, while calling the findattr method. You are right: I see a serious problem with this. It doesn't even need continuations to get things messed up. Guido's proposed coroutines, together with uThread-Switching, might be able to enter the same instance twice with ease. Barry, after second thought, I feel this can become a problem in the future. This infindattr attribute only works correctly if we are guaranteed to use strict stack order of execution. What you're *intending* to to is to tell the PyEval_CallObject that it should not find the __findattr__ attribute. But this should be done only for this call and all of its descendants, but no *fresh* access from elsewhere. The hard way to get out of this would be to stop scheduling in that case. Maybe this is very cheap, but quite unelegant. We have a quite peculiar system state here: A function call acts like an escape, to make all subsequent calls behave differently, until this call is finished. Without blocking microthreads, a clean way to do this would be a search up in the frame chain, if there is a running __findattr__ method of this object. Fairly expensive. Well, the problem also exists with real threads, if they are allowed to switch in such a context. I fear it is necessary to either block this stuff until it is ready, or to maintain some thread-wise structure for the state of this object. Ok, after thinking some more, I'll start an extra message to Barry on this topic. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tismer at tismer.com Sat Dec 2 18:21:18 2000 From: tismer at tismer.com (Christian Tismer) Date: Sat, 02 Dec 2000 19:21:18 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: <3A292F8D.7C616449@tismer.com> "Barry A. Warsaw" wrote: > > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). Ok, as I announced already, here some thoughts on __findattr__, system state, and how it could work. Looking at your patch, I realize that you are blocking __findattr__ for your whole instance, until this call ends. This is not what you want to do, I guess. This has an effect of affecting the whole system state when threads are involved. Also you cannot use __findattr__ on any other attribute during this call. You want most probably do this: __findattr__ should not be invoked again for this instance, with this attribute name, for this "thread", until you are done. The correct way to find out whether __findattr__ is active or not would be to look upwards the frame chain and inspect it. Moshe also asked about continuations: I think this would resolve quite fine. However we jump around, the current chain of frames dictates the semantics of __findattr__. It even applies to Guido's tamed coroutines, given that an explicit switch were allowed in the context of __findattr__. In a sense, we get some kind of dynamic context here, since we need to do a lookup for something in the dynamic call chain. I guess this would be quite messy to implement, and inefficient. Isn't there a way to accomplish the desired effect without modifying the instance? In the context of __findattr__, *we* know that we don't want to get a recursive call. Let's assume __getattr__ and __setattr__ had yet another optional parameter: infindattr, defaulting to 0. We would than have to pass a positive value in this context, which would object.c tell to not try to invoke __findattr__ again. With explicit passing of state, no problems with threads can occour. Readability might improve as well. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From moshez at math.huji.ac.il Sun Dec 3 14:14:43 2000 From: moshez at math.huji.ac.il (Moshe Zadka) Date: Sun, 3 Dec 2000 15:14:43 +0200 (IST) Subject: [Python-Dev] Another Python Developer Missing Message-ID: Gordon McMillan is not a possible assignee in the assign_to field. -- Moshe Zadka -- 95855124 http://moshez.org From tim.one at home.com Sun Dec 3 18:35:36 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 3 Dec 2000 12:35:36 -0500 Subject: [Python-Dev] Another Python Developer Missing In-Reply-To: Message-ID: [Moshe Zadka] > Gordon McMillan is not a possible assignee in the assign_to field. We almost never add people as Python developers unless they ask for that, since it comes with responsibility as well as riches beyond the dreams of avarice. If Gordon would like to apply, we won't charge him any interest until 2001 . From mal at lemburg.com Sun Dec 3 20:21:11 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Sun, 03 Dec 2000 20:21:11 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib urllib.py,1.107,1.108 References: <200012031830.KAA30620@slayer.i.sourceforge.net> Message-ID: <3A2A9D27.AF43D665@lemburg.com> "Martin v. L?wis" wrote: > > Update of /cvsroot/python/python/dist/src/Lib > In directory slayer.i.sourceforge.net:/tmp/cvs-serv30506 > > Modified Files: > urllib.py > Log Message: > Convert Unicode strings to byte strings before passing them into specific > protocols. Closes bug #119822. > > ... > + > + def toBytes(url): > + """toBytes(u"URL") --> 'URL'.""" > + # Most URL schemes require ASCII. If that changes, the conversion > + # can be relaxed > + if type(url) is types.UnicodeType: > + try: > + url = url.encode("ASCII") You should make this: 'ascii' -- encoding names are lower case per convention (and the implementation has a short-cut to speed up conversion to 'ascii' -- not for 'ASCII'). > + except UnicodeError: > + raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters") Would it be better to use a simple ValueError here ? (UnicodeError is a subclass of ValueError, but the error doesn't really have something to do with Unicode conversions...) > + return url > > def unwrap(url): -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer at tismer.com Sun Dec 3 21:01:07 2000 From: tismer at tismer.com (Christian Tismer) Date: Sun, 03 Dec 2000 22:01:07 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib filecmp.py,1.6,1.7 References: <200012032048.MAA10353@slayer.i.sourceforge.net> Message-ID: <3A2AA683.3840AA8A@tismer.com> Moshe Zadka wrote: > > Update of /cvsroot/python/python/dist/src/Lib > In directory slayer.i.sourceforge.net:/tmp/cvs-serv9465 > > Modified Files: > filecmp.py > Log Message: > Call of _cmp had wrong number of paramereters. > Fixed definition of _cmp. ... > ! return not abs(cmp(a, b, sh, st)) > except os.error: > return 2 Ugh! Wouldn't that be a fine chance to rename the cmp function in this module? Overriding a built-in is really not nice to have in a library. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From moshez at math.huji.ac.il Sun Dec 3 22:01:07 2000 From: moshez at math.huji.ac.il (Moshe Zadka) Date: Sun, 3 Dec 2000 23:01:07 +0200 (IST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib filecmp.py,1.6,1.7 In-Reply-To: <3A2AA683.3840AA8A@tismer.com> Message-ID: On Sun, 3 Dec 2000, Christian Tismer wrote: > Ugh! Wouldn't that be a fine chance to rename the cmp > function in this module? Overriding a built-in > is really not nice to have in a library. The fine chance was when we moved cmp.py->filecmp.py. Now it would just break backwards compatability. -- Moshe Zadka -- 95855124 http://moshez.org From tismer at tismer.com Sun Dec 3 21:12:15 2000 From: tismer at tismer.com (Christian Tismer) Date: Sun, 03 Dec 2000 22:12:15 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Libfilecmp.py,1.6,1.7 References: Message-ID: <3A2AA91F.843E2BAE@tismer.com> Moshe Zadka wrote: > > On Sun, 3 Dec 2000, Christian Tismer wrote: > > > Ugh! Wouldn't that be a fine chance to rename the cmp > > function in this module? Overriding a built-in > > is really not nice to have in a library. > > The fine chance was when we moved cmp.py->filecmp.py. > Now it would just break backwards compatability. Yes, I see. cmp belongs to the module's interface. Maybe it could be renamed anyway, and be assigned to cmp at the very end of the file, but not using cmp anywhere in the code. My first reaction on reading the patch was "juck!" since I didn't know this module. python-dev/null - ly y'rs - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From martin at loewis.home.cs.tu-berlin.de Sun Dec 3 22:56:44 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 3 Dec 2000 22:56:44 +0100 Subject: [Python-Dev] PEP 231, __findattr__() Message-ID: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> > Isn't there a way to accomplish the desired effect without modifying > the instance? In the context of __findattr__, *we* know that we > don't want to get a recursive call. Let's assume __getattr__ and > __setattr__ had yet another optional parameter: infindattr, > defaulting to 0. We would than have to pass a positive value in > this context, which would object.c tell to not try to invoke > __findattr__ again. Who is "we" here? The Python code implementing __findattr__? How would it pass a value to __setattr__? It doesn't call __setattr__, instead it has "self.__myfoo = x"... I agree that the current implementation is not thread-safe. To solve that, you'd need to associate with each instance not a single "infindattr" attribute, but a whole set of them - one per "thread of execution" (which would be a thread-id in most threading systems). Of course, that would need some cooperation from the any thread scheme (including uthreads), which would need to provide an identification for a "calling context". Regards, Martin From martin at loewis.home.cs.tu-berlin.de Sun Dec 3 23:07:17 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 3 Dec 2000 23:07:17 +0100 Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108 Message-ID: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de> > You should make this: 'ascii' -- encoding names are lower case per > convention (and the implementation has a short-cut to speed up > conversion to 'ascii' -- not for 'ASCII'). With conventions, it is a difficult story. I'm pretty certain that users typically see that particular american standard as ASCII (to the extend of calling it "a s c two"), not ascii. As for speed - feel free to change the code if you think it matters. > + raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters") > Would it be better to use a simple ValueError here ? (UnicodeError > is a subclass of ValueError, but the error doesn't really have > something to do with Unicode conversions...) Why does it not have to do with Unicode conversion? A conversion from Unicode to ASCII was attempted, and failed. I guess I would be more open to suggested changes if you had put them into the patch manager at the time you've reviewed the patch... Regards, Martin From tismer at tismer.com Sun Dec 3 22:38:11 2000 From: tismer at tismer.com (Christian Tismer) Date: Sun, 03 Dec 2000 23:38:11 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> Message-ID: <3A2ABD43.AB56BD60@tismer.com> "Martin v. Loewis" wrote: > > > Isn't there a way to accomplish the desired effect without modifying > > the instance? In the context of __findattr__, *we* know that we > > don't want to get a recursive call. Let's assume __getattr__ and > > __setattr__ had yet another optional parameter: infindattr, > > defaulting to 0. We would than have to pass a positive value in > > this context, which would object.c tell to not try to invoke > > __findattr__ again. > > Who is "we" here? The Python code implementing __findattr__? How would > it pass a value to __setattr__? It doesn't call __setattr__, instead > it has "self.__myfoo = x"... Ouch - right! Sorry :) > I agree that the current implementation is not thread-safe. To solve > that, you'd need to associate with each instance not a single > "infindattr" attribute, but a whole set of them - one per "thread of > execution" (which would be a thread-id in most threading systems). Of > course, that would need some cooperation from the any thread scheme > (including uthreads), which would need to provide an identification > for a "calling context". Right, that is one possible way to do it. I also thought about some alternatives, but they all sound too complicated to justify them. Also I don't think this is only thread-related, since mess can happen even with an explicit coroutine jmp. Furthermore, how to deal with multiple attribute names? The function works wrong if __findattr__ tries to inspect another attribute. IMO, the state of the current interpreter changes here (or should do so), and this changed state needs to be carried down with all subsequent function calls. confused - ly chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From mal at lemburg.com Sun Dec 3 23:51:10 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Sun, 03 Dec 2000 23:51:10 +0100 Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108 References: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de> Message-ID: <3A2ACE5E.A9F860A8@lemburg.com> "Martin v. Loewis" wrote: > > > You should make this: 'ascii' -- encoding names are lower case per > > convention (and the implementation has a short-cut to speed up > > conversion to 'ascii' -- not for 'ASCII'). > > With conventions, it is a difficult story. I'm pretty certain that > users typically see that particular american standard as ASCII (to the > extend of calling it "a s c two"), not ascii. It's a convention in the codec registry design and used as such in the Unicode implementation. > As for speed - feel free to change the code if you think it matters. Hey... this was just a suggestion. I thought that you didn't know of the internal short-cut and wanted to hint at it. > > + raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters") > > > Would it be better to use a simple ValueError here ? (UnicodeError > > is a subclass of ValueError, but the error doesn't really have > > something to do with Unicode conversions...) > > Why does it not have to do with Unicode conversion? A conversion from > Unicode to ASCII was attempted, and failed. Sure, but the fact that URLs have to be ASCII is not something that is enforced by the Unicode implementation. > I guess I would be more open to suggested changes if you had put them > into the patch manager at the time you've reviewed the patch... I didn't review the patch, only the summary... Don't have much time to look into these things closely right now, so all I can do is comment. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From barry at scottb.demon.co.uk Mon Dec 4 01:55:32 2000 From: barry at scottb.demon.co.uk (Barry Scott) Date: Mon, 4 Dec 2000 00:55:32 -0000 Subject: [Python-Dev] A house upon the sand In-Reply-To: <20001130181937.B21596@ludwig.cnri.reston.va.us> Message-ID: <000201c05d8c$e7a15b10$060210ac@private> I fully support Greg Wards view. If string was removed I'd not update the old code but add in my own string module. Given the effort you guys went to to keep the C extension protocol the same (in the context of crashing on importing a 1.5 dll into 2.0) I amazed you think that string could be removed... Could you split the lib into blessed and backward compatibility sections? Then by some suitable mechanism I can choose the compatibility I need? Oh and as for join obviously a method of a list... ['thats','better'].join(' ') Barry From fredrik at pythonware.com Mon Dec 4 11:37:18 2000 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 4 Dec 2000 11:37:18 +0100 Subject: [Python-Dev] unit testing and Python regression test References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> Message-ID: <00e701c05dde$2d77c240$0900a8c0@SPIFF> andrew kuchling wrote: > Someone remembered my post of 23 Nov, I see... The only other test > framework I know of is the unittest.py inside Quixote, written because > we thought PyUnit was kind of clunky. the pythonware teams agree -- we've been using an internal reimplementation of Kent Beck's original Smalltalk work, but we're switching to unittest.py. > Obviously I think the Quixote unittest.py is the best choice for the stdlib. +1 from here. From mal at lemburg.com Mon Dec 4 12:14:20 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 04 Dec 2000 12:14:20 +0100 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> <3A28C8DF.E430484F@lemburg.com> <14889.10298.621133.961677@anthem.concentric.net> Message-ID: <3A2B7C8C.D6B889EE@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "M" == M writes: > > M> The PEP does define when and how __findattr__() is called, > M> but makes no statement about what it should do or return... > > Good point. I've clarified that in the PEP. > > M> Here's a slightly different idea: > > M> Given the name, I would expect it to go look for an attribute > M> and then return the attribute and its container (this doesn't > M> seem to be what you have in mind here, though). > > No, because some applications won't need a wrapped object. E.g. in > the Java bean example, it just returns the attribute (which is stored > with a slightly different name). I was thinking of a standardised helper which could then be used for all kinds of attribute retrieval techniques. Acquisition would be easy to do, access control too. In most cases __findattr__ would simply return (self, self.attrname). > M> An alternative approach given the semantics above would then be > M> to first try a __getattr__() lookup and revert to > M> __findattr__() in case this fails. > > I don't think this is as useful. What would that buy you that you > can't already do today? Forget that idea... *always* calling __findattr__ is the more useful way, just like you intended. > The key concept here is that you want to give the class first crack to > interpose on every attribute access. You want this hook to get called > before anybody else can get at, or set, your attributes. That gives > you (the class) total control to implement whatever policy is useful. Right. > M> I don't think there is any need to overload __setattr__() in > M> such a way, because you cannot be sure which object actually > M> gets the new attribute. > > M> By exposing the functionality using a new builtin, findattr(), > M> this could be used for all the examples you give too. > > No, because then people couldn't use the object in the normal > dot-notational way. Uhm, why not ? -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gvwilson at nevex.com Mon Dec 4 15:40:58 2000 From: gvwilson at nevex.com (Greg Wilson) Date: Mon, 4 Dec 2000 09:40:58 -0500 Subject: [Python-Dev] Q: Python standard library re-org plans/schedule? In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us> Message-ID: Hi, everyone. A potential customer has asked whether there are any plans to re-organize and rationalize the Python standard library. If there are any firms plans, and a schedule (however tentative), I'd be grateful for a pointer. Thanks, Greg From barry at digicool.com Mon Dec 4 16:13:23 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 10:13:23 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> Message-ID: <14891.46227.785856.307437@anthem.concentric.net> >>>>> "MvL" == Martin v Loewis writes: MvL> I agree that the current implementation is not MvL> thread-safe. To solve that, you'd need to associate with each MvL> instance not a single "infindattr" attribute, but a whole set MvL> of them - one per "thread of execution" (which would be a MvL> thread-id in most threading systems). Of course, that would MvL> need some cooperation from the any thread scheme (including MvL> uthreads), which would need to provide an identification for MvL> a "calling context". I'm still catching up on several hundred emails over the weekend. I had a sneaking suspicion that infindattr wasn't thread-safe, so I'm convinced this is a bug in the implementation. One approach might be to store the info in the thread state object (isn't that how the recursive repr stop flag is stored?) That would also save having to allocate an extra int for every instance (yuck) but might impose a bit more of a performance overhead. I'll work more on this later today. -Barry From jeremy at alum.mit.edu Mon Dec 4 16:23:10 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon, 4 Dec 2000 10:23:10 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <00e701c05dde$2d77c240$0900a8c0@SPIFF> References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <00e701c05dde$2d77c240$0900a8c0@SPIFF> Message-ID: <14891.46814.359333.76720@bitdiddle.concentric.net> >>>>> "FL" == Fredrik Lundh writes: FL> andrew kuchling wrote: >> Someone remembered my post of 23 Nov, I see... The only other >> test framework I know of is the unittest.py inside Quixote, >> written because we thought PyUnit was kind of clunky. FL> the pythonware teams agree -- we've been using an internal FL> reimplementation of Kent Beck's original Smalltalk work, but FL> we're switching to unittest.py. Can you provide any specifics about what you like about unittest.py (perhaps as opposed to PyUnit)? Jeremy From guido at python.org Mon Dec 4 16:20:11 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 04 Dec 2000 10:20:11 -0500 Subject: [Python-Dev] Q: Python standard library re-org plans/schedule? In-Reply-To: Your message of "Mon, 04 Dec 2000 09:40:58 EST." References: Message-ID: <200012041520.KAA20979@cj20424-a.reston1.va.home.com> > Hi, everyone. A potential customer has asked whether there are any > plans to re-organize and rationalize the Python standard library. > If there are any firms plans, and a schedule (however tentative), > I'd be grateful for a pointer. Alas, none that I know of except the ineffable Python 3000 schedule. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Mon Dec 4 16:46:53 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Mon, 4 Dec 2000 10:46:53 -0500 Subject: [Python-Dev] Quixote unit testing docs (Was: unit testing) In-Reply-To: <14891.46814.359333.76720@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Mon, Dec 04, 2000 at 10:23:10AM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <00e701c05dde$2d77c240$0900a8c0@SPIFF> <14891.46814.359333.76720@bitdiddle.concentric.net> Message-ID: <20001204104653.A19387@kronos.cnri.reston.va.us> Prodded by Jeremy, I went and actually wrote some documentation for the Quixote unittest.py; please see . The HTML is from a manually hacked Library Reference, so ignore the broken image links and other formatting goofyness. In case anyone needs it, the LaTeX is in /files/python/. The plain text version comes out to around 290 lines; I can post it to this list if that's desired. --amk From pf at artcom-gmbh.de Mon Dec 4 18:59:54 2000 From: pf at artcom-gmbh.de (Peter Funk) Date: Mon, 4 Dec 2000 18:59:54 +0100 (MET) Subject: Tim Peter's doctest compared to Quixote unit testing (was Re: [Python-Dev] Quixote unit testing docs) In-Reply-To: <20001204104653.A19387@kronos.cnri.reston.va.us> from Andrew Kuchling at "Dec 4, 2000 10:46:53 am" Message-ID: Hi all, Andrew Kuchling: > ... I ... actually wrote some documentation for > the Quixote unittest.py; please see > . [...] > comes out to around 290 lines; I can post it to this list if that's > desired. After reading Andrews docs, I think Quixote basically offers three additional features if compared with Tim Peters 'doctest': 1. integration of Skip Montanaro's code coverage analysis. 2. the idea of Scenario objects useful to share the setup needed to test related functions or methods of a class (same start condition). 3. Some useful functions to check whether the result returned by some test fullfills certain properties without having to be so explicite, as cut'n'paste from the interactive interpreter session would have been. As I've pointed out before in private mail to Jeremy I've used Tim Peters 'doctest.py' to accomplish all testing of Python apps in our company. In doctest each doc string is an independent unit, which starts fresh. Sometimes this leads to duplicated setup stuff, which is needed to test each method of a set of related methods from a class. This is distracting, if you intend the test cases to take their double role of being at same time useful documentational examples for the intended use of the provided API. Tim_one: Do you read this? What do you think about the idea to add something like the following two functions to 'doctest': use_module_scenario() -- imports all objects created and preserved during execution of the module doc string examples. use_class_scenario() -- imports all objects created and preserved during the execution of doc string examples of a class. Only allowed in doc string examples of methods. This would allow easily to provide the same setup scenario to a group of related test cases. AFAI understand doctest handles test-shutdown automatically, iff the doc string test examples leave no persistent resources behind. Regards, Peter From moshez at zadka.site.co.il Tue Dec 5 04:31:18 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Tue, 05 Dec 2000 05:31:18 +0200 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) of "Mon, 04 Dec 2000 10:13:23 EST." <14891.46227.785856.307437@anthem.concentric.net> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> Message-ID: <20001205033118.9135CA817@darjeeling.zadka.site.co.il> > I'm still catching up on several hundred emails over the weekend. I > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm > convinced this is a bug in the implementation. One approach might be > to store the info in the thread state object I don't think this is a good idea -- continuations and coroutines might mess it up. Maybe the right thing is to mess with the *compilation* of __findattr__ so that it would call __setattr__ and __getattr__ with special flags that stop them from calling __findattr__? This is ugly, but I can't think of a better way. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tismer at tismer.com Mon Dec 4 19:35:19 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 04 Dec 2000 20:35:19 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> Message-ID: <3A2BE3E7.60A8E220@tismer.com> Moshe Zadka wrote: > > > I'm still catching up on several hundred emails over the weekend. I > > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm > > convinced this is a bug in the implementation. One approach might be > > to store the info in the thread state object > > I don't think this is a good idea -- continuations and coroutines might > mess it up. Maybe the right thing is to mess with the *compilation* of > __findattr__ so that it would call __setattr__ and __getattr__ with > special flags that stop them from calling __findattr__? This is > ugly, but I can't think of a better way. Yeah, this is what I tried to say by "different machine state"; compiling different behavior in the case of a special method is an interesting idea. It is limited somewhat, since the changed system state is not inherited to called functions. But if __findattr__ performs its one, single task in its body alone, we are fine. still-thinking-of-alternatives - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tismer at tismer.com Mon Dec 4 19:52:43 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 04 Dec 2000 20:52:43 +0200 Subject: [Python-Dev] A house upon the sand References: <000201c05d8c$e7a15b10$060210ac@private> Message-ID: <3A2BE7FB.831F2F93@tismer.com> Barry Scott wrote: > > I fully support Greg Wards view. If string was removed I'd not > update the old code but add in my own string module. > > Given the effort you guys went to to keep the C extension protocol the > same (in the context of crashing on importing a 1.5 dll into 2.0) I > amazed you think that string could be removed... > > Could you split the lib into blessed and backward compatibility sections? > Then by some suitable mechanism I can choose the compatibility I need? > > Oh and as for join obviously a method of a list... > > ['thats','better'].join(' ') The above is the way as it is defined for JavaScript. But in JavaScript, the list join method performs an implicit str() on the list elements. As has been discussed some time ago, Python's lists are too versatile to justify a string-centric method. Marc Andr? pointed out that one could do a reduction with the semantics of the "+" operator, but Guido said that he wouldn't like to see [2, 3, 5].join(7) being reduced to 2+7+3+7+5 == 24. That could only be avoided if there were a way to distinguish numeric addition from concatenation. but-I-could-live-with-it - ly y'rs - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From barry at digicool.com Mon Dec 4 22:23:00 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 16:23:00 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> Message-ID: <14892.2868.982013.313562@anthem.concentric.net> >>>>> "CT" == Christian Tismer writes: CT> You want most probably do this: __findattr__ should not be CT> invoked again for this instance, with this attribute name, for CT> this "thread", until you are done. First, I think the rule should be "__findattr__ should not be invoked again for this instance, in this thread, until you are done". I.e. once in __findattr__, you want all subsequent attribute references to bypass findattr, because presumably, your instance now has complete control for all accesses in this thread. You don't want to limit it to just the currently named attribute. Second, if "this thread" is defined as _PyThreadState_Current, then we have a simple solution, as I mapped out earlier. We do a PyThreadState_GetDict() and store the instance in that dict on entry to __findattr__ and remove it on exit from __findattr__. If the instance can be found in the current thread's dict, we bypass __findattr__. >>>>> "MZ" == Moshe Zadka writes: MZ> I don't think this is a good idea -- continuations and MZ> coroutines might mess it up. You might be right, but I'm not sure. If we make __findattr__ thread safe according to the definition above, and if uthread/coroutine/continuation safety can be accomplished by the __findattr__ programmer's discipline, then I think that is enough. IOW, if we can tell the __findattr__ author to not relinquish the uthread explicitly during the __findattr__ call, we're cool. Oh, and as long as we're not somehow substantially reducing the utility of __findattr__ by making that restriction. What I worry about is re-entrancy that isn't under the programmer's control, like the Real Thread-safety problem. -Barry From barry at digicool.com Mon Dec 4 23:58:33 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 17:58:33 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <3A2C0E0D.E042D026@tismer.com> Message-ID: <14892.8601.41178.81475@anthem.concentric.net> >>>>> "CT" == Christian Tismer writes: CT> Hmm. WHat do you think about Moshe's idea to change compiling CT> of the method? It has the nice advantage that there are no CT> Thread-safety problems by design. The only drawback is that CT> the contract of not-calling-myself only holds for this CT> function. I'm not sure I understand what Moshe was proposing. Moshe: are you saying that we should change the way the compiler works, so that it somehow recognizes this special case? I'm not sure I like that approach. I think I want something more runtime-y, but I'm not sure why (maybe just because I'm more comfortable mucking about in the run-time than in the compiler). -Barry From guido at python.org Tue Dec 5 00:16:17 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 04 Dec 2000 18:16:17 -0500 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: Your message of "Mon, 04 Dec 2000 16:23:00 EST." <14892.2868.982013.313562@anthem.concentric.net> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> Message-ID: <200012042316.SAA23081@cj20424-a.reston1.va.home.com> I'm unconvinced by the __findattr__ proposal as it now stands. - Do you really think that JimF would do away with ExtensionClasses if __findattr__ was intruduced? I kinda doubt it. See [*footnote]. It seems that *using* __findattr__ is expensive (even if *not* using is cheap :-). - Why is deletion not supported? What if you want to enforce a policy on deletions too? - It's ugly to use the same call for get and set. The examples indicate that it's not such a great idea: every example has *two* tests whether it's get or set. To share a policy, the proper thing to do is to write a method that either get or set can use. - I think it would be sufficient to *only* use __findattr__ for getattr -- __setattr__ and __delattr__ already have full control. The "one routine to implement the policy" argument doesn't really hold, I think. - The PEP says that the "in-findattr" flag is set on the instance. We've already determined that this is not thread-safe. This is not just a bug in the implementation -- it's a bug in the specification. I also find it ugly. But if we decide to do this, it can go in the thread-state -- if we ever add coroutines, we have to decide on what stuff to move from the thread state to the coroutine state anyway. - It's also easy to conceive situations where recursive __findattr__ calls on the same instance in the same thread/coroutine are perfectly desirable -- e.g. when __findattr__ ends up calling a method that uses a lot of internal machinery of the class. You don't want all the machinery to have to be aware of the fact that it may be called with __findattr__ on the stack and without it. So perhaps it may be better to only treat the body of __findattr__ itself special, as Moshe suggested. What does Jython do here? - The code examples require a *lot* of effort to understand. These are complicated issues! (I rewrote the Bean example using __getattr__ and __setattr__ and found no need for __findattr__; the __getattr__ version is simpler and easier to understand. I'm still studying the other __findattr__ examples.) - The PEP really isn't that long, except for the code examples. I recommend reading the patch first -- the patch is probably shorter than any specification of the feature can be. --Guido van Rossum (home page: http://www.python.org/~guido/) [*footnote] There's an easy way (that few people seem to know) to cause __getattr__ to be called for virtually all attribute accesses: put *all* (user-visible) attributes in a sepate dictionary. If you want to prevent access to this dictionary too (for Zope security enforcement), make it a global indexed by id() -- a destructor(__del__) can take care of deleting entries here. From martin at loewis.home.cs.tu-berlin.de Tue Dec 5 00:10:43 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 5 Dec 2000 00:10:43 +0100 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <14891.46227.785856.307437@anthem.concentric.net> (barry@digicool.com) References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> Message-ID: <200012042310.AAA00786@loewis.home.cs.tu-berlin.de> > I'm still catching up on several hundred emails over the weekend. I > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm > convinced this is a bug in the implementation. One approach might be > to store the info in the thread state object (isn't that how the > recursive repr stop flag is stored?) Whether this works depends on how exactly the info is stored. A single flag won't be sufficient, since multiple objects may have __findattr__ in progress in a given thread. With a set of instances, it would work, though. Regards, Martin From martin at loewis.home.cs.tu-berlin.de Tue Dec 5 00:13:15 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 5 Dec 2000 00:13:15 +0100 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <20001205033118.9135CA817@darjeeling.zadka.site.co.il> (message from Moshe Zadka on Tue, 05 Dec 2000 05:31:18 +0200) References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> Message-ID: <200012042313.AAA00832@loewis.home.cs.tu-berlin.de> > I don't think this is a good idea -- continuations and coroutines > might mess it up. If coroutines and continuations present operate preemptively, then they should present themselves as an implementation of the thread API; perhaps the thread API needs to be extended to allow for such a feature. If yielding control is in the hands of the implementation, it would be easy to outrule a context switch while findattr is in progress. Regards, Martin From martin at loewis.home.cs.tu-berlin.de Tue Dec 5 00:19:37 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 5 Dec 2000 00:19:37 +0100 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <14892.8601.41178.81475@anthem.concentric.net> (barry@digicool.com) References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <3A2C0E0D.E042D026@tismer.com> <14892.8601.41178.81475@anthem.concentric.net> Message-ID: <200012042319.AAA00877@loewis.home.cs.tu-berlin.de> > I'm not sure I understand what Moshe was proposing. Moshe: are you > saying that we should change the way the compiler works, so that it > somehow recognizes this special case? I'm not sure I like that > approach. I think I want something more runtime-y, but I'm not sure > why (maybe just because I'm more comfortable mucking about in the > run-time than in the compiler). I guess you are also uncomfortable with the problem that the compile-time analysis cannot "see" through levels of indirection. E.g. if findattr as return self.compute_attribute(real_attribute) then compile-time analysis could figure out to call compute_attribute directly. However, that method may be implemented as def compute_attribute(self,name): return self.mapping[name] where the access to mapping could not be detected statically. Regards, Martin From tismer at tismer.com Mon Dec 4 22:35:09 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 04 Dec 2000 23:35:09 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> Message-ID: <3A2C0E0D.E042D026@tismer.com> "Barry A. Warsaw" wrote: > > >>>>> "CT" == Christian Tismer writes: > > CT> You want most probably do this: __findattr__ should not be > CT> invoked again for this instance, with this attribute name, for > CT> this "thread", until you are done. > > First, I think the rule should be "__findattr__ should not be invoked > again for this instance, in this thread, until you are done". Maybe this is better. Surely easier. :) [ThreadState solution - well fine so far] > MZ> I don't think this is a good idea -- continuations and > MZ> coroutines might mess it up. > > You might be right, but I'm not sure. > > If we make __findattr__ thread safe according to the definition above, > and if uthread/coroutine/continuation safety can be accomplished by > the __findattr__ programmer's discipline, then I think that is enough. > IOW, if we can tell the __findattr__ author to not relinquish the > uthread explicitly during the __findattr__ call, we're cool. Oh, and > as long as we're not somehow substantially reducing the utility of > __findattr__ by making that restriction. > > What I worry about is re-entrancy that isn't under the programmer's > control, like the Real Thread-safety problem. Hmm. WHat do you think about Moshe's idea to change compiling of the method? It has the nice advantage that there are no Thread-safety problems by design. The only drawback is that the contract of not-calling-myself only holds for this function. I don't know how Threadstate scale up when there are more things like these invented. Well, for the moment, the simple solution with Stackless would just be to let the interpreter recurse in this call, the same as it happens during __init__ and anything else that isn't easily turned into tail-recursion. It just blocks :-) ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From barry at digicool.com Tue Dec 5 03:54:23 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 21:54:23 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> Message-ID: <14892.22751.921264.156010@anthem.concentric.net> >>>>> "GvR" == Guido van Rossum writes: GvR> - Do you really think that JimF would do away with GvR> ExtensionClasses if __findattr__ was intruduced? I kinda GvR> doubt it. See [*footnote]. It seems that *using* GvR> __findattr__ is expensive (even if *not* using is cheap :-). That's not even the real reason why JimF wouldn't stop using ExtensionClass. He's already got too much code invested in EC. However EC can be a big pill to swallow for some applications because it's a C extension (and because it has some surprising non-Pythonic side effects). In those situations, a pure Python approach, even though slower, is useful. GvR> - Why is deletion not supported? What if you want to enforce GvR> a policy on deletions too? It could be, without much work. GvR> - It's ugly to use the same call for get and set. The GvR> examples indicate that it's not such a great idea: every GvR> example has *two* tests whether it's get or set. To share a GvR> policy, the proper thing to do is to write a method that GvR> either get or set can use. I don't have strong feelings either way. GvR> - I think it would be sufficient to *only* use __findattr__ GvR> for getattr -- __setattr__ and __delattr__ already have full GvR> control. The "one routine to implement the policy" argument GvR> doesn't really hold, I think. What about the ability to use "normal" x.name attribute access syntax inside the hook? Let me guess your answer. :) GvR> - The PEP says that the "in-findattr" flag is set on the GvR> instance. We've already determined that this is not GvR> thread-safe. This is not just a bug in the implementation -- GvR> it's a bug in the specification. I also find it ugly. But GvR> if we decide to do this, it can go in the thread-state -- if GvR> we ever add coroutines, we have to decide on what stuff to GvR> move from the thread state to the coroutine state anyway. Right. That's where we've ended up in subsequent messages on this thread. GvR> - It's also easy to conceive situations where recursive GvR> __findattr__ calls on the same instance in the same GvR> thread/coroutine are perfectly desirable -- e.g. when GvR> __findattr__ ends up calling a method that uses a lot of GvR> internal machinery of the class. You don't want all the GvR> machinery to have to be aware of the fact that it may be GvR> called with __findattr__ on the stack and without it. Hmm, okay, I don't really understand your example. I suppose I'm envisioning __findattr__ as a way to provide an interface to clients of the class. Maybe it's a bean interface, maybe it's an acquisition interface or an access control interface. The internal machinery has to know something about how that interface is implemented, so whether __findattr__ is recursive or not doesn't seem to enter into it. And also, allowing __findattr__ to be recursive will just impose different constraints on the internal machinery methods, just like __setattr__ currently does. I.e. you better know that you're in __setattr__ and not do self.name type things, or you'll recurse forever. GvR> So perhaps it may be better to only treat the body of GvR> __findattr__ itself special, as Moshe suggested. Maybe I'm being dense, but I'm not sure exactly what this means, or how you would do this. GvR> What does Jython do here? It's not exactly equivalent, because Jython's __findattr__ can't call back into Python. GvR> - The code examples require a *lot* of effort to understand. GvR> These are complicated issues! (I rewrote the Bean example GvR> using __getattr__ and __setattr__ and found no need for GvR> __findattr__; the __getattr__ version is simpler and easier GvR> to understand. I'm still studying the other __findattr__ GvR> examples.) Is it simpler because you separated out the set and get behavior? If __findattr__ only did getting, I think it would be a lot similar too (but I'd still be interested in seeing your __getattr__ only example). The acquisition examples are complicated because I wanted to support the same interface that EC's acquisition classes support. All that detail isn't necessary for example code. GvR> - The PEP really isn't that long, except for the code GvR> examples. I recommend reading the patch first -- the patch GvR> is probably shorter than any specification of the feature can GvR> be. Would it be more helpful to remove the examples? If so, where would you put them? It's certainly useful to have examples someplace I think. GvR> There's an easy way (that few people seem to know) to cause GvR> __getattr__ to be called for virtually all attribute GvR> accesses: put *all* (user-visible) attributes in a sepate GvR> dictionary. If you want to prevent access to this dictionary GvR> too (for Zope security enforcement), make it a global indexed GvR> by id() -- a destructor(__del__) can take care of deleting GvR> entries here. Presumably that'd be a module global, right? Maybe within Zope that could be protected, but outside of that, that global's always going to be accessible. So are methods, even if given private names. And I don't think that such code would be any more readable since instead of self.name you'd see stuff like def __getattr__(self, name): global instdict mydict = instdict[id(self)] obj = mydict[name] ... def __setattr__(self, name, val): global instdict mydict = instdict[id(self)] instdict[name] = val ... and that /might/ be a problem with Jython currently, because id()'s may be reused. And relying on __del__ may have unfortunate side effects when viewed in conjunction with garbage collection. You're probably still unconvinced , but are you dead-set against it? I can try implementing __findattr__() as a pre-__getattr__ hook only. Then we can live with the current __setattr__() restrictions and see what the examples look like in that situation. -Barry From guido at python.org Tue Dec 5 13:54:20 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 05 Dec 2000 07:54:20 -0500 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: Your message of "Mon, 04 Dec 2000 21:54:23 EST." <14892.22751.921264.156010@anthem.concentric.net> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> <14892.22751.921264.156010@anthem.concentric.net> Message-ID: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> > >>>>> "GvR" == Guido van Rossum writes: > > GvR> - Do you really think that JimF would do away with > GvR> ExtensionClasses if __findattr__ was intruduced? I kinda > GvR> doubt it. See [*footnote]. It seems that *using* > GvR> __findattr__ is expensive (even if *not* using is cheap :-). > > That's not even the real reason why JimF wouldn't stop using > ExtensionClass. He's already got too much code invested in EC. > However EC can be a big pill to swallow for some applications because > it's a C extension (and because it has some surprising non-Pythonic > side effects). In those situations, a pure Python approach, even > though slower, is useful. Agreed. But I'm still hoping to find the silver bullet that lets Jim (and everybody else) do what ExtensionClass does without needing another extension. > GvR> - Why is deletion not supported? What if you want to enforce > GvR> a policy on deletions too? > > It could be, without much work. Then it should be -- except I prefer to do only getattr anyway, see below. > GvR> - It's ugly to use the same call for get and set. The > GvR> examples indicate that it's not such a great idea: every > GvR> example has *two* tests whether it's get or set. To share a > GvR> policy, the proper thing to do is to write a method that > GvR> either get or set can use. > > I don't have strong feelings either way. What does Jython do? I thought it only did set (hence the name :-). I think there's no *need* for findattr to catch the setattr operation, because __setattr__ *already* gets invoked on each set not just ones where the attr doesn't yet exist. > GvR> - I think it would be sufficient to *only* use __findattr__ > GvR> for getattr -- __setattr__ and __delattr__ already have full > GvR> control. The "one routine to implement the policy" argument > GvR> doesn't really hold, I think. > > What about the ability to use "normal" x.name attribute access syntax > inside the hook? Let me guess your answer. :) Aha! You got me there. Clearly the REAL reason for wanting __findattr__ is the no-recursive-calls rule -- which is also the most uncooked feature... Traditional getattr hooks don't need this as much because they don't get called when the attribute already exists; traditional setattr hooks deal with it by switching on the attribute name. The no-recursive-calls rule certainly SEEMS an attractive way around this. But I'm not sure that it really is... I need to get my head around this more. (The only reason I'm still posting this reply is to test the new mailing lists setup via mail.python.org.) > GvR> - The PEP says that the "in-findattr" flag is set on the > GvR> instance. We've already determined that this is not > GvR> thread-safe. This is not just a bug in the implementation -- > GvR> it's a bug in the specification. I also find it ugly. But > GvR> if we decide to do this, it can go in the thread-state -- if > GvR> we ever add coroutines, we have to decide on what stuff to > GvR> move from the thread state to the coroutine state anyway. > > Right. That's where we've ended up in subsequent messages on this thread. > > GvR> - It's also easy to conceive situations where recursive > GvR> __findattr__ calls on the same instance in the same > GvR> thread/coroutine are perfectly desirable -- e.g. when > GvR> __findattr__ ends up calling a method that uses a lot of > GvR> internal machinery of the class. You don't want all the > GvR> machinery to have to be aware of the fact that it may be > GvR> called with __findattr__ on the stack and without it. > > Hmm, okay, I don't really understand your example. I suppose I'm > envisioning __findattr__ as a way to provide an interface to clients > of the class. Maybe it's a bean interface, maybe it's an acquisition > interface or an access control interface. The internal machinery has > to know something about how that interface is implemented, so whether > __findattr__ is recursive or not doesn't seem to enter into it. But the class is also a client of itself, and not all cases where it is a client of itself are inside a findattr call. Take your bean example. Suppose your bean class also has a spam() method. The findattr code needs to account for this, e.g.: def __findattr__(self, name, *args): if name == "spam" and not args: return self.spam ...original body here... Or you have to add a _get_spam() method: def _get_spam(self): return self.spam Either solution gets tedious if there ar a lot of methods; instead, findattr could check if the attr is defined on the class, and then return that: def __findattr__(self, name, *args): if not args and name[0] != '_' and hasattr(self.__class__, name): return getattr(self, name) ...original body here... Anyway, let's go back to the spam method. Suppose it references self.foo. The findattr machinery will access it. Fine. But now consider another attribute (bar) with _set_bar() and _get_bar() methods that do a little more. Maybe bar is really calculated from the value of self.foo. Then _get_bar cannot use self.foo (because it's inside findattr so findattr won't resolve it, and self.foo doesn't actually exist on the instance) so it has to use self.__myfoo. Fine -- after all this is inside a _get_* handler, which knows it's being called from findattr. But what if, instead of needing self.foo, _get_bar wants to call self.spam() in order? Then self.spam() is being called from inside findattr, so when it access self.foo, findattr isn't used -- and it fails with an AttributeError! Sorry for the long detour, but *that's* the problem I was referring to. I think the scenario is quite realistic. > And also, allowing __findattr__ to be recursive will just impose > different constraints on the internal machinery methods, just like > __setattr__ currently does. I.e. you better know that you're in > __setattr__ and not do self.name type things, or you'll recurse > forever. Actually, this is usually solved by having __setattr__ check for specific names only, and for others do self.__dict__[name] = value; that way, recursive __setattr__ calls are okay. Similar for __getattr__ (which has to raise AttributeError for unrecognized names). > GvR> So perhaps it may be better to only treat the body of > GvR> __findattr__ itself special, as Moshe suggested. > > Maybe I'm being dense, but I'm not sure exactly what this means, or > how you would do this. Read Moshe's messages (and Martin's replies) again. I don't care that much for it so I won't explain it again. > GvR> What does Jython do here? > > It's not exactly equivalent, because Jython's __findattr__ can't call > back into Python. I'd say that Jython's __findattr__ is an entirely different beast than what we have here. Its min purpose in life appears to be to be a getattr equivalent that returns NULL instead of raising an exception when the attribute isn't found -- which is reasonable because from within Java, testing for null is much cheaper than checking for an exception, and you often need to look whether a given attribute exists do some default action if not. (In fact, I'd say that CPython could also use a findattr of this kind...) This is really too bad. Based on the name similarity and things I thought you'd said in private before, I thought that they would be similar. Then the experience with Jython would be a good argument for adding a findattr hook to CPython. But now that they are totally different beasts it doesn't help at all. > GvR> - The code examples require a *lot* of effort to understand. > GvR> These are complicated issues! (I rewrote the Bean example > GvR> using __getattr__ and __setattr__ and found no need for > GvR> __findattr__; the __getattr__ version is simpler and easier > GvR> to understand. I'm still studying the other __findattr__ > GvR> examples.) > > Is it simpler because you separated out the set and get behavior? If > __findattr__ only did getting, I think it would be a lot similar too > (but I'd still be interested in seeing your __getattr__ only > example). Here's my getattr example. It's more lines of code, but cleaner IMHO: class Bean: def __init__(self, x): self.__myfoo = x def __isprivate(self, name): return name.startswith('_') def __getattr__(self, name): if self.__isprivate(name): raise AttributeError, name return getattr(self, "_get_" + name)() def __setattr__(self, name, value): if self.__isprivate(name): self.__dict__[name] = value else: return getattr(self, "_set_" + name)(value) def _set_foo(self, x): self.__myfoo = x def _get_foo(self): return self.__myfoo b = Bean(3) print b.foo b.foo = 9 print b.foo > The acquisition examples are complicated because I wanted > to support the same interface that EC's acquisition classes support. > All that detail isn't necessary for example code. I *still* have to study the examples... :-( Will do next. > GvR> - The PEP really isn't that long, except for the code > GvR> examples. I recommend reading the patch first -- the patch > GvR> is probably shorter than any specification of the feature can > GvR> be. > > Would it be more helpful to remove the examples? If so, where would > you put them? It's certainly useful to have examples someplace I > think. No, my point is that the examples need more explanation. Right now the EC example is over 200 lines of brain-exploding code! :-) > GvR> There's an easy way (that few people seem to know) to cause > GvR> __getattr__ to be called for virtually all attribute > GvR> accesses: put *all* (user-visible) attributes in a sepate > GvR> dictionary. If you want to prevent access to this dictionary > GvR> too (for Zope security enforcement), make it a global indexed > GvR> by id() -- a destructor(__del__) can take care of deleting > GvR> entries here. > > Presumably that'd be a module global, right? Maybe within Zope that > could be protected, Yes. > but outside of that, that global's always going to > be accessible. So are methods, even if given private names. Aha! Another think that I expect has been on your agenda for a long time, but which isn't explicit in the PEP (AFAICT): findattr gives *total* control over attribute access, unlike __getattr__ and __setattr__ and private name mangling, which can all be defeated. And this may be one of the things that Jim is after with ExtensionClasses in Zope. Although I believe that in DTML, he doesn't trust this: he uses source-level (or bytecode-level) transformations to turn all X.Y operations into a call into a security manager. So I'm not sure that the argument is very strong. > And I > don't think that such code would be any more readable since instead of > self.name you'd see stuff like > > def __getattr__(self, name): > global instdict > mydict = instdict[id(self)] > obj = mydict[name] > ... > > def __setattr__(self, name, val): > global instdict > mydict = instdict[id(self)] > instdict[name] = val > ... > > and that /might/ be a problem with Jython currently, because id()'s > may be reused. And relying on __del__ may have unfortunate side > effects when viewed in conjunction with garbage collection. Fair enough. I withdraw the suggestion, and propose restricted execution instead. There, you can use Bastions -- which have problems of their own, but you do get total control. > You're probably still unconvinced , but are you dead-set against > it? I can try implementing __findattr__() as a pre-__getattr__ hook > only. Then we can live with the current __setattr__() restrictions > and see what the examples look like in that situation. I am dead-set against introducing a feature that I don't fully understand. Let's continue this discussion. --Guido van Rossum (home page: http://www.python.org/~guido/) From bckfnn at worldonline.dk Tue Dec 5 16:40:10 2000 From: bckfnn at worldonline.dk (Finn Bock) Date: Tue, 05 Dec 2000 15:40:10 GMT Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> <14892.22751.921264.156010@anthem.concentric.net> <200012051254.HAA25502@cj20424-a.reston1.va.home.com> Message-ID: <3a2d0c29.242749@smtp.worldonline.dk> On Tue, 05 Dec 2000 07:54:20 -0500, you wrote: >> GvR> What does Jython do here? >> >> It's not exactly equivalent, because Jython's __findattr__ can't call >> back into Python. > >I'd say that Jython's __findattr__ is an entirely different beast than >what we have here. Its min purpose in life appears to be to be a >getattr equivalent that returns NULL instead of raising an exception >when the attribute isn't found -- which is reasonable because from >within Java, testing for null is much cheaper than checking for an >exception, and you often need to look whether a given attribute exists >do some default action if not. Correct. It is also the method to override when making a new builtin type and it will be called on such a type subclass regardless of the presence of any __getattr__ hook and __dict__ content. So I think it have some of the properties which Barry wants. regards, finn From greg at cosc.canterbury.ac.nz Wed Dec 6 00:07:06 2000 From: greg at cosc.canterbury.ac.nz (greg at cosc.canterbury.ac.nz) Date: Wed, 06 Dec 2000 12:07:06 +1300 (NZDT) Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__()) In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> Message-ID: <200012052307.MAA01082@s454.cosc.canterbury.ac.nz> I can't believe you're even considering a magic dynamically-scoped flag that invisibly changes the semantics of fundamental operations. To me the idea is utterly insane! If I understand correctly, the problem is that if you do something like def __findattr__(self, name): if name == 'spam': return self.__dict__['spam'] then self.__dict__ is going to trigger a recursive __findattr__ call. It seems to me that if you're going to have some sort of hook that is always called on any x.y reference, you need some way of explicitly bypassing it and getting at the underlying machinery. I can think of a couple of ways: 1) Make the __dict__ attribute special, so that accessing it always bypasses __findattr__. 2) Provide some other way of getting direct access to the attributes of an object, e.g. new builtins called peekattr() and pokeattr(). This assumes that you always know when you write a particular access whether you want it to be a "normal" or "special" one, so that you can use the appropriate mechanism. Are there any cases where this is not true? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From barry at digicool.com Wed Dec 6 03:20:40 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 5 Dec 2000 21:20:40 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> <14892.22751.921264.156010@anthem.concentric.net> <200012051254.HAA25502@cj20424-a.reston1.va.home.com> <3a2d0c29.242749@smtp.worldonline.dk> Message-ID: <14893.41592.701128.58110@anthem.concentric.net> >>>>> "FB" == Finn Bock writes: FB> Correct. It is also the method to override when making a new FB> builtin type and it will be called on such a type subclass FB> regardless of the presence of any __getattr__ hook and FB> __dict__ content. So I think it have some of the properties FB> which Barry wants. We had a discussion about this PEP at our group meeting today. Rather than write it all twice, I'm going to try to update the PEP and patch tonight. I think what we came up with will solve most of the problems raised, and will be implementable in Jython (I'll try to work up a Jython patch too, if I don't fall asleep first :) -Barry From barry at digicool.com Wed Dec 6 03:54:36 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 5 Dec 2000 21:54:36 -0500 Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__()) References: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> <200012052307.MAA01082@s454.cosc.canterbury.ac.nz> Message-ID: <14893.43628.61063.905227@anthem.concentric.net> >>>>> "greg" == writes: | 1) Make the __dict__ attribute special, so that accessing | it always bypasses __findattr__. You're not far from what I came up with right after our delicious lunch. We're going to invent a new protocol which passes __dict__ into the method as an argument. That way self.__dict__ doesn't need to be special cased at all because you can get at all the attributes via a local! So no recursion stop hack is necessary. More in the updated PEP and patch. -Barry From dgoodger at bigfoot.com Thu Dec 7 05:33:33 2000 From: dgoodger at bigfoot.com (David Goodger) Date: Wed, 06 Dec 2000 23:33:33 -0500 Subject: [Python-Dev] unit testing and Python regression test Message-ID: There is another unit testing implementation out there, OmPyUnit, available from: http://www.objectmentor.com/freeware/downloads.html -- David Goodger dgoodger at bigfoot.com Open-source projects: - The Go Tools Project: http://gotools.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net (soon!) From fdrake at users.sourceforge.net Thu Dec 7 07:26:54 2000 From: fdrake at users.sourceforge.net (Fred L. Drake) Date: Wed, 6 Dec 2000 22:26:54 -0800 Subject: [Python-Dev] [development doc updates] Message-ID: <200012070626.WAA22103@orbital.p.sourceforge.net> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Lots of small changes, but most important, more DOM documentation: http://python.sourceforge.net/devel-docs/lib/module-xml.dom.html From guido at python.org Thu Dec 7 18:48:53 2000 From: guido at python.org (Guido van Rossum) Date: Thu, 07 Dec 2000 12:48:53 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons Message-ID: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> After perusing David Ascher's proposal, several versions of his patches, and hundreds of email exchanged on this subject (almost all of this dated April or May of 1998), I've produced a reasonably semblance of PEP 207. Get it from CVS or here on the web: http://python.sourceforge.net/peps/pep-0207.html I'd like to hear your comments, praise, and criticisms! The PEP still needs work; in particular, the minority point of view back then (that comparisons should return only Boolean results) is not adequately represented (but I *did* work in a reference to tabnanny, to ensure Tim's support :-). I'd like to work on a patch next, but I think there will be interference with Neil's coercion patch. I'm not sure how to resolve that yet; maybe I'll just wait until Neil's coercion patch is checked in. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Dec 7 18:54:51 2000 From: guido at python.org (Guido van Rossum) Date: Thu, 07 Dec 2000 12:54:51 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) Message-ID: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> I'm maybe about three quarters on the way with writing PEP 230 -- far enough along to be asking for comments. Get it from CVS or go to: http://python.sourceforge.net/peps/pep-0230.html A prototype implementation in Python is included in the PEP; I think this shows that the implementation is not too complex (Paul Prescod's fear about my proposal). This is pretty close to what I proposed earlier (Nov 5), except that I have added warning category classes (inspired by Paul's proposal). This class also serves as the exception to be raised when warnings are turned into exceptions. Do I need to include a discussion of Paul's counter-proposal and why I rejected it? --Guido van Rossum (home page: http://www.python.org/~guido/) From Barrett at stsci.edu Thu Dec 7 23:49:02 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Thu, 7 Dec 2000 17:49:02 -0500 (EST) Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays Message-ID: <14896.1191.240597.632888@nem-srvr.stsci.edu> What is the status of PEP 209? I see David Ascher is the champion of this PEP, but nothing has been written up. Is the intention of this PEP to make the current Numeric a built-in feature of Python or to re-implement and replace the current Numeric module? The reason that I ask these questions is because I'm working on a prototype of a new N-dimensional Array module which I call Numeric 2. This new module will be much more extensible than the current Numeric. For example, new array types and universal functions can be loaded or imported on demand. We also intend to implement a record (or C-structure) type, because 1-D arrays or lists of records are a common data structure for storing photon events in astronomy and related fields. The current Numeric does not handle record types efficiently, particularly when the data type is not aligned and is in non-native endian format. To handle such data, temporary arrays must be created and alignment and byte-swapping done on them. Numeric 2 does such pre- and post-processing inside the inner-most loop which is more efficient in both time and memory. It also does type conversion at this level which is consistent with that proposed for PEP 208. Since many scientific users would like direct access to the array data via C pointers, we have investigated using the buffer object. We have not had much success with it, because of its implementation. I have scanned the python-dev mailing list for discussions of this issue and found that it now appears to be deprecated. My opinion on this is that a new _fundamental_ built-in type should be created for memory allocation with features and an interface similar to the _mmap_ object. I'll call this a _malloc_ object. This would allow Numeric 2 to use either object interchangeably depending on the circumstance. The _string_ type could also benefit from this new object by using a read-only version of it. Since its an object, it's memory area should be safe from inadvertent deletion. Because of these and other new features in Numeric 2, I have a keen interest in the status of PEPs 207, 208, 211, 225, and 228; and also in the proposed buffer object. I'm willing to implement this new _malloc_ object if members of the python-dev list are in agreement. Actually I see no alternative, given the current design of Numeric 2, since the Array class will initially be written completely in Python and will need a mutable memory buffer, while the _string_ type is meant to be a read-only object. All comments welcome. -- Paul -- Dr. Paul Barrett Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Group FAX: 410-338-4767 Baltimore, MD 21218 From DavidA at ActiveState.com Fri Dec 8 02:13:04 2000 From: DavidA at ActiveState.com (David Ascher) Date: Thu, 7 Dec 2000 17:13:04 -0800 (Pacific Standard Time) Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays In-Reply-To: <14896.1191.240597.632888@nem-srvr.stsci.edu> Message-ID: On Thu, 7 Dec 2000, Paul Barrett wrote: > What is the status of PEP 209? I see David Ascher is the champion of > this PEP, but nothing has been written up. Is the intention of this I put my name on the PEP just to make sure it wasn't forgotten. If someone wants to champion it, their name should go on it. --david From guido at python.org Fri Dec 8 17:10:50 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 11:10:50 -0500 Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays In-Reply-To: Your message of "Thu, 07 Dec 2000 17:49:02 EST." <14896.1191.240597.632888@nem-srvr.stsci.edu> References: <14896.1191.240597.632888@nem-srvr.stsci.edu> Message-ID: <200012081610.LAA30679@cj20424-a.reston1.va.home.com> > What is the status of PEP 209? I see David Ascher is the champion of > this PEP, but nothing has been written up. Is the intention of this > PEP to make the current Numeric a built-in feature of Python or to > re-implement and replace the current Numeric module? David has already explained why his name is on it -- basically, David's name is on several PEPs but he doesn't currently have any time to work on these, so other volunteers are most welcome to join. It is my understanding that the current Numeric is sufficiently messy in implementation and controversial in semantics that it would not be a good basis to start from. However, I do think that a basic multi-dimensional array object would be a welcome addition to core Python. > The reason that I ask these questions is because I'm working on a > prototype of a new N-dimensional Array module which I call Numeric 2. > This new module will be much more extensible than the current Numeric. > For example, new array types and universal functions can be loaded or > imported on demand. We also intend to implement a record (or > C-structure) type, because 1-D arrays or lists of records are a common > data structure for storing photon events in astronomy and related > fields. I'm not familiar with the use of computers in astronomy and related fields, so I'll take your word for that! :-) > The current Numeric does not handle record types efficiently, > particularly when the data type is not aligned and is in non-native > endian format. To handle such data, temporary arrays must be created > and alignment and byte-swapping done on them. Numeric 2 does such > pre- and post-processing inside the inner-most loop which is more > efficient in both time and memory. It also does type conversion at > this level which is consistent with that proposed for PEP 208. > > Since many scientific users would like direct access to the array data > via C pointers, we have investigated using the buffer object. We have > not had much success with it, because of its implementation. I have > scanned the python-dev mailing list for discussions of this issue and > found that it now appears to be deprecated. Indeed. I think it's best to leave the buffer object out of your implementation plans. There are several problems with it, and one of the backburner projects is to redesign it to be much more to the point (providing less, not more functionality). > My opinion on this is that a new _fundamental_ built-in type should be > created for memory allocation with features and an interface similar > to the _mmap_ object. I'll call this a _malloc_ object. This would > allow Numeric 2 to use either object interchangeably depending on the > circumstance. The _string_ type could also benefit from this new > object by using a read-only version of it. Since its an object, it's > memory area should be safe from inadvertent deletion. Interesting. I'm actually not sufficiently familiar with mmap to comment. But would the existing array module's array object be at all useful? You can get to the raw bytes in C (using the C buffer API, which is not deprecated) and it is extensible. > Because of these and other new features in Numeric 2, I have a keen > interest in the status of PEPs 207, 208, 211, 225, and 228; and also > in the proposed buffer object. Here are some quick comments on the mentioned PEPs. 207: Rich Comparisons. This will go into Python 2.1. (I just finished the first draft of the PEP, please read it and comment.) 208: Reworking the Coercion Model. This will go into Python 2.1. Neil Schemenauer has mostly finished the patches already. Please comment. 211: Adding New Lineal Algebra Operators (Greg Wilson). This is unlikely to go into Python 2.1. I don't like the idea much. If you disagree, please let me know! (Also, a choice has to be made between 211 and 225; I don't want to accept both, so until 225 is rejected, 211 is in limbo.) 225: Elementwise/Objectwise Operators (Zhu, Lielens). This will definitely not go into Python 2.1. It adds too many new operators. 228: Reworking Python's Numeric Model. This is a total pie-in-the-sky PEP, and this kind of change is not likely to happen before Python 3000. > I'm willing to implement this new _malloc_ object if members of the > python-dev list are in agreement. Actually I see no alternative, > given the current design of Numeric 2, since the Array class will > initially be written completely in Python and will need a mutable > memory buffer, while the _string_ type is meant to be a read-only > object. Would you be willing to take over authorship of PEP 209? David Ascher and the Numeric Python community will thank you. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Dec 8 19:43:39 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 13:43:39 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Your message of "Thu, 30 Nov 2000 17:46:52 EST." References: Message-ID: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> After the last round of discussion, I was left with the idea that the best thing we could do to help destructive iteration is to introduce a {}.popitem() that returns an arbitrary (key, value) pair and deletes it. I wrote about this: > > One more concern: if you repeatedly remove the *first* item, the hash > > table will start looking lobsided. Since we don't resize the hash > > table on deletes, maybe picking an item at random (but not using an > > expensive random generator!) would be better. and Tim replied: > Which is the reason SETL doesn't specify *which* set item is removed: if > you always start looking at "the front" of a dict that's being consumed, the > dict fills with turds without shrinking, you skip over them again and again, > and consuming the entire dict is still quadratic time. > > Unfortunately, while using a random start point is almost always quicker > than that, the expected time for consuming the whole dict remains quadratic. > > The clearest way around that is to save a per-dict search finger, recording > where the last search left off. Start from its current value. Failure if > it wraps around. This is linear time in non-pathological cases (a > pathological case is one in which it isn't linear time ). I've implemented this, except I use a static variable for the finger intead of a per-dict finger. I'm concerned about adding 4-8 extra bytes to each dict object for a feature that most dictionaries never need. So, instead, I use a single shared finger. This works just as well as long as this is used for a single dictionary. For multiple dictionaries (either used by the same thread or in different threads), it'll work almost as well, although it's possible to make up a pathological example that would work qadratically. An easy example of such a pathological example is to call popitem() for two identical dictionaries in lock step. Comments please! We could: - Live with the pathological cases. - Forget the whole thing; and then also forget about firstkey() etc. which has the same problem only worse. - Fix the algorithm. Maybe jumping criss-cross through the hash table like lookdict does would improve that; but I don't understand the math used for that ("Cycle through GF(2^n)-{0}" ???). I've placed a patch on SourceForge: http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470 The algorithm is: static PyObject * dict_popitem(dictobject *mp, PyObject *args) { static int finger = 0; int i; dictentry *ep; PyObject *res; if (!PyArg_NoArgs(args)) return NULL; if (mp->ma_used == 0) { PyErr_SetString(PyExc_KeyError, "popitem(): dictionary is empty"); return NULL; } i = finger; if (i >= mp->ma_size) ir = 0; while ((ep = &mp->ma_table[i])->me_value == NULL) { i++; if (i >= mp->ma_size) i = 0; } finger = i+1; res = PyTuple_New(2); if (res != NULL) { PyTuple_SET_ITEM(res, 0, ep->me_key); PyTuple_SET_ITEM(res, 1, ep->me_value); Py_INCREF(dummy); ep->me_key = dummy; ep->me_value = NULL; mp->ma_used--; } return res; } --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Dec 8 19:51:49 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 13:51:49 -0500 Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use Message-ID: <200012081851.NAA32254@cj20424-a.reston1.va.home.com> Moshe proposes to add an overridable function sys.displayhook(obj) which will be called by the interpreter for the PRINT_EXPR opcode, instead of hardcoding the behavior. The default implementation will of course have the current behavior, but this makes it much simpler to experiment with alternatives, e.g. using str() instead of repr() (or to choose between str() and repr() based on the type). Moshe has asked me to pronounce on this PEP. I've thought about it, and I'm now all for it. Moshe (or anyone else), please submit a patch to SF that shows the complete implementation! --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at home.com Fri Dec 8 20:06:50 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 8 Dec 2000 14:06:50 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: [Guido, on sharing a search finger and getting worse-than-linear behavior in a simple test case] See my reply on SourceForge (crossed in the mails). I predict that fixing this in an acceptable way (not bulletproof, but linear-time for all predictably common cases) is a two-character change. Surprise, although maybe I'm hallucinating (would someone please confirm?): when I went to the SF patch manager page to look for your patch (using the Open Patches view), I couldn't find it. My guess is that if there are "too many" patches to fit on one screen, then unlike the SF *bug* manager, you don't get any indication that more patches exist or any control to go to the next page. From barry at digicool.com Fri Dec 8 20:18:26 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 14:18:26 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: <14897.13314.469255.853298@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> Surprise, although maybe I'm hallucinating (would someone TP> please confirm?): when I went to the SF patch manager page to TP> look for your patch (using the Open Patches view), I couldn't TP> find it. My guess is that if there are "too many" patches to TP> fit on one screen, then unlike the SF *bug* manager, you don't TP> get any indication that more patches exist or any control to TP> go to the next page. I haven't checked recently, but this was definitely true a few weeks ago. I think I even submitted an admin request on it, but I don't remember for sure. -Barry From Barrett at stsci.edu Fri Dec 8 22:22:39 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Fri, 8 Dec 2000 16:22:39 -0500 (EST) Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays In-Reply-To: <200012081610.LAA30679@cj20424-a.reston1.va.home.com> References: <14896.1191.240597.632888@nem-srvr.stsci.edu> <200012081610.LAA30679@cj20424-a.reston1.va.home.com> Message-ID: <14897.10309.686024.254701@nem-srvr.stsci.edu> Guido van Rossum writes: > > What is the status of PEP 209? I see David Ascher is the champion of > > this PEP, but nothing has been written up. Is the intention of this > > PEP to make the current Numeric a built-in feature of Python or to > > re-implement and replace the current Numeric module? > > David has already explained why his name is on it -- basically, > David's name is on several PEPs but he doesn't currently have any time > to work on these, so other volunteers are most welcome to join. > > It is my understanding that the current Numeric is sufficiently messy > in implementation and controversial in semantics that it would not be > a good basis to start from. That is our (Rick, Perry, and I) belief also. > However, I do think that a basic multi-dimensional array object would > be a welcome addition to core Python. That's re-assuring. > Indeed. I think it's best to leave the buffer object out of your > implementation plans. There are several problems with it, and one of > the backburner projects is to redesign it to be much more to the point > (providing less, not more functionality). I agree and have already made the decision to leave it out. > > My opinion on this is that a new _fundamental_ built-in type should be > > created for memory allocation with features and an interface similar > > to the _mmap_ object. I'll call this a _malloc_ object. This would > > allow Numeric 2 to use either object interchangeably depending on the > > circumstance. The _string_ type could also benefit from this new > > object by using a read-only version of it. Since its an object, it's > > memory area should be safe from inadvertent deletion. > > Interesting. I'm actually not sufficiently familiar with mmap to > comment. But would the existing array module's array object be at all > useful? You can get to the raw bytes in C (using the C buffer API, > which is not deprecated) and it is extensible. I tried using this but had problems. I'll look into it again. > > Because of these and other new features in Numeric 2, I have a keen > > interest in the status of PEPs 207, 208, 211, 225, and 228; and also > > in the proposed buffer object. > > Here are some quick comments on the mentioned PEPs. I've got these PEPs on my desk and will comment on them when I can. > > I'm willing to implement this new _malloc_ object if members of the > > python-dev list are in agreement. Actually I see no alternative, > > given the current design of Numeric 2, since the Array class will > > initially be written completely in Python and will need a mutable > > memory buffer, while the _string_ type is meant to be a read-only > > object. > > Would you be willing to take over authorship of PEP 209? David Ascher > and the Numeric Python community will thank you. Yes, I'd gladly wield vast and inconsiderate power over unsuspecting pythoneers. ;-) -- Paul From guido at python.org Fri Dec 8 23:58:03 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 17:58:03 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Thu, 07 Dec 2000 12:54:51 EST." <200012071754.MAA26557@cj20424-a.reston1.va.home.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> Message-ID: <200012082258.RAA02389@cj20424-a.reston1.va.home.com> Nobody seems to care much about the warnings PEP so far. What's up? Are you all too busy buying presents for the holidays? Then get me some too, please? :-) > http://python.sourceforge.net/peps/pep-0230.html I've now produced a prototype implementation for the C code: http://sourceforge.net/patch/?func=detailpatch&patch_id=102715&group_id=5470 Issues: - This defines a C API PyErr_Warn(category, message) instead of Py_Warn(message, category) as the PEP proposes. I actually like this better: it's consistent with PyErr_SetString() etc. rather than with the Python warn(message[, category]) function. - This calls the Python module from C. We'll have to see if this is fast enough. I wish I could postpone the import of warnings.py until the first call to PyErr_Warn(), but unfortunately the warning category classes must be initialized first (so they can be passed into PyErr_Warn()). The current version of warnings.py imports rather a lot of other modules (e.g. re and getopt); this can be reduced by placing those imports inside the functions that use them. - All the issues listed in the PEP. Please comment! BTW: somebody overwrote the PEP on SourceForge with an older version. Please remember to do a "cvs update" before running "make install" in the peps directory! --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Sat Dec 9 00:26:51 2000 From: gstein at lyra.org (Greg Stein) Date: Fri, 8 Dec 2000 15:26:51 -0800 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Dec 08, 2000 at 01:43:39PM -0500 References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: <20001208152651.H30644@lyra.org> On Fri, Dec 08, 2000 at 01:43:39PM -0500, Guido van Rossum wrote: >... > Comments please! We could: > > - Live with the pathological cases. I agree: live with it. The typical case will operate just fine. > - Forget the whole thing; and then also forget about firstkey() > etc. which has the same problem only worse. No opinion. > - Fix the algorithm. Maybe jumping criss-cross through the hash table > like lookdict does would improve that; but I don't understand the > math used for that ("Cycle through GF(2^n)-{0}" ???). No need. The keys were inserted randomly, so sequencing through is effectively random. :-) >... > static PyObject * > dict_popitem(dictobject *mp, PyObject *args) > { > static int finger = 0; > int i; > dictentry *ep; > PyObject *res; > > if (!PyArg_NoArgs(args)) > return NULL; > if (mp->ma_used == 0) { > PyErr_SetString(PyExc_KeyError, > "popitem(): dictionary is empty"); > return NULL; > } > i = finger; > if (i >= mp->ma_size) > ir = 0; Should be "i = 0" Cheers, -g -- Greg Stein, http://www.lyra.org/ From tismer at tismer.com Sat Dec 9 17:44:14 2000 From: tismer at tismer.com (Christian Tismer) Date: Sat, 09 Dec 2000 18:44:14 +0200 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: <3A32615E.D39B68D2@tismer.com> Guido van Rossum wrote: > > After the last round of discussion, I was left with the idea that the > best thing we could do to help destructive iteration is to introduce a > {}.popitem() that returns an arbitrary (key, value) pair and deletes > it. I wrote about this: > > > > One more concern: if you repeatedly remove the *first* item, the hash > > > table will start looking lobsided. Since we don't resize the hash > > > table on deletes, maybe picking an item at random (but not using an > > > expensive random generator!) would be better. > > and Tim replied: > > > Which is the reason SETL doesn't specify *which* set item is removed: if > > you always start looking at "the front" of a dict that's being consumed, the > > dict fills with turds without shrinking, you skip over them again and again, > > and consuming the entire dict is still quadratic time. > > > > Unfortunately, while using a random start point is almost always quicker > > than that, the expected time for consuming the whole dict remains quadratic. > > > > The clearest way around that is to save a per-dict search finger, recording > > where the last search left off. Start from its current value. Failure if > > it wraps around. This is linear time in non-pathological cases (a > > pathological case is one in which it isn't linear time ). > > I've implemented this, except I use a static variable for the finger > intead of a per-dict finger. I'm concerned about adding 4-8 extra > bytes to each dict object for a feature that most dictionaries never > need. So, instead, I use a single shared finger. This works just as > well as long as this is used for a single dictionary. For multiple > dictionaries (either used by the same thread or in different threads), > it'll work almost as well, although it's possible to make up a > pathological example that would work qadratically. > > An easy example of such a pathological example is to call popitem() > for two identical dictionaries in lock step. > > Comments please! We could: > > - Live with the pathological cases. > > - Forget the whole thing; and then also forget about firstkey() > etc. which has the same problem only worse. > > - Fix the algorithm. Maybe jumping criss-cross through the hash table > like lookdict does would improve that; but I don't understand the > math used for that ("Cycle through GF(2^n)-{0}" ???). That algorithm is really a gem which you should know, so let me try to explain it. Intro: A little story about finite field theory (very basic). ------------------------------------------------------------- For every prime p and every power p^n, there exists a Galois Field ( GF(p^n) ), which is a finite field. The additive group is called "elementary Abelian", it is commutative, and it looks a little like a vector space, since addition works in cycles modulo p for every p cell. The multiplicative group is cyclic, and it never touches 0. Cyclic groups are generated by a single primitive element. The powers of that element make up all the other elements. For all elements of the multiplication group GF(p^n)* the equality x^(p^n -1) == 1 . A generator element is therefore a primitive (p^n-1)th root of unity. From nas at arctrix.com Sat Dec 9 12:30:06 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Sat, 9 Dec 2000 03:30:06 -0800 Subject: [Python-Dev] PEP 208 and __coerce__ Message-ID: <20001209033006.A3737@glacier.fnational.com> While working on the implementation of PEP 208, I discovered that __coerce__ has some surprising properties. Initially I implemented __coerce__ so that the numberic operation currently being performed was called on the values returned by __coerce__. This caused test_class to blow up due to code like this: class Test: def __coerce__(self, other): return (self, other) The 2.0 "solves" this by not calling __coerce__ again if the objects returned by __coerce__ are instances. This has the effect of making code like: class A: def __coerce__(self, other): return B(), other class B: def __coerce__(self, other): return 1, other A() + 1 fail to work in the expected way. The question is: how should __coerce__ work? One option is to leave it work the way it does in 2.0. Alternatively, I could change it so that if coerce returns (self, *) then __coerce__ is not called again. Neil From mal at lemburg.com Sat Dec 9 19:49:29 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 09 Dec 2000 19:49:29 +0100 Subject: [Python-Dev] PEP 208 and __coerce__ References: <20001209033006.A3737@glacier.fnational.com> Message-ID: <3A327EB9.BD2CA3CC@lemburg.com> Neil Schemenauer wrote: > > While working on the implementation of PEP 208, I discovered that > __coerce__ has some surprising properties. Initially I > implemented __coerce__ so that the numberic operation currently > being performed was called on the values returned by __coerce__. > This caused test_class to blow up due to code like this: > > class Test: > def __coerce__(self, other): > return (self, other) > > The 2.0 "solves" this by not calling __coerce__ again if the > objects returned by __coerce__ are instances. This has the > effect of making code like: > > class A: > def __coerce__(self, other): > return B(), other > > class B: > def __coerce__(self, other): > return 1, other > > A() + 1 > > fail to work in the expected way. The question is: how should > __coerce__ work? One option is to leave it work the way it does > in 2.0. Alternatively, I could change it so that if coerce > returns (self, *) then __coerce__ is not called again. +0 -- the idea behind the PEP 208 is to get rid off the centralized coercion mechanism, so fixing it to allow yet more obscure variants should be carefully considered. I see __coerce__ et al. as old style mechanisms -- operator methods have much more information available to do the right thing than the single bottelneck __coerce__. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tim.one at home.com Sat Dec 9 21:49:04 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 9 Dec 2000 15:49:04 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > ... > I've implemented this, except I use a static variable for the finger > intead of a per-dict finger. I'm concerned about adding 4-8 extra > bytes to each dict object for a feature that most dictionaries never > need. It's a bit ironic that dicts are guaranteed to be at least 1/3 wasted space . Let's pick on Christian's idea to reclaim a few bytes of that. > So, instead, I use a single shared finger. This works just as > well as long as this is used for a single dictionary. For multiple > dictionaries (either used by the same thread or in different threads), > it'll work almost as well, although it's possible to make up a > pathological example that would work qadratically. > > An easy example of such a pathological example is to call popitem() > for two identical dictionaries in lock step. Please see my later comments attached to the patch: http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470 In short, for me (truly) identical dicts perform well with or without my suggestion, while dicts cloned via dict.copy() perform horribly with or without my suggestion (their internal structures differ); still curious as to whether that's also true for you (am I looking at a Windows bug? I don't see how, but it's possible ...). In any case, my suggestion turned out to be worthless on my box. Playing around via simulations suggests that a shared finger is going to be disastrous when consuming more than one dict unless they have identical internal structure (not just compare equal). As soon as they get a little out of synch, it just gets worse with each succeeding probe. > Comments please! We could: > > - Live with the pathological cases. How boring . > - Forget the whole thing; and then also forget about firstkey() > etc. which has the same problem only worse. I don't know that this is an important idea for dicts in general (it is important for sets) -- it's akin to an xrange for dicts. But then I've had more than one real-life program that built giant dicts then ran out of memory trying to iterate over them! I'd like to fix that. > - Fix the algorithm. Maybe jumping criss-cross through the hash table > like lookdict does would improve that; but I don't understand the > math used for that ("Cycle through GF(2^n)-{0}" ???). Christian explained that well (thanks!). However, I still don't see any point to doing that business in .popitem(): when inserting keys, the jitterbug probe sequence has the crucial benefit of preventing primary clustering when keys collide. But when we consume a dict, we just want to visit every slot as quickly as possible. [Christian] > Appendix, on the use of finger: > ------------------------------- > > Instead of using a global finger variable, you can do the > following (involving a cast from object to int) : > > - if the 0'th slot of the dict is non-empty: > return this element and insert the dummy element > as key. Set the value field to the Dictionary Algorithm > would give for the removed object's hash. This is the > next finger. > - else: > treat the value field of the 0'th slot as the last finger. > If it is zero, initialize it with 2^n-1. > Repetitively use the DA until you find an entry. Save > the finger in slot 0 again. > > This dosn't cost an extra slot, and even when the dictionary > is written between removals, the chance to loose the finger > is just 1:(2^n-1) on every insertion. I like that, except: 1) As above, I don't believe the GF business buys anything over a straightforward search when consuming a dict. 2) Overloading the value field bristles with problems, in part because it breaks the invariant that a slot is unused if and only if the value field is NULL, in part because C doesn't guarantee that you can get away with casting an arbitrary int to a pointer and back again. None of the problems in #2 arise if we abuse the me_hash field instead, so the attached does that. Here's a typical run of Guido's test case using this (on an 866MHz machine w/ 256Mb RAM -- the early values jump all over the place from run to run): run = 0 log2size = 10 size = 1024 7.4 usec per item to build (total 0.008 sec) 3.4 usec per item to destroy twins (total 0.003 sec) log2size = 11 size = 2048 6.7 usec per item to build (total 0.014 sec) 3.4 usec per item to destroy twins (total 0.007 sec) log2size = 12 size = 4096 7.0 usec per item to build (total 0.029 sec) 3.7 usec per item to destroy twins (total 0.015 sec) log2size = 13 size = 8192 7.1 usec per item to build (total 0.058 sec) 5.9 usec per item to destroy twins (total 0.048 sec) log2size = 14 size = 16384 14.7 usec per item to build (total 0.241 sec) 6.4 usec per item to destroy twins (total 0.105 sec) log2size = 15 size = 32768 12.2 usec per item to build (total 0.401 sec) 3.9 usec per item to destroy twins (total 0.128 sec) log2size = 16 size = 65536 7.8 usec per item to build (total 0.509 sec) 4.0 usec per item to destroy twins (total 0.265 sec) log2size = 17 size = 131072 7.9 usec per item to build (total 1.031 sec) 4.1 usec per item to destroy twins (total 0.543 sec) The last one is over 100 usec per item using the original patch (with or without my first suggestion). if-i-were-a-betting-man-i'd-say-"bingo"-ly y'rs - tim Drop-in replacement for the popitem in the patch: static PyObject * dict_popitem(dictobject *mp, PyObject *args) { int i = 0; dictentry *ep; PyObject *res; if (!PyArg_NoArgs(args)) return NULL; if (mp->ma_used == 0) { PyErr_SetString(PyExc_KeyError, "popitem(): dictionary is empty"); return NULL; } /* Set ep to "the first" dict entry with a value. We abuse the hash * field of slot 0 to hold a search finger: * If slot 0 has a value, use slot 0. * Else slot 0 is being used to hold a search finger, * and we use its hash value as the first index to look. */ ep = &mp->ma_table[0]; if (ep->me_value == NULL) { i = (int)ep->me_hash; /* The hash field may be uninitialized trash, or it * may be a real hash value, or it may be a legit * search finger, or it may be a once-legit search * finger that's out of bounds now because it * wrapped around or the table shrunk -- simply * make sure it's in bounds now. */ if (i >= mp->ma_size || i < 1) i = 1; /* skip slot 0 */ while ((ep = &mp->ma_table[i])->me_value == NULL) { i++; if (i >= mp->ma_size) i = 1; } } res = PyTuple_New(2); if (res != NULL) { PyTuple_SET_ITEM(res, 0, ep->me_key); PyTuple_SET_ITEM(res, 1, ep->me_value); Py_INCREF(dummy); ep->me_key = dummy; ep->me_value = NULL; mp->ma_used--; } assert(mp->ma_table[0].me_value == NULL); mp->ma_table[0].me_hash = i + 1; /* next place to start */ return res; } From tim.one at home.com Sat Dec 9 22:09:30 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 9 Dec 2000 16:09:30 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Message-ID: > assert(mp->ma_table[0].me_value == NULL); > mp->ma_table[0].me_hash = i + 1; /* next place to start */ Ack, those two lines should move up into the "if (res != NULL)" block. errors-are-error-prone-ly y'rs - tim From gvwilson at nevex.com Sun Dec 10 17:11:09 2000 From: gvwilson at nevex.com (Greg Wilson) Date: Sun, 10 Dec 2000 11:11:09 -0500 Subject: [Python-Dev] re: So You Want to Write About Python? Message-ID: Hi, folks. Jon Erickson (Doctor Dobb's Journal), Frank Willison (O'Reilly), and I (professional loose cannon) are doing a workshop at IPC on writing books and magazine articles about Python. It would be great to have a few articles (in various stages of their lives) and/or book proposals from people on this list to use as examples. So, if you think the world oughta know about the things you're doing, and would like to use this to help get yourself motivated to start writing, please drop me a line. I'm particularly interested in: - the real-world issues involved in moving to Unicode - non-trivial XML processing using SAX and DOM (where "non-trivial" means "including namespaces, entity references, error handling, and all that") - the theory and practice of stackless, generators, and continuations - the real-world tradeoffs between the various memory management schemes that are now available for Python - feature comparisons of various Foobars that can be used with Python (where "Foobar" could be "GUI toolkit", "IDE", "web scripting toolkit", or just about anything else) - performance analysis and tuning of Python itself (as an example of how you speed up real applications --- this is something that matters a lot in the real world, but tends to get forgotten in school) - just about anything else that you wish someone had written for you before you started your last big project Thanks, Greg From paul at prescod.net Sun Dec 10 19:02:27 2000 From: paul at prescod.net (Paul Prescod) Date: Sun, 10 Dec 2000 10:02:27 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> Message-ID: <3A33C533.ABA27C7C@prescod.net> Guido van Rossum wrote: > > Nobody seems to care much about the warnings PEP so far. What's up? > Are you all too busy buying presents for the holidays? Then get me > some too, please? :-) My opinions: * it should be a built-in or keyword, not a function in "sys". Warning is supposed to be as easy as possible so people will do it often. sys.argv and sys.stdout annoy me as it is. * the term "level" applied to warnings typically means "warning level" as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something. * this level idea gives rise to another question. What if I want to see the full stack context of a warning? Do I have to implement a whole new warning output hook? It seems like I should be able to specify this as a command line option alongside the action. * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out. * should there be a sys.formatwarning? What if I want to redirect warnings to a socket -- I'd like to use the standard formatting machinery. Or vice versa, I might want to change the formatting but not override the destination. * there should be a "RuntimeWarning" -- base category for warnings about dubious runtime behaviors (e.g. integer division truncated value) * it should be possible to strip warnings as an optimization step. That may require interpreter and syntax support. * warnings will usually be tied to tests which the user will want to be able to optimize out also. (e.g. if __debug__ and type(foo)==StringType: warn "Should be Unicode!") I propose: >>> warn conditional, message[, category] to be very parallel with >>> assert conditional, message I'm not proposing the use of the assert keyword anymore, but I am trying to reuse the syntax for familiarity. Perhaps -Wstrip would strip warnings out of the bytecode. Paul Prescod From nas at arctrix.com Sun Dec 10 14:46:46 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Sun, 10 Dec 2000 05:46:46 -0800 Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) Message-ID: <20001210054646.A5219@glacier.fnational.com> Sourceforge unloads are not working. The lastest version of the patch for PEP 208 is here: http://arctrix.com/nas/python/coerce-6.0.diff Operations on instances now call __coerce__ if it exists. I think the patch is now complete. Converting other builtin types to "new style numbers" can be done with a separate patch. Neil From guido at python.org Sun Dec 10 23:17:08 2000 From: guido at python.org (Guido van Rossum) Date: Sun, 10 Dec 2000 17:17:08 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Sun, 10 Dec 2000 10:02:27 PST." <3A33C533.ABA27C7C@prescod.net> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> Message-ID: <200012102217.RAA12550@cj20424-a.reston1.va.home.com> > My opinions: > > * it should be a built-in or keyword, not a function in "sys". Warning > is supposed to be as easy as possible so people will do it often. Disagree. Warnings are there mostly for the Python system to warn the Python programmer. The most heavy use will come from the standard library, not from user code. > sys.argv and sys.stdout annoy me as it is. Too bad. > * the term "level" applied to warnings typically means "warning level" > as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something. Good point. > * this level idea gives rise to another question. What if I want to see > the full stack context of a warning? Do I have to implement a whole new > warning output hook? It seems like I should be able to specify this as a > command line option alongside the action. Turn warnings into errors and you'll get a full traceback. If you really want a full traceback without exiting, some creative use of sys._getframe() and the traceback module will probably suit you well. > * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out. I don't. > * should there be a sys.formatwarning? What if I want to redirect > warnings to a socket -- I'd like to use the standard formatting > machinery. Or vice versa, I might want to change the formatting but not > override the destination. Good point. I'm changing this to: def showwarning(message, category, filename, lineno, file=None): """Hook to frite a warning to a file; replace if you like.""" and def formatwarning(message, category, filename, lineno): """Hook to format a warning the standard way.""" > * there should be a "RuntimeWarning" -- base category for warnings > about dubious runtime behaviors (e.g. integer division truncated value) OK. > * it should be possible to strip warnings as an optimization step. That > may require interpreter and syntax support. I don't see the point of this. I think this comes from our different views on who should issue warnings. > * warnings will usually be tied to tests which the user will want to be > able to optimize out also. (e.g. if __debug__ and type(foo)==StringType: > warn "Should be Unicode!") > > I propose: > > >>> warn conditional, message[, category] Sorry, this is not worth a new keyword. > to be very parallel with > > >>> assert conditional, message > > I'm not proposing the use of the assert keyword anymore, but I am trying > to reuse the syntax for familiarity. Perhaps -Wstrip would strip > warnings out of the bytecode. Why? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at effbot.org Mon Dec 11 01:16:25 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Mon, 11 Dec 2000 01:16:25 +0100 Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com> Message-ID: <000901c06307$9a814d60$3c6340d5@hagrid> Guido wrote: > Moshe proposes to add an overridable function sys.displayhook(obj) > which will be called by the interpreter for the PRINT_EXPR opcode, > instead of hardcoding the behavior. The default implementation will > of course have the current behavior, but this makes it much simpler to > experiment with alternatives, e.g. using str() instead of repr() (or > to choose between str() and repr() based on the type). hmm. instead of patching here and there, what's stopping us from doing it the right way? I'd prefer something like: import code class myCLI(code.InteractiveConsole): def displayhook(self, data): # non-standard display hook print str(data) sys.setcli(myCLI()) (in other words, why not move the *entire* command line interface over to Python code) From guido at python.org Mon Dec 11 03:24:20 2000 From: guido at python.org (Guido van Rossum) Date: Sun, 10 Dec 2000 21:24:20 -0500 Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use In-Reply-To: Your message of "Mon, 11 Dec 2000 01:16:25 +0100." <000901c06307$9a814d60$3c6340d5@hagrid> References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com> <000901c06307$9a814d60$3c6340d5@hagrid> Message-ID: <200012110224.VAA12844@cj20424-a.reston1.va.home.com> > Guido wrote: > > Moshe proposes to add an overridable function sys.displayhook(obj) > > which will be called by the interpreter for the PRINT_EXPR opcode, > > instead of hardcoding the behavior. The default implementation will > > of course have the current behavior, but this makes it much simpler to > > experiment with alternatives, e.g. using str() instead of repr() (or > > to choose between str() and repr() based on the type). Effbot regurgitates: > hmm. instead of patching here and there, what's stopping us > from doing it the right way? I'd prefer something like: > > import code > > class myCLI(code.InteractiveConsole): > def displayhook(self, data): > # non-standard display hook > print str(data) > > sys.setcli(myCLI()) > > (in other words, why not move the *entire* command line interface > over to Python code) Indeed, this is why I've been hesitant to bless Moshe's hack. I finally decided to go for it because I don't see this redesign of the CLI happening anytime soon. In order to do it right, it would require a redesign of the parser input handling, which is probably the oldest code in Python (short of the long integer math, which predates Python by several years). The current code module is a hack, alas, and doesn't always get it right the same way as the *real* CLI does things. So, rather than wait forever for the perfect solution, I think it's okay to settle for less sooner. "Now is better than never." --Guido van Rossum (home page: http://www.python.org/~guido/) From paulp at ActiveState.com Mon Dec 11 07:59:29 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Sun, 10 Dec 2000 22:59:29 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com> Message-ID: <3A347B51.ADB3F12C@ActiveState.com> Guido van Rossum wrote: > >... > > Disagree. Warnings are there mostly for the Python system to warn the > Python programmer. The most heavy use will come from the standard > library, not from user code. Most Python code is part of some library or another. It may not be the standard library but its still a library. Perl and Java both make warnings (especially about deprecation) very easy *for user code*. > > * it should be possible to strip warnings as an optimization step. That > > may require interpreter and syntax support. > > I don't see the point of this. I think this comes from our different > views on who should issue warnings. Everyone who creates a reusable library will want to issue warnings. That is to say, most serious Python programmers. Anyhow, let's presume that it is only the standard library that issues warnings (for arguments sake). What if I have a speed-critical module that triggers warnings in an inner loop. Turning off the warning doesn't turn off the overhead of the warning infrastructure. I should be able to turn off the overhead easily -- ideally from the Python command line. And I still feel that part of that "overhead" is in the code that tests to determine whether to issue the warnings. There should be a way to turn off that overhead also. Paul From paulp at ActiveState.com Mon Dec 11 08:23:17 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Sun, 10 Dec 2000 23:23:17 -0800 Subject: [Python-Dev] Online help PEP Message-ID: <3A3480E5.C2577AE6@ActiveState.com> PEP: ??? Title: Python Online Help Version: $Revision: 1.0 $ Author: paul at prescod.net, paulp at activestate.com (Paul Prescod) Status: Draft Type: Standards Track Python-Version: 2.1 Status: Incomplete Abstract This PEP describes a command-line driven online help facility for Python. The facility should be able to build on existing documentation facilities such as the Python documentation and docstrings. It should also be extensible for new types and modules. Interactive use: Simply typing "help" describes the help function (through repr overloading). "help" can also be used as a function: The function takes the following forms of input: help( "string" ) -- built-in topic or global help( ) -- docstring from object or type help( "doc:filename" ) -- filename from Python documentation If you ask for a global, it can be a fully-qualfied name such as help("xml.dom"). You can also use the facility from a command-line python --help if In either situation, the output does paging similar to the "more" command. Implementation The help function is implemented in an onlinehelp module which is demand-loaded. There should be options for fetching help information from environments other than the command line through the onlinehelp module: onelinehelp.gethelp(object_or_string) -> string It should also be possible to override the help display function by assigning to onlinehelp.displayhelp(object_or_string). The module should be able to extract module information from either the HTML or LaTeX versions of the Python documentation. Links should be accommodated in a "lynx-like" manner. Over time, it should also be able to recognize when docstrings are in "special" syntaxes like structured text, HTML and LaTeX and decode them appropriately. A prototype implementation is available with the Python source distribution as nondist/sandbox/doctools/onlinehelp.py. Built-in Topics help( "intro" ) - What is Python? Read this first! help( "keywords" ) - What are the keywords? help( "syntax" ) - What is the overall syntax? help( "operators" ) - What operators are available? help( "builtins" ) - What functions, types, etc. are built-in? help( "modules" ) - What modules are in the standard library? help( "copyright" ) - Who owns Python? help( "moreinfo" ) - Where is there more information? help( "changes" ) - What changed in Python 2.0? help( "extensions" ) - What extensions are installed? help( "faq" ) - What questions are frequently asked? help( "ack" ) - Who has done work on Python lately? Security Issues This module will attempt to import modules with the same names as requested topics. Don't use the modules if you are not confident that everything in your pythonpath is from a trusted source. Local Variables: mode: indented-text indent-tabs-mode: nil End: From tim.one at home.com Mon Dec 11 08:36:57 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 11 Dec 2000 02:36:57 -0500 Subject: [Python-Dev] FW: [Python-Help] indentation Message-ID: While we're talking about pluggable CLIs, I share this fellow's confusion over IDLE's CLI variant: block code doesn't "look right" under IDLE because sys.ps2 doesn't exist under IDLE. Some days you can't make *anybody* happy . -----Original Message----- ... Subject: [Python-Help] indentation Sent: Sunday, December 10, 2000 7:32 AM ... My Problem has to do with identation: I put the following to idle: >>> if not 1: print 'Hallo' else: SyntaxError: invalid syntax I get the Message above. I know that else must be 4 spaces to the left, but idle doesn't let me do this. I have only the alternative to put to most left point. But than I disturb the block structure and I get again the error message. I want to have it like this: >>> if not 1: print 'Hallo' else: Can you help me? ... From fredrik at pythonware.com Mon Dec 11 12:36:53 2000 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 11 Dec 2000 12:36:53 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> Message-ID: <033701c06366$ab746580$0900a8c0@SPIFF> christian wrote: > That algorithm is really a gem which you should know, > so let me try to explain it. I think someone just won the "brain exploder 2000" award ;-) to paraphrase Bertrand Russell, "Mathematics may be defined as the subject where I never know what you are talking about, nor whether what you are saying is true" cheers /F From thomas at xs4all.net Mon Dec 11 13:12:09 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 11 Dec 2000 13:12:09 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF>; from fredrik@pythonware.com on Mon, Dec 11, 2000 at 12:36:53PM +0100 References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: <20001211131208.G4396@xs4all.nl> On Mon, Dec 11, 2000 at 12:36:53PM +0100, Fredrik Lundh wrote: > christian wrote: > > That algorithm is really a gem which you should know, > > so let me try to explain it. > I think someone just won the "brain exploder 2000" award ;-) By acclamation, I'd expect. I know it was the best laugh I had since last week's Have I Got News For You, even though trying to understand it made me glad I had boring meetings to recuperate in ;) Highschool-dropout-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal at lemburg.com Mon Dec 11 13:33:18 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 11 Dec 2000 13:33:18 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: <3A34C98E.7C42FD24@lemburg.com> Fredrik Lundh wrote: > > christian wrote: > > That algorithm is really a gem which you should know, > > so let me try to explain it. > > I think someone just won the "brain exploder 2000" award ;-) > > to paraphrase Bertrand Russell, > > "Mathematics may be defined as the subject where I never > know what you are talking about, nor whether what you are > saying is true" Hmm, I must have missed that one... care to repost ? -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer at tismer.com Mon Dec 11 14:49:48 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 11 Dec 2000 15:49:48 +0200 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: <3A34DB7C.FF7E82CE@tismer.com> Fredrik Lundh wrote: > > christian wrote: > > That algorithm is really a gem which you should know, > > so let me try to explain it. > > I think someone just won the "brain exploder 2000" award ;-) > > to paraphrase Bertrand Russell, > > "Mathematics may be defined as the subject where I never > know what you are talking about, nor whether what you are > saying is true" :-)) Well, I was primarily targeting Guido, who said that he came from math, and one cannot study math without standing a basic algebra course, I think. I tried my best to explain it for those who know at least how groups, fields, rings and automorphisms work. Going into more details of the theory would be off-topic for python-dev, but I will try it in an upcoming DDJ article. As you might have guessed, I didn't do this just for fun. It is the old game of explaining what is there, convincing everybody that you at least know what you are talking about, and then three days later coming up with an improved application of the theory. Today is Monday, 2 days left. :-) ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From guido at python.org Mon Dec 11 16:12:24 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 10:12:24 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Your message of "Mon, 11 Dec 2000 15:49:48 +0200." <3A34DB7C.FF7E82CE@tismer.com> References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> Message-ID: <200012111512.KAA23622@cj20424-a.reston1.va.home.com> > Fredrik Lundh wrote: > > > > christian wrote: > > > That algorithm is really a gem which you should know, > > > so let me try to explain it. > > > > I think someone just won the "brain exploder 2000" award ;-) > > > > to paraphrase Bertrand Russell, > > > > "Mathematics may be defined as the subject where I never > > know what you are talking about, nor whether what you are > > saying is true" > > :-)) > > Well, I was primarily targeting Guido, who said that he > came from math, and one cannot study math without standing > a basic algebra course, I think. I tried my best to explain > it for those who know at least how groups, fields, rings > and automorphisms work. Going into more details of the > theory would be off-topic for python-dev, but I will try > it in an upcoming DDJ article. I do have a math degree, but it is 18 years old and I had to give up after the first paragraph of your explanation. It made me vividly recall the first and only class on Galois Theory that I ever took -- after one hour I realized that this was not for me and I didn't have a math brain after all. I went back to the basement where the software development lab was (i.e. a row of card punches :-). > As you might have guessed, I didn't do this just for fun. > It is the old game of explaining what is there, convincing > everybody that you at least know what you are talking about, > and then three days later coming up with an improved > application of the theory. > > Today is Monday, 2 days left. :-) I'm very impressed. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Dec 11 16:15:02 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 10:15:02 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Sun, 10 Dec 2000 22:59:29 PST." <3A347B51.ADB3F12C@ActiveState.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com> <3A347B51.ADB3F12C@ActiveState.com> Message-ID: <200012111515.KAA23764@cj20424-a.reston1.va.home.com> [me] > > Disagree. Warnings are there mostly for the Python system to warn the > > Python programmer. The most heavy use will come from the standard > > library, not from user code. [Paul Prescod] > Most Python code is part of some library or another. It may not be the > standard library but its still a library. Perl and Java both make > warnings (especially about deprecation) very easy *for user code*. Hey. I'm not making it impossible to use warnings. I'm making it very easy. All you have to do is put "from warnings import warn" at the top of your library module. Requesting a built-in or even a new statement is simply excessive. > > > * it should be possible to strip warnings as an optimization step. That > > > may require interpreter and syntax support. > > > > I don't see the point of this. I think this comes from our different > > views on who should issue warnings. > > Everyone who creates a reusable library will want to issue warnings. > That is to say, most serious Python programmers. > > Anyhow, let's presume that it is only the standard library that issues > warnings (for arguments sake). What if I have a speed-critical module > that triggers warnings in an inner loop. Turning off the warning doesn't > turn off the overhead of the warning infrastructure. I should be able to > turn off the overhead easily -- ideally from the Python command line. > And I still feel that part of that "overhead" is in the code that tests > to determine whether to issue the warnings. There should be a way to > turn off that overhead also. So rewrite your code so that it doesn't trigger the warning. When you get a warning, you're doing something that could be done in a better way. So don't whine about the performance. It's a quality of implementation issue whether C code that tests for issues that deserve warnings can do the test without slowing down code that doesn't deserve a warning. Ditto for standard library code. Here's an example. I expect there will eventually (not in 2.1 yet!) warnings in the deprecated string module. If you get such a warning in a time-critical piece of code, the solution is to use string methods -- not to while about the performance of the backwards compatibility code. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at digicool.com Mon Dec 11 17:02:29 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 11 Dec 2000 11:02:29 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> Message-ID: <14900.64149.910989.998348@anthem.concentric.net> Some of my thoughts after reading the PEP and Paul/Guido's exchange. - A function in the warn module is better than one in the sys module. "from warnings import warn" is good enough to not warrant a built-in. I get the sense that the PEP description is behind Guido's currently implementation here. - When PyErr_Warn() returns 1, does that mean a warning has been transmuted into an exception, or some other exception occurred during the setting of the warning? (I think I know, but the PEP could be clearer here). - It would be nice if lineno can be a range specification. Other matches are based on regexps -- think of this as a line number regexp. - Why not do setupwarnings() in site.py? - Regexp matching on messages should be case insensitive. - The second argument to sys.warn() or PyErr_Warn() can be any class, right? If so, it's easy for me to have my own warning classes. What if I want to set up my own warnings filters? Maybe if `action' could be a callable as well as a string. Then in my IDE, I could set that to "mygui.popupWarningsDialog". -Barry From guido at python.org Mon Dec 11 16:57:33 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 10:57:33 -0500 Subject: [Python-Dev] Online help PEP In-Reply-To: Your message of "Sun, 10 Dec 2000 23:23:17 PST." <3A3480E5.C2577AE6@ActiveState.com> References: <3A3480E5.C2577AE6@ActiveState.com> Message-ID: <200012111557.KAA24266@cj20424-a.reston1.va.home.com> I approve of the general idea. Barry, please assign a PEP number. > PEP: ??? > Title: Python Online Help > Version: $Revision: 1.0 $ > Author: paul at prescod.net, paulp at activestate.com (Paul Prescod) > Status: Draft > Type: Standards Track > Python-Version: 2.1 > Status: Incomplete > > Abstract > > This PEP describes a command-line driven online help facility > for Python. The facility should be able to build on existing > documentation facilities such as the Python documentation > and docstrings. It should also be extensible for new types and > modules. > > Interactive use: > > Simply typing "help" describes the help function (through repr > overloading). Cute -- like license, copyright, credits I suppose. > "help" can also be used as a function: > > The function takes the following forms of input: > > help( "string" ) -- built-in topic or global Why does a global require string quotes? > help( ) -- docstring from object or type > help( "doc:filename" ) -- filename from Python documentation I'm missing help() -- table of contents I'm not sure if the table of contents should be printed by the repr output. > If you ask for a global, it can be a fully-qualfied name such as > help("xml.dom"). Why are the string quotes needed? When are they useful? > You can also use the facility from a command-line > > python --help if Is this really useful? Sounds like Perlism to me. > In either situation, the output does paging similar to the "more" > command. Agreed. But how to implement paging in a platform-dependent manner? On Unix, os.system("more") or "$PAGER" is likely to work. On Windows, I suppose we could use its MORE, although that's pretty braindead. On the Mac? Also, inside IDLE or Pythonwin, invoking the system pager isn't a good idea. > Implementation > > The help function is implemented in an onlinehelp module which is > demand-loaded. What does "demand-loaded" mean in a Python context? > There should be options for fetching help information from > environments other than the command line through the onlinehelp > module: > > onelinehelp.gethelp(object_or_string) -> string Good idea. > It should also be possible to override the help display function by > assigning to onlinehelp.displayhelp(object_or_string). Good idea. Pythonwin and IDLE could use this. But I'd like it to work at least "okay" if they don't. > The module should be able to extract module information from either > the HTML or LaTeX versions of the Python documentation. Links should > be accommodated in a "lynx-like" manner. I think this is beyond the scope. The LaTeX isn't installed anywhere (and processing would be too much work). The HTML is installed only on Windows, where there already is a way to get it to pop up in your browser (actually two: it's in the Start menu, and also in IDLE's Help menu). > Over time, it should also be able to recognize when docstrings are > in "special" syntaxes like structured text, HTML and LaTeX and > decode them appropriately. A standard syntax for docstrings is under development, PEP 216. I don't agree with the proposal there, but in any case the help PEP should not attempt to legalize a different format than PEP 216. > A prototype implementation is available with the Python source > distribution as nondist/sandbox/doctools/onlinehelp.py. Neat. I noticed that in a 24-line screen, the pagesize must be set to 21 to avoid stuff scrolling off the screen. Maybe there's an off-by-3 error somewhere? I also noticed that it always prints '1' when invoked as a function. The new license pager in site.py avoids this problem. help("operators") and several others raise an AttributeError('handledocrl'). The "lynx-line links" don't work. > Built-in Topics > > help( "intro" ) - What is Python? Read this first! > help( "keywords" ) - What are the keywords? > help( "syntax" ) - What is the overall syntax? > help( "operators" ) - What operators are available? > help( "builtins" ) - What functions, types, etc. are built-in? > help( "modules" ) - What modules are in the standard library? > help( "copyright" ) - Who owns Python? > help( "moreinfo" ) - Where is there more information? > help( "changes" ) - What changed in Python 2.0? > help( "extensions" ) - What extensions are installed? > help( "faq" ) - What questions are frequently asked? > help( "ack" ) - Who has done work on Python lately? I think it's naive to expect this help facility to replace browsing the website or the full documentation package. There should be one entry that says to point your browser there (giving the local filesystem URL if available), and that's it. The rest of the online help facility should be concerned with exposing doc strings. > Security Issues > > This module will attempt to import modules with the same names as > requested topics. Don't use the modules if you are not confident > that everything in your pythonpath is from a trusted source. Yikes! Another reason to avoid the "string" -> global variable option. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Dec 11 17:53:37 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 11:53:37 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 11:02:29 EST." <14900.64149.910989.998348@anthem.concentric.net> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> Message-ID: <200012111653.LAA24545@cj20424-a.reston1.va.home.com> > Some of my thoughts after reading the PEP and Paul/Guido's exchange. > > - A function in the warn module is better than one in the sys module. > "from warnings import warn" is good enough to not warrant a > built-in. I get the sense that the PEP description is behind > Guido's currently implementation here. Yes. I've updated the PEP to match the (2nd) implementation. > - When PyErr_Warn() returns 1, does that mean a warning has been > transmuted into an exception, or some other exception occurred > during the setting of the warning? (I think I know, but the PEP > could be clearer here). I've clarified this now: it returns 1 in either case. You have to do exception handling in either case. I'm not telling why -- you don't need to know. The caller of PyErr_Warn() should not attempt to catch the exception -- if that's your intent, you shouldn't be calling PyErr_Warn(). And PyErr_Warn() is complicated enough that it has to allow raising an exception. > - It would be nice if lineno can be a range specification. Other > matches are based on regexps -- think of this as a line number > regexp. Too much complexity already. > - Why not do setupwarnings() in site.py? See the PEP and the current implementation. The delayed-loading of the warnings module means that we have to save the -W options as sys.warnoptions. (This also makes them work when multiple interpreters are used -- they all get the -W options.) > - Regexp matching on messages should be case insensitive. Good point! Done in my version of the code. > - The second argument to sys.warn() or PyErr_Warn() can be any class, > right? Almost. It must be derived from __builtin__.Warning. > If so, it's easy for me to have my own warning classes. > What if I want to set up my own warnings filters? Maybe if `action' > could be a callable as well as a string. Then in my IDE, I could > set that to "mygui.popupWarningsDialog". No, for that purpose you would override warnings.showwarning(). --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas at xs4all.net Mon Dec 11 17:58:39 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 11 Dec 2000 17:58:39 +0100 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <14900.64149.910989.998348@anthem.concentric.net>; from barry@digicool.com on Mon, Dec 11, 2000 at 11:02:29AM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> Message-ID: <20001211175839.H4396@xs4all.nl> On Mon, Dec 11, 2000 at 11:02:29AM -0500, Barry A. Warsaw wrote: > - A function in the warn module is better than one in the sys module. > "from warnings import warn" is good enough to not warrant a > built-in. I get the sense that the PEP description is behind > Guido's currently implementation here. +1 on this. I have a response to Guido's first posted PEP on my laptop, but due to a weekend in Germany wasn't able to post it before he updated the PEP. I guess I can delete the arguments for this, now ;) but lets just say I think 'sys' is being a bit overused, and the case of a function in sys and its data in another module is just plain silly. > - When PyErr_Warn() returns 1, does that mean a warning has been > transmuted into an exception, or some other exception occurred > during the setting of the warning? (I think I know, but the PEP > could be clearer here). How about returning 1 for 'warning turned into exception' and -1 for 'normal exception' ? It would be slightly more similar to other functions if '-1' meant 'exception', and it would be easy to put in an if statement -- and still allow C code to ignore the produced error, if it wanted to. > - It would be nice if lineno can be a range specification. Other > matches are based on regexps -- think of this as a line number > regexp. +0 on this... I'm not sure if such fine-grained control is really necessary. I liked the hint at 'per function' granularity, but I realise it's tricky to do right, what with naming issues and all that. > - Regexp matching on messages should be case insensitive. How about being able to pass in compiled regexp objects as well as strings ? I haven't looked at the implementation at all, so I'm not sure how expensive it would be, but it might also be nice to have users (= programmers) pass in an object with its own 'match' method, so you can 'interactively' decide whether or not to raise an exception, popup a window, and what not. Sort of like letting 'action' be a callable, which I think is a good idea as well. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido at python.org Mon Dec 11 18:11:02 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 12:11:02 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 17:58:39 +0100." <20001211175839.H4396@xs4all.nl> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> Message-ID: <200012111711.MAA24818@cj20424-a.reston1.va.home.com> > > - When PyErr_Warn() returns 1, does that mean a warning has been > > transmuted into an exception, or some other exception occurred > > during the setting of the warning? (I think I know, but the PEP > > could be clearer here). > > How about returning 1 for 'warning turned into exception' and -1 for 'normal > exception' ? It would be slightly more similar to other functions if '-1' > meant 'exception', and it would be easy to put in an if statement -- and > still allow C code to ignore the produced error, if it wanted to. Why would you want this? The user clearly said that they wanted the exception! --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at effbot.org Mon Dec 11 18:13:10 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Mon, 11 Dec 2000 18:13:10 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com> Message-ID: <009a01c06395$a9da3220$3c6340d5@hagrid> > Hmm, I must have missed that one... care to repost ? doesn't everyone here read the daily URL? here's a link: http://mail.python.org/pipermail/python-dev/2000-December/010913.html From barry at digicool.com Mon Dec 11 18:18:04 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 11 Dec 2000 12:18:04 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> Message-ID: <14901.3149.109401.151742@anthem.concentric.net> >>>>> "GvR" == Guido van Rossum writes: GvR> I've clarified this now: it returns 1 in either case. You GvR> have to do exception handling in either case. I'm not GvR> telling why -- you don't need to know. The caller of GvR> PyErr_Warn() should not attempt to catch the exception -- if GvR> that's your intent, you shouldn't be calling PyErr_Warn(). GvR> And PyErr_Warn() is complicated enough that it has to allow GvR> raising an exception. Makes sense. >> - It would be nice if lineno can be a range specification. >> Other matches are based on regexps -- think of this as a line >> number regexp. GvR> Too much complexity already. Okay, no biggie I think. >> - Why not do setupwarnings() in site.py? GvR> See the PEP and the current implementation. The GvR> delayed-loading of the warnings module means that we have to GvR> save the -W options as sys.warnoptions. (This also makes GvR> them work when multiple interpreters are used -- they all get GvR> the -W options.) Cool. >> - Regexp matching on messages should be case insensitive. GvR> Good point! Done in my version of the code. Cool. >> - The second argument to sys.warn() or PyErr_Warn() can be any >> class, right? GvR> Almost. It must be derived from __builtin__.Warning. __builtin__.Warning == exceptions.Warning, right? >> If so, it's easy for me to have my own warning classes. What >> if I want to set up my own warnings filters? Maybe if `action' >> could be a callable as well as a string. Then in my IDE, I >> could set that to "mygui.popupWarningsDialog". GvR> No, for that purpose you would override GvR> warnings.showwarning(). Cool. Looks good. -Barry From thomas at xs4all.net Mon Dec 11 19:04:56 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 11 Dec 2000 19:04:56 +0100 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012111711.MAA24818@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 12:11:02PM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com> Message-ID: <20001211190455.I4396@xs4all.nl> On Mon, Dec 11, 2000 at 12:11:02PM -0500, Guido van Rossum wrote: > > How about returning 1 for 'warning turned into exception' and -1 for 'normal > > exception' ? It would be slightly more similar to other functions if '-1' > > meant 'exception', and it would be easy to put in an if statement -- and > > still allow C code to ignore the produced error, if it wanted to. > Why would you want this? The user clearly said that they wanted the > exception! The difference is that in one case, the user will see the original warning-turned-exception, and in the other she won't -- the warning will be lost. At best she'll see (by looking at the traceback) the code intended to give a warning (that might or might not have been turned into an exception) and failed. The warning code might decide to do something aditional to notify the user of the thing it intended to warn about, which ended up as a 'real' exception because of something else. It's no biggy, obviously, except that if you change your mind it will be hard to add it without breaking code. Even if you explicitly state the return value should be tested for boolean value, not greater-than-zero value. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido at python.org Mon Dec 11 19:16:58 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 13:16:58 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 19:04:56 +0100." <20001211190455.I4396@xs4all.nl> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com> <20001211190455.I4396@xs4all.nl> Message-ID: <200012111816.NAA25214@cj20424-a.reston1.va.home.com> > > > How about returning 1 for 'warning turned into exception' and -1 for 'normal > > > exception' ? It would be slightly more similar to other functions if '-1' > > > meant 'exception', and it would be easy to put in an if statement -- and > > > still allow C code to ignore the produced error, if it wanted to. > > > Why would you want this? The user clearly said that they wanted the > > exception! > > The difference is that in one case, the user will see the original > warning-turned-exception, and in the other she won't -- the warning will be > lost. At best she'll see (by looking at the traceback) the code intended to > give a warning (that might or might not have been turned into an exception) > and failed. Yes -- this is a standard convention in Python. if there's a bug in code that is used to raise or handle an exception, you get a traceback from that bug. > The warning code might decide to do something aditional to > notify the user of the thing it intended to warn about, which ended up as a > 'real' exception because of something else. Nah. The warning code shouldn't worry about that. If there's a bug in PyErr_Warn(), that should get top priority until it's fixed. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Mon Dec 11 19:12:56 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 11 Dec 2000 19:12:56 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com> <009a01c06395$a9da3220$3c6340d5@hagrid> Message-ID: <3A351928.3A41C970@lemburg.com> Fredrik Lundh wrote: > > > Hmm, I must have missed that one... care to repost ? > > doesn't everyone here read the daily URL? No time for pull logic... only push logic ;-) > here's a link: > http://mail.python.org/pipermail/python-dev/2000-December/010913.html Thanks. A very nice introduction indeed. The only thing which didn't come through in the first reading: why do we need GF(p^n)'s in the first place ? The second reading then made this clear: we need to assure that by iterating through the set of possible coefficients we can actually reach all slots in the dictionary... a gem indeed. Now if we could only figure out an equally simple way of producing perfect hash functions on-the-fly we could eliminate the need for the PyObject_Compare()s... ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tim.one at home.com Mon Dec 11 21:22:55 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 11 Dec 2000 15:22:55 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: [/F, on Christian's GF tutorial] > I think someone just won the "brain exploder 2000" award ;-) Well, anyone can play. When keys collide, what we need is a function f(i) such that repeating i = f(i) visits every int in (0, 2**N) exactly once before setting i back to its initial value, for a fixed N and where the first i is in (0, 2**N). This is the quickest: def f(i): i -= 1 if i == 0: i = 2**N-1 return i Unfortunately, this leads to performance-destroying "primary collisions" (see Knuth, or any other text w/ a section on hashing). Other *good* possibilities include a pseudo-random number generator of maximal period, or viewing the ints in (0, 2**N) as bit vectors indicating set membership and generating all subsets of an N-element set in a Grey code order. The *form* of the function dictobject.c actually uses is: def f(i): i <<= 1 if i >= 2**N: i ^= MAGIC_CONSTANT_DEPENDING_ON_N return i which is suitably non-linear and as fast as the naive method. Given the form of the function, you don't need any theory at all to find a value for MAGIC_CONSTANT_DEPENDING_ON_N that simply works. In fact, I verified all the magic values in dictobject.c via brute force, because the person who contributed the original code botched the theory slightly and gave us some values that didn't work. I'll rely on the theory if and only if we have to extend this to 64-bit machines someday: I'm too old to wait for a brute search of a space with 2**64 elements . mathematics-is-a-battle-against-mortality-ly y'rs - tim From greg at cosc.canterbury.ac.nz Mon Dec 11 22:46:11 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Dec 2000 10:46:11 +1300 (NZDT) Subject: [Python-Dev] Online help PEP In-Reply-To: <200012111557.KAA24266@cj20424-a.reston1.va.home.com> Message-ID: <200012112146.KAA01771@s454.cosc.canterbury.ac.nz> Guido: > Paul Prescod: > > In either situation, the output does paging similar to the "more" > > command. > Agreed. Only if it can be turned off! I usually prefer to use the scrolling capabilities of whatever shell window I'm using rather than having some program's own idea of how to do paging forced upon me when I don't want it. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From moshez at zadka.site.co.il Tue Dec 12 07:33:02 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Tue, 12 Dec 2000 08:33:02 +0200 (IST) Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) Message-ID: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il> On Mon, 11 Dec 2000 15:22:55 -0500, "Tim Peters" wrote: > Well, anyone can play. When keys collide, what we need is a function f(i) > such that repeating > i = f(i) > visits every int in (0, 2**N) exactly once before setting i back to its > initial value, for a fixed N and where the first i is in (0, 2**N). OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)? Did 0 harm you in your childhood, and you're trying to get back? <0 wink>. If we had an affine operation, instead of a linear one, we could have [0, 2**n). I won't repeat the proof here but changing > def f(i): > i <<= 1 i^=1 # This is the line I added > if i >= 2**N: > i ^= MAGIC_CONSTANT_DEPENDING_ON_N > return i Makes you waltz all over [0, 2**n) if the original made you comple (0, 2**n). if-i'm-wrong-then-someone-should-shoot-me-to-save-me-the-embarrasment-ly y'rs, Z. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tim.one at home.com Mon Dec 11 23:38:56 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 11 Dec 2000 17:38:56 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il> Message-ID: [Tim] > Well, anyone can play. When keys collide, what we need is a > function f(i) such that repeating > i = f(i) > visits every int in (0, 2**N) exactly once before setting i back to its > initial value, for a fixed N and where the first i is in (0, 2**N). [Moshe Zadka] > OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)? > Did 0 harm you in your childhood, and you're trying to get > back? <0 wink>. We don't need f at all unless we've already determined there's a collision at some index h. The i sequence is used to offset h (mod 2**N). An increment of 0 would waste time (h+0 == h, but we've already done a full compare on the h'th table entry and already determined it wasn't equal to what we're looking for). IOW, there are only 2**N-1 slots still of interest by the time f is needed. > If we had an affine operation, instead of a linear one, we could have > [0, 2**n). I won't repeat the proof here but changing > > def f(i): > i <<= 1 > i^=1 # This is the line I added > if i >= 2**N: > i ^= MAGIC_CONSTANT_DEPENDING_ON_N > return i > > Makes you waltz all over [0, 2**n) if the original made you comple > (0, 2**n). But, Moshe! The proof would have been the most interesting part . From gstein at lyra.org Tue Dec 12 01:15:50 2000 From: gstein at lyra.org (Greg Stein) Date: Mon, 11 Dec 2000 16:15:50 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012111653.LAA24545@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 11:53:37AM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> Message-ID: <20001211161550.Y7732@lyra.org> On Mon, Dec 11, 2000 at 11:53:37AM -0500, Guido van Rossum wrote: >... > > - The second argument to sys.warn() or PyErr_Warn() can be any class, > > right? > > Almost. It must be derived from __builtin__.Warning. Since you must do "from warnings import warn" before using the warnings, then I think it makes sense to put the Warning classes into the warnings module. (e.g. why increase the size of the builtins?) Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at python.org Tue Dec 12 01:39:31 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 19:39:31 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 16:15:50 PST." <20001211161550.Y7732@lyra.org> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> Message-ID: <200012120039.TAA02983@cj20424-a.reston1.va.home.com> > Since you must do "from warnings import warn" before using the warnings, > then I think it makes sense to put the Warning classes into the warnings > module. (e.g. why increase the size of the builtins?) I don't particularly care whether the Warning category classes are builtins, but I can't declare them in the warnings module. Typical use from C is: if (PyErr_Warn(PyExc_DeprecationWarning, "the strop module is deprecated")) return NULL; PyErr_Warn() imports the warnings module on its first call. But the value of PyExc_DeprecationWarning c.s. must be available *before* the first call, so they can't be imported from the warnings module! My first version imported warnings at the start of the program, but this almost doubled the start-up time, hence the design where the module is imported only when needed. The most convenient place to create the Warning category classes is in the _exceptions module; doing it the easiest way there means that they are automatically exported to __builtin__. This doesn't bother me enough to try and hide them. --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Tue Dec 12 02:11:02 2000 From: gstein at lyra.org (Greg Stein) Date: Mon, 11 Dec 2000 17:11:02 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012120039.TAA02983@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 07:39:31PM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> Message-ID: <20001211171102.C7732@lyra.org> On Mon, Dec 11, 2000 at 07:39:31PM -0500, Guido van Rossum wrote: > > Since you must do "from warnings import warn" before using the warnings, > > then I think it makes sense to put the Warning classes into the warnings > > module. (e.g. why increase the size of the builtins?) > > I don't particularly care whether the Warning category classes are > builtins, but I can't declare them in the warnings module. Typical > use from C is: > > if (PyErr_Warn(PyExc_DeprecationWarning, > "the strop module is deprecated")) > return NULL; > > PyErr_Warn() imports the warnings module on its first call. But the > value of PyExc_DeprecationWarning c.s. must be available *before* the > first call, so they can't be imported from the warnings module! Do the following: pywarn.h or pyerrors.h: #define PyWARN_DEPRECATION "DeprecationWarning" ... if (PyErr_Warn(PyWARN_DEPRECATION, "the strop module is deprecated")) return NULL; The PyErr_Warn would then use the string to dynamically look up / bind to the correct value from the warnings module. By using the symbolic constant, you will catch typos in the C code (e.g. if people passed raw strings, then a typo won't be found until runtime; using symbols will catch the problem at compile time). The above strategy will allow for fully-delayed loading, and for all the warnings to be located in the "warnings" module. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at python.org Tue Dec 12 02:21:41 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 20:21:41 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 17:11:02 PST." <20001211171102.C7732@lyra.org> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> <20001211171102.C7732@lyra.org> Message-ID: <200012120121.UAA04576@cj20424-a.reston1.va.home.com> > > PyErr_Warn() imports the warnings module on its first call. But the > > value of PyExc_DeprecationWarning c.s. must be available *before* the > > first call, so they can't be imported from the warnings module! > > Do the following: > > pywarn.h or pyerrors.h: > > #define PyWARN_DEPRECATION "DeprecationWarning" > > ... > if (PyErr_Warn(PyWARN_DEPRECATION, > "the strop module is deprecated")) > return NULL; > > The PyErr_Warn would then use the string to dynamically look up / bind to > the correct value from the warnings module. By using the symbolic constant, > you will catch typos in the C code (e.g. if people passed raw strings, then > a typo won't be found until runtime; using symbols will catch the problem at > compile time). > > The above strategy will allow for fully-delayed loading, and for all the > warnings to be located in the "warnings" module. Yeah, that would be a possibility, if it was deemed evil that the warnings appear in __builtin__. I don't see what's so evil about that. (There's also the problem that the C code must be able to create new warning categories, as long as they are derived from the Warning base class. Your approach above doesn't support this. I'm sure you can figure a way around that too. But I prefer to hear why you think it's needed first.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Tue Dec 12 02:26:00 2000 From: gstein at lyra.org (Greg Stein) Date: Mon, 11 Dec 2000 17:26:00 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012120121.UAA04576@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 08:21:41PM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> <20001211171102.C7732@lyra.org> <200012120121.UAA04576@cj20424-a.reston1.va.home.com> Message-ID: <20001211172600.E7732@lyra.org> On Mon, Dec 11, 2000 at 08:21:41PM -0500, Guido van Rossum wrote: >... > > The above strategy will allow for fully-delayed loading, and for all the > > warnings to be located in the "warnings" module. > > Yeah, that would be a possibility, if it was deemed evil that the > warnings appear in __builtin__. I don't see what's so evil about > that. > > (There's also the problem that the C code must be able to create new > warning categories, as long as they are derived from the Warning base > class. Your approach above doesn't support this. I'm sure you can > figure a way around that too. But I prefer to hear why you think it's > needed first.) I'm just attempting to avoid dumping more names into __builtins__ is all. I don't believe there is anything intrinsically bad about putting more names in there, but avoiding the kitchen-sink metaphor for __builtins__ has got to be a Good Thing :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at python.org Tue Dec 12 14:43:59 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Dec 2000 08:43:59 -0500 Subject: [Python-Dev] Request review of gdbm patch Message-ID: <200012121343.IAA06713@cj20424-a.reston1.va.home.com> I'm asking for a review of the patch to gdbm at http://sourceforge.net/patch/?func=detailpatch&patch_id=102638&group_id=5470 I asked the author for clarification and this is what I got. Can anybody suggest what to do? His mail doesn't give me much confidence in the patch. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Forwarded Message Date: Tue, 12 Dec 2000 13:24:13 +0100 From: Damjan To: Guido van Rossum Subject: Re: your gdbm patch for Python On Mon, Dec 11, 2000 at 03:51:03PM -0500, Guido van Rossum wrote: > I'm looking at your patch at SourceForge: First, I'm sorry it was such a mess of a patch, but I could't figure it out how to send a more elaborate comment. (But then again, I would't have an email from Guido van Rossum in my mail-box, to show of my friends :) > and wondering two things: > > (1) what does the patch do? > > (2) why does the patch remove the 'f' / GDBM_FAST option? From the gdbm info page: ...The following may also be logically or'd into the database flags: GDBM_SYNC, which causes all database operations to be synchronized to the disk, and GDBM_NOLOCK, which prevents the library from performing any locking on the database file. The option GDBM_FAST is now obsolete, since `gdbm' defaults to no-sync mode... ^^^^^^^^ (1) My patch adds two options to the gdbm.open(..) function. These are 'u' for GDBM_NOLOCK, and 's' for GDBM_SYNC. (2) GDBM_FAST is obsolete because gdbm defaults to GDBM_FAST, so it's removed. I'm also thinking about adding a lock and unlock methods to the gdbm object, but it seems that a gdbm database can only be locked and not unlocked. - -- Damjan Georgievski | ???????????? ?????????????????????? Skopje, Macedonia | ????????????, ???????????????????? ------- End of Forwarded Message From mal at lemburg.com Tue Dec 12 14:49:40 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Tue, 12 Dec 2000 14:49:40 +0100 Subject: [Python-Dev] Codec aliasing and naming Message-ID: <3A362CF4.2082A606@lemburg.com> I just wanted to inform you of a change I plan for the standard encodings search function to enable better support for aliasing of encoding names. The current implementation caches the aliases returned from the codecs .getaliases() function in the encodings lookup cache rather than in the alias cache. As a consequence, the hyphen to underscore mapping is not applied to the aliases. A codec would have to return a list of all combinations of names with hyphens and underscores in order to emulate the standard lookup behaviour. I have a ptach which fixes this and also assures that aliases cannot be overwritten by codecs which register at some later point in time. This assures that we won't run into situations where a codec import suddenly overrides behaviour of previously active codecs. I would also like to propose the use of a new naming scheme for codecs which enables drop-in installation. As discussed on the i18n-sig list, people would like to install codecs without having the users to call a codec registration function or to touch site.py. The standard search function in the encodings package has a nice property (which I only noticed after the fact ;) which allows using Python package names in the encoding names, e.g. you can install a package 'japanese' and the access the codecs in that package using 'japanese.shiftjis' without having to bother registering a new codec search function for the package -- the encodings package search function will redirect the lookup to the 'japanese' package. Using package names in the encoding name has several advantages: * you know where the codec comes from * you can have mutliple codecs for the same encoding * drop-in installation without registration is possible * the need for a non-default encoding package is visible in the source code * you no longer need to drop new codecs into the Python standard lib Perhaps someone could add a note about this possibility to the codec docs ?! If noone objects, I'll apply the patch for the enhanced alias support later today. Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at python.org Tue Dec 12 14:57:01 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Dec 2000 08:57:01 -0500 Subject: [Python-Dev] Codec aliasing and naming In-Reply-To: Your message of "Tue, 12 Dec 2000 14:49:40 +0100." <3A362CF4.2082A606@lemburg.com> References: <3A362CF4.2082A606@lemburg.com> Message-ID: <200012121357.IAA06846@cj20424-a.reston1.va.home.com> > Perhaps someone could add a note about this possibility > to the codec docs ?! You can check it in yourself or mail it to Fred or submit it to SF... I don't expect anyone else will jump in and document this properly. > If noone objects, I'll apply the patch for the enhanced alias > support later today. Fine with me (but I don't use codecs -- where's the Dutch language support? :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Tue Dec 12 15:38:20 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Tue, 12 Dec 2000 15:38:20 +0100 Subject: [Python-Dev] Codec aliasing and naming References: <3A362CF4.2082A606@lemburg.com> <200012121357.IAA06846@cj20424-a.reston1.va.home.com> Message-ID: <3A36385C.60C7F2B@lemburg.com> Guido van Rossum wrote: > > > Perhaps someone could add a note about this possibility > > to the codec docs ?! > > You can check it in yourself or mail it to Fred or submit it to SF... > I don't expect anyone else will jump in and document this properly. I'll submit a bug report so that this doesn't get lost in the archives. Don't have time for it myself... alas, noone really does seem to have time these days ;-) > > If noone objects, I'll apply the patch for the enhanced alias > > support later today. > > Fine with me (but I don't use codecs -- where's the Dutch language > support? :-). OK. About the Dutch language support: this would make a nice Christmas fun-project... a new standard module which interfaces to babel.altavista.com (hmm, they don't list Dutch as a supported language yet, but maybe if we bug them enough... ;). -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From paulp at ActiveState.com Tue Dec 12 19:11:13 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Tue, 12 Dec 2000 10:11:13 -0800 Subject: [Python-Dev] Online help PEP References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> Message-ID: <3A366A41.1A14EFD4@ActiveState.com> Guido van Rossum wrote: > >... > > help( "string" ) -- built-in topic or global > > Why does a global require string quotes? It doesn't, but if you happen to say help( "dir" ) instead of help( dir ), I think it should do the right thing. > I'm missing > > help() -- table of contents > > I'm not sure if the table of contents should be printed by the repr > output. I don't see any benefit in having different behaviors for help and help(). > > If you ask for a global, it can be a fully-qualfied name such as > > help("xml.dom"). > > Why are the string quotes needed? When are they useful? When you haven't imported the thing you are asking about. Or when the string comes from another UI like an editor window, command line or web form. > > You can also use the facility from a command-line > > > > python --help if > > Is this really useful? Sounds like Perlism to me. I'm just trying to make it easy to quickly get answers to Python questions. I could totally see someone writing code in VIM switching to a bash window to type: python --help os.path.dirname That's alot easier than: $ python Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. >>> import os >>> help(os.path.dirname) And what does it hurt? > > In either situation, the output does paging similar to the "more" > > command. > > Agreed. But how to implement paging in a platform-dependent manner? > On Unix, os.system("more") or "$PAGER" is likely to work. On Windows, > I suppose we could use its MORE, although that's pretty braindead. On > the Mac? Also, inside IDLE or Pythonwin, invoking the system pager > isn't a good idea. The current implementation does paging internally. You could override it to use the system pager (or no pager). > What does "demand-loaded" mean in a Python context? When you "touch" the help object, it loads the onlinehelp module which has the real implementation. The thing in __builtins__ is just a lightweight proxy. > > It should also be possible to override the help display function by > > assigning to onlinehelp.displayhelp(object_or_string). > > Good idea. Pythonwin and IDLE could use this. But I'd like it to > work at least "okay" if they don't. Agreed. > > The module should be able to extract module information from either > > the HTML or LaTeX versions of the Python documentation. Links should > > be accommodated in a "lynx-like" manner. > > I think this is beyond the scope. Well, we have to do one of: * re-write a subset of the docs in a form that can be accessed from the command line * access the existing docs in a form that's installed * auto-convert the docs into a form that's compatible I've already implemented HTML parsing and LaTeX parsing is actually not that far off. I just need impetus to finish a LaTeX-parsing project I started on my last vacation. The reason that LaTeX is interesting is because it would be nice to be able to move documentation from existing LaTeX files into docstrings. > The LaTeX isn't installed anywhere > (and processing would be too much work). > The HTML is installed only > on Windows, where there already is a way to get it to pop up in your > browser (actually two: it's in the Start menu, and also in IDLE's Help > menu). If the documentation becomes an integral part of the Python code, then it will be installed. It's ridiculous that it isn't already. ActivePython does install the docs on all platforms. > A standard syntax for docstrings is under development, PEP 216. I > don't agree with the proposal there, but in any case the help PEP > should not attempt to legalize a different format than PEP 216. I won't hold my breath for a standard Python docstring format. I've gone out of my way to make the code format independent.. > Neat. I noticed that in a 24-line screen, the pagesize must be set to > 21 to avoid stuff scrolling off the screen. Maybe there's an off-by-3 > error somewhere? Yes. > I also noticed that it always prints '1' when invoked as a function. > The new license pager in site.py avoids this problem. Okay. > help("operators") and several others raise an > AttributeError('handledocrl'). Fixed. > The "lynx-line links" don't work. I don't think that's implemented yet. > I think it's naive to expect this help facility to replace browsing > the website or the full documentation package. There should be one > entry that says to point your browser there (giving the local > filesystem URL if available), and that's it. The rest of the online > help facility should be concerned with exposing doc strings. I don't want to replace the documentation. But there is no reason we should set out to make it incomplete. If its integrated with the HTML then people can choose whatever access mechanism is easiest for them right now I'm trying hard not to be "naive". Realistically, nobody is going to write a million docstrings between now and Python 2.1. It is much more feasible to leverage the existing documentation that Fred and others have spent months on. > > Security Issues > > > > This module will attempt to import modules with the same names as > > requested topics. Don't use the modules if you are not confident > > that everything in your pythonpath is from a trusted source. > Yikes! Another reason to avoid the "string" -> global variable > option. I don't think we should lose that option. People will want to look up information from non-executable environments like command lines, GUIs and web pages. Perhaps you can point me to techniques for extracting information from Python modules and packages without executing them. Paul From guido at python.org Tue Dec 12 21:46:09 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Dec 2000 15:46:09 -0500 Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE Message-ID: <200012122046.PAA16915@cj20424-a.reston1.va.home.com> ------- Forwarded Message Date: Tue, 12 Dec 2000 12:38:20 -0800 From: noreply at sourceforge.net To: noreply at sourceforge.net Subject: SourceForge: PROJECT DOWNTIME NOTICE ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS This update is being sent to project administrators only and contains important information regarding your project. Please read it in its entirety. INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION As noted in the sitewide email sent this week, the SourceForge.net infrastructure is being upgraded (and relocated). As part of this projects, plans are underway to further increase capacity and responsiveness. We are scheduling the relocation of the systems serving project subdomain web pages. IMPORTANT: This move will affect you in the following ways: 1. Service and availability of SourceForge.net and the development tools provided will continue uninterupted. 2. Project page webservers hosting subdomains (yourprojectname.sourceforge.net) will be down Friday December 15 from 9PM PST (12AM EST) until 3AM PST. 3. CVS will be unavailable (read only part of the time) from 7PM until 3AM PST 4. Mailing lists and mail aliases will be unavailable until 3AM PST - --------------------- This email was sent from sourceforge.net. To change your email receipt preferences, please visit the site and edit your account via the "Account Maintenance" link. Direct any questions to admin at sourceforge.net, or reply to this email. ------- End of Forwarded Message From greg at cosc.canterbury.ac.nz Tue Dec 12 23:42:01 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 13 Dec 2000 11:42:01 +1300 (NZDT) Subject: [Python-Dev] Online help PEP In-Reply-To: <3A366A41.1A14EFD4@ActiveState.com> Message-ID: <200012122242.LAA01902@s454.cosc.canterbury.ac.nz> Paul Prescod: > Guido: > > Why are the string quotes needed? When are they useful? > When you haven't imported the thing you are asking about. It would be interesting if the quoted form allowed you to extract doc info from a module *without* having the side effect of importing it. This could no doubt be done for pure Python modules. Would be rather tricky for extension modules, though, I expect. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From barry at digicool.com Wed Dec 13 03:21:36 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 12 Dec 2000 21:21:36 -0500 Subject: [Python-Dev] Two new PEPs, 232 & 233 Message-ID: <14902.56624.20961.768525@anthem.concentric.net> I've just uploaded two new PEPs. 232 is a revision of my pre-PEP era function attribute proposal. 233 is Paul Prescod's proposal for an on-line help facility. http://python.sourceforge.net/peps/pep-0232.html http://python.sourceforge.net/peps/pep-0233.html Let the games begin, -Barry From tim.one at home.com Wed Dec 13 04:34:35 2000 From: tim.one at home.com (Tim Peters) Date: Tue, 12 Dec 2000 22:34:35 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Message-ID: [Moshe Zadka] > If we had an affine operation, instead of a linear one, we could have > [0, 2**n). I won't repeat the proof here but changing > > def f(i): > i <<= 1 > i^=1 # This is the line I added > if i >= 2**N: > i ^= MAGIC_CONSTANT_DEPENDING_ON_N > return i > > Makes you waltz all over [0, 2**n) if the original made you comple > (0, 2**n). [Tim] > But, Moshe! The proof would have been the most interesting part . Turns out the proof would have been intensely interesting, as you can see by running the attached with and without the new line commented out. don't-ever-trust-a-theoretician-ly y'rs - tim N = 2 MAGIC_CONSTANT_DEPENDING_ON_N = 7 def f(i): i <<= 1 # i^=1 # This is the line I added if i >= 2**N: i ^= MAGIC_CONSTANT_DEPENDING_ON_N return i i = 1 for nothing in range(4): print i, i = f(i) print i From amk at mira.erols.com Wed Dec 13 04:55:33 2000 From: amk at mira.erols.com (A.M. Kuchling) Date: Tue, 12 Dec 2000 22:55:33 -0500 Subject: [Python-Dev] Splitting up _cursesmodule Message-ID: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> At 2502 lines, _cursesmodule.c is cumbersomely large. I've just received a patch from Thomas Gellekum that adds support for the panel library that will add another 500 lines. I'd like to split the C file into several subfiles (_curses_panel.c, _curses_window.c, etc.) that get #included from the master _cursesmodule.c file. Do the powers that be approve of this idea? --amk From tim.one at home.com Wed Dec 13 04:54:20 2000 From: tim.one at home.com (Tim Peters) Date: Tue, 12 Dec 2000 22:54:20 -0500 Subject: FW: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE Message-ID: FYI, looks like SourceForge is scheduled to be unusable in a span covering late Friday thru early Saturday (OTT -- One True Time, defined by the clocks in Guido's house). -----Original Message----- From: python-dev-admin at python.org [mailto:python-dev-admin at python.org]On Behalf Of Guido van Rossum Sent: Tuesday, December 12, 2000 3:46 PM To: python-dev at python.org Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE ------- Forwarded Message Date: Tue, 12 Dec 2000 12:38:20 -0800 From: noreply at sourceforge.net To: noreply at sourceforge.net Subject: SourceForge: PROJECT DOWNTIME NOTICE ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS This update is being sent to project administrators only and contains important information regarding your project. Please read it in its entirety. INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION As noted in the sitewide email sent this week, the SourceForge.net infrastructure is being upgraded (and relocated). As part of this projects, plans are underway to further increase capacity and responsiveness. We are scheduling the relocation of the systems serving project subdomain web pages. IMPORTANT: This move will affect you in the following ways: 1. Service and availability of SourceForge.net and the development tools provided will continue uninterupted. 2. Project page webservers hosting subdomains (yourprojectname.sourceforge.net) will be down Friday December 15 from 9PM PST (12AM EST) until 3AM PST. 3. CVS will be unavailable (read only part of the time) from 7PM until 3AM PST 4. Mailing lists and mail aliases will be unavailable until 3AM PST --------------------- This email was sent from sourceforge.net. To change your email receipt preferences, please visit the site and edit your account via the "Account Maintenance" link. Direct any questions to admin at sourceforge.net, or reply to this email. ------- End of Forwarded Message _______________________________________________ Python-Dev mailing list Python-Dev at python.org http://www.python.org/mailman/listinfo/python-dev From esr at thyrsus.com Wed Dec 13 05:29:17 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 12 Dec 2000 23:29:17 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> Message-ID: <20001212232917.A22839@thyrsus.com> A.M. Kuchling : > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > received a patch from Thomas Gellekum that adds support for the panel > library that will add another 500 lines. I'd like to split the C file > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > get #included from the master _cursesmodule.c file. > > Do the powers that be approve of this idea? I doubt I qualify as a power that be, but I'm certainly +1 on panel support. -- Eric S. Raymond The biggest hypocrites on gun control are those who live in upscale developments with armed security guards -- and who want to keep other people from having guns to defend themselves. But what about lower-income people living in high-crime, inner city neighborhoods? Should such people be kept unarmed and helpless, so that limousine liberals can 'make a statement' by adding to the thousands of gun laws already on the books?" --Thomas Sowell From fdrake at acm.org Wed Dec 13 07:24:01 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 01:24:01 -0500 (EST) Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> Message-ID: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> A.M. Kuchling writes: > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > received a patch from Thomas Gellekum that adds support for the panel > library that will add another 500 lines. I'd like to split the C file > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > get #included from the master _cursesmodule.c file. Would it be reasonable to add panel support as a second extension module? Is there really a need for them to be in the same module, since the panel library is a separate library? -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From gstein at lyra.org Wed Dec 13 08:58:38 2000 From: gstein at lyra.org (Greg Stein) Date: Tue, 12 Dec 2000 23:58:38 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> Message-ID: <20001212235838.T8951@lyra.org> On Tue, Dec 12, 2000 at 10:55:33PM -0500, A.M. Kuchling wrote: > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > received a patch from Thomas Gellekum that adds support for the panel > library that will add another 500 lines. I'd like to split the C file > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > get #included from the master _cursesmodule.c file. Why should they be #included? I thought that we can build multiple .c files into a module... Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Wed Dec 13 09:05:05 2000 From: gstein at lyra.org (Greg Stein) Date: Wed, 13 Dec 2000 00:05:05 -0800 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects dictobject.c,2.68,2.69 In-Reply-To: <200012130102.RAA31828@slayer.i.sourceforge.net>; from tim_one@users.sourceforge.net on Tue, Dec 12, 2000 at 05:02:49PM -0800 References: <200012130102.RAA31828@slayer.i.sourceforge.net> Message-ID: <20001213000505.U8951@lyra.org> On Tue, Dec 12, 2000 at 05:02:49PM -0800, Tim Peters wrote: > Update of /cvsroot/python/python/dist/src/Objects > In directory slayer.i.sourceforge.net:/tmp/cvs-serv31776/python/dist/src/objects > > Modified Files: > dictobject.c > Log Message: > Bring comments up to date (e.g., they still said the table had to be > a prime size, which is in fact never true anymore ...). >... > --- 55,78 ---- > > /* > ! There are three kinds of slots in the table: > ! > ! 1. Unused. me_key == me_value == NULL > ! Does not hold an active (key, value) pair now and never did. Unused can > ! transition to Active upon key insertion. This is the only case in which > ! me_key is NULL, and is each slot's initial state. > ! > ! 2. Active. me_key != NULL and me_key != dummy and me_value != NULL > ! Holds an active (key, value) pair. Active can transition to Dummy upon > ! key deletion. This is the only case in which me_value != NULL. > ! > ! 3. Dummy. me_key == dummy && me_value == NULL > ! Previously held an active (key, value) pair, but that was deleted and an > ! active pair has not yet overwritten the slot. Dummy can transition to > ! Active upon key insertion. Dummy slots cannot be made Unused again > ! (cannot have me_key set to NULL), else the probe sequence in case of > ! collision would have no way to know they were once active. 4. The popitem finger. :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From moshez at zadka.site.co.il Wed Dec 13 20:19:53 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Wed, 13 Dec 2000 21:19:53 +0200 (IST) Subject: [Python-Dev] Splitting up _cursesmodule Message-ID: <20001213191953.7208DA82E@darjeeling.zadka.site.co.il> On Tue, 12 Dec 2000 23:29:17 -0500, "Eric S. Raymond" wrote: > A.M. Kuchling : > > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > > received a patch from Thomas Gellekum that adds support for the panel > > library that will add another 500 lines. I'd like to split the C file > > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > > get #included from the master _cursesmodule.c file. > > > > Do the powers that be approve of this idea? > > I doubt I qualify as a power that be, but I'm certainly +1 on panel support. I'm +1 on panel support, but that seems the wrong solution. Why not have several C moudles (_curses_panel,...) and manage a more unified namespace with the Python wrapper modules? /curses/panel.py -- from _curses_panel import * etc. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From akuchlin at mems-exchange.org Wed Dec 13 13:44:23 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Wed, 13 Dec 2000 07:44:23 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 01:24:01AM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> Message-ID: <20001213074423.A30348@kronos.cnri.reston.va.us> [CC'ing Thomas Gellekum ] On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote: > Would it be reasonable to add panel support as a second extension >module? Is there really a need for them to be in the same module, >since the panel library is a separate library? Quite possibly, though the patch isn't structured that way. The panel module will need access to the type object for the curses window object, so it'll have to ensure that _curses is already imported, but that's no problem. Thomas, do you feel capable of implementing it as a separate module, or should I work on it? Probably a _cursesmodule.h header will have to be created to make various definitions available to external users of the basic objects in _curses. (Bonus: this means that the menu and form libraries, if they ever get wrapped, can be separate modules, too.) --amk From tg at melaten.rwth-aachen.de Wed Dec 13 15:00:46 2000 From: tg at melaten.rwth-aachen.de (Thomas Gellekum) Date: 13 Dec 2000 15:00:46 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 07:44:23 -0500" References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> Message-ID: Andrew Kuchling writes: > [CC'ing Thomas Gellekum ] > > On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote: > > Would it be reasonable to add panel support as a second extension > >module? Is there really a need for them to be in the same module, > >since the panel library is a separate library? > > Quite possibly, though the patch isn't structured that way. The panel > module will need access to the type object for the curses window > object, so it'll have to ensure that _curses is already imported, but > that's no problem. You mean as separate modules like import curses import panel ? Hm. A panel object is associated with a window object, so it's created from a window method. This means you'd need to add window.new_panel() to PyCursesWindow_Methods[] and curses.update_panels(), curses.panel_above() and curses.panel_below() (or whatever they're called after we're through discussing this ;-)) to PyCurses_Methods[]. Also, the curses.panel_{above,below}() wrappers need access to the list_of_panels via find_po(). > Thomas, do you feel capable of implementing it as a separate module, > or should I work on it? It's probably finished a lot sooner when you do it; OTOH, it would be fun to try it. Let's carry this discussion a bit further. > Probably a _cursesmodule.h header will have > to be created to make various definitions available to external > users of the basic objects in _curses. That's easy. The problem is that we want to extend those basic objects in _curses. > (Bonus: this means that the > menu and form libraries, if they ever get wrapped, can be separate > modules, too.) Sure, if we solve this for panel, the others are a SMOP. :-) tg From guido at python.org Wed Dec 13 15:31:52 2000 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 09:31:52 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src README,1.106,1.107 In-Reply-To: Your message of "Wed, 13 Dec 2000 06:14:35 PST." <200012131414.GAA20849@slayer.i.sourceforge.net> References: <200012131414.GAA20849@slayer.i.sourceforge.net> Message-ID: <200012131431.JAA21243@cj20424-a.reston1.va.home.com> > + --with-cxx=: Some C++ compilers require that main() is > + compiled with the C++ if there is any C++ code in the application. > + Specifically, g++ on a.out systems may require that to support > + construction of global objects. With this option, the main() function > + of Python will be compiled with ; use that only if you > + plan to use C++ extension modules, and if your compiler requires > + compilation of main() as a C++ program. Thanks for documenting this; see my continued reservation in the (reopened) bug report. Another question remains regarding the docs though: why is it bad to always compile main.c with a C++ compiler? --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Wed Dec 13 16:19:01 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 10:19:01 -0500 (EST) Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> Message-ID: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> Thomas Gellekum writes: > You mean as separate modules like > > import curses > import panel Or better yet: import curses import curses.panel > ? Hm. A panel object is associated with a window object, so it's > created from a window method. This means you'd need to add > window.new_panel() to PyCursesWindow_Methods[] and > curses.update_panels(), curses.panel_above() and curses.panel_below() > (or whatever they're called after we're through discussing this ;-)) > to PyCurses_Methods[]. Do these new functions have to be methods on the window objects, or can they be functions in the new module that take a window as a parameter? The underlying window object can certainly provide slots for the use of the panel (forms, ..., etc.) bindings, and simply initialize them to NULL (or whatever) for newly created windows. > Also, the curses.panel_{above,below}() wrappers need access to the > list_of_panels via find_po(). There's no reason that underlying utilities can't be provided by _curses using a CObject. The Extending & Embedding manual has a section on using CObjects to provide a C API to a module without having to link to it directly. > That's easy. The problem is that we want to extend those basic objects > in _curses. Again, I'm curious about the necessity of this. I suspect it can be avoided. I think the approach I've hinted at above will allow you to avoid this, and will allow the panel (forms, ...) support to be added simply by adding additional modules as they are written and the underlying libraries are installed on the host. I know the question of including these modules in the core distribution has come up before, but the resurgence in interest in these makes me want to bring it up again: Does the curses package (and the associated C extension(s)) belong in the standard library, or does it make sense to spin out a distutils-based package? I've no objection to them being in the core, but it seems that the release cycle may want to diverge from Python's. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From guido at python.org Wed Dec 13 16:48:50 2000 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 10:48:50 -0500 Subject: [Python-Dev] Online help PEP In-Reply-To: Your message of "Tue, 12 Dec 2000 10:11:13 PST." <3A366A41.1A14EFD4@ActiveState.com> References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> <3A366A41.1A14EFD4@ActiveState.com> Message-ID: <200012131548.KAA21344@cj20424-a.reston1.va.home.com> [Paul's PEP] > > > help( "string" ) -- built-in topic or global [me] > > Why does a global require string quotes? [Paul] > It doesn't, but if you happen to say > > help( "dir" ) instead of help( dir ), I think it should do the right > thing. Fair enough. > > I'm missing > > > > help() -- table of contents > > > > I'm not sure if the table of contents should be printed by the repr > > output. > > I don't see any benefit in having different behaviors for help and > help(). Having the repr() overloading invoke the pager is dangerous. The beta version of the license command did this, and it caused some strange side effects, e.g. vars(__builtins__) would start reading from input and confuse the users. The new version's repr() returns the desired string if it's less than a page, and 'Type license() to see the full license text' if the pager would need to be invoked. > > > If you ask for a global, it can be a fully-qualfied name such as > > > help("xml.dom"). > > > > Why are the string quotes needed? When are they useful? > > When you haven't imported the thing you are asking about. Or when the > string comes from another UI like an editor window, command line or web > form. The implied import is a major liability. If you can do this without importing (e.g. by source code inspection), fine. Otherwise, you might issue some kind of message like "you must first import XXX.YYY". > > > You can also use the facility from a command-line > > > > > > python --help if > > > > Is this really useful? Sounds like Perlism to me. > > I'm just trying to make it easy to quickly get answers to Python > questions. I could totally see someone writing code in VIM switching to > a bash window to type: > > python --help os.path.dirname > > That's alot easier than: > > $ python > Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32 > Type "copyright", "credits" or "license" for more information. > >>> import os > >>> help(os.path.dirname) > > And what does it hurt? The hurt is code bloat in the interpreter and creeping featurism. If you need command line access to the docs (which may be a reasonable thing to ask for, although to me it sounds backwards :-), it's better to provide a separate command, e.g. pythondoc. (Analog to perldoc.) > > > In either situation, the output does paging similar to the "more" > > > command. > > > > Agreed. But how to implement paging in a platform-dependent manner? > > On Unix, os.system("more") or "$PAGER" is likely to work. On Windows, > > I suppose we could use its MORE, although that's pretty braindead. On > > the Mac? Also, inside IDLE or Pythonwin, invoking the system pager > > isn't a good idea. > > The current implementation does paging internally. You could override it > to use the system pager (or no pager). Yes. Please add that option to the PEP. > > What does "demand-loaded" mean in a Python context? > > When you "touch" the help object, it loads the onlinehelp module which > has the real implementation. The thing in __builtins__ is just a > lightweight proxy. Please suggest an implementation. > > > It Should Also Be Possible To Override The Help Display Function By > > > Assigning To Onlinehelp.Displayhelp(Object_Or_String). > > > > Good Idea. Pythonwin And Idle Could Use This. But I'D Like It To > > Work At Least "Okay" If They Don'T. > > Agreed. Glad You'Re So Agreeable. :) > > > The Module Should Be Able To Extract Module Information From Either > > > The Html Or Latex Versions Of The Python Documentation. Links Should > > > Be Accommodated In A "Lynx-Like" Manner. > > > > I Think This Is Beyond The Scope. > > Well, We Have To Do One Of: > > * Re-Write A Subset Of The Docs In A Form That Can Be Accessed From The > Command Line > * Access The Existing Docs In A Form That'S Installed > * Auto-Convert The Docs Into A Form That'S Compatible I Really Don'T Think That This Tool Should Attempt To Do Everything. If Someone *Really* Wants To Browse The Existing (Large) Doc Set In A Terminal Emulation Window, Let Them Use Lynx And Point It To The Documentation Set. (I Agree That The Html Docs Should Be Installed, By The Way.) > I'Ve Already Implemented Html Parsing And Latex Parsing Is Actually Not > That Far Off. I Just Need Impetus To Finish A Latex-Parsing Project I > Started On My Last Vacation. A Latex Parser Would Be Most Welcome -- If It Could Replace Latex2html! That Old Perl Program Is Really Ready For Retirement. (Ask Fred.) > The Reason That Latex Is Interesting Is Because It Would Be Nice To Be > Able To Move Documentation From Existing Latex Files Into Docstrings. That'S What Some People Think. I Disagree That It Would Be Either Feasible Or A Good Idea To Put All Documentation For A Typical Module In Its Doc Strings. > > The Latex Isn'T Installed Anywhere > > (And Processing Would Be Too Much Work). > > The Html Is Installed Only > > On Windows, Where There Already Is A Way To Get It To Pop Up In Your > > Browser (Actually Two: It'S In The Start Menu, And Also In Idle'S Help > > Menu). > > If The Documentation Becomes An Integral Part Of The Python Code, Then > It Will Be Installed. It'S Ridiculous That It Isn'T Already. Why Is That Ridiculous? It'S Just As Easy To Access Them Through The Web For Most People. If It'S Not, They Are Available In Easily Downloadable Tarballs Supporting A Variety Of Formats. That'S Just Too Much To Be Included In The Standard Rpms. (Also, Latex2html Requires So Much Hand-Holding, And Is So Slow, That It'S Really Not A Good Idea To Let "Make Install" Install The Html By Default.) > Activepython Does Install The Docs On All Platforms. Great. More Power To You. > > A Standard Syntax For Docstrings Is Under Development, Pep 216. I > > Don'T Agree With The Proposal There, But In Any Case The Help Pep > > Should Not Attempt To Legalize A Different Format Than Pep 216. > > I Won'T Hold My Breath For A Standard Python Docstring Format. I'Ve Gone > Out Of My Way To Make The Code Format Independent.. To Tell You The Truth, I'M Not Holding My Breath Either. :-) So your code should just dump the doc string on stdout without interpreting it in any way (except for paging). > > Neat. I noticed that in a 24-line screen, the pagesize must be set to > > 21 to avoid stuff scrolling off the screen. Maybe there's an off-by-3 > > error somewhere? > > Yes. It's buggier than just that. The output of the pager prints an extra "| " at the start of each page except for the first, and the first page is a line longer than subsequent pages. BTW, another bug: try help(cgi). It's nice that it gives the default value for arguments, but the defaults for FieldStorage.__init__ happen to include os.environ. Its entire value is dumped -- which causes the pager to be off (it wraps over about 20 lines for me). I think you may have to truncate long values a bit, e.g. by using the repr module. > > I also noticed that it always prints '1' when invoked as a function. > > The new license pager in site.py avoids this problem. > > Okay. Where's the check-in? :-) > > help("operators") and several others raise an > > AttributeError('handledocrl'). > > Fixed. > > > The "lynx-line links" don't work. > > I don't think that's implemented yet. I'm not sure what you intended to implement there. I prefer to see the raw URLs, then I can do whatever I normally do to paste them into my preferred webbrowser (which *not* lynx :-). > > I think it's naive to expect this help facility to replace browsing > > the website or the full documentation package. There should be one > > entry that says to point your browser there (giving the local > > filesystem URL if available), and that's it. The rest of the online > > help facility should be concerned with exposing doc strings. > > I don't want to replace the documentation. But there is no reason we > should set out to make it incomplete. If its integrated with the HTML > then people can choose whatever access mechanism is easiest for them > right now > > I'm trying hard not to be "naive". Realistically, nobody is going to > write a million docstrings between now and Python 2.1. It is much more > feasible to leverage the existing documentation that Fred and others > have spent months on. I said above, and I'll say it again: I think the majority of people would prefer to use their standard web browser to read the standard docs. It's not worth the effort to try to make those accessible through help(). In fact, I'd encourage the development of a command-line-invoked help facility that shows doc strings in the user's preferred web browser -- the webbrowser module makes this trivial. > > > Security Issues > > > > > > This module will attempt to import modules with the same names as > > > requested topics. Don't use the modules if you are not confident > > > that everything in your pythonpath is from a trusted source. > > Yikes! Another reason to avoid the "string" -> global variable > > option. > > I don't think we should lose that option. People will want to look up > information from non-executable environments like command lines, GUIs > and web pages. Perhaps you can point me to techniques for extracting > information from Python modules and packages without executing them. I don't know specific tools, but any serious docstring processing tool ends up parsing the source code for this very reason, so there's probably plenty of prior art. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Wed Dec 13 17:07:22 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 11:07:22 -0500 (EST) Subject: [Python-Dev] Online help PEP In-Reply-To: <200012131548.KAA21344@cj20424-a.reston1.va.home.com> References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> <3A366A41.1A14EFD4@ActiveState.com> <200012131548.KAA21344@cj20424-a.reston1.va.home.com> Message-ID: <14903.40634.569192.704368@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > A Latex Parser Would Be Most Welcome -- If It Could Replace > Latex2html! That Old Perl Program Is Really Ready For Retirement. > (Ask Fred.) Note that Doc/tools/sgmlconv/latex2esis.py already includes a moderate start at a LaTeX parser. Paragraph marking is done as a separate step in Doc/tools/sgmlconv/docfixer.py, but I'd like to push that down into the LaTeX handler. (Note that these tools are mostly broken at the moment, except for latex2esis.py, which does most of what I need other than paragraph marking.) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From Barrett at stsci.edu Wed Dec 13 17:34:40 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Wed, 13 Dec 2000 11:34:40 -0500 (EST) Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) In-Reply-To: <20001210054646.A5219@glacier.fnational.com> References: <20001210054646.A5219@glacier.fnational.com> Message-ID: <14903.41669.883591.420446@nem-srvr.stsci.edu> Neil Schemenauer writes: > Sourceforge unloads are not working. The lastest version of the > patch for PEP 208 is here: > > http://arctrix.com/nas/python/coerce-6.0.diff > > Operations on instances now call __coerce__ if it exists. I > think the patch is now complete. Converting other builtin types > to "new style numbers" can be done with a separate patch. My one concern about this patch is whether the non-commutativity of operators is preserved. This issue being important for matrix operations (not to be confused with element-wise array operations). -- Paul From guido at python.org Wed Dec 13 17:45:12 2000 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 11:45:12 -0500 Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) In-Reply-To: Your message of "Wed, 13 Dec 2000 11:34:40 EST." <14903.41669.883591.420446@nem-srvr.stsci.edu> References: <20001210054646.A5219@glacier.fnational.com> <14903.41669.883591.420446@nem-srvr.stsci.edu> Message-ID: <200012131645.LAA21719@cj20424-a.reston1.va.home.com> > Neil Schemenauer writes: > > Sourceforge unloads are not working. The lastest version of the > > patch for PEP 208 is here: > > > > http://arctrix.com/nas/python/coerce-6.0.diff > > > > Operations on instances now call __coerce__ if it exists. I > > think the patch is now complete. Converting other builtin types > > to "new style numbers" can be done with a separate patch. > > My one concern about this patch is whether the non-commutativity of > operators is preserved. This issue being important for matrix > operations (not to be confused with element-wise array operations). Yes, this is preserved. (I'm spending most of my waking hours understanding this patch -- it is a true piece of wizardry.) --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Wed Dec 13 18:38:00 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Wed, 13 Dec 2000 18:38:00 +0100 Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) References: <20001210054646.A5219@glacier.fnational.com> <14903.41669.883591.420446@nem-srvr.stsci.edu> <200012131645.LAA21719@cj20424-a.reston1.va.home.com> Message-ID: <3A37B3F7.5640FAFC@lemburg.com> Guido van Rossum wrote: > > > Neil Schemenauer writes: > > > Sourceforge unloads are not working. The lastest version of the > > > patch for PEP 208 is here: > > > > > > http://arctrix.com/nas/python/coerce-6.0.diff > > > > > > Operations on instances now call __coerce__ if it exists. I > > > think the patch is now complete. Converting other builtin types > > > to "new style numbers" can be done with a separate patch. > > > > My one concern about this patch is whether the non-commutativity of > > operators is preserved. This issue being important for matrix > > operations (not to be confused with element-wise array operations). > > Yes, this is preserved. (I'm spending most of my waking hours > understanding this patch -- it is a true piece of wizardry.) The fact that coercion didn't allow detection of parameter order was the initial cause for my try at fixing it back then. I was confronted with the fact that at C level there was no way to tell whether the operands were in the order left, right or right, left -- as a result I used a gross hack in mxDateTime to still make this work... -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From esr at thyrsus.com Wed Dec 13 22:01:46 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 16:01:46 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> Message-ID: <20001213160146.A24753@thyrsus.com> Fred L. Drake, Jr. : > I know the question of including these modules in the core > distribution has come up before, but the resurgence in interest in > these makes me want to bring it up again: Does the curses package > (and the associated C extension(s)) belong in the standard library, or > does it make sense to spin out a distutils-based package? I've no > objection to them being in the core, but it seems that the release > cycle may want to diverge from Python's. Curses needs to be in the core for political reasons. Specifically, to support CML2 without requiring any extra packages or downloads beyond the stock Python interpreter. And what makes CML2 so constrained and so important? It's my bid to replace the Linux kernel's configuration machinery. It has many advantages over the existing config system, but the linux developers are *very* resistant to adding things to the kernel's minimum build kit. Python alone may prove too much for them to swallow (though there are hopeful signs they will); Python plus a separately downloadable curses module would definitely be too much. Guido attaches sufficient importance to getting Python into the kernel build machinery that he approved adding ncurses to the standard modules on that basis. This would be a huge design win for us, raising Python's visibility considerably. So curses must stay in the core. I don't have a requirement for panels; my present curses front end simulates them. But if panels were integrated into the core I could simplify the front-end code significantly. Every line I can remove from my stuff (even if it, in effect, is just migrating into the Python core) makes it easier to sell CML2 into the kernel. -- Eric S. Raymond "Experience should teach us to be most on our guard to protect liberty when the government's purposes are beneficient... The greatest dangers to liberty lurk in insidious encroachment by men of zeal, well meaning but without understanding." -- Supreme Court Justice Louis Brandeis From jheintz at isogen.com Wed Dec 13 22:10:32 2000 From: jheintz at isogen.com (John D. Heintz) Date: Wed, 13 Dec 2000 15:10:32 -0600 Subject: [Python-Dev] Announcing ZODB-Corba code release Message-ID: <3A37E5C8.7000800@isogen.com> Here is the first release of code that exposes a ZODB database through CORBA (omniORB). The code is functioning, the docs are sparse, and it should work on your machines. ;-) I am only going to be in town for the next two days, then I will be unavailable until Jan 1. See http://www.zope.org/Members/jheintz/ZODB_CORBA_Connection to download the code. It's not perfect, but it works for me. Enjoy, John -- . . . . . . . . . . . . . . . . . . . . . . . . John D. Heintz | Senior Engineer 1016 La Posada Dr. | Suite 240 | Austin TX 78752 T 512.633.1198 | jheintz at isogen.com w w w . d a t a c h a n n e l . c o m From guido at python.org Wed Dec 13 22:19:01 2000 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 16:19:01 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Your message of "Wed, 13 Dec 2000 16:01:46 EST." <20001213160146.A24753@thyrsus.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> Message-ID: <200012132119.QAA11060@cj20424-a.reston1.va.home.com> > So curses must stay in the core. I don't have a requirement for > panels; my present curses front end simulates them. But if panels were > integrated into the core I could simplify the front-end code > significantly. Every line I can remove from my stuff (even if it, in > effect, is just migrating into the Python core) makes it easier to > sell CML2 into the kernel. On the other hand you may want to be conservative. You already have to require Python 2.0 (I presume). The panel stuff will be available in 2.1 at the earliest. You probably shouldn't throw out your panel emulation until your code has already been accepted... --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at loewis.home.cs.tu-berlin.de Wed Dec 13 22:56:27 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 13 Dec 2000 22:56:27 +0100 Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107 Message-ID: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de> > Another question remains regarding the docs though: why is it bad to > always compile main.c with a C++ compiler? For the whole thing to work, it may also be necessary to link the entire application with a C++ compiler; that in turn may bind to the C++ library. Linking with the system's C++ library means that the Python executable cannot be as easily exchanged between installations of the operating system - you'd also need to have the right version of the C++ library to run it. If the C++ library is static, that may also increase the size of the executable. I can't really point to a specific problem that would occur on a specific system I use if main() was compiled with a C++ compiler. However, on the systems I use (Windows, Solaris, Linux), you can build C++ extension modules even if Python was not compiled as a C++ application. On Solaris and Windows, you'd also have to chose the C++ compiler you want to use (MSVC++, SunPro CC, or g++); in turn, different C++ runtime systems would be linked into the application. Regards, Martin From esr at thyrsus.com Wed Dec 13 23:03:59 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 17:03:59 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132119.QAA11060@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 04:19:01PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> Message-ID: <20001213170359.A24915@thyrsus.com> Guido van Rossum : > > So curses must stay in the core. I don't have a requirement for > > panels; my present curses front end simulates them. But if panels were > > integrated into the core I could simplify the front-end code > > significantly. Every line I can remove from my stuff (even if it, in > > effect, is just migrating into the Python core) makes it easier to > > sell CML2 into the kernel. > > On the other hand you may want to be conservative. You already have > to require Python 2.0 (I presume). The panel stuff will be available > in 2.1 at the earliest. You probably shouldn't throw out your panel > emulation until your code has already been accepted... Yes, that's how I am currently expecting it to play out -- but if the 2.4.0 kernel is delayed another six months, I'd change my mind. I'll explain this, because python-dev people should grok what the surrounding politics and timing are. I actually debated staying with 1.5.2 as a base version. What changed my mind was two things. One: by going to 2.0 I could drop close to 600 lines and three entire support modules from CML2, slimming down its footprint in the kernel tree significantly (by more than 10% of the entire code volume, actually). Second: CML2 is not going to be seriously evaluated until 2.4.0 final is out. Linus made this clear when I demoed it for him at LWE. My best guess about when that will happen is late January into Februrary. By the time Red Hat issues its next distro after that (probably May or thenabouts) it's a safe bet 2.0 will be on it, and everywhere else. But if the 2.4.0 kernel slips another six months yet again, and our 2.1 commes out relatively quickly (like, just before the 9th Python Conference :-)) then we *might* have time to get 2.1 into the distros before CML2 gets the imprimatur. So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel will be delayed yet again :-). -- Eric S. Raymond Ideology, politics and journalism, which luxuriate in failure, are impotent in the face of hope and joy. -- P. J. O'Rourke From nas at arctrix.com Wed Dec 13 16:37:45 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 07:37:45 -0800 Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107 In-Reply-To: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Wed, Dec 13, 2000 at 10:56:27PM +0100 References: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de> Message-ID: <20001213073745.C17148@glacier.fnational.com> These are issues to consider for Python 3000 as well. AFAKI, C++ ABIs are a nighmare. Neil From fdrake at acm.org Wed Dec 13 23:29:25 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 17:29:25 -0500 (EST) Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213170359.A24915@thyrsus.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> Message-ID: <14903.63557.282592.796169@cj42289-a.reston1.va.home.com> Eric S. Raymond writes: > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel > will be delayed yet again :-). Politics aside, I think development of curses-related extensions like panels and forms doesn't need to be delayed. I've posted what I think are relavant technical comments already, and leave it up to the developers of any new modules to get them written -- I don't know enough curses to offer any help there. Regardless of how the curses package is distributed and deployed, I don't see any reason to delay development in its existing location in the Python CVS repository. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From nas at arctrix.com Wed Dec 13 16:41:54 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 07:41:54 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> Message-ID: <20001213074154.D17148@glacier.fnational.com> On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote: > CML2 is not going to be seriously evaluated until 2.4.0 final > is out. Linus made this clear when I demoed it for him at LWE. > My best guess about when that will happen is late January into > Februrary. By the time Red Hat issues its next distro after > that (probably May or thenabouts) it's a safe bet 2.0 will be > on it, and everywhere else. I don't think that is a very safe bet. Python 2.0 missed the Debian Potato boat. I have no idea when Woody is expected to be released but I expect it may take longer than that if history is any indication. Neil From guido at python.org Thu Dec 14 00:03:31 2000 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 18:03:31 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Your message of "Wed, 13 Dec 2000 07:41:54 PST." <20001213074154.D17148@glacier.fnational.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> Message-ID: <200012132303.SAA12434@cj20424-a.reston1.va.home.com> > I don't think that is a very safe bet. Python 2.0 missed the > Debian Potato boat. This may have had to do more with the unresolved GPL issues. I recently received a mail from Stallman indicating that an agreement with CNRI has been reached; they have agreed (in principle, at least) to specific changes to the CNRI license that will defuse the choice-of-law clause when it is combined with GPL-licensed code "in a non-separable way". A glitch here is that the BeOpen license probably has to be changed too, but I believe that that's all doable. > I have no idea when Woody is expected to be > released but I expect it may take longer than that if history is > any indication. And who or what is Woody? Feeling-left-out, --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Thu Dec 14 00:16:09 2000 From: gstein at lyra.org (Greg Stein) Date: Wed, 13 Dec 2000 15:16:09 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> Message-ID: <20001213151609.E8951@lyra.org> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: >... > > I have no idea when Woody is expected to be > > released but I expect it may take longer than that if history is > > any indication. > > And who or what is Woody? One of the Debian releases. Dunno if it is the "next" release, but there ya go. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Thu Dec 14 00:18:34 2000 From: gstein at lyra.org (Greg Stein) Date: Wed, 13 Dec 2000 15:18:34 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> Message-ID: <20001213151834.F8951@lyra.org> On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote: >... > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel > will be delayed yet again :-). The kernel is not going to be delayed that much. Linus wants it to go out this month. Worst case, I could see January. But no way on six months. But as Fred said: that should not change panels going into the curses support at all. You can always have a "compat.py" module in CML2 that provides functionality for prior-to-2.1 releases of Python. I'd also be up for a separate _curses_panels module, loaded into the curses package. Cheers, -g -- Greg Stein, http://www.lyra.org/ From esr at thyrsus.com Thu Dec 14 00:33:02 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 18:33:02 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213151834.F8951@lyra.org>; from gstein@lyra.org on Wed, Dec 13, 2000 at 03:18:34PM -0800 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213151834.F8951@lyra.org> Message-ID: <20001213183302.A25160@thyrsus.com> Greg Stein : > On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote: > >... > > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel > > will be delayed yet again :-). > > The kernel is not going to be delayed that much. Linus wants it to go out > this month. Worst case, I could see January. But no way on six months. I know what Linus wants. That's why I'm estimating end of January or earlier Februrary -- the man's error curve on these estimates has a certain, er, *consistency* about it. -- Eric S. Raymond Alcohol still kills more people every year than all `illegal' drugs put together, and Prohibition only made it worse. Oppose the War On Some Drugs! From nas at arctrix.com Wed Dec 13 18:18:48 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 09:18:48 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> Message-ID: <20001213091848.A17326@glacier.fnational.com> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: > > I don't think that is a very safe bet. Python 2.0 missed the > > Debian Potato boat. > > This may have had to do more with the unresolved GPL issues. I can't remember the exact dates but I think Debian Potato was frozen before Python 2.0 was released. Once a Debian release is frozen packages are not upgraded except under unusual circumstances. > I recently received a mail from Stallman indicating that an > agreement with CNRI has been reached; they have agreed (in > principle, at least) to specific changes to the CNRI license > that will defuse the choice-of-law clause when it is combined > with GPL-licensed code "in a non-separable way". A glitch here > is that the BeOpen license probably has to be changed too, but > I believe that that's all doable. This is great news. > > I have no idea when Woody is expected to be > > released but I expect it may take longer than that if history is > > any indication. > > And who or what is Woody? Woody would be another character from the Pixar movie "Toy Story" (just like Rex, Bo, Potato, Slink, and Hamm). I believe Bruce Perens used to work a Pixar. Debian uses a code name for the development release until a release number is assigned. This avoids some problems but has the disadvantage of confusing people who are not familiar with Debian. I should have said "the next stable release of Debian". Neil (aka nas at debian.org) From akuchlin at mems-exchange.org Thu Dec 14 01:26:32 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Wed, 13 Dec 2000 19:26:32 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> Message-ID: <20001213192632.A30585@kronos.cnri.reston.va.us> On Wed, Dec 13, 2000 at 10:19:01AM -0500, Fred L. Drake, Jr. wrote: > Do these new functions have to be methods on the window objects, or >can they be functions in the new module that take a window as a >parameter? The underlying window object can certainly provide slots Panels and windows have a 1-1 association, but they're separate objects. The window.new_panel function could become just a method which takes a window as its first argument; it would only need the TypeObject for PyCursesWindow, in order to do typechecking. > > Also, the curses.panel_{above,below}() wrappers need access to the > > list_of_panels via find_po(). The list_of_panels is used only in the curses.panel module, so it could be private to that module, since only panel-related functions care about it. I'm ambivalent about the list_of_panels. It's a linked list storing (PyWindow, PyPanel) pairs. Probably it should use a dictionary instead of implementing a little list, just to reduce the amount of code. >does it make sense to spin out a distutils-based package? I've no >objection to them being in the core, but it seems that the release >cycle may want to diverge from Python's. Consensus seemed to be to leave it in; I'd have no objection to removing it, but either course is fine with me. So, I suggest we create _curses_panel.c, which would be available as curses.panel. (A panel.py module could then add any convenience functions that are required.) Thomas, do you want to work on this, or should I? --amk From nas at arctrix.com Wed Dec 13 18:43:06 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 09:43:06 -0800 Subject: [Python-Dev] OT: Debian and Python In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100 References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> Message-ID: <20001213094306.C17326@glacier.fnational.com> On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote: > Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0. > Someone created a separate set of 2.0-packages, but they didn't include > readline and gdbm support because of the licencing issues. (Posted on c.l.py > sometime this week.) I've had Python packages for Debian stable for a while. I guess I should have posted a link: http://arctrix.com/nas/python/debian/ Most useful modules are enabled. > I'm *almost* tempted enough to learn enough about > dpkg/.deb files to build my own licence-be-damned set Its quite easy. Debian source packages are basicly a diff. Applying the diff will create a "debian" directory and in that directory will be a makefile called "rules". Use the target "binary" to create new binary packages. Good things to know are that you must be in the source directory when you run the makefile (ie. ./debian/rules binary). You should be running a shell under fakeroot to get the install permissions right (running "fakeroot" will do). You need to have the Debian developer tools installed. There is a list somewhere on debian.org. "apt-get source " will get, extract and patch a package ready for tweaking and building (handy for getting stuff from unstable to run on stable). This is too off topic for python-dev. If anyone needs more info they can email me directly. Neil From thomas at xs4all.net Thu Dec 14 01:05:34 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 14 Dec 2000 01:05:34 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> Message-ID: <20001214010534.M4396@xs4all.nl> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: > > I don't think that is a very safe bet. Python 2.0 missed the Debian > > Potato boat. > > This may have had to do more with the unresolved GPL issues. This is very likely. Debian is very licence -- or at least GPL -- aware. Which is a pity, really, because I already prefer it over RedHat in all other cases (and RedHat is also pretty licence aware, just less piously, devoutly, beyond-practicality-IMHO dedicated to the GPL.) > > I have no idea when Woody is expected to be released but I expect it may > > take longer than that if history is any indication. BTW, I believe Debian uses a fairly steady release schedule, something like an unstable->stable switch every year or 6 months or so ? I seem to recall seeing something like that on the debian website, but can't check right now. > And who or what is Woody? Woody is Debian's current development branch, the current bearer of the alias 'unstable'. It'll become Debian 2.3 (I believe, I don't pay attention to version numbers, I just run unstable :) once it's stabilized. 'potato' is the previous development branch, and currently the 'stable' branch. You can compare them with 'rawhide' and 'redhat-7.0', respectively :) (With the enormous difference that you can upgrade your debian install to a new version (even the devel version, or update your machine to the latest devel snapshot) while you are using it, without having to reboot ;) Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0. Someone created a separate set of 2.0-packages, but they didn't include readline and gdbm support because of the licencing issues. (Posted on c.l.py sometime this week.) I'm *almost* tempted enough to learn enough about dpkg/.deb files to build my own licence-be-damned set, but it'd be a lot of work to mirror the current debian 1.5.2 set of packages (which include numeric, imaging, mxTools, GTK/GNOME, and a shitload of 3rd party modules) in 2.0. Ponder, maybe it could be done semi-automatically, from the src-deb's of those packages. By the way, in woody, there are 52 packages with 'python' in the name, and 32 with 'perl' in the name... Pity all of my perl-hugging hippy-friends are still blindly using RedHat, and refuse to listen to my calls from the Debian/Python-dark-side :-) Oh, and the names 'woody' and 'potato' came from the movie Toy Story, in case you wondered ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From esr at snark.thyrsus.com Thu Dec 14 01:46:37 2000 From: esr at snark.thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 19:46:37 -0500 Subject: [Python-Dev] Business related to the upcoming Python conference Message-ID: <200012140046.TAA25289@snark.thyrsus.com> I'm sending this to python-dev because I believe most or all of the reviewers for my PC9 paper are on this list. Paul, would you please forward to any who were not? First, my humble apologies for not having got my PC9 reviews in on time. I diligently read my assigned papers early, but I couldn't do the reviews early because of technical problems with my Foretec account -- and then I couldn't do them late because the pre-deadline crunch happened while I was on a ten-day speaking and business trip in Japan and California, with mostly poor or nonexistent Internet access. Matters were not helped by a nasty four-month-old problem in my personal life coming to a head right in the middle of the trip. Nor by the fact that the trip included the VA Linux Systems annual stockholders' meeting and the toughest Board of Directors' meeting in my tenure. We had to hammer out a strategic theory of what to do now that the dot-com companies who used to be our best companies aren't getting funded any more. Unfortunately, it's at times like this that Board members earn their stock options. Management oversight. Fiduciary responsibility. Mumble... Second, the feedback I received on the paper was *excellent*, and I will be making many of the recommended changes. I've already extended the discussion of "Why Python?" including addressing the weaknesses of Scheme and Prolog for this application. I have said more about uses of CML2 beyond the Linux kernel. I am working on a discussion of the politics of CML2 option, but may save that for the stand-up talk rather than the written paper. I will try to trim the CML2 language reference for the final version. (The reviewer who complained about the lack of references on the SAT problem should be pleased to hear that URLs to relevant papers are in fact included in the masters. I hope they show in the final version as rendered for publication.) -- Eric S. Raymond The Constitution is not neutral. It was designed to take the government off the backs of the people. -- Justice William O. Douglas From moshez at zadka.site.co.il Thu Dec 14 13:22:24 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Thu, 14 Dec 2000 14:22:24 +0200 (IST) Subject: [Python-Dev] Splitting up _cursesmodule Message-ID: <20001214122224.739EEA82E@darjeeling.zadka.site.co.il> On Wed, 13 Dec 2000 07:41:54 -0800, Neil Schemenauer wrote: > I don't think that is a very safe bet. Python 2.0 missed the > Debian Potato boat. By a long time -- potato was frozen for a few months when 2.0 came out. > I have no idea when Woody is expected to be > released but I expect it may take longer than that if history is > any indication. My bet is that woody starts freezing as soon as 2.4.0 is out. Note that once it starts freezing, 2.1 doesn't have a shot of getting in, regardless of how long it takes to freeze. OTOH, since in woody time there's a good chance for the "testing" distribution, a lot more people would be running something that *can* and *will* upgrade to 2.1 almost as soon as it is out. (For the record, most of the Debian users I know run woody on their server) -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From jeremy at alum.mit.edu Thu Dec 14 06:04:43 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 14 Dec 2000 00:04:43 -0500 (EST) Subject: [Python-Dev] new draft of PEP 227 Message-ID: <14904.21739.804346.650062@bitdiddle.concentric.net> I've got a new draft of PEP 227. The terminology and wording are more convoluted than they need to be. I'll do at least one revision just to say things more clearly, but I'd appreciate comments on the proposed spec if you can read the current draft. Jeremy From cgw at fnal.gov Thu Dec 14 07:03:01 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Thu, 14 Dec 2000 00:03:01 -0600 (CST) Subject: [Python-Dev] Memory leaks in tupleobject.c Message-ID: <14904.25237.654143.861733@buffalo.fnal.gov> I've been running a set of memory-leak tests against the latest Python and have found that running "test_extcall" leaks memory. This gave me a strange sense of deja vu, having fixed this once before... From nas at arctrix.com Thu Dec 14 00:43:43 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 15:43:43 -0800 Subject: [Python-Dev] Memory leaks in tupleobject.c In-Reply-To: <14904.25237.654143.861733@buffalo.fnal.gov>; from cgw@fnal.gov on Thu, Dec 14, 2000 at 12:03:01AM -0600 References: <14904.25237.654143.861733@buffalo.fnal.gov> Message-ID: <20001213154343.A18303@glacier.fnational.com> On Thu, Dec 14, 2000 at 12:03:01AM -0600, Charles G Waldman wrote: > date: 2000/10/05 19:36:49; author: nascheme; state: Exp; lines: +24 -86 > Simplify _PyTuple_Resize by not using the tuple free list and dropping > support for the last_is_sticky flag. A few hard to find bugs may be > fixed by this patch since the old code was buggy. > > The 2.47 patch seems to have re-introduced the memory leak which was > fixed in 2.31. Maybe the old code was buggy, but the "right thing" > would have been to fix it, not to throw it away.... if _PyTuple_Resize > simply ignores the tuple free list, memory will be leaked. Guilty as charged. Can you explain how the current code is leaking memory? I can see one problem with deallocating size=0 tuples. Are there any more leaks? Neil From cgw at fnal.gov Thu Dec 14 07:57:05 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Thu, 14 Dec 2000 00:57:05 -0600 (CST) Subject: [Python-Dev] Memory leaks in tupleobject.c In-Reply-To: <20001213154343.A18303@glacier.fnational.com> References: <14904.25237.654143.861733@buffalo.fnal.gov> <20001213154343.A18303@glacier.fnational.com> Message-ID: <14904.28481.292539.354303@buffalo.fnal.gov> Neil Schemenauer writes: > Guilty as charged. Can you explain how the current code is > leaking memory? I can see one problem with deallocating size=0 > tuples. Are there any more leaks? Actually, I think I may have spoken too hastily - it's late and I'm tired and I should be sleeping rather than staring at the screen (like I've been doing since 8:30 this morning) - I jumped to conclusions - I'm not really sure that it was your patch that caused the leak; all I can say with 100% certainty is that if you run "test_extcall" in a loop, memory usage goes through the ceiling.... It's not just the cyclic garbage caused by the "saboteur" function because even with this commented out, the memory leak persists. I'm actually trying to track down a different memory leak, something which is currently causing trouble in one of our production servers (more about this some other time) and just as a sanity check I ran my little "leaktest.py" script over all the test_*.py modules in the distribution, and found that test_extcall triggers leaks... having analyzed and fixed this once before (see the CVS logs for tupleobject.c), I jumped to conclusions about the reason for its return. I'll take a more clear-headed and careful look tomorrow and post something (hopefully) a little more conclusive. It may have been some other change that caused this memory leak to re-appear. If you feel inclined to investigate, just do "reload(test.test_extcall)" in a loop and watch the memory usage with ps or top or what-have-you... -C From paulp at ActiveState.com Thu Dec 14 08:00:21 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Wed, 13 Dec 2000 23:00:21 -0800 Subject: [Python-Dev] new draft of PEP 227 References: <14904.21739.804346.650062@bitdiddle.concentric.net> Message-ID: <3A387005.6725DAAE@ActiveState.com> Jeremy Hylton wrote: > > I've got a new draft of PEP 227. The terminology and wording are more > convoluted than they need to be. I'll do at least one revision just > to say things more clearly, but I'd appreciate comments on the > proposed spec if you can read the current draft. It set me to thinking: Python should never require declarations. But would it necessarily be a problem for Python to have a variable declaration syntax? Might not the existence of declarations simplify some aspects of the proposal and of backwards compatibility? Along the same lines, might a new rule make Python code more robust? We could say that a local can only shadow a global if the local is formally declared. It's pretty rare that there is a good reason to shadow a global and Python makes it too easy to do accidentally. Paul Prescod From paulp at ActiveState.com Thu Dec 14 08:29:35 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Wed, 13 Dec 2000 23:29:35 -0800 Subject: [Python-Dev] Online help scope Message-ID: <3A3876DF.5554080C@ActiveState.com> I think Guido and I are pretty far apart on the scope and requirements of this online help thing so I'd like some clarification and opinions from the peanut gallery. Consider these scenarios a) Signature >>> help( dir ) dir([object]) -> list of stringsb) b) Usage hint >>> help( dir ) dir([object]) -> list of stringsb) Return an alphabetized list of names comprising (some of) the attributes of the given object. Without an argument, the names in the current scope are listed. With an instance argument, only the instance attributes are returned. With a class argument, attributes of the base class are not returned. For other types or arguments, this may list members or methods. c) Complete documentation, paged(man-style) >>> help( dir ) dir([object]) -> list of stringsb) Without arguments, return the list of names in the current local symbol table. With an argument, attempts to return a list of valid attribute for that object. This information is gleaned from the object's __dict__, __methods__ and __members__ attributes, if defined. The list is not necessarily complete; e.g., for classes, attributes defined in base classes are not included, and for class instances, methods are not included. The resulting list is sorted alphabetically. For example: >>> import sys >>> dir() ['sys'] >>> dir(sys) ['argv', 'exit', 'modules', 'path', 'stderr', 'stdin', 'stdout'] d) Complete documentation in a user-chosen hypertext window >>> help( dir ) (Netscape or lynx pops up) I'm thinking that maybe we need two functions: * help * pythondoc pythondoc("dir") would launch the Python documentation for the "dir" command. > That'S What Some People Think. I Disagree That It Would Be Either > Feasible Or A Good Idea To Put All Documentation For A Typical Module > In Its Doc Strings. Java and Perl people do it regularly. I think that in the greater world of software development, the inline model has won (or is winning) and I don't see a compelling reason to fight the tide. There will always be out-of-line tutorials, discussions, books etc. The canonical module documentation could be inline. That improves the liklihood of it being maintained. The LaTeX documentation is a major bottleneck and moving to XML or SGML will not help. Programmers do not want to learn documentation systems or syntaxes. They want to write code and comments. > I said above, and I'll say it again: I think the majority of people > would prefer to use their standard web browser to read the standard > docs. It's not worth the effort to try to make those accessible > through help(). No matter what we decide on the issue above, reusing the standard documentation is the only practical way of populating the help system in the short-term. Right now, today, there is a ton of documentation that exists only in LaTeX and HTML. Tons of modules have no docstrings. Keywords have no docstrings. Compare the docstring for urllib.urlretrieve to the HTML documentation. In fact, you've given me a good idea: if the HTML is not available locally, I can access it over the web. Paul Prescod From paulp at ActiveState.com Thu Dec 14 08:29:53 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Wed, 13 Dec 2000 23:29:53 -0800 Subject: [Python-Dev] Online help PEP References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> <3A366A41.1A14EFD4@ActiveState.com> <200012131548.KAA21344@cj20424-a.reston1.va.home.com> Message-ID: <3A3876F1.D3E65E90@ActiveState.com> Guido van Rossum wrote: > > Having the repr() overloading invoke the pager is dangerous. The beta > version of the license command did this, and it caused some strange > side effects, e.g. vars(__builtins__) would start reading from input > and confuse the users. The new version's repr() returns the desired > string if it's less than a page, and 'Type license() to see the full > license text' if the pager would need to be invoked. I'll add this to the PEP. > The implied import is a major liability. If you can do this without > importing (e.g. by source code inspection), fine. Otherwise, you > might issue some kind of message like "you must first import XXX.YYY". Okay, I'll add to the PEP that an open issue is what strategy to use, but that we want to avoid implicit import. > The hurt is code bloat in the interpreter and creeping featurism. If > you need command line access to the docs (which may be a reasonable > thing to ask for, although to me it sounds backwards :-), it's better > to provide a separate command, e.g. pythondoc. (Analog to perldoc.) Okay, I'll add a pythondoc proposal to the PEP. > Yes. Please add that option to the PEP. Done. > > > What does "demand-loaded" mean in a Python context? > > > > When you "touch" the help object, it loads the onlinehelp module which > > has the real implementation. The thing in __builtins__ is just a > > lightweight proxy. > > Please suggest an implementation. In the PEP. > Glad You'Re So Agreeable. :) What happened to your capitalization? elisp gone awry? > ... > To Tell You The Truth, I'M Not Holding My Breath Either. :-) So your > code should just dump the doc string on stdout without interpreting it > in any way (except for paging). I'll do this for the first version. > It's buggier than just that. The output of the pager prints an extra > "| " at the start of each page except for the first, and the first > page is a line longer than subsequent pages. For some reason that I now I forget, that code is pretty hairy. > BTW, another bug: try help(cgi). It's nice that it gives the default > value for arguments, but the defaults for FieldStorage.__init__ happen > to include os.environ. Its entire value is dumped -- which causes the > pager to be off (it wraps over about 20 lines for me). I think you > may have to truncate long values a bit, e.g. by using the repr module. Okay. There are a lot of little things we need to figure out. Such as whether we should print out docstrings for private methods etc. >... > I don't know specific tools, but any serious docstring processing tool > ends up parsing the source code for this very reason, so there's > probably plenty of prior art. Okay, I'll look into it. Paul From tim.one at home.com Thu Dec 14 08:35:00 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 02:35:00 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A387005.6725DAAE@ActiveState.com> Message-ID: [Paul Prescod] > ... > Along the same lines, might a new rule make Python code more robust? > We could say that a local can only shadow a global if the local is > formally declared. It's pretty rare that there is a good reason to > shadow a global and Python makes it too easy to do accidentally. I've rarely seen problems due to shadowing a global, but have often seen problems due to shadowing a builtin. Alas, if this rule were extended to builtins too-- where it would do the most good --then the names of builtins would effectively become reserved words (any code shadowing them today would be broken until declarations were added, and any code working today may break tomorrow if a new builtin were introduced that happened to have the same name as a local). From pf at artcom-gmbh.de Thu Dec 14 08:42:59 2000 From: pf at artcom-gmbh.de (Peter Funk) Date: Thu, 14 Dec 2000 08:42:59 +0100 (MET) Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: <200012132039.MAA07496@slayer.i.sourceforge.net> from Moshe Zadka at "Dec 13, 2000 12:39:24 pm" Message-ID: Hi, I think the following change is incompatible and will break applications. At least I have some server type applications that rely on 'allow_reuse_address' defaulting to 0, because they use the 'address already in use' exception, to make sure, that exactly one server process is running on this port. One of these applications, which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work, if I change this default in SocketServer.py. Would you please explain the reasoning behind this change? Moshe Zadka: > *** SocketServer.py 2000/09/01 03:25:14 1.19 > --- SocketServer.py 2000/12/13 20:39:17 1.20 > *************** > *** 158,162 **** > request_queue_size = 5 > > ! allow_reuse_address = 0 > > def __init__(self, server_address, RequestHandlerClass): > --- 158,162 ---- > request_queue_size = 5 > > ! allow_reuse_address = 1 > > def __init__(self, server_address, RequestHandlerClass): Regards, Peter -- Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260 office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen) From paul at prescod.net Thu Dec 14 08:57:30 2000 From: paul at prescod.net (Paul Prescod) Date: Wed, 13 Dec 2000 23:57:30 -0800 Subject: [Python-Dev] new draft of PEP 227 References: Message-ID: <3A387D6A.782E6A3B@prescod.net> Tim Peters wrote: > > ... > > I've rarely seen problems due to shadowing a global, but have often seen > problems due to shadowing a builtin. Really? I think that there are two different issues here. One is consciously choosing to create a new variable but not understanding that there already exists a variable by that name. (i.e. str, list). Another is trying to assign to a global but actually shadowing it. There is no way that anyone coming from another language is going to consider this transcript reasonable: >>> a=5 >>> def show(): ... print a ... >>> def set(val): ... a=val ... >>> a 5 >>> show() 5 >>> set(10) >>> show() 5 It doesn't seem to make any sense. My solution is to make the assignment in "set" illegal unless you add a declaration that says: "No, really. I mean it. Override that sucker." As the PEP points out, overriding is seldom a good idea so the requirement to declare would be rarely invoked. Actually, one could argue that there is no good reason to even *allow* the shadowing of globals. You can always add an underscore to the end of the variable name to disambiguate. > Alas, if this rule were extended to > builtins too-- where it would do the most good --then the names of builtins > would effectively become reserved words (any code shadowing them today would > be broken until declarations were added, and any code working today may > break tomorrow if a new builtin were introduced that happened to have the > same name as a local). I have no good solutions to the shadowing-builtins accidently problem. But I will say that those sorts of problems are typically less subtle: str = "abcdef" ... str(5) # You'll get a pretty good error message here! The "right answer" in terms of namespace theory is to consistently refer to builtins with a prefix (whether "__builtins__" or "$") but that's pretty unpalatable from an aesthetic point of view. Paul Prescod From tim.one at home.com Thu Dec 14 09:41:19 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 03:41:19 -0500 Subject: [Python-Dev] Online help scope In-Reply-To: <3A3876DF.5554080C@ActiveState.com> Message-ID: [Paul Prescod] > I think Guido and I are pretty far apart on the scope and requirements > of this online help thing so I'd like some clarification and opinions > from the peanut gallery. > > Consider these scenarios > > a) Signature > ... > b) Usage hint > ... > c) Complete documentation, paged(man-style) > ... > d) Complete documentation in a user-chosen hypertext window > ... Guido's style guide has a lot to say about docstrings, suggesting that they were intended to support two scenarios: #a+#b together (the first line of a multi-line docstring), and #c+#d together (the entire docstring). In this respect I think Guido was (consciously or not) aping elisp's conventions, up to but not including the elisp convention for naming the arguments in the first line of a docstring. The elisp conventions were very successful (simple, and useful in practice), so aping them is a good thing. We've had stalemate ever since: there isn't a single style of writing docstrings in practice because no single docstring processor has been blessed, while no docstring processor can gain momentum before being blessed. Every attempt to date has erred by trying to do too much, thus attracting so much complaint that it can't ever become blessed. The current argument over PEP 233 appears more of the same. The way to break the stalemate is to err on the side of simplicity: just cater to the two obvious (first-line vs whole-string) cases, and for existing docstrings only. HTML vs plain text is fluff. Paging vs non-paging is fluff. Dumping to stdout vs displaying in a browser is fluff. Jumping through hoops for functions and modules whose authors didn't bother to write docstrings is fluff. Etc. People fight over fluff until it fills the air and everyone chokes to death on it <0.9 wink>. Something dirt simple can get blessed, and once *anything* is blessed, a million docstrings will bloom. [Guido] > That'S What Some People Think. I Disagree That It Would Be Either > Feasible Or A Good Idea To Put All Documentation For A Typical Module > In Its Doc Strings. I'm with Paul on this one: that's what module.__doc__ is for, IMO (Javadoc is great, Eiffel's embedded doc tools are great, Perl POD is great, even REBOL's interactive help is great). All Java, Eiffel, Perl and REBOL have in common that Python lacks is *a* blessed system, no matter how crude. [back to Paul] > ... > No matter what we decide on the issue above, reusing the standard > documentation is the only practical way of populating the help system > in the short-term. Right now, today, there is a ton of documentation > that exists only in LaTeX and HTML. Tons of modules have no docstrings. Then write tools to automatically create docstrings from the LaTeX and HTML, but *check in* the results (i.e., add the docstrings so created to the codebase), and keep the help system simple. > Keywords have no docstrings. Neither do integers, but they're obvious too . From thomas at xs4all.net Thu Dec 14 10:13:49 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 14 Dec 2000 10:13:49 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100 References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> Message-ID: <20001214101348.N4396@xs4all.nl> On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote: > By the way, in woody, there are 52 packages with 'python' in the name, and > 32 with 'perl' in the name... Ah, not true, sorry. I shouldn't have posted off-topic stuff after being awoken by machine-down-alarms ;) That was just what my reasonably-default install had installed. Debian has what looks like most CPAN modules as packages, too, so it's closer to a 110/410 spread (python/perl.) Still, not a bad number :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal at lemburg.com Thu Dec 14 11:32:58 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 11:32:58 +0100 Subject: [Python-Dev] new draft of PEP 227 References: <14904.21739.804346.650062@bitdiddle.concentric.net> Message-ID: <3A38A1DA.7EC49149@lemburg.com> Jeremy Hylton wrote: > > I've got a new draft of PEP 227. The terminology and wording are more > convoluted than they need to be. I'll do at least one revision just > to say things more clearly, but I'd appreciate comments on the > proposed spec if you can read the current draft. The PEP doesn't mention the problems I pointed out about breaking the lookup schemes w/r to symbols in methods, classes and globals. Please add a comment about this to the PEP + maybe the example I gave in one the posts to python-dev about it. I consider the problem serious enough to limit the nested scoping to lambda functions (or functions in general) only if that's possible. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Thu Dec 14 11:55:38 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 11:55:38 +0100 Subject: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule) References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> Message-ID: <3A38A72A.4011B5BD@lemburg.com> Thomas Wouters wrote: > > On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: > > > I don't think that is a very safe bet. Python 2.0 missed the Debian > > > Potato boat. > > > > This may have had to do more with the unresolved GPL issues. > > This is very likely. Debian is very licence -- or at least GPL -- aware. > Which is a pity, really, because I already prefer it over RedHat in all > other cases (and RedHat is also pretty licence aware, just less piously, > devoutly, beyond-practicality-IMHO dedicated to the GPL.) About the GPL issue: as I understood Guido's post, RMS still regards the choice of law clause as being incompatible to the GPL (heck, doesn't this guy ever think about international trade terms, the United Nations Convention on International Sale of Goods or local law in one of the 200+ countries where you could deploy GPLed software... is the GPL only meant for US programmers ?). I am currently rewriting my open source licenses as well and among other things I chose to integrate a choice of law clause as well. Seeing RMS' view of things, I guess that my license will be regarded as incompatible to the GPL which is sad even though I'm in good company... e.g. the Apache license, the Zope license, etc. Dual licensing is not possible as it would reopen the loop-wholes in the GPL I tried to fix in my license. Any idea on how to proceed ? Another issue: since Python doesn't link Python scripts, is it still true that if one (pure) Python package is covered by the GPL, then all other packages needed by that application will also fall under GPL ? Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein at lyra.org Thu Dec 14 12:57:43 2000 From: gstein at lyra.org (Greg Stein) Date: Thu, 14 Dec 2000 03:57:43 -0800 Subject: (offtopic) Re: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A38A72A.4011B5BD@lemburg.com>; from mal@lemburg.com on Thu, Dec 14, 2000 at 11:55:38AM +0100 References: <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> <3A38A72A.4011B5BD@lemburg.com> Message-ID: <20001214035742.Z8951@lyra.org> On Thu, Dec 14, 2000 at 11:55:38AM +0100, M.-A. Lemburg wrote: >... > I am currently rewriting my open source licenses as well and among > other things I chose to integrate a choice of law clause as well. > Seeing RMS' view of things, I guess that my license will be regarded > as incompatible to the GPL which is sad even though I'm in good > company... e.g. the Apache license, the Zope license, etc. Dual > licensing is not possible as it would reopen the loop-wholes in the > GPL I tried to fix in my license. Any idea on how to proceed ? Only RMS is under the belief that the Apache license is incompatible. It is either clause 4 or 5 (I forget which) where we state that certain names (e.g. "Apache") cannot be used in derived products' names and promo materials. RMS views this as an "additional restriction on redistribution", which is apparently not allowed by the GPL. We (the ASF) generally feel he is being a royal pain in the ass with this. We've sent him a big, long email asking for clarification / resolution, but haven't heard back (we sent it a month or so ago). Basically, his FUD creates views such as yours ("the Apache license is incompatible with the GPL") because people just take his word for it. We plan to put together a web page to outline our own thoughts and licensing beliefs/philosophy. We're also planning to rev our license to rephrase/alter the particular clause, but for logistic purposes (putting the project name in there ties it to the particular project; we want a generic ASF license that can be applied to all of the projects without a search/replace). At this point, the ASF is taking the position of ignoring him and his controlling attitude(*) and beliefs. There is the outstanding letter to him, but that doesn't really change our point of view. Cheers, -g (*) for a person espousing freedom, it is rather ironic just how much of a control freak he is (stemming from a no-compromise position to guarantee peoples' freedoms, he always wants things done his way) -- Greg Stein, http://www.lyra.org/ From tg at melaten.rwth-aachen.de Thu Dec 14 14:07:12 2000 From: tg at melaten.rwth-aachen.de (Thomas Gellekum) Date: 14 Dec 2000 14:07:12 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 19:26:32 -0500" References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213192632.A30585@kronos.cnri.reston.va.us> Message-ID: Andrew Kuchling writes: > I'm ambivalent about the list_of_panels. It's a linked list storing > (PyWindow, PyPanel) pairs. Probably it should use a dictionary > instead of implementing a little list, just to reduce the amount of > code. I don't like it either, so feel free to shred it. As I said, this is the first (piece of an) extension module I've written and I thought it would be easier to implement a little list than to manage a Python list or such in C. > So, I suggest we create _curses_panel.c, which would be available as > curses.panel. (A panel.py module could then add any convenience > functions that are required.) > > Thomas, do you want to work on this, or should I? Just do it. I'll try to add more examples in the meantime. tg From fredrik at pythonware.com Thu Dec 14 14:19:08 2000 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 14 Dec 2000 14:19:08 +0100 Subject: [Python-Dev] fuzzy logic? Message-ID: <015101c065d0$717d1680$0900a8c0@SPIFF> here's a simple (but somewhat strange) test program: def spam(): a = 1 if (0): global a print "global a" a = 2 def egg(): b = 1 if 0: global b print "global b" b = 2 egg() spam() print a print b if I run this under 1.5.2, I get: 2 Traceback (innermost last): File "", line 19, in ? NameError: b From gstein at lyra.org Thu Dec 14 14:42:11 2000 From: gstein at lyra.org (Greg Stein) Date: Thu, 14 Dec 2000 05:42:11 -0800 Subject: [Python-Dev] fuzzy logic? In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF>; from fredrik@pythonware.com on Thu, Dec 14, 2000 at 02:19:08PM +0100 References: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: <20001214054210.G8951@lyra.org> I would take a guess that the "if 0:" is optimized away *before* the inspection for a "global" statement. But the compiler doesn't know how to optimize away "if (0):", so the global statement remains. Ah. Just checked. Look at compile.c::com_if_stmt(). There is a call to "is_constant_false()" in there. Heh. Looks like is_constant_false() could be made a bit smarter. But the point is valid: you can make is_constant_false() as smart as you want, and you'll still end up with "funny" global behavior. Cheers, -g On Thu, Dec 14, 2000 at 02:19:08PM +0100, Fredrik Lundh wrote: > here's a simple (but somewhat strange) test program: > > def spam(): > a = 1 > if (0): > global a > print "global a" > a = 2 > > def egg(): > b = 1 > if 0: > global b > print "global b" > b = 2 > > egg() > spam() > > print a > print b > > if I run this under 1.5.2, I get: > > 2 > Traceback (innermost last): > File "", line 19, in ? > NameError: b > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://www.python.org/mailman/listinfo/python-dev -- Greg Stein, http://www.lyra.org/ From mwh21 at cam.ac.uk Thu Dec 14 14:58:24 2000 From: mwh21 at cam.ac.uk (Michael Hudson) Date: 14 Dec 2000 13:58:24 +0000 Subject: [Python-Dev] fuzzy logic? In-Reply-To: "Fredrik Lundh"'s message of "Thu, 14 Dec 2000 14:19:08 +0100" References: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: 1) Is there anything is the standard library that does the equivalent of import symbol,token def decode_ast(ast): if token.ISTERMINAL(ast[0]): return (token.tok_name[ast[0]], ast[1]) else: return (symbol.sym_name[ast[0]],)+tuple(map(decode_ast,ast[1:])) so that, eg: >>> pprint.pprint(decode.decode_ast(parser.expr("0").totuple())) ('eval_input', ('testlist', ('test', ('and_test', ('not_test', ('comparison', ('expr', ('xor_expr', ('and_expr', ('shift_expr', ('arith_expr', ('term', ('factor', ('power', ('atom', ('NUMBER', '0'))))))))))))))), ('NEWLINE', ''), ('ENDMARKER', '')) ? Should there be? (Especially if it was a bit better written). ... and Greg's just said everything else I wanted to! Cheers, M. -- please realize that the Common Lisp community is more than 40 years old. collectively, the community has already been where every clueless newbie will be going for the next three years. so relax, please. -- Erik Naggum, comp.lang.lisp From guido at python.org Thu Dec 14 15:51:26 2000 From: guido at python.org (Guido van Rossum) Date: Thu, 14 Dec 2000 09:51:26 -0500 Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: Your message of "Thu, 14 Dec 2000 08:42:59 +0100." References: Message-ID: <200012141451.JAA15637@cj20424-a.reston1.va.home.com> > I think the following change is incompatible and will break applications. > > At least I have some server type applications that rely on > 'allow_reuse_address' defaulting to 0, because they use > the 'address already in use' exception, to make sure, that exactly one > server process is running on this port. One of these applications, > which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work, > if I change this default in SocketServer.py. > > Would you please explain the reasoning behind this change? The reason for the patch is that without this, if you kill a TCP server and restart it right away, you'll get a 'port in use" error -- TCP has some kind of strange wait period after a connection is closed before it can be reused. The patch avoids this error. As far as I know, with TCP, code using SO_REUSEADDR still cannot bind to the port when another process is already using it, but for UDP, the semantics may be different. Is your server using UDP? Try this patch if your problem is indeed related to UDP: *** SocketServer.py 2000/12/13 20:39:17 1.20 --- SocketServer.py 2000/12/14 14:48:16 *************** *** 268,273 **** --- 268,275 ---- """UDP server class.""" + allow_reuse_address = 0 + socket_type = socket.SOCK_DGRAM max_packet_size = 8192 If this works for you, I'll check it in, of course. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at alum.mit.edu Thu Dec 14 15:52:37 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 14 Dec 2000 09:52:37 -0500 (EST) Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A38A1DA.7EC49149@lemburg.com> References: <14904.21739.804346.650062@bitdiddle.concentric.net> <3A38A1DA.7EC49149@lemburg.com> Message-ID: <14904.57013.371474.691948@bitdiddle.concentric.net> >>>>> "MAL" == M -A Lemburg writes: MAL> Jeremy Hylton wrote: >> >> I've got a new draft of PEP 227. The terminology and wording are >> more convoluted than they need to be. I'll do at least one >> revision just to say things more clearly, but I'd appreciate >> comments on the proposed spec if you can read the current draft. MAL> The PEP doesn't mention the problems I pointed out about MAL> breaking the lookup schemes w/r to symbols in methods, classes MAL> and globals. I believe it does. There was some discussion on python-dev and with others in private email about how classes should be handled. The relevant section of the specification is: If a name is used within a code block, but it is not bound there and is not declared global, the use is treated as a reference to the nearest enclosing function region. (Note: If a region is contained within a class definition, the name bindings that occur in the class block are not visible to enclosed functions.) MAL> Please add a comment about this to the PEP + maybe the example MAL> I gave in one the posts to python-dev about it. I consider the MAL> problem serious enough to limit the nested scoping to lambda MAL> functions (or functions in general) only if that's possible. If there was some other concern you had, then I don't know what it was. I recall that you had a longish example that raised a NameError immediately :-). Jeremy From mal at lemburg.com Thu Dec 14 16:02:33 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 16:02:33 +0100 Subject: [Python-Dev] new draft of PEP 227 References: <14904.21739.804346.650062@bitdiddle.concentric.net> <3A38A1DA.7EC49149@lemburg.com> <14904.57013.371474.691948@bitdiddle.concentric.net> Message-ID: <3A38E109.54C07565@lemburg.com> Jeremy Hylton wrote: > > >>>>> "MAL" == M -A Lemburg writes: > > MAL> Jeremy Hylton wrote: > >> > >> I've got a new draft of PEP 227. The terminology and wording are > >> more convoluted than they need to be. I'll do at least one > >> revision just to say things more clearly, but I'd appreciate > >> comments on the proposed spec if you can read the current draft. > > MAL> The PEP doesn't mention the problems I pointed out about > MAL> breaking the lookup schemes w/r to symbols in methods, classes > MAL> and globals. > > I believe it does. There was some discussion on python-dev and > with others in private email about how classes should be handled. > > The relevant section of the specification is: > > If a name is used within a code block, but it is not bound there > and is not declared global, the use is treated as a reference to > the nearest enclosing function region. (Note: If a region is > contained within a class definition, the name bindings that occur > in the class block are not visible to enclosed functions.) Well hidden ;-) Honestly, I think that you should either make this specific case more visible to readers of the PEP since this single detail would produce most of the problems with nested scopes. BTW, what about nested classes ? AFAIR, the PEP only talks about nested functions. > MAL> Please add a comment about this to the PEP + maybe the example > MAL> I gave in one the posts to python-dev about it. I consider the > MAL> problem serious enough to limit the nested scoping to lambda > MAL> functions (or functions in general) only if that's possible. > > If there was some other concern you had, then I don't know what it > was. I recall that you had a longish example that raised a NameError > immediately :-). The idea behind the example should have been clear, though. x = 1 class C: x = 2 def test(self): print x -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake at acm.org Thu Dec 14 16:09:57 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 14 Dec 2000 10:09:57 -0500 (EST) Subject: [Python-Dev] fuzzy logic? In-Reply-To: References: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: <14904.58053.282537.260186@cj42289-a.reston1.va.home.com> Michael Hudson writes: > 1) Is there anything is the standard library that does the equivalent > of No, but I have a chunk of code that does in a different way. Where in the library do you think it belongs? The compiler package sounds like the best place, but that's not installed by default. (Jeremy, is that likely to change soon?) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From mwh21 at cam.ac.uk Thu Dec 14 16:47:33 2000 From: mwh21 at cam.ac.uk (Michael Hudson) Date: 14 Dec 2000 15:47:33 +0000 Subject: [Python-Dev] fuzzy logic? In-Reply-To: "Fred L. Drake, Jr."'s message of "Thu, 14 Dec 2000 10:09:57 -0500 (EST)" References: <015101c065d0$717d1680$0900a8c0@SPIFF> <14904.58053.282537.260186@cj42289-a.reston1.va.home.com> Message-ID: "Fred L. Drake, Jr." writes: > Michael Hudson writes: > > 1) Is there anything is the standard library that does the equivalent > > of > > No, but I have a chunk of code that does in a different way. I'm guessing everyone who's played with the parser much does, hence the suggestion. I agree my implementation is probably not optimal - I just threw it together as quickly as I could! > Where in the library do you think it belongs? The compiler package > sounds like the best place, but that's not installed by default. > (Jeremy, is that likely to change soon?) Actually, I'd have thought the parser module would be most natural, but that would probably mean doing the _module.c trick, and it's probably not worth the bother. OTOH, it seems that wrapping any given extension module in a python module is becoming if anything the norm, so maybe it is. Cheers, M. -- I don't remember any dirty green trousers. -- Ian Jackson, ucam.chat From nowonder at nowonder.de Thu Dec 14 16:50:10 2000 From: nowonder at nowonder.de (Peter Schneider-Kamp) Date: Thu, 14 Dec 2000 16:50:10 +0100 Subject: [Python-Dev] [PEP-212] new draft Message-ID: <3A38EC32.210BD1A2@nowonder.de> In an attempt to revive PEP 212 - Loop counter iteration I have updated the draft. The HTML version can be found at: http://python.sourceforge.net/peps/pep-0212.html I will appreciate any form of comments and/or criticisms. Peter P.S.: Now I have posted it - should I update the Post-History? Or is that for posts to c.l.py? From pf at artcom-gmbh.de Thu Dec 14 16:56:08 2000 From: pf at artcom-gmbh.de (Peter Funk) Date: Thu, 14 Dec 2000 16:56:08 +0100 (MET) Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: <200012141451.JAA15637@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 14, 2000 9:51:26 am" Message-ID: Hi, Moshes checkin indeed makes a lot of sense. Sorry for the irritation. Guido van Rossum: > The reason for the patch is that without this, if you kill a TCP server > and restart it right away, you'll get a 'port in use" error -- TCP has > some kind of strange wait period after a connection is closed before > it can be reused. The patch avoids this error. > > As far as I know, with TCP, code using SO_REUSEADDR still cannot bind > to the port when another process is already using it, but for UDP, the > semantics may be different. > > Is your server using UDP? No and I must admit, that I didn't tested carefully enough: From a quick look at my process listing I assumed there were indeed two server processes running concurrently which would have broken the needed mutual exclusion. But the second process went in a sleep-and-retry-to-connect-loop which I simply forgot about. This loop was initially built into my server to wait until the "strange wait period" you mentioned above was over or a certain number of retries has been exceeded. I guess I can take this ugly work-around out with Python 2.0 and newer, since the BaseHTTPServer.py shipped with Python 2.0 already contained allow_reuse_address = 1 default in the HTTPServer class. BTW: I've took my old W.Richard Stevens Unix Network Programming from the shelf. After rereading the rather terse paragraph about SO_REUSEADDR I guess the wait period is necessary to make sure, that their is no connect pending from an outside client on this TCP port. I can't find nothing about UDP and REUSE. Regards, Peter From guido at python.org Thu Dec 14 17:17:27 2000 From: guido at python.org (Guido van Rossum) Date: Thu, 14 Dec 2000 11:17:27 -0500 Subject: [Python-Dev] Online help scope In-Reply-To: Your message of "Wed, 13 Dec 2000 23:29:35 PST." <3A3876DF.5554080C@ActiveState.com> References: <3A3876DF.5554080C@ActiveState.com> Message-ID: <200012141617.LAA16179@cj20424-a.reston1.va.home.com> > I think Guido and I are pretty far apart on the scope and requirements > of this online help thing so I'd like some clarification and opinions > from the peanut gallery. I started replying but I think Tim's said it all. Let's do something dead simple. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at digicool.com Thu Dec 14 18:14:01 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 12:14:01 -0500 Subject: [Python-Dev] [PEP-212] new draft References: <3A38EC32.210BD1A2@nowonder.de> Message-ID: <14904.65497.940293.975775@anthem.concentric.net> >>>>> "PS" == Peter Schneider-Kamp writes: PS> P.S.: Now I have posted it - should I update the Post-History? PS> Or is that for posts to c.l.py? Originally, I'd thought of it as tracking the posting history to c.l.py. I'm not sure how useful that header is after all -- maybe in just giving a start into the python-list archives... -Barry From tim.one at home.com Thu Dec 14 18:33:41 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 12:33:41 -0500 Subject: [Python-Dev] fuzzy logic? In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: Note that the behavior of both functions is undefined ("Names listed in a global statement must not be used in the same code block textually preceding that global statement", from the Lang Ref, and "if" does not introduce a new code block in Python's terminology). But you'll get the same outcome via these trivial variants, which sidestep that problem: def spam(): if (0): global a print "global a" a = 2 def egg(): if 0: global b print "global b" b = 2 *Now* you can complain . > -----Original Message----- > From: python-dev-admin at python.org [mailto:python-dev-admin at python.org]On > Behalf Of Fredrik Lundh > Sent: Thursday, December 14, 2000 8:19 AM > To: python-dev at python.org > Subject: [Python-Dev] fuzzy logic? > > > here's a simple (but somewhat strange) test program: > > def spam(): > a = 1 > if (0): > global a > print "global a" > a = 2 > > def egg(): > b = 1 > if 0: > global b > print "global b" > b = 2 > > egg() > spam() > > print a > print b > > if I run this under 1.5.2, I get: > > 2 > Traceback (innermost last): > File "", line 19, in ? > NameError: b > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://www.python.org/mailman/listinfo/python-dev From tim.one at home.com Thu Dec 14 19:46:09 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 13:46:09 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule) In-Reply-To: <3A38A72A.4011B5BD@lemburg.com> Message-ID: [MAL] > About the GPL issue: as I understood Guido's post, RMS still regards > the choice of law clause as being incompatible to the GPL Yes. Actually, I don't know what RMS really thinks -- his public opinions on legal issues appear to be echoes of what Eben Moglen tells him. Like his views or not, Moglen is a tenured law professor > (heck, doesn't this guy ever think about international trade terms, > the United Nations Convention on International Sale of Goods > or local law in one of the 200+ countries where you could deploy > GPLed software... Yes. > is the GPL only meant for US programmers ?). No. Indeed, that's why the GPL is grounded in copyright law, because copyright law is the most uniform (across countries) body of law we've got. Most commentary I've seen suggests that the GPL has its *weakest* legal legs in the US! > I am currently rewriting my open source licenses as well and among > other things I chose to integrate a choice of law clause as well. > Seeing RMS' view of things, I guess that my license will be regarded > as incompatible to the GPL Yes. > which is sad even though I'm in good company... e.g. the Apache > license, the Zope license, etc. Dual licensing is not possible as > it would reopen the loop-wholes in the GPL I tried to fix in my > license. Any idea on how to proceed ? You can wait to see how the CNRI license turns out, then copy it if it's successful; you can approach the FSF directly; you can stop trying to do it yourself and reuse some license that's already been blessed by the FSF; or you can give up on GPL compatibility (according to the FSF). I don't see any other choices. > Another issue: since Python doesn't link Python scripts, is it > still true that if one (pure) Python package is covered by the GPL, > then all other packages needed by that application will also fall > under GPL ? Sorry, couldn't make sense of the question. Just as well, since you should ask about it on a GNU forum anyway . From mal at lemburg.com Thu Dec 14 21:02:05 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 21:02:05 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: Message-ID: <3A39273D.4AE24920@lemburg.com> Tim Peters wrote: > > [MAL] > > About the GPL issue: as I understood Guido's post, RMS still regards > > the choice of law clause as being incompatible to the GPL > > Yes. Actually, I don't know what RMS really thinks -- his public opinions > on legal issues appear to be echoes of what Eben Moglen tells him. Like his > views or not, Moglen is a tenured law professor But it's his piece of work, isn't it ? He's the one who can change it. > > (heck, doesn't this guy ever think about international trade terms, > > the United Nations Convention on International Sale of Goods > > or local law in one of the 200+ countries where you could deploy > > GPLed software... > > Yes. Strange, then how come he sees the choice of law clause as a problem: without explicitely ruling out the applicability of the UN CISC, this clause is waived by it anyway... at least according to a specialist on software law here in Germany. > > is the GPL only meant for US programmers ?). > > No. Indeed, that's why the GPL is grounded in copyright law, because > copyright law is the most uniform (across countries) body of law we've got. > Most commentary I've seen suggests that the GPL has its *weakest* legal legs > in the US! Huh ? Just an example: in Germany customer rights assure a 6 month warranty on everything you buy or obtain in some other way. Liability is another issue: there are some very unpleasant laws which render most of the "no liability" paragraphs in licenses useless in Germany. Even better: since the license itself is written in English a German party could simply consider the license non-binding, since he or she hasn't agreed to accept contract in foreign languages. France has similar interpretations. > > I am currently rewriting my open source licenses as well and among > > other things I chose to integrate a choice of law clause as well. > > Seeing RMS' view of things, I guess that my license will be regarded > > as incompatible to the GPL > > Yes. > > > which is sad even though I'm in good company... e.g. the Apache > > license, the Zope license, etc. Dual licensing is not possible as > > it would reopen the loop-wholes in the GPL I tried to fix in my > > license. Any idea on how to proceed ? > > You can wait to see how the CNRI license turns out, then copy it if it's > successful; you can approach the FSF directly; you can stop trying to do it > yourself and reuse some license that's already been blessed by the FSF; or > you can give up on GPL compatibility (according to the FSF). I don't see > any other choices. I guess I'll go with the latter. > > Another issue: since Python doesn't link Python scripts, is it > > still true that if one (pure) Python package is covered by the GPL, > > then all other packages needed by that application will also fall > > under GPL ? > > Sorry, couldn't make sense of the question. Just as well, since you should > ask about it on a GNU forum anyway . Isn't this question (whether the GPL virus applies to byte-code as well) important to Python programmers as well ? Oh well, nevermind... it's still nice to hear that CNRI and RMS have finally made up their minds to render Python GPL-compatible -- whatever this means ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From cgw at fnal.gov Thu Dec 14 22:06:43 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Thu, 14 Dec 2000 15:06:43 -0600 (CST) Subject: [Python-Dev] memory leaks Message-ID: <14905.13923.659879.100243@buffalo.fnal.gov> The following code (extracted from test_extcall.py) leaks memory: class Foo: def method(self, arg1, arg2): return arg1 + arg2 def f(): err = None try: Foo.method(*(1, 2, 3)) except TypeError, err: pass del err One-line fix (also posted to Sourceforge): --- Python/ceval.c 2000/10/30 17:15:19 2.213 +++ Python/ceval.c 2000/12/14 20:54:02 @@ -1905,8 +1905,7 @@ class))) { PyErr_SetString(PyExc_TypeError, "unbound method must be called with instance as first argument"); - x = NULL; - break; + goto extcall_fail; } } } I think that there are a bunch more memory leaks lurking around... this only fixes one of them. I'll send more info as I find out what's going on. From tim.one at home.com Thu Dec 14 22:28:09 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 16:28:09 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A39273D.4AE24920@lemburg.com> Message-ID: I'm not going to argue about the GPL. Take it up with the FSF! I will say that if you do get the FSF's attention, Moglen will have an instant counter to any objection you're likely to raise -- he's been thinking about this for 10 years, and he's heard it all. And in our experience, RMS won't commit to anything before running it past Moglen. [MAL] > But it's his [RMS's] piece of work, isn't it ? He's the one who can > change it. Akin to saying Python is Guido's piece of work. Yes, no, kinda, more true at some times than others, ditto respects. RMS has consistently said that any changes for the next version of the GPL will take at least a year, due to extensive legal review required first. Would be more clearly true to say that the first version of the GPL was RMS's alone -- but version 2 came out in 1991. > ... > Strange, then how come he sees the choice of law clause as a problem: > without explicitely ruling out the applicability of the UN CISC, > this clause is waived by it anyway... at least according to a > specialist on software law here in Germany. > ... [and other "who knows?" objections] ... Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re: [License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who dismissed it almost offhandedly as "layman's commentary". You'll have to ask him why: MAL, we're not lawyers. We're incompetent to have this discussion -- or at least I am, and Moglen thinks you are too . >>> Another issue: since Python doesn't link Python scripts, is it >>> still true that if one (pure) Python package is covered by the GPL, >>> then all other packages needed by that application will also fall >>> under GPL ? [Tim] >> Sorry, couldn't make sense of the question. Just as well, >> since you should ask about it on a GNU forum anyway . [MAL] > Isn't this question (whether the GPL virus applies to byte-code > as well) important to Python programmers as well ? I don't know -- like I said, I couldn't make sense of the question, i.e. I couldn't figure out what it is you're asking. I *suspect* it's based on a misunderstanding of the GPL; for example, gcc is a GPL'ed application that requires stuff from the OS in order to do its job of compiling, but that doesn't mean that every OS it runs on falls under the GPL. The GPL contains no restrictions on *use*, it restricts only copying, modifying and distributing (the specific rights granted by copyright law). I don't see any way to read the GPL as restricting your ability to distribute a GPL'ed program P on its own, no matter what the status of the packages that P may rely upon for operation. The GPL is also not viral in the sense that it cannot infect an unwitting victim. Nothing whatsoever you do or don't do can make *any* other program Q "fall under" the GPL -- only Q's owner can set the license for Q. The GPL purportedly can prevent you from distributing (but not from using) a program that links with a GPL'ed program, but that doesn't appear to be what you're asking about. Or is it? If you were to put, say, mxDateTime, under the GPL, then yes, I believe the FSF would claim I could not distribute my program T that uses mxDateTime unless T were also under the GPL or a GPL-compatible license. But if mxDateTime is not under the GPL, then nothing I do with T can magically change the mxDateTime license to the GPL (although if your mxDateTime license allows me to redistribute mxDateTime under a different license, then it allows me to ship a copy of mxDateTime under the GPL). That said, the whole theory of GPL linking is muddy to me, especially since the word "link" (and its variants) doesn't appear in the GPL. > Oh well, nevermind... it's still nice to hear that CNRI and RMS > have finally made up their minds to render Python GPL-compatible -- > whatever this means ;-) I'm not sure it means anything yet. CNRI and the FSF believed they reached agreement before, but that didn't last after Moglen and Kahn each figured out what the other was really suggesting. From mal at lemburg.com Thu Dec 14 23:25:31 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 23:25:31 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: Message-ID: <3A3948DB.9165E404@lemburg.com> Tim Peters wrote: > > I'm not going to argue about the GPL. Take it up with the FSF! Sorry, I got a bit carried away -- I don't want to take it up with the FSF, simply because I couldn't care less. What's bugging me is that this one guy is splitting the OSS world in two even though both halfs actually want the same thing: software which you can use for free with full source code. I find that a very poor situation. > I will say > that if you do get the FSF's attention, Moglen will have an instant counter > to any objection you're likely to raise -- he's been thinking about this for > 10 years, and he's heard it all. And in our experience, RMS won't commit to > anything before running it past Moglen. > > [MAL] > > But it's his [RMS's] piece of work, isn't it ? He's the one who can > > change it. > > Akin to saying Python is Guido's piece of work. Yes, no, kinda, more true > at some times than others, ditto respects. RMS has consistently said that > any changes for the next version of the GPL will take at least a year, due > to extensive legal review required first. Would be more clearly true to say > that the first version of the GPL was RMS's alone -- but version 2 came out > in 1991. Point taken. > > ... > > Strange, then how come he sees the choice of law clause as a problem: > > without explicitely ruling out the applicability of the UN CISC, > > this clause is waived by it anyway... at least according to a > > specialist on software law here in Germany. > > ... [and other "who knows?" objections] ... > > Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re: > [License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who > dismissed it almost offhandedly as "layman's commentary". You'll have to > ask him why: MAL, we're not lawyers. We're incompetent to have this > discussion -- or at least I am, and Moglen thinks you are too . I'm not a lawyer either, but I am able to apply common sense and know about German trade laws. Anyway, here a reference which covers all the controversial subjects. It's in German, but these guys qualify as lawyers ;-) ... http://www.ifross.de/ifross_html/index.html There's also a book on the subject in German which covers all aspects of software licensing. Here's the reference in case anyone cares: Jochen Marly, Software?berlassungsvertr?ge C.H. Beck, M?nchen, 2000 > >>> Another issue: since Python doesn't link Python scripts, is it > >>> still true that if one (pure) Python package is covered by the GPL, > >>> then all other packages needed by that application will also fall > >>> under GPL ? > > [Tim] > >> Sorry, couldn't make sense of the question. Just as well, > >> since you should ask about it on a GNU forum anyway . > > [MAL] > > Isn't this question (whether the GPL virus applies to byte-code > > as well) important to Python programmers as well ? > > I don't know -- like I said, I couldn't make sense of the question, i.e. I > couldn't figure out what it is you're asking. I *suspect* it's based on a > misunderstanding of the GPL; for example, gcc is a GPL'ed application that > requires stuff from the OS in order to do its job of compiling, but that > doesn't mean that every OS it runs on falls under the GPL. The GPL contains > no restrictions on *use*, it restricts only copying, modifying and > distributing (the specific rights granted by copyright law). I don't see > any way to read the GPL as restricting your ability to distribute a GPL'ed > program P on its own, no matter what the status of the packages that P may > rely upon for operation. This is very controversial: if an application Q needs a GPLed library P to work, then P and Q form a new whole in the sense of the GPL. And this even though P wasn't even distributed together with Q. Don't ask me why, but that's how RMS and folks look at it. It can be argued that the dynamic linker actually integrates P into Q, but is the same argument valid for a Python program Q which relies on a GPLed package P ? (The relationship between Q and P is one of providing interfaces -- there is no call address patching required for the setup to work.) > The GPL is also not viral in the sense that it cannot infect an unwitting > victim. Nothing whatsoever you do or don't do can make *any* other program > Q "fall under" the GPL -- only Q's owner can set the license for Q. The GPL > purportedly can prevent you from distributing (but not from using) a program > that links with a GPL'ed program, but that doesn't appear to be what you're > asking about. Or is it? No. What's viral about the GPL is that you can turn an application into a GPLed one by merely linking the two together -- that's why e.g. the libc is distributed under the LGPL which doesn't have this viral property. > If you were to put, say, mxDateTime, under the GPL, then yes, I believe the > FSF would claim I could not distribute my program T that uses mxDateTime > unless T were also under the GPL or a GPL-compatible license. But if > mxDateTime is not under the GPL, then nothing I do with T can magically > change the mxDateTime license to the GPL (although if your mxDateTime > license allows me to redistribute mxDateTime under a different license, then > it allows me to ship a copy of mxDateTime under the GPL). > > That said, the whole theory of GPL linking is muddy to me, especially since > the word "link" (and its variants) doesn't appear in the GPL. True. > > Oh well, nevermind... it's still nice to hear that CNRI and RMS > > have finally made up their minds to render Python GPL-compatible -- > > whatever this means ;-) > > I'm not sure it means anything yet. CNRI and the FSF believed they reached > agreement before, but that didn't last after Moglen and Kahn each figured > out what the other was really suggesting. Oh boy... -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From greg at cosc.canterbury.ac.nz Fri Dec 15 00:19:09 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Dec 2000 12:19:09 +1300 (NZDT) Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A3948DB.9165E404@lemburg.com> Message-ID: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> "M.-A. Lemburg" : > if an application Q needs a GPLed > library P to work, then P and Q form a new whole in the sense of > the GPL. I don't see how Q can *need* any particular library P to work. The most it can need is some library with an API which is compatible with P's. So I don't buy that argument. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Fri Dec 15 00:58:24 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Dec 2000 12:58:24 +1300 (NZDT) Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A387005.6725DAAE@ActiveState.com> Message-ID: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> Paul Prescod : > We could say that a local can only shadow a global > if the local is formally declared. How do you intend to enforce that? Seems like it would require a test on every assignment to a local, to make sure nobody has snuck in a new global since the function was compiled. > Actually, one could argue that there is no good reason to > even *allow* the shadowing of globals. If shadowing were completely disallowed, it would make it impossible to write a completely self-contained function whose source could be moved from one environment to another without danger of it breaking. I wouldn't like the language to have a characteristic like that. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Fri Dec 15 01:06:12 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Dec 2000 13:06:12 +1300 (NZDT) Subject: [Python-Dev] Online help scope In-Reply-To: Message-ID: <200012150006.NAA02154@s454.cosc.canterbury.ac.nz> Tim Peters : > [Paul Prescod] > > Keywords have no docstrings. > Neither do integers, but they're obvious too . Oh, I don't know, it could be useful. >>> help(2) The first prime number. >>> help(2147483647) sys.maxint, the largest Python small integer. >>> help(42) The answer to the ultimate question of life, the universe and everything. See also: ultimate_question. >>> help("ultimate_question") [Importing research.mice.earth] [Calling earth.find_ultimate_question] This may take about 10 million years, please be patient... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From barry at digicool.com Fri Dec 15 01:33:16 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 19:33:16 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <3A3948DB.9165E404@lemburg.com> <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> Message-ID: <14905.26316.407495.981198@anthem.concentric.net> >>>>> "GE" == Greg Ewing writes: GE> I don't see how Q can *need* any particular library P to GE> work. The most it can need is some library with an API which GE> is compatible with P's. So I don't buy that argument. It's been my understanding that the FSF's position on this is as follows. If the only functional implementation of the API is GPL'd software then simply writing your code against that API is tantamount to linking with that software. Their reasoning is that the clear intent of the programmer (shut up, Chad) is to combine the program with GPL code. As soon as there is a second, non-GPL implementation of the API, you're fine because while you may not distribute your program with the GPL'd software linked in, those who receive your software wouldn't be forced to combine GPL and non-GPL code. -Barry From tim.one at home.com Fri Dec 15 04:01:36 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 22:01:36 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A3948DB.9165E404@lemburg.com> Message-ID: [MAL] > Sorry, I got a bit carried away -- I don't want to take it up > with the FSF, simply because I couldn't care less. Well, nobody else is able to Pronounce on what the FSF believes or will do. Which tells me that you're not really interested in playing along with the FSF here after all -- which we both knew from the start anyway . > What's bugging me is that this one guy is splitting the OSS world There are many people on the FSF bandwagon. I'm not one of them, but I can count. > in two even though both halfs actually want the same thing: software > which you can use for free with full source code. I find that a very > poor situation. RMS would not agree that both halves want the same thing; to the contrary, he's openly contemptuous of the Open Source movement -- which you also knew from the start. > [stuff about German law I won't touch with 12-foot schnitzel] OTOH, a German FSF advocate assured me: I also tend to forget that the system of the law works different in the US as in Germany. In Germany something that most people will believe (called "common grounds") play a role in the court. So if you knew, because it is widely known what the GPL means, than it is harder to attack that in court. In the US, when something gets to court it doesn't matter at all what people believed about it. Heck, we'll let mass murderers go free if a comma was in the wrong place in a 1592 statute, or send a kid to jail for life for using crack cocaine instead of the flavor favored by stockbrokers . I hope the US is unique in that respect, but it does makes the GPL weaker here because even if *everyone* in our country believed the GPL means what RMS says it means, a US court would give that no weight in its logic-chopping. >>> Another issue: since Python doesn't link Python scripts, is it >>> still true that if one (pure) Python package is covered by the GPL, >>> then all other packages needed by that application will also fall >>> under GPL ? > This is very controversial: if an application Q needs a GPLed > library P to work, then P and Q form a new whole in the sense of > the GPL. And this even though P wasn't even distributed together > with Q. Don't ask me why, but that's how RMS and folks look at it. Understood, but have you reread your question above, which I've said twice I can't make sense of? That's not what you were asking about. Your question above asks, if anything, the opposite: the *application* Q is GPL'ed, and the question above asks whether that means the *Ps* it depends on must also be GPL'ed. To the best of my ability, I've answered "NO" to that one, and "YES" to the question it appears you meant to ask. > It can be argued that the dynamic linker actually integrates > P into Q, but is the same argument valid for a Python program Q > which relies on a GPLed package P ? (The relationship between > Q and P is one of providing interfaces -- there is no call address > patching required for the setup to work.) As before, I believe the FSF will say YES. Unless there's also a non-GPL'ed implementation of the same interface that people could use just as well. See my extended mxDateTime example too. > ... > No. What's viral about the GPL is that you can turn an application > into a GPLed one by merely linking the two together No, you cannot. You can link them together all day without any hassle. What you cannot do is *distribute* it unless the aggregate is first placed under the GPL (or a GPL-compatible license) too. If you distribute it without taking that step, that doesn't turn it into a GPL'ed application either -- in that case you've simply (& supposedly) violated the license on P, so your distribution was simply (& supposedly) illegal. And that is in fact the end result that people who knowingly use the GPL want (granting that it appears most people who use the GPL do so unknowing of its consequences). > -- that's why e.g. the libc is distributed under the LGPL which > doesn't have this viral property. You should read RMS on why glibc is under the LGPL: http://www.fsf.org/philosophy/why-not-lgpl.html It will at least disabuse you of the notion that RMS and you are after the same thing . From paulp at ActiveState.com Fri Dec 15 05:02:08 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Thu, 14 Dec 2000 20:02:08 -0800 Subject: [Python-Dev] new draft of PEP 227 References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> Message-ID: <3A3997C0.F977AF51@ActiveState.com> Greg Ewing wrote: > > Paul Prescod : > > > We could say that a local can only shadow a global > > if the local is formally declared. > > How do you intend to enforce that? Seems like it would > require a test on every assignment to a local, to make > sure nobody has snuck in a new global since the function > was compiled. I would expect that all of the checks would be at compile-time. Except for __dict__ hackery, I think it is doable. Python already keeps track of all assignments to locals and all assignments to globals in a function scope. The only addition is keeping track of assignments at a global scope. > > Actually, one could argue that there is no good reason to > > even *allow* the shadowing of globals. > > If shadowing were completely disallowed, it would make it > impossible to write a completely self-contained function > whose source could be moved from one environment to another > without danger of it breaking. I wouldn't like the language > to have a characteristic like that. That seems like a very esoteric requirement. How often do you have functions that do not rely *at all* on their environment (other functions, import statements, global variables). When you move code you have to do some rewriting or customizing of the environment in 94% of the cases. How much effort do you want to spend on the other 6%? Also, there are tools that are designed to help you move code without breaking programs (refactoring editors). They can just as easily handle renaming local variables as adding import statements and fixing up function calls. Paul Prescod From mal at lemburg.com Fri Dec 15 11:05:59 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 11:05:59 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> Message-ID: <3A39ED07.6B3EE68E@lemburg.com> Greg Ewing wrote: > > "M.-A. Lemburg" : > > if an application Q needs a GPLed > > library P to work, then P and Q form a new whole in the sense of > > the GPL. > > I don't see how Q can *need* any particular library P > to work. The most it can need is some library with > an API which is compatible with P's. So I don't > buy that argument. It's the view of the FSF, AFAIK. You can't distribute an application in binary which dynamically links against libreadline (which is GPLed) on the user's machine, since even though you don't distribute libreadline the application running on the user's machine is considered the "whole" in terms of the GPL. FWIW, I don't agree with that view either, but that's probably because I'm a programmer and not a lawyer :) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 15 11:25:12 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 11:25:12 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: Message-ID: <3A39F188.E366B481@lemburg.com> Tim Peters wrote: > > [Tim and MAL talking about the FSF and their views] > > [Tim and MAL showing off as hobby advocates ;-)] > > >>> Another issue: since Python doesn't link Python scripts, is it > >>> still true that if one (pure) Python package is covered by the GPL, > >>> then all other packages needed by that application will also fall > >>> under GPL ? > > > This is very controversial: if an application Q needs a GPLed > > library P to work, then P and Q form a new whole in the sense of > > the GPL. And this even though P wasn't even distributed together > > with Q. Don't ask me why, but that's how RMS and folks look at it. > > Understood, but have you reread your question above, which I've said twice I > can't make sense of? I know, it was backwards. Take an example: I have a program which wants to process MP3 files in some way. Now because of some stroke is luck, all Python MP3 modules out there are covered by the GPL. Now I could write an application which uses a certain interface and then tell the user to install the MP3 module separately. As Barry mentioned, this setup will cause distribution of my application to be illegal because I could have only done so by putting the application under the GPL. > You should read RMS on why glibc is under the LGPL: > > http://www.fsf.org/philosophy/why-not-lgpl.html > > It will at least disabuse you of the notion that RMS and you are after the > same thing . :-) Let's stop this discussion and get back to those cheerful things like Christmas Bells and Santa Clause... :-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From amk at mira.erols.com Fri Dec 15 14:27:24 2000 From: amk at mira.erols.com (A.M. Kuchling) Date: Fri, 15 Dec 2000 08:27:24 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE Message-ID: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> unicodeobject.c contains this code: PyErr_Format(PyExc_ValueError, "unsupported format character '%c' (0x%x) " "at index %i", c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat)); c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits, so '%\u3000' % 1 results in an error message containing "'\000' (0x3000)". Is this worth fixing? I'd say no, since the hex value is more useful for Unicode strings anyway. (I still wanted to mention this little buglet, since I just touched this bit of code.) --amk From jack at oratrix.nl Fri Dec 15 15:26:15 2000 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 15 Dec 2000 15:26:15 +0100 Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: Message by Guido van Rossum , Thu, 14 Dec 2000 09:51:26 -0500 , <200012141451.JAA15637@cj20424-a.reston1.va.home.com> Message-ID: <20001215142616.705993B9B44@snelboot.oratrix.nl> > The reason for the patch is that without this, if you kill a TCP server > and restart it right away, you'll get a 'port in use" error -- TCP has > some kind of strange wait period after a connection is closed before > it can be reused. The patch avoids this error. Well, actually there's a pretty good reason for the "port in use" behaviour: the TCP standard more-or-less requires it. A srchost/srcport/dsthost/dstport combination should not be reused until the maximum TTL has passed, because there may still be "old" retransmissions around. Especially the "open" packets are potentially dangerous. Setting the reuse bit while you're debugging is fine, but setting it in general is not a very good idea... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido at python.org Fri Dec 15 15:31:19 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 09:31:19 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: Your message of "Thu, 14 Dec 2000 20:02:08 PST." <3A3997C0.F977AF51@ActiveState.com> References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> Message-ID: <200012151431.JAA19799@cj20424-a.reston1.va.home.com> > Greg Ewing wrote: > > > > Paul Prescod : > > > > > We could say that a local can only shadow a global > > > if the local is formally declared. > > > > How do you intend to enforce that? Seems like it would > > require a test on every assignment to a local, to make > > sure nobody has snuck in a new global since the function > > was compiled. > > I would expect that all of the checks would be at compile-time. Except > for __dict__ hackery, I think it is doable. Python already keeps track > of all assignments to locals and all assignments to globals in a > function scope. The only addition is keeping track of assignments at a > global scope. > > > > Actually, one could argue that there is no good reason to > > > even *allow* the shadowing of globals. > > > > If shadowing were completely disallowed, it would make it > > impossible to write a completely self-contained function > > whose source could be moved from one environment to another > > without danger of it breaking. I wouldn't like the language > > to have a characteristic like that. > > That seems like a very esoteric requirement. How often do you have > functions that do not rely *at all* on their environment (other > functions, import statements, global variables). > > When you move code you have to do some rewriting or customizing of the > environment in 94% of the cases. How much effort do you want to spend on > the other 6%? Also, there are tools that are designed to help you move > code without breaking programs (refactoring editors). They can just as > easily handle renaming local variables as adding import statements and > fixing up function calls. Can we cut this out please? Paul is misguided. There's no reason to forbid a local shadowing a global. All languages with nested scopes allow this. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at digicool.com Fri Dec 15 17:17:08 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Fri, 15 Dec 2000 11:17:08 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> Message-ID: <14906.17412.221040.895357@anthem.concentric.net> >>>>> "M" == M writes: M> It's the view of the FSF, AFAIK. You can't distribute an M> application in binary which dynamically links against M> libreadline (which is GPLed) on the user's machine, since even M> though you don't distribute libreadline the application running M> on the user's machine is considered the "whole" in terms of the M> GPL. M> FWIW, I don't agree with that view either, but that's probably M> because I'm a programmer and not a lawyer :) I'm not sure I agree with that view either, but mostly because there is a non-GPL replacement for parts of the readline API: http://www.cstr.ed.ac.uk/downloads/editline.html Don't know anything about it, so it may not be featureful enough for Python's needs, but if licensing is really a problem, it might be worth looking into. -Barry From paulp at ActiveState.com Fri Dec 15 17:16:37 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Fri, 15 Dec 2000 08:16:37 -0800 Subject: [Python-Dev] new draft of PEP 227 References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com> Message-ID: <3A3A43E5.347AAF6C@ActiveState.com> Guido van Rossum wrote: > > ... > > Can we cut this out please? Paul is misguided. There's no reason to > forbid a local shadowing a global. All languages with nested scopes > allow this. Python is the only one I know of that implicitly shadows without requiring some form of declaration. JavaScript has it right: reading and writing of globals are symmetrical. In the rare case that you explicitly want to shadow, you need a declaration. Python's rule is confusing, implicit and error causing. In my opinion, of course. If you are dead-set against explicit declarations then I would say that disallowing the ambiguous construct is better than silently treating it as a declaration. Paul Prescod From guido at python.org Fri Dec 15 17:23:07 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 11:23:07 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: Your message of "Fri, 15 Dec 2000 08:16:37 PST." <3A3A43E5.347AAF6C@ActiveState.com> References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com> <3A3A43E5.347AAF6C@ActiveState.com> Message-ID: <200012151623.LAA27630@cj20424-a.reston1.va.home.com> > Python is the only one I know of that implicitly shadows without > requiring some form of declaration. JavaScript has it right: reading and > writing of globals are symmetrical. In the rare case that you explicitly > want to shadow, you need a declaration. Python's rule is confusing, > implicit and error causing. In my opinion, of course. If you are > dead-set against explicit declarations then I would say that disallowing > the ambiguous construct is better than silently treating it as a > declaration. Let's agree to differ. This will never change. In Python, assignment is declaration. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Dec 15 18:01:33 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 12:01:33 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: Your message of "Fri, 15 Dec 2000 08:27:24 EST." <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> Message-ID: <200012151701.MAA28058@cj20424-a.reston1.va.home.com> > unicodeobject.c contains this code: > > PyErr_Format(PyExc_ValueError, > "unsupported format character '%c' (0x%x) " > "at index %i", > c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat)); > > c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits, > so '%\u3000' % 1 results in an error message containing "'\000' > (0x3000)". Is this worth fixing? I'd say no, since the hex value is > more useful for Unicode strings anyway. (I still wanted to mention > this little buglet, since I just touched this bit of code.) Sounds like the '%c' should just be deleted. --Guido van Rossum (home page: http://www.python.org/~guido/) From bckfnn at worldonline.dk Fri Dec 15 18:05:42 2000 From: bckfnn at worldonline.dk (Finn Bock) Date: Fri, 15 Dec 2000 17:05:42 GMT Subject: [Python-Dev] CWD in sys.path. Message-ID: <3a3a480b.28490597@smtp.worldonline.dk> Hi, I'm trying to understand the initialization of sys.path and especially if CWD is supposed to be included in sys.path by default. (I understand the purpose of sys.path[0], that is not the focus of my question). My setup is Python2.0 on Win2000, no PYTHONHOME or PYTHONPATH envvars. In this setup, an empty string exists as sys.path[1], but I'm unsure if this is by careful design or some freak accident. The empty entry is added because HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath does *not* have any subkey. There are a default value, but that value appears to be ignored. If I add a subkey "foo": HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath\foo with a default value of "d:\foo", the CWD is no longer in sys.path. i:\java\jython.cvs\org\python\util>d:\Python20\python.exe -S Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. >>> import sys >>> sys.path ['', 'd:\\foo', 'D:\\PYTHON20\\DLLs', 'D:\\PYTHON20\\lib', 'D:\\PYTHON20\\lib\\plat-win', 'D:\\PYTHON20\\lib\\lib-tk', 'D:\\PYTHON20'] >>> I noticed that some of the PYTHONPATH macros in PC/config.h includes the '.', others does not. So, to put it as a question (for jython): Should CWD be included in sys.path? Are there some situation (like embedding) where CWD shouldn't be in sys.path? regards, finn From guido at python.org Fri Dec 15 18:12:03 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 12:12:03 -0500 Subject: [Python-Dev] CWD in sys.path. In-Reply-To: Your message of "Fri, 15 Dec 2000 17:05:42 GMT." <3a3a480b.28490597@smtp.worldonline.dk> References: <3a3a480b.28490597@smtp.worldonline.dk> Message-ID: <200012151712.MAA02544@cj20424-a.reston1.va.home.com> On Unix, CWD is not in sys.path unless as sys.path[0]. --Guido van Rossum (home page: http://www.python.org/~guido/) From moshez at zadka.site.co.il Sat Dec 16 02:43:41 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Sat, 16 Dec 2000 03:43:41 +0200 (IST) Subject: [Python-Dev] new draft of PEP 227 Message-ID: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il> On Fri, 15 Dec 2000 08:16:37 -0800, Paul Prescod wrote: > Python is the only one I know of that implicitly shadows without > requiring some form of declaration. Perl and Scheme permit implicit shadowing too. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tismer at tismer.com Fri Dec 15 17:42:18 2000 From: tismer at tismer.com (Christian Tismer) Date: Fri, 15 Dec 2000 18:42:18 +0200 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule) References: Message-ID: <3A3A49EA.5D9418E@tismer.com> Tim Peters wrote: ... > > Another issue: since Python doesn't link Python scripts, is it > > still true that if one (pure) Python package is covered by the GPL, > > then all other packages needed by that application will also fall > > under GPL ? > > Sorry, couldn't make sense of the question. Just as well, since you should > ask about it on a GNU forum anyway . The GNU license is transitive. It automatically extends on other parts of a project, unless they are identifiable, independent developments. As soon as a couple of modules is published together, based upon one GPL-ed module, this propagates. I think this is what MAL meant? Anyway, I'd be interested to hear what the GNU forum says. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From amk at mira.erols.com Fri Dec 15 19:10:34 2000 From: amk at mira.erols.com (A.M. Kuchling) Date: Fri, 15 Dec 2000 13:10:34 -0500 Subject: [Python-Dev] What to do about PEP 229? Message-ID: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> I began writing the fabled fancy setup script described in PEP 229, and then realized there was duplication going on here. The code in setup.py would need to know what libraries, #defines, &c., are needed by each module in order to check if they're needed and set them. But if Modules/Setup can be used to override setup.py's behaviour, then much of this information would need to be in that file, too; the details of compiling a module are in two places. Possibilities: 1) Setup contains fully-loaded module descriptions, and the setup script drops unneeded bits. For example, the socket module requires -lnsl on some platforms. The Setup file would contain "socket socketmodule.c -lnsl" on all platforms, and setup.py would check for an nsl library and only use if it's there. This seems dodgy to me; what if -ldbm is needed on one platform and -lndbm on another? 2) Drop setup completely and just maintain setup.py, with some different overriding mechanism. This is more radical. Adding a new module is then not just a matter of editing a simple text file; you'd have to modify setup.py, making it more like maintaining an autoconf script. Remember, the underlying goal of PEP 229 is to have the out-of-the-box Python installation you get from "./configure;make" contain many more useful modules; right now you wouldn't get zlib, syslog, resource, any of the DBM modules, PyExpat, &c. I'm not wedded to using Distutils to get that, but think that's the only practical way; witness the hackery required to get the DB module automatically compiled. You can also wave your hands in the direction of packagers such as ActiveState or Red Hat, and say "let them make to compile everything". But this problem actually inconveniences *me*, since I always build Python myself and have to extensively edit Setup, so I'd like to fix the problem. Thoughts? --amk From nas at arctrix.com Fri Dec 15 13:03:04 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Fri, 15 Dec 2000 04:03:04 -0800 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <14906.17412.221040.895357@anthem.concentric.net>; from barry@digicool.com on Fri, Dec 15, 2000 at 11:17:08AM -0500 References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> Message-ID: <20001215040304.A22056@glacier.fnational.com> On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > I'm not sure I agree with that view either, but mostly because there > is a non-GPL replacement for parts of the readline API: > > http://www.cstr.ed.ac.uk/downloads/editline.html It doesn't work with the current readline module. It is much smaller than readline and works just as well in my experience. Would there be any interest in including a copy with the standard distribution? The license is quite nice (X11 type). Neil From nas at arctrix.com Fri Dec 15 13:14:50 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Fri, 15 Dec 2000 04:14:50 -0800 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012151509.HAA18093@slayer.i.sourceforge.net>; from gvanrossum@users.sourceforge.net on Fri, Dec 15, 2000 at 07:09:46AM -0800 References: <200012151509.HAA18093@slayer.i.sourceforge.net> Message-ID: <20001215041450.B22056@glacier.fnational.com> On Fri, Dec 15, 2000 at 07:09:46AM -0800, Guido van Rossum wrote: > Update of /cvsroot/python/python/dist/src/Lib > In directory slayer.i.sourceforge.net:/tmp/cvs-serv18082 > > Modified Files: > httplib.py > Log Message: > Get rid of string functions. Can you explain the logic behind this recent interest in removing string functions from the standard library? It it performance? Some unicode issue? I don't have a great attachment to string.py but I also don't see the justification for the amount of work it requires. Neil From guido at python.org Fri Dec 15 20:29:37 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 14:29:37 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Fri, 15 Dec 2000 04:14:50 PST." <20001215041450.B22056@glacier.fnational.com> References: <200012151509.HAA18093@slayer.i.sourceforge.net> <20001215041450.B22056@glacier.fnational.com> Message-ID: <200012151929.OAA03073@cj20424-a.reston1.va.home.com> > Can you explain the logic behind this recent interest in removing > string functions from the standard library? It it performance? > Some unicode issue? I don't have a great attachment to string.py > but I also don't see the justification for the amount of work it > requires. I figure that at *some* point we should start putting our money where our mouth is, deprecate most uses of the string module, and start warning about it. Not in 2.1 probably, given my experience below. As a realistic test of the warnings module I played with some warnings about the string module, and then found that say most of the std library modules use it, triggering an extraordinary amount of warnings. I then decided to experiment with the conversion. I quickly found out it's too much work to do manually, so I'll hold off until someone comes up with a tool that does 99% of the work. (The selection of std library modules to convert manually was triggered by something pretty random -- I decided to silence a particular cron job I was running. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From Barrett at stsci.edu Fri Dec 15 20:32:10 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Fri, 15 Dec 2000 14:32:10 -0500 (EST) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> Message-ID: <14906.17712.830224.481130@nem-srvr.stsci.edu> Guido, Here are my comments on PEP 207. (I've also gone back and read most of the 1998 discussion. What a tedious, in terms of time, but enlightening, in terms of content, discussion that was.) | - New function: | | PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op) | | This performs the requested rich comparison, returning a Python | object or raising an exception. The 3rd argument must be one of | LT, LE, EQ, NE, GT or GE. I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc. | Classes | | - Classes can define new special methods __lt__, __le__, __gt__, | __ge__, __eq__, __ne__ to override the corresponding operators. | (You gotta love the Fortran heritage.) If a class overrides | __cmp__ as well, it is only used by PyObject_Compare(). Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to __lt__, __le__, __eq__, etc. I'm not keen on the FORTRAN derived symbolism. I also find it contrary to Python's heritage of being clear and concise. I don't mind typing __lessequal__ (or __less_equal__) once per class for the additional clarity. | - Should we even bother upgrading the existing types? Isn't this question partly related to the coercion issue and which type of comparison takes precedence? And if so, then I would think the answer would be 'yes'. Or better still see below my suggestion of adding poor and rich comparison operators along with matrix-type operators. - If so, how should comparisons on container types be defined? Suppose we have a list whose items define rich comparisons. How should the itemwise comparisons be done? For example: def __lt__(a, b): # a bi: return 0 raise TypeError, "incomparable item types" return len(a) < len(b) This uses the same sequence of comparisons as cmp(), so it may as well use cmp() instead: def __lt__(a, b): # a 0: return 0 assert 0 # unreachable return len(a) < len(b) And now there's not really a reason to change lists to rich comparisons. I don't understand this example. If a[i] and b[i] define rich comparisons, then 'a[i] < b[i]' is likely to return a non-boolean value. Yet the 'if' statement expects a boolean value. I don't see how the above example will work. This example also makes me think that the proposals for new operators (ie. PEP 211 and 225) are a good idea. The discussion of rich comparisons in 1998 also lends some support to this. I can see many uses for two types of comparison operators (as well as the proposed matrix-type operators), one set for poor or boolean comparisons and one for rich or non-boolean comparisons. For example, numeric arrays can define both. Rich comparison operators would return an array of boolean values, while poor comparison operators return a boolean value by performing an implied 'and.reduce' operation. These operators provide clarity and conciseness, without much change to current Python behavior. -- Paul From guido at python.org Fri Dec 15 20:51:04 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 14:51:04 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Fri, 15 Dec 2000 14:32:10 EST." <14906.17712.830224.481130@nem-srvr.stsci.edu> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> Message-ID: <200012151951.OAA03219@cj20424-a.reston1.va.home.com> > Here are my comments on PEP 207. (I've also gone back and read most > of the 1998 discussion. What a tedious, in terms of time, but > enlightening, in terms of content, discussion that was.) > > | - New function: > | > | PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op) > | > | This performs the requested rich comparison, returning a Python > | object or raising an exception. The 3rd argument must be one of > | LT, LE, EQ, NE, GT or GE. > > I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc. This is only at the C level. Having to do a string compare is too slow. Since some of these are multi-character symbols, a character constant doesn't suffice (multi-character character constants are not portable). > | Classes > | > | - Classes can define new special methods __lt__, __le__, __gt__, > | __ge__, __eq__, __ne__ to override the corresponding operators. > | (You gotta love the Fortran heritage.) If a class overrides > | __cmp__ as well, it is only used by PyObject_Compare(). > > Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to > __lt__, __le__, __eq__, etc. I'm not keen on the FORTRAN derived > symbolism. I also find it contrary to Python's heritage of being > clear and concise. I don't mind typing __lessequal__ (or > __less_equal__) once per class for the additional clarity. I don't care about Fortran, but you just showed why I think the short operator names are better: there's less guessing or disagreement about how they are to be spelled. E.g. should it be __lessthan__ or __less_than__ or __less__? > | - Should we even bother upgrading the existing types? > > Isn't this question partly related to the coercion issue and which > type of comparison takes precedence? And if so, then I would think > the answer would be 'yes'. It wouldn't make much of a difference -- comparisons between different types numbers would get the same outcome either way. > Or better still see below my suggestion of > adding poor and rich comparison operators along with matrix-type > operators. > > > - If so, how should comparisons on container types be defined? > Suppose we have a list whose items define rich comparisons. How > should the itemwise comparisons be done? For example: > > def __lt__(a, b): # a for i in range(min(len(a), len(b))): > ai, bi = a[i], b[i] > if ai < bi: return 1 > if ai == bi: continue > if ai > bi: return 0 > raise TypeError, "incomparable item types" > return len(a) < len(b) > > This uses the same sequence of comparisons as cmp(), so it may > as well use cmp() instead: > > def __lt__(a, b): # a for i in range(min(len(a), len(b))): > c = cmp(a[i], b[i]) > if c < 0: return 1 > if c == 0: continue > if c > 0: return 0 > assert 0 # unreachable > return len(a) < len(b) > > And now there's not really a reason to change lists to rich > comparisons. > > I don't understand this example. If a[i] and b[i] define rich > comparisons, then 'a[i] < b[i]' is likely to return a non-boolean > value. Yet the 'if' statement expects a boolean value. I don't see > how the above example will work. Sorry. I was thinking of list items that contain objects that respond to the new overloading protocol, but still return Boolean outcomes. My conclusion is that __cmp__ is just as well. > This example also makes me think that the proposals for new operators > (ie. PEP 211 and 225) are a good idea. The discussion of rich > comparisons in 1998 also lends some support to this. I can see many > uses for two types of comparison operators (as well as the proposed > matrix-type operators), one set for poor or boolean comparisons and > one for rich or non-boolean comparisons. For example, numeric arrays > can define both. Rich comparison operators would return an array of > boolean values, while poor comparison operators return a boolean value > by performing an implied 'and.reduce' operation. These operators > provide clarity and conciseness, without much change to current Python > behavior. Maybe. That can still be decided later. Right now, adding operators is not on the table for 2.1 (if only because there are two conflicting PEPs); adding rich comparisons *is* on the table because it doesn't change the parser (and because the rich comparisons idea was already pretty much worked out two years ago). --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at home.com Fri Dec 15 22:08:02 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 15 Dec 2000 16:08:02 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012151929.OAA03073@cj20424-a.reston1.va.home.com> Message-ID: [Neil Schemenauer] > Can you explain the logic behind this recent interest in removing > string functions from the standard library? It it performance? > Some unicode issue? I don't have a great attachment to string.py > but I also don't see the justification for the amount of work it > requires. [Guido] > I figure that at *some* point we should start putting our money where > our mouth is, deprecate most uses of the string module, and start > warning about it. Not in 2.1 probably, given my experience below. I think this begs Neil's questions: *is* our mouth there , and if so, why? The only public notice of impending string module deprecation anyone came up with was a vague note on the 1.6 web page, and one not repeated in any of the 2.0 release material. "string" is right up there with "os" and "sys" as a FIM (Frequently Imported Module), so the required code changes will be massive. As a user, I don't see what's in it for me to endure that pain: the string module functions work fine! Neither are they warts in the language, any more than that we say sin(pi) instead of pi.sin(). Keeping the functions around doesn't hurt anybody that I can see. > As a realistic test of the warnings module I played with some warnings > about the string module, and then found that say most of the std > library modules use it, triggering an extraordinary amount of > warnings. I then decided to experiment with the conversion. I > quickly found out it's too much work to do manually, so I'll hold off > until someone comes up with a tool that does 99% of the work. Ah, so that's the *easy* way to kill this crusade -- forget I said anything . From Barrett at stsci.edu Fri Dec 15 22:20:20 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Fri, 15 Dec 2000 16:20:20 -0500 (EST) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <200012151951.OAA03219@cj20424-a.reston1.va.home.com> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> <200012151951.OAA03219@cj20424-a.reston1.va.home.com> Message-ID: <14906.33325.5784.118110@nem-srvr.stsci.edu> >> This example also makes me think that the proposals for new operators >> (ie. PEP 211 and 225) are a good idea. The discussion of rich >> comparisons in 1998 also lends some support to this. I can see many >> uses for two types of comparison operators (as well as the proposed >> matrix-type operators), one set for poor or boolean comparisons and >> one for rich or non-boolean comparisons. For example, numeric arrays >> can define both. Rich comparison operators would return an array of >> boolean values, while poor comparison operators return a boolean value >> by performing an implied 'and.reduce' operation. These operators >> provide clarity and conciseness, without much change to current Python >> behavior. > > Maybe. That can still be decided later. Right now, adding operators > is not on the table for 2.1 (if only because there are two conflicting > PEPs); adding rich comparisons *is* on the table because it doesn't > change the parser (and because the rich comparisons idea was already > pretty much worked out two years ago). Yes, it was worked out previously _assuming_ rich comparisons do not use any new operators. But let's stop for a moment and contemplate adding rich comparisons along with new comparison operators. What do we gain? 1. The current boolean operator behavior does not have to change, and hence will be backward compatible. 2. It eliminates the need to decide whether or not rich comparisons takes precedence over boolean comparisons. 3. The new operators add additional behavior without directly impacting current behavior and the use of them is unambigous, at least in relation to current Python behavior. You know by the operator what type of comparison will be returned. This should appease Jim Fulton, based on his arguments in 1998 about comparison operators always returning a boolean value. 4. Compound objects, such as lists, could implement both rich and boolean comparisons. The boolean comparison would remain as is, while the rich comparison would return a list of boolean values. Current behavior doesn't change; just a new feature, which you may or may not choose to use, is added. If we go one step further and add the matrix-style operators along with the comparison operators, we can provide a consistent user interface to array/complex operations without changing current Python behavior. If a user has no need for these new operators, he doesn't have to use them or even know about them. All we've done is made Python richer, but I believe with making it more complex. For example, all element-wise operations could have a ':' appended to them, e.g. '+:', '<:', etc.; and will define element-wise addition, element-wise less-than, etc. The traditional '*', '/', etc. operators can then be used for matrix operations, which will appease the Matlab people. Therefore, I don't think rich comparisons and matrix-type operators should be considered separable. I really think you should consider this suggestion. It appeases many groups while providing a consistent and clear user interface, while greatly impacting current Python behavior. Always-causing-havoc-at-the-last-moment-ly Yours, Paul -- Dr. Paul Barrett Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Group FAX: 410-338-4767 Baltimore, MD 21218 From guido at python.org Fri Dec 15 22:23:46 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 16:23:46 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Fri, 15 Dec 2000 16:08:02 EST." References: Message-ID: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> > "string" is right up there with "os" and "sys" as a FIM (Frequently > Imported Module), so the required code changes will be massive. As > a user, I don't see what's in it for me to endure that pain: the > string module functions work fine! Neither are they warts in the > language, any more than that we say sin(pi) instead of pi.sin(). > Keeping the functions around doesn't hurt anybody that I can see. Hm. I'm not saying that this one will be easy. But I don't like having "two ways to do it". It means more learning, etc. (you know the drill). We could have chosen to make the strop module support Unicode; instead, we chose to give string objects methods and promote the use of those methods instead of the string module. (And in a generous mood, we also supported Unicode in the string module -- by providing wrappers that invoke string methods.) If you're saying that we should give users ample time for the transition, I'm with you. If you're saying that you think the string module is too prominent to ever start deprecating its use, I'm afraid we have a problem. I'd also like to note that using the string module's wrappers incurs the overhead of a Python function call -- using string methods is faster. Finally, I like the look of fields[i].strip().lower() much better than that of string.lower(string.strip(fields[i])) -- an actual example from mimetools.py. Ideally, I would like to deprecate the entire string module, so that I can place a single warning at its top. This will cause a single warning to be issued for programs that still use it (no matter how many times it is imported). Unfortunately, there are a couple of things that still need it: string.letters etc., and string.maketrans(). --Guido van Rossum (home page: http://www.python.org/~guido/) From gvwilson at nevex.com Fri Dec 15 22:43:47 2000 From: gvwilson at nevex.com (Greg Wilson) Date: Fri, 15 Dec 2000 16:43:47 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <14906.33325.5784.118110@nem-srvr.stsci.edu> Message-ID: <002901c066e0$1b3f13c0$770a0a0a@nevex.com> Hi, Paul; thanks for your mail. W.r.t. adding matrix operators to Python, you may want to take a look at the counter-arguments in PEP 0211 (attached). Basically, I spoke with the authors of GNU Octave (a GPL'd clone of MATLAB) about what users really used. They felt that the only matrix operator that really mattered was matrix-matrix multiply; other operators (including the left and right division operators that even experienced MATLAB users often mix up) were second order at best, and were better handled with methods or functions. Thanks, Greg p.s. PEP 0225 (also attached) is an alternative to PEP 0211 which would add most of the MATLAB-ish operators to Python. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pep-0211.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pep-0225.txt URL: From guido at python.org Fri Dec 15 22:55:46 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 16:55:46 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Fri, 15 Dec 2000 16:20:20 EST." <14906.33325.5784.118110@nem-srvr.stsci.edu> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> <200012151951.OAA03219@cj20424-a.reston1.va.home.com> <14906.33325.5784.118110@nem-srvr.stsci.edu> Message-ID: <200012152155.QAA03879@cj20424-a.reston1.va.home.com> > > Maybe. That can still be decided later. Right now, adding operators > > is not on the table for 2.1 (if only because there are two conflicting > > PEPs); adding rich comparisons *is* on the table because it doesn't > > change the parser (and because the rich comparisons idea was already > > pretty much worked out two years ago). > > Yes, it was worked out previously _assuming_ rich comparisons do not > use any new operators. > > But let's stop for a moment and contemplate adding rich comparisons > along with new comparison operators. What do we gain? > > 1. The current boolean operator behavior does not have to change, and > hence will be backward compatible. What incompatibility do you see in the current proposal? > 2. It eliminates the need to decide whether or not rich comparisons > takes precedence over boolean comparisons. Only if you want different semantics -- that's only an issue for NumPy. > 3. The new operators add additional behavior without directly impacting > current behavior and the use of them is unambigous, at least in > relation to current Python behavior. You know by the operator what > type of comparison will be returned. This should appease Jim > Fulton, based on his arguments in 1998 about comparison operators > always returning a boolean value. As you know, I'm now pretty close to Jim. :-) He seemed pretty mellow about this now. > 4. Compound objects, such as lists, could implement both rich > and boolean comparisons. The boolean comparison would remain as > is, while the rich comparison would return a list of boolean > values. Current behavior doesn't change; just a new feature, which > you may or may not choose to use, is added. > > If we go one step further and add the matrix-style operators along > with the comparison operators, we can provide a consistent user > interface to array/complex operations without changing current Python > behavior. If a user has no need for these new operators, he doesn't > have to use them or even know about them. All we've done is made > Python richer, but I believe with making it more complex. For > example, all element-wise operations could have a ':' appended to > them, e.g. '+:', '<:', etc.; and will define element-wise addition, > element-wise less-than, etc. The traditional '*', '/', etc. operators > can then be used for matrix operations, which will appease the Matlab > people. > > Therefore, I don't think rich comparisons and matrix-type operators > should be considered separable. I really think you should consider > this suggestion. It appeases many groups while providing a consistent > and clear user interface, while greatly impacting current Python > behavior. > > Always-causing-havoc-at-the-last-moment-ly Yours, I think you misunderstand. Rich comparisons are mostly about allowing the separate overloading of <, <=, ==, !=, >, and >=. This is useful in its own light. If you don't want to use this overloading facility for elementwise comparisons in NumPy, that's fine with me. Nobody says you have to -- it's just that you *could*. Red my lips: there won't be *any* new operators in 2.1. There will a better way to overload the existing Boolean operators, and they will be able to return non-Boolean results. That's useful in other situations besides NumPy. Feel free to lobby for elementwise operators -- but based on the discussion about this subject so far, I don't give it much of a chance even past Python 2.1. They would add a lot of baggage to the language (e.g. the table of operators in all Python books would be about twice as long) and by far the most users don't care about them. (Read the intro to 211 for some of the concerns -- this PEP tries to make the addition palatable by adding exactly *one* new operator.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Dec 15 23:16:34 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 17:16:34 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Fri, 08 Dec 2000 17:58:03 EST." <200012082258.RAA02389@cj20424-a.reston1.va.home.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> Message-ID: <200012152216.RAA11098@cj20424-a.reston1.va.home.com> I've checked in the essential parts of the warnings PEP, and closed the SF patch. I haven't checked in the examples in the patch -- it's too early for that. But I figured that it's easier to revise the code once it's checked in. I'm pretty confident that it works as advertised. Still missing is documentation: the warnings module, the new API functions, and the new command line option should all be documented. I'll work on that over the holidays. I consider the PEP done. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Fri Dec 15 23:21:24 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:21:24 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> Message-ID: <3A3A9964.A6B3DD11@lemburg.com> Neil Schemenauer wrote: > > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > > I'm not sure I agree with that view either, but mostly because there > > is a non-GPL replacement for parts of the readline API: > > > > http://www.cstr.ed.ac.uk/downloads/editline.html > > It doesn't work with the current readline module. It is much > smaller than readline and works just as well in my experience. > Would there be any interest in including a copy with the standard > distribution? The license is quite nice (X11 type). +1 from here -- line editing is simply a very important part of an interactive prompt and readline is not only big, slow and full of strange surprises, but also GPLed ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 15 23:24:34 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:24:34 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> Message-ID: <3A3A9A22.E9BA9551@lemburg.com> "A.M. Kuchling" wrote: > > unicodeobject.c contains this code: > > PyErr_Format(PyExc_ValueError, > "unsupported format character '%c' (0x%x) " > "at index %i", > c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat)); > > c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits, > so '%\u3000' % 1 results in an error message containing "'\000' > (0x3000)". Is this worth fixing? I'd say no, since the hex value is > more useful for Unicode strings anyway. (I still wanted to mention > this little buglet, since I just touched this bit of code.) Why would you want to fix it ? Format characters will always be ASCII and thus 7-bit -- theres really no need to expand the set of possibilities beyond 8 bits ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake at acm.org Fri Dec 15 23:22:34 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 15 Dec 2000 17:22:34 -0500 (EST) Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012152216.RAA11098@cj20424-a.reston1.va.home.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <200012152216.RAA11098@cj20424-a.reston1.va.home.com> Message-ID: <14906.39338.795843.947683@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > Still missing is documentation: the warnings module, the new API > functions, and the new command line option should all be documented. > I'll work on that over the holidays. I've assigned a bug to you in case you forget. I've given it a "show-stopper" priority level, so I'll feel good ripping the code out if you don't get docs written in time. ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From mal at lemburg.com Fri Dec 15 23:39:18 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:39:18 +0100 Subject: [Python-Dev] What to do about PEP 229? References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> Message-ID: <3A3A9D96.80781D61@lemburg.com> "A.M. Kuchling" wrote: > > I began writing the fabled fancy setup script described in PEP 229, > and then realized there was duplication going on here. The code in > setup.py would need to know what libraries, #defines, &c., are needed > by each module in order to check if they're needed and set them. But > if Modules/Setup can be used to override setup.py's behaviour, then > much of this information would need to be in that file, too; the > details of compiling a module are in two places. > > Possibilities: > > 1) Setup contains fully-loaded module descriptions, and the setup > script drops unneeded bits. For example, the socket module > requires -lnsl on some platforms. The Setup file would contain > "socket socketmodule.c -lnsl" on all platforms, and setup.py would > check for an nsl library and only use if it's there. > > This seems dodgy to me; what if -ldbm is needed on one platform and > -lndbm on another? Can't distutils try both and then settle for the working combination ? [distutils isn't really ready for auto-configure yet, but Greg has already provided most of the needed functionality -- it's just not well integrated into the rest of the build process in version 1.0.1 ... BTW, where is Gerg ? I haven't heard from him in quite a while.] > 2) Drop setup completely and just maintain setup.py, with some > different overriding mechanism. This is more radical. Adding a > new module is then not just a matter of editing a simple text file; > you'd have to modify setup.py, making it more like maintaining an > autoconf script. Why not parse Setup and use it as input to distutils setup.py ? > Remember, the underlying goal of PEP 229 is to have the out-of-the-box > Python installation you get from "./configure;make" contain many more > useful modules; right now you wouldn't get zlib, syslog, resource, any > of the DBM modules, PyExpat, &c. I'm not wedded to using Distutils to > get that, but think that's the only practical way; witness the hackery > required to get the DB module automatically compiled. > > You can also wave your hands in the direction of packagers such as > ActiveState or Red Hat, and say "let them make to compile everything". > But this problem actually inconveniences *me*, since I always build > Python myself and have to extensively edit Setup, so I'd like to fix > the problem. > > Thoughts? Nice idea :-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 15 23:44:15 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:44:15 +0100 Subject: [Python-Dev] Death to string functions! References: <200012151509.HAA18093@slayer.i.sourceforge.net> <20001215041450.B22056@glacier.fnational.com> <200012151929.OAA03073@cj20424-a.reston1.va.home.com> Message-ID: <3A3A9EBF.3F9306B6@lemburg.com> Guido van Rossum wrote: > > > Can you explain the logic behind this recent interest in removing > > string functions from the standard library? It it performance? > > Some unicode issue? I don't have a great attachment to string.py > > but I also don't see the justification for the amount of work it > > requires. > > I figure that at *some* point we should start putting our money where > our mouth is, deprecate most uses of the string module, and start > warning about it. Not in 2.1 probably, given my experience below. > > As a realistic test of the warnings module I played with some warnings > about the string module, and then found that say most of the std > library modules use it, triggering an extraordinary amount of > warnings. I then decided to experiment with the conversion. I > quickly found out it's too much work to do manually, so I'll hold off > until someone comes up with a tool that does 99% of the work. This would also help a lot of programmers out there who are stuch with 100k LOCs of Python code using string.py ;) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 15 23:49:01 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:49:01 +0100 Subject: [Python-Dev] Death to string functions! References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: <3A3A9FDD.E6F021AF@lemburg.com> Guido van Rossum wrote: > > Ideally, I would like to deprecate the entire string module, so that I > can place a single warning at its top. This will cause a single > warning to be issued for programs that still use it (no matter how > many times it is imported). Unfortunately, there are a couple of > things that still need it: string.letters etc., and > string.maketrans(). Can't we come up with a module similar to unicodedata[.py] ? string.py could then still provide the interfaces, but the implementation would live in stringdata.py [Perhaps we won't need stringdata by then... Unicode will have taken over and the discussion be mood ;-)] -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From thomas at xs4all.net Fri Dec 15 23:54:25 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 15 Dec 2000 23:54:25 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <20001215040304.A22056@glacier.fnational.com>; from nas@arctrix.com on Fri, Dec 15, 2000 at 04:03:04AM -0800 References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> Message-ID: <20001215235425.A29681@xs4all.nl> On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote: > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > > I'm not sure I agree with that view either, but mostly because there > > is a non-GPL replacement for parts of the readline API: > > > > http://www.cstr.ed.ac.uk/downloads/editline.html > > It doesn't work with the current readline module. It is much > smaller than readline and works just as well in my experience. > Would there be any interest in including a copy with the standard > distribution? The license is quite nice (X11 type). Definately +1 from here. Readline reminds me of the cold war, for some reason. (Actually, multiple reasons ;) I don't have time to do it myself, unfortunately, or I would. (Looking at editline has been on my TODO list for a while... :P) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From martin at loewis.home.cs.tu-berlin.de Sat Dec 16 13:32:30 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 16 Dec 2000 13:32:30 +0100 Subject: [Python-Dev] PEP 226 Message-ID: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> I remember earlier discussion on the Python 2.1 release schedule, and never managed to comment on those. I believe that Python contributors and maintainers did an enourmous job in releasing Python 2, which took quite some time from everybody's life. I think it is unrealistic to expect the same amount of commitment for the next release, especially if that release appears just a few months after the previous release (that is, one month from now). So I'd like to ask the release manager to take that into account. I'm not quite sure what kind of action I expect; possible alternatives are: - declare 2.1 a pure bug fix release only; with a minimal set of new features. In particular, don't push for completion of PEPs; everybody should then accept that most features that are currently discussed will appear in Python 2.2. - move the schedule for Python 2.1 back (or is it forward?) by, say, a few month. This will people give some time to do the things that did not get the right amount of attention during 2.0 release, and will still allow to work on new and interesting features. Just my 0.02EUR, Martin From guido at python.org Sat Dec 16 17:38:28 2000 From: guido at python.org (Guido van Rossum) Date: Sat, 16 Dec 2000 11:38:28 -0500 Subject: [Python-Dev] PEP 226 In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100." <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> Message-ID: <200012161638.LAA13888@cj20424-a.reston1.va.home.com> > I remember earlier discussion on the Python 2.1 release schedule, and > never managed to comment on those. > > I believe that Python contributors and maintainers did an enourmous > job in releasing Python 2, which took quite some time from everybody's > life. I think it is unrealistic to expect the same amount of > commitment for the next release, especially if that release appears > just a few months after the previous release (that is, one month from > now). > > So I'd like to ask the release manager to take that into > account. I'm not quite sure what kind of action I expect; possible > alternatives are: > - declare 2.1 a pure bug fix release only; with a minimal set of new > features. In particular, don't push for completion of PEPs; everybody > should then accept that most features that are currently discussed > will appear in Python 2.2. > > - move the schedule for Python 2.1 back (or is it forward?) by, say, a > few month. This will people give some time to do the things that did > not get the right amount of attention during 2.0 release, and will > still allow to work on new and interesting features. > > Just my 0.02EUR, You're right -- 2.0 (including 1.6) was a monumental effort, and I'm grateful to all who contributed. I don't expect that 2.1 will be anywhere near the same amount of work! Let's look at what's on the table. 0042 Small Feature Requests Hylton SD 205 pep-0205.txt Weak References Drake S 207 pep-0207.txt Rich Comparisons Lemburg, van Rossum S 208 pep-0208.txt Reworking the Coercion Model Schemenauer S 217 pep-0217.txt Display Hook for Interactive Use Zadka S 222 pep-0222.txt Web Library Enhancements Kuchling I 226 pep-0226.txt Python 2.1 Release Schedule Hylton S 227 pep-0227.txt Statically Nested Scopes Hylton S 230 pep-0230.txt Warning Framework van Rossum S 232 pep-0232.txt Function Attributes Warsaw S 233 pep-0233.txt Python Online Help Prescod From guido at python.org Sat Dec 16 17:46:32 2000 From: guido at python.org (Guido van Rossum) Date: Sat, 16 Dec 2000 11:46:32 -0500 Subject: [Python-Dev] PEP 226 In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100." <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> Message-ID: <200012161646.LAA13947@cj20424-a.reston1.va.home.com> [Oops, I posted a partial edit of this message by mistake before.] > I remember earlier discussion on the Python 2.1 release schedule, and > never managed to comment on those. > > I believe that Python contributors and maintainers did an enourmous > job in releasing Python 2, which took quite some time from everybody's > life. I think it is unrealistic to expect the same amount of > commitment for the next release, especially if that release appears > just a few months after the previous release (that is, one month from > now). > > So I'd like to ask the release manager to take that into > account. I'm not quite sure what kind of action I expect; possible > alternatives are: > - declare 2.1 a pure bug fix release only; with a minimal set of new > features. In particular, don't push for completion of PEPs; everybody > should then accept that most features that are currently discussed > will appear in Python 2.2. > > - move the schedule for Python 2.1 back (or is it forward?) by, say, a > few month. This will people give some time to do the things that did > not get the right amount of attention during 2.0 release, and will > still allow to work on new and interesting features. > > Just my 0.02EUR, You're right -- 2.0 (including 1.6) was a monumental effort, and I'm grateful to all who contributed. I don't expect that 2.1 will be anywhere near the same amount of work! Let's look at what's on the table. These are listed as Active PEPs -- under serious consideration for Python 2.1: > 0042 Small Feature Requests Hylton We can do some of these or leave them. > 0205 Weak References Drake This one's open. > 0207 Rich Comparisons Lemburg, van Rossum This is really not that much work -- I would've done it already if I weren't distracted by the next one. > 0208 Reworking the Coercion Model Schemenauer Neil has most of this under control. I don't doubt for a second that it will be finished. > 0217 Display Hook for Interactive Use Zadka Probably a 20-line fix. > 0222 Web Library Enhancements Kuchling Up to Andrew. If he doesn't get to it, no big deal. > 0226 Python 2.1 Release Schedule Hylton I still think this is realistic -- a release before the conference seems doable! > 0227 Statically Nested Scopes Hylton This one's got a 50% chance at least. Jeremy seems motivated to do it. > 0230 Warning Framework van Rossum Done except for documentation. > 0232 Function Attributes Warsaw We need to discuss this more, but it's not much work to implement. > 0233 Python Online Help Prescod If Paul can control his urge to want to solve everything at once, I see no reason whi this one couldn't find its way into 2.1. Now, officially the PEP deadline is closed today: the schedule says "16-Dec-2000: 2.1 PEPs ready for review". That means that no new PEPs will be considered for inclusion in 2.1, and PEPs not in the active list won't be considered either. But the PEPs in the list above are all ready for review, even if we don't agree with all of them. I'm actually more worried about the ever-growing number of bug reports and submitted patches. But that's for another time. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Sun Dec 17 01:09:28 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Sat, 16 Dec 2000 19:09:28 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:24:34PM +0100 References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> <3A3A9A22.E9BA9551@lemburg.com> Message-ID: <20001216190928.A6703@kronos.cnri.reston.va.us> On Fri, Dec 15, 2000 at 11:24:34PM +0100, M.-A. Lemburg wrote: >Why would you want to fix it ? Format characters will always >be ASCII and thus 7-bit -- theres really no need to expand the >set of possibilities beyond 8 bits ;-) This message is for characters that aren't format characters, which therefore includes all characters >127. --amk From akuchlin at mems-exchange.org Sun Dec 17 01:17:39 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Sat, 16 Dec 2000 19:17:39 -0500 Subject: [Python-Dev] What to do about PEP 229? In-Reply-To: <3A3A9D96.80781D61@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:39:18PM +0100 References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> Message-ID: <20001216191739.B6703@kronos.cnri.reston.va.us> On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote: >Can't distutils try both and then settle for the working combination ? I'm worried about subtle problems; what if an unneeded -lfoo drags in a customized malloc, or has symbols which conflict with some other library. >... BTW, where is Greg ? I haven't heard from him in quite a while.] Still around; he just hasn't been posting much these days. >Why not parse Setup and use it as input to distutils setup.py ? That was option 1. The existing Setup format doesn't really contain enough intelligence, though; the intelligence is usually in comments such as "Uncomment the following line for Solaris". So either the Setup format is modified (bad, since we'd break existing 3rd-party packages that still use a Makefile.pre.in), or I give up and just do everything in a setup.py. --amk From guido at python.org Sun Dec 17 03:38:01 2000 From: guido at python.org (Guido van Rossum) Date: Sat, 16 Dec 2000 21:38:01 -0500 Subject: [Python-Dev] What to do about PEP 229? In-Reply-To: Your message of "Sat, 16 Dec 2000 19:17:39 EST." <20001216191739.B6703@kronos.cnri.reston.va.us> References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> <20001216191739.B6703@kronos.cnri.reston.va.us> Message-ID: <200012170238.VAA14466@cj20424-a.reston1.va.home.com> > >Why not parse Setup and use it as input to distutils setup.py ? > > That was option 1. The existing Setup format doesn't really contain > enough intelligence, though; the intelligence is usually in comments > such as "Uncomment the following line for Solaris". So either the > Setup format is modified (bad, since we'd break existing 3rd-party > packages that still use a Makefile.pre.in), or I give up and just do > everything in a setup.py. Forget Setup. Convert it and be done with it. There really isn't enough there to hang on to. We'll support Setup format (through the makesetup script and the Misc/Makefile.pre.in file) for 3rd party b/w compatibility, but we won't need to use it ourselves. (Too bad for 3rd party documentation that describes the Setup format. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at home.com Sun Dec 17 08:34:27 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 17 Dec 2000 02:34:27 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: <20001216190928.A6703@kronos.cnri.reston.va.us> Message-ID: [MAL] > Why would you want to fix it ? Format characters will always > be ASCII and thus 7-bit -- theres really no need to expand the > set of possibilities beyond 8 bits ;-) [AMK] > This message is for characters that aren't format characters, which > therefore includes all characters >127. I'm with the wise man who suggested to drop the %c in this case and just display the hex value. Although it would be more readable to drop the %c if and only if the bogus format character isn't printable 7-bit ASCII. Which is obvious, yes? A new if/else isn't going to hurt anything. From tim.one at home.com Sun Dec 17 08:57:01 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 17 Dec 2000 02:57:01 -0500 Subject: [Python-Dev] PEP 226 In-Reply-To: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> Message-ID: [Martin v. Loewis] > ... > - move the schedule for Python 2.1 back (or is it forward?) by, say, a > few month. This will people give some time to do the things that did > not get the right amount of attention during 2.0 release, and will > still allow to work on new and interesting features. Just a stab in the dark, but is one of your real concerns the spotty state of Unicode support in the std libraries? If so, nobody working on the PEPs Guido identified would be likely to work on improving Unicode support even if the PEPs vanished. I don't know how Unicode support is going to improve, but in the absence of visible work in that direction-- or even A Plan to get some --I doubt we're going to hold up 2.1 waiting for magic. no-feature-is-ever-done-ly y'rs - tim From tim.one at home.com Sun Dec 17 09:30:24 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 17 Dec 2000 03:30:24 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A387D6A.782E6A3B@prescod.net> Message-ID: [Tim] >> I've rarely seen problems due to shadowing a global, but have often >> seen problems due to shadowing a builtin. [Paul Prescod] > Really? Yes. > I think that there are two different issues here. One is consciously > choosing to create a new variable but not understanding that there > already exists a variable by that name. (i.e. str, list). Yes, and that's what I've often seen, typically long after the original code is written: someone sticks in some debugging output, or makes a small change to the implementation, and introduces e.g. str = some_preexisting_var + ":" yadda(str) "Suddenly" the program misbehaves in baffling ways. They're "baffling" because the errors do not occur on the lines where the changes were made, and are almost never related to the programmer's intent when making the changes. > Another is trying to assign to a global but actually shadowing it. I've rarely seen that. > There is no way that anyone coming from another language is going > to consider this transcript reasonable: True, but I don't really care: everyone gets burned once, the better ones eventually learn to use classes instead of mutating globals, and even the dull get over it. It is not, in my experience, an on-going problem for anyone. But I still get burned regularly by shadowing builtins. The burns are not fatal, however, and I can't think of an ointment less painful than the blisters. > >>> a=5 > >>> def show(): > ... print a > ... > >>> def set(val): > ... a=val > ... > >>> a > 5 > >>> show() > 5 > >>> set(10) > >>> show() > 5 > > It doesn't seem to make any sense. My solution is to make the assignment > in "set" illegal unless you add a declaration that says: "No, really. I > mean it. Override that sucker." As the PEP points out, overriding is > seldom a good idea so the requirement to declare would be rarely > invoked. I expect it would do less harm to introduce a compile-time warning for locals that are never referenced (such as the "a" in "set"). > ... > The "right answer" in terms of namespace theory is to consistently refer > to builtins with a prefix (whether "__builtins__" or "$") but that's > pretty unpalatable from an aesthetic point of view. Right, that's one of the ointments I won't apply to my own code, so wouldn't think of asking others to either. WRT mutable globals, people who feel they have to use them would be well served to adopt a naming convention. For example, begin each name with "g" and capitalize the second letter. This can make global-rich code much easier to follow (I've done-- and very happily --similar things in Javascript and C++). From pf at artcom-gmbh.de Sun Dec 17 10:59:11 2000 From: pf at artcom-gmbh.de (Peter Funk) Date: Sun, 17 Dec 2000 10:59:11 +0100 (MET) Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 15, 2000 4:23:46 pm" Message-ID: Hi, Guido van Rossum: > If you're saying that you think the string module is too prominent to > ever start deprecating its use, I'm afraid we have a problem. I strongly believe the string module is too prominent. > I'd also like to note that using the string module's wrappers incurs > the overhead of a Python function call -- using string methods is > faster. I think most care more about readbility than about run time performance. For people without much OOP experience, the method syntax hurts readability. > Finally, I like the look of fields[i].strip().lower() much better than > that of string.lower(string.strip(fields[i])) -- an actual example > from mimetools.py. Hmmmm.... May be this is just a matter of taste? Like my preference for '<>' instead of '!='? Personally I still like the old fashinoned form more. Especially, if string.join() or string.split() are involved. Since Python 1.5.2 will stay around for several years, keeping backward compatibility in our Python coding is still major issue for us. So we won't change our Python coding style soon if ever. > Ideally, I would like to deprecate the entire string module, so that I [...] I share Mark Lutz and Tim Peters oppinion, that this crusade will do more harm than good to Python community. IMO this is a really bad idea. Just my $0.02, Peter From martin at loewis.home.cs.tu-berlin.de Sun Dec 17 12:13:09 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 17 Dec 2000 12:13:09 +0100 Subject: [Python-Dev] PEP 226 In-Reply-To: References: Message-ID: <200012171113.MAA00733@loewis.home.cs.tu-berlin.de> > Just a stab in the dark, but is one of your real concerns the spotty state > of Unicode support in the std libraries? Not at all. I really responded to amk's message # All the PEPs for 2.1 are supposed to be complete for Dec. 16, and # some of those PEPs are pretty complicated. I'm a bit worried that # it's been so quiet on python-dev lately, especially after the # previous two weeks of lively discussion. I just thought that something was wrong here - contributing to a free software project ought to be fun for contributors, not a cause for worries. There-are-other-things-but-i18n-although-they-are-not-that-interesting y'rs, Martin From guido at python.org Sun Dec 17 15:38:07 2000 From: guido at python.org (Guido van Rossum) Date: Sun, 17 Dec 2000 09:38:07 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: Your message of "Sun, 17 Dec 2000 03:30:24 EST." References: Message-ID: <200012171438.JAA21603@cj20424-a.reston1.va.home.com> > I expect it would do less harm to introduce a compile-time warning for > locals that are never referenced (such as the "a" in "set"). Another warning that would be quite useful (and trap similar cases) would be "local variable used before set". --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Dec 17 15:40:40 2000 From: guido at python.org (Guido van Rossum) Date: Sun, 17 Dec 2000 09:40:40 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Sun, 17 Dec 2000 10:59:11 +0100." References: Message-ID: <200012171440.JAA21620@cj20424-a.reston1.va.home.com> > I think most care more about readbility than about run time performance. > For people without much OOP experience, the method syntax hurts > readability. I don't believe one bit of this. By that standard, we would do better to define a new module "list" and start writing list.append(L, x) for L.append(x). > I share Mark Lutz and Tim Peters oppinion, that this crusade will do > more harm than good to Python community. IMO this is a really bad > idea. You are entitled to your opinion, but given that your arguments seem very weak I will continue to ignore it (except to argue with you :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at digicool.com Sun Dec 17 17:17:12 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Sun, 17 Dec 2000 11:17:12 -0500 Subject: [Python-Dev] Death to string functions! References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: <14908.59144.321167.419762@anthem.concentric.net> >>>>> "PF" == Peter Funk writes: PF> Hmmmm.... May be this is just a matter of taste? Like my PF> preference for '<>' instead of '!='? Personally I still like PF> the old fashinoned form more. Especially, if string.join() or PF> string.split() are involved. Hey cool! I prefer <> over != too, but I also (not surprisingly) strongly prefer string methods over string module functions. TOOWTDI-MA-ly y'rs, -Barry From gvwilson at nevex.com Sun Dec 17 17:25:17 2000 From: gvwilson at nevex.com (Greg Wilson) Date: Sun, 17 Dec 2000 11:25:17 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <14908.59144.321167.419762@anthem.concentric.net> Message-ID: <000201c06845$f1afdb40$770a0a0a@nevex.com> +1 on deprecating string functions. Every Python book and tutorial (including mine) emphasizes Python's simplicity and lack of Perl-ish redundancy; the more we practice what we preach, the more persuasive this argument is. Greg (who admittedly only has a few thousand lines of Python to maintain) From pf at artcom-gmbh.de Sun Dec 17 18:40:06 2000 From: pf at artcom-gmbh.de (Peter Funk) Date: Sun, 17 Dec 2000 18:40:06 +0100 (MET) Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012171440.JAA21620@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 17, 2000 9:40:40 am" Message-ID: [string.function(S, ...) vs. S.method(...)] Guido van Rossum: > I don't believe one bit of this. By that standard, we would do better > to define a new module "list" and start writing list.append(L, x) for > L.append(x). list objects have only very few methods. Strings have so many methods. Some of them have names, that clash easily with the method names of other kind of objects. Since there are no type declarations in Python, looking at the code in isolation and seeing a line i = string.index(some_parameter) tells at the first glance, that some_parameter should be a string object even if the doc string of this function is too terse. However in i = some_parameter.index() it could be a list, a database or whatever. > You are entitled to your opinion, but given that your arguments seem > very weak I will continue to ignore it (except to argue with you :-). I see. But given the time frame that the string module wouldn't go away any time soon, I guess I have a lot of time to either think about some stronger arguments or to get finally accustomed to that new style of coding. But since we have to keep compatibility with Python 1.5.2 for at least the next two years chances for the latter are bad. Regards and have a nice vacation, Peter From mwh21 at cam.ac.uk Sun Dec 17 19:18:24 2000 From: mwh21 at cam.ac.uk (Michael Hudson) Date: 17 Dec 2000 18:18:24 +0000 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: Thomas Wouters's message of "Fri, 15 Dec 2000 23:54:25 +0100" References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> <20001215235425.A29681@xs4all.nl> Message-ID: Thomas Wouters writes: > On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote: > > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > > > I'm not sure I agree with that view either, but mostly because there > > > is a non-GPL replacement for parts of the readline API: > > > > > > http://www.cstr.ed.ac.uk/downloads/editline.html > > > > It doesn't work with the current readline module. It is much > > smaller than readline and works just as well in my experience. > > Would there be any interest in including a copy with the standard > > distribution? The license is quite nice (X11 type). > > Definately +1 from here. Readline reminds me of the cold war, for > some reason. (Actually, multiple reasons ;) I don't have time to do > it myself, unfortunately, or I would. (Looking at editline has been > on my TODO list for a while... :P) It wouldn't be particularly hard to rewrite editline in Python (we have termios & the terminal handling functions in curses - and even ioctl if we get really keen). I've been hacking on my own Python line reader on and off for a while; it's still pretty buggy, but if you're feeling brave you could look at: http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.0.0.tar.gz To try it out, unpack it, cd into the ./pyrl directory and try: >>> import foo # sorry >>> foo.test_loop() It sort of imitates the Python command prompt, except that it doesn't actually execute the code you type. You need a recent _cursesmodule.c for it to work. Cheers, M. -- 41. Some programming languages manage to absorb change, but withstand progress. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From thomas at xs4all.net Sun Dec 17 19:30:38 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Sun, 17 Dec 2000 19:30:38 +0100 Subject: [Python-Dev] Death to string functions! In-Reply-To: <000201c06845$f1afdb40$770a0a0a@nevex.com>; from gvwilson@nevex.com on Sun, Dec 17, 2000 at 11:25:17AM -0500 References: <14908.59144.321167.419762@anthem.concentric.net> <000201c06845$f1afdb40$770a0a0a@nevex.com> Message-ID: <20001217193038.C29681@xs4all.nl> On Sun, Dec 17, 2000 at 11:25:17AM -0500, Greg Wilson wrote: > +1 on deprecating string functions. How wonderfully ambiguous ! Do you mean string methods, or the string module? :) FWIW, I agree that in time, the string module should be deprecated. But I also think that 'in time' should be a considerable timespan. Don't deprecate it before everything it provides is available though some other means. Wait a bit longer than that, even, before calling it deprecated -- that scares people off. And then keep it for practically forever (until Py3K) just to support old code. And don't forget to document it 'deprecated' everywhere, not just one minor release note. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tismer at tismer.com Sun Dec 17 18:38:31 2000 From: tismer at tismer.com (Christian Tismer) Date: Sun, 17 Dec 2000 19:38:31 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> Message-ID: <3A3CFA17.ED26F51A@tismer.com> Old topic: {}.popitem() (was Re: {}.first[key,value,item] ...) Christian Tismer wrote: > > Fredrik Lundh wrote: > > > > christian wrote: > > > That algorithm is really a gem which you should know, > > > so let me try to explain it. > > > > I think someone just won the "brain exploder 2000" award ;-) > As you might have guessed, I didn't do this just for fun. > It is the old game of explaining what is there, convincing > everybody that you at least know what you are talking about, > and then three days later coming up with an improved > application of the theory. > > Today is Monday, 2 days left. :-) Ok, today is Sunday, I had no time to finish this. But now it is here. =========================== ===== Claim: ===== =========================== - Dictionary access time can be improved with a minimal change - On the hash() function: All Objects are supposed to provide a hash function which is as good as possible. Good means to provide a wide range of different keys for different values. Problem: There are hash functions which are "good" in this sense, but they do not spread their randomness uniformly over the 32 bits. Example: Integers use their own value as hash. This is ok, as far as the integers are uniformly distributed. But if they all contain a high power of two, for instance, the low bits give a very bad hash function. Take a dictionary with integers range(1000) as keys and access all entries. Then use a dictionay with the integers shifted left by 16. Access time is slowed down by a factor of 100, since every access is a linear search now. This is not an urgent problem, although applications exist where this can play a role (memory addresses for instance can have high factors of two when people do statistics on page accesses...) While this is not a big problem, it is ugly enough to think of a solution. Solution 1: ------------- Try to involve more bits of the hash value by doing extra shuffling, either a) in the dictlook function, or b) in the hash generation itself. I believe, both can't be really justified for a rare problem. But how about changing the existing solution in a way that an improvement is gained without extra cost? Solution 2: (*the* solution) ---------------------------- Some people may remember what I wrote about re-hashing functions through the multiplicative group GF(2^n)*, and I don't want to repeat this here. The simple idea can be summarized quickly: The original algorithm uses multiplication by polynomials, and it is guaranteed that these re-hash values are jittering through all possible nonzero patterns of the n bits. Observation: Whe are using an operation of a finite field. This means that the inverse of multiplication also exists! Old algortithm (multiplication): shift the index left by 1 if index > mask: xor the index with the generator polynomial New algorithm (division): if low bit of index set: xor the index with the generator polynomial shift the index right by 1 What does this mean? Not so much, we are just cycling through our bit patterns in reverse order. But now for the big difference. First change: We change from multiplication to division. Second change: We do not mask the hash value before! The second change is what I was after: By not masking the hash value when computing the initial index, all the existing bits in the hash come into play. This can be seen like a polynomial division, but the initial remainder (the hash value) was not normalized. After a number of steps, all the extra bits are wheeled into our index, but not wasted by masking them off. That gives our re-hash some more randomness. When all the extra bits are sucked in, the guaranteed single-visit cycle begins. There cannot be more than 27 extra cycles in the worst case (dict size = 32, so there are 27 bits to consume). I do not expect any bad effect from this modification. Here some results, dictionaries have 1000 entries: timing for strings old= 5.097 new= 5.088 timing for bad integers (<<10) old=101.540 new=12.610 timing for bad integers (<<16) old=571.210 new=19.220 On strings, both algorithms behave the same. On numbers, they differ dramatically. While the current algorithm is 110 times slower on a worst case dict (quadratic behavior), the new algorithm accounts a little for the extra cycle, but is only 4 times slower. Alternative implementation: The above approach is conservative in the sense that it tries not to slow down the current implementation in any way. An alternative would be to comsume all of the extra bits at once. But this would add an extra "warmup" loop like this to the algorithm: while index > mask: if low bit of index set: xor the index with the generator polynomial shift the index right by 1 This is of course a very good digest of the higher bits, since it is a polynomial division and not just some bit xor-ing which might give quite predictable cancellations, therefore it is "the right way" in my sense. It might be cheap, but would add over 20 cycles to every small dict. I therefore don't think it is worth to do this. Personally, I prefer the solution to merge the bits during the actual lookup, since it suffices to get access time from quadratic down to logarithmic. Attached is a direct translation of the relevant parts of dictobject.c into Python, with both algorithms implemented. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com -------------- next part -------------- ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # even the shifting my not be worth it. incr = _hash ^ (_hash >> 3) ###### TO HERE if (not incr): incr = mask while 1: ep = ep0[(i+incr)&mask] if (ep[me_key] is NULL) : if (freeslot != NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def timing(func, args=None, n=1, **keywords) : import time time=time.time appl=apply if args is None: args = () if type(args) != type(()) : args=(args,) rep=range(n) dummyarg = ("",) dummykw = {} dummyfunc = len if keywords: before=time() for i in rep: res=appl(dummyfunc, dummyarg, dummykw) empty = time()-before before=time() for i in rep: res=appl(func, args, keywords) else: before=time() for i in rep: res=appl(dummyfunc, dummyarg) empty = time()-before before=time() for i in rep: res=appl(func, args) after = time() return round(after-before-empty,4), res def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash for i in range(1000): s = str(i) * 5 d1[s] = d2[s] = i return d1, d2 def badnum_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash for i in range(1000): bad = i << 16 d1[bad] = d2[bad] = i return d1, d2 def do_test(dict, keys, n): t0 = timing(nulltest, (keys, dict), n)[0] t1 = timing(test, (keys, dict), n)[0] return t1-t0 if __name__ == "__main__": sdold, sdnew = string_dicts() bdold, bdnew = badnum_dicts() print "timing for strings old=%.3f new=%.3f" % ( do_test(sdold, sdold.keys(), 100), do_test(sdnew, sdnew.keys(), 100) ) print "timing for bad integers old=%.3f new=%.3f" % ( do_test(bdold, bdold.keys(), 10) *10, do_test(bdnew, bdnew.keys(), 10) *10) """ D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 """ From fdrake at acm.org Sun Dec 17 19:49:58 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sun, 17 Dec 2000 13:49:58 -0500 (EST) Subject: [Python-Dev] Death to string functions! In-Reply-To: <20001217193038.C29681@xs4all.nl> References: <14908.59144.321167.419762@anthem.concentric.net> <000201c06845$f1afdb40$770a0a0a@nevex.com> <20001217193038.C29681@xs4all.nl> Message-ID: <14909.2774.158973.760077@cj42289-a.reston1.va.home.com> Thomas Wouters writes: > FWIW, I agree that in time, the string module should be deprecated. But I > also think that 'in time' should be a considerable timespan. Don't deprecate *If* most functions in the string module are going to be deprecated, that should be done *now*, so that the documentation will include the appropriate warning to users. When they should actually be removed is another matter, and I think Guido is sufficiently aware of their widespread use and won't remove them too quickly -- his creation of Python isn't the reason he's *accepted* as BDFL, it just made it a possibility. He's had to actually *earn* the BDFL position, I think. With regard to converting the standard library to string methods: that needs to be done as part of the deprecation. The code in the library is commonly used as example code, and should be good example code wherever possible. > support old code. And don't forget to document it 'deprecated' everywhere, > not just one minor release note. When Guido tells me exactly what is deprecated, the documentation will be updated with proper deprecation notices in the appropriate places. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tismer at tismer.com Sun Dec 17 19:10:07 2000 From: tismer at tismer.com (Christian Tismer) Date: Sun, 17 Dec 2000 20:10:07 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> Message-ID: <3A3D017F.62AD599F@tismer.com> Christian Tismer wrote: ... (my timings) Attached is the updated script with the timings mentioned in the last posting. Sorry, I posted an older version before. -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com -------------- next part -------------- ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # even the shifting my not be worth it. incr = _hash ^ (_hash >> 3) ###### TO HERE if (not incr): incr = mask while 1: ep = ep0[(i+incr)&mask] if (ep[me_key] is NULL) : if (freeslot != NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def timing(func, args=None, n=1, **keywords) : import time time=time.time appl=apply if args is None: args = () if type(args) != type(()) : args=(args,) rep=range(n) dummyarg = ("",) dummykw = {} dummyfunc = len if keywords: before=time() for i in rep: res=appl(dummyfunc, dummyarg, dummykw) empty = time()-before before=time() for i in rep: res=appl(func, args, keywords) else: before=time() for i in rep: res=appl(dummyfunc, dummyarg) empty = time()-before before=time() for i in rep: res=appl(func, args) after = time() return round(after-before-empty,4), res def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash for i in range(1000): s = str(i) * 5 d1[s] = d2[s] = i return d1, d2 def badnum_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash shift = 10 if EXTREME: shift = 16 for i in range(1000): bad = i << 16 d1[bad] = d2[bad] = i return d1, d2 def do_test(dict, keys, n): t0 = timing(nulltest, (keys, dict), n)[0] t1 = timing(test, (keys, dict), n)[0] return t1-t0 EXTREME=1 if __name__ == "__main__": sdold, sdnew = string_dicts() bdold, bdnew = badnum_dicts() print "timing for strings old=%.3f new=%.3f" % ( do_test(sdold, sdold.keys(), 100), do_test(sdnew, sdnew.keys(), 100) ) print "timing for bad integers old=%.3f new=%.3f" % ( do_test(bdold, bdold.keys(), 10) *10, do_test(bdnew, bdnew.keys(), 10) *10) """ Results with a shift of 10 (EXTREME=0): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 Results with a shift of 16 (EXTREME=1): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.218 new=5.147 timing for bad integers old=571.210 new=19.220 """ From lutz at rmi.net Sun Dec 17 20:09:47 2000 From: lutz at rmi.net (Mark Lutz) Date: Sun, 17 Dec 2000 12:09:47 -0700 Subject: [Python-Dev] Death to string functions! References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: <001f01c0685c$ef555200$7bdb5da6@vaio> As a longstanding Python advocate and user, I find this thread disturbing, and feel compelled to add a few words: > > [Tim wrote:] > > "string" is right up there with "os" and "sys" as a FIM (Frequently > > Imported Module), so the required code changes will be massive. As > > a user, I don't see what's in it for me to endure that pain: the > > string module functions work fine! Neither are they warts in the > > language, any more than that we say sin(pi) instead of pi.sin(). > > Keeping the functions around doesn't hurt anybody that I can see. > > [Guido wrote:] > Hm. I'm not saying that this one will be easy. But I don't like > having "two ways to do it". It means more learning, etc. (you know > the drill). But with all due respect, there are already _lots_ of places in Python that provide at least two ways to do something already. Why be so strict on this one alone? Consider lambda and def; tuples and lists; map and for loops; the loop else and boolean exit flags; and so on. The notion of Python forcing a single solution is largely a myth. And as someone who makes a living teaching this stuff, I can tell you that none of the existing redundancies prevent anyone from learning Python. More to the point, many of those shiny new features added to 2.0 fall squarely into this category too, and are completely redundant with other tools. Consider list comprehensions and simple loops; extended print statements and sys.std* assignments; augmented assignment statements and simpler ones. Eliminating redundancy at a time when we're also busy introducing it seems a tough goal to sell. I understand the virtues of aesthetics too, but removing the string module seems an incredibly arbitrary application of it. > If you're saying that you think the string module is too prominent to > ever start deprecating its use, I'm afraid we have a problem. > > [...] > Ideally, I'd like to deprecate the entire string module, so that I > can place a single warning at its top. This will cause a single > warning to be issued for programs that still use it (no matter how > many times it is imported). And to me, this seems the real crux of the matter. For a decade now, the string module _has_ been the right way to do it. And today, half a million Python developers absolutely rely on it as an essential staple in their toolbox. What could possibly be wrong with keeping it around for backward compatibility, albeit as a less recommended option? If almost every Python program ever written suddenly starts issuing warning messages, then I think we do have a problem indeed. Frankly, a Python that changes without regard to its user base seems an ominous thing to me. And keep in mind that I like Python; others will look much less generously upon a tool that seems inclined to rip the rug out from under its users. Trust me on this; I've already heard the rumblings out there. So please: can we keep string around? Like it or not, we're way past the point of removing such core modules at this point. Such a radical change might pass in a future non-backward- compatible Python mutation; I'm not sure such a different system will still be "Python", but that's a topic for another day. All IMHO, of course, --Mark Lutz (http://www.rmi.net~lutz) From tim.one at home.com Sun Dec 17 20:50:55 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 17 Dec 2000 14:50:55 -0500 Subject: [Python-Dev] SourceForge SSH silliness Message-ID: Starting last night, I get this msg whenever I update Python code w/ CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python: """ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that the host key has just been changed. Please contact your system administrator. Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message. Password authentication is disabled to avoid trojan horses. """ This is SourceForge's doing, and is permanent (they've changed keys on their end). Here's a link to a thread that may or may not make sense to you: http://sourceforge.net/forum/forum.php?forum_id=52867 Deleting the sourceforge entries from my .ssh/known_hosts file worked for me. But everyone in the thread above who tried it says that they haven't been able to get scp working again (I haven't tried it yet ...). From paulp at ActiveState.com Sun Dec 17 21:04:27 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Sun, 17 Dec 2000 12:04:27 -0800 Subject: [Python-Dev] Pragmas and warnings Message-ID: <3A3D1C4B.8F08A744@ActiveState.com> A couple of other threads started me to thinking that there are a couple of things missing from our warnings framework. Many languages have pragmas that allow you turn warnings on and off in code. For instance, I should be able to put a pragma at the top of a module that uses string functions to say: "I know that this module doesn't adhere to the latest Python conventions. Please don't warn me about it." I should also be able to put a declaration that says: "I'm really paranoid about shadowing globals and builtins. Please warn me when I do that." Batch and visual linters could also use the declarations to customize their behaviors. And of course we have a stack of other features that could use pragmas: * type signatures * Unicode syntax declarations * external object model language binding hints * ... A case could be made that warning pragmas could use a totally different syntax from "user-defined" pragmas. I don't care much. Paul From thomas at xs4all.net Sun Dec 17 22:00:08 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Sun, 17 Dec 2000 22:00:08 +0100 Subject: [Python-Dev] SourceForge SSH silliness In-Reply-To: ; from tim.one@home.com on Sun, Dec 17, 2000 at 02:50:55PM -0500 References: Message-ID: <20001217220008.D29681@xs4all.nl> On Sun, Dec 17, 2000 at 02:50:55PM -0500, Tim Peters wrote: > Starting last night, I get this msg whenever I update Python code w/ > CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python: > """ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > @ WARNING: HOST IDENTIFICATION HAS CHANGED! @ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! > Someone could be eavesdropping on you right now (man-in-the-middle attack)! > It is also possible that the host key has just been changed. > Please contact your system administrator. > Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message. > Password authentication is disabled to avoid trojan horses. > """ > This is SourceForge's doing, and is permanent (they've changed keys on their > end). Here's a link to a thread that may or may not make sense to you: > http://sourceforge.net/forum/forum.php?forum_id=52867 > Deleting the sourceforge entries from my .ssh/known_hosts file worked for > me. But everyone in the thread above who tried it says that they haven't > been able to get scp working again (I haven't tried it yet ...). What sourceforge did was switch Linux distributions, and upgrade. The switch doesn't really matter for the SSH problem, because recent Debian and recent RedHat releases both use a new ssh, the OpenBSD ssh imlementation. Apparently, it isn't entirely backwards compatible to old versions of F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This might or might not be your problem; if it is, you should get a decent message that gives a relatively clear message such as 'cypher type 'idea' not supported'. You should be able to pass the '-c' option to scp/ssh to use a different cypher, like 3des (aka triple-des.) Or maybe the windows versions have a menu to configure that kind of thing :) Another possible problem is that it might not have good support for older protocol versions. The 'current' protocol version, at least for 'ssh1', is 1.5. The one message on the sourceforge thread above that actually mentions a version in the *cough* bugreport is using an older ssh that only supports protocol version 1.4. Since that particular version of F-secure ssh has known problems (why else would they release 16 more versions ?) I'd suggest anyone with problems first try a newer version. I hope that doesn't break WinCVS, but it would suck if it did :P If that doesn't work, which is entirely possible, it might be an honest bug in the OpenBSD ssh that Sourceforge is using. If anyone cared, we could do a bit of experimenting with the openssh-2.0 betas installed by Debian woody (unstable) to see if the problem occurs there as well. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From greg at cosc.canterbury.ac.nz Mon Dec 18 00:05:41 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 18 Dec 2000 12:05:41 +1300 (NZDT) Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il> Message-ID: <200012172305.MAA02512@s454.cosc.canterbury.ac.nz> Moshe Zadka : > Perl and Scheme permit implicit shadowing too. But Scheme always requires declarations! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From martin at loewis.home.cs.tu-berlin.de Mon Dec 18 00:45:56 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 18 Dec 2000 00:45:56 +0100 Subject: [Python-Dev] Death to string functions! Message-ID: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> > But with all due respect, there are already _lots_ of places in > Python that provide at least two ways to do something already. Exactly. My favourite one here is string exceptions, which have quite some analogy to the string module. At some time, there were only string exceptions. Then, instance exceptions were added, some releases later they were considered the better choice, so the standard library was converted to use them. Still, there is no sign whatsoever that anybody plans to deprecate string exceptions. I believe the string module will get less importance over time. Comparing it with string exception, that may be well 5 years. It seems there are two ways of "deprecation": a loud "we will remove that, change your code", and a silent "strings have methods" (i.e. don't mention the module when educating people). The latter approach requires educators to agree that the module is "uninteresting", and people to really not use once they find out it exists. I think deprecation should be only attempted once there is a clear sign that people don't use it massively for new code anymore. Removal should only occur if keeping the module less pain than maintaining it. Regards, Martin From skip at mojam.com Mon Dec 18 00:55:10 2000 From: skip at mojam.com (Skip Montanaro) Date: Sun, 17 Dec 2000 17:55:10 -0600 (CST) Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in? Message-ID: <14909.21086.92774.940814@beluga.mojam.com> I executed cvs update today (removing the sourceforge machines from .ssh/known_hosts worked fine for me, btw) followed by a configure and a make clean. The last step failed with this output: ... make[1]: Entering directory `/home/beluga/skip/src/python/dist/src/Modules' Makefile.pre.in:20: *** missing separator. Stop. make[1]: Leaving directory `/home/beluga/skip/src/python/dist/src/Modules' make: [clean] Error 2 (ignored) I found the following at line 20 of Modules/Makefile.pre.in: @SET_CXX@ I then tried a cvs annotate on that file but saw that line 20 had been there since rev 1.60 (16-Dec-99). I then checked the top-level Makefile.in thinking something must have changed in the clean target recently, but cvs annotate shows no recent changes there either: 1.1 (guido 24-Dec-93): clean: localclean 1.1 (guido 24-Dec-93): -for i in $(SUBDIRS); do \ 1.74 (guido 19-May-98): if test -d $$i; then \ 1.24 (guido 20-Jun-96): (echo making clean in subdirectory $$i; cd $$i; \ 1.4 (guido 01-Aug-94): if test -f Makefile; \ 1.4 (guido 01-Aug-94): then $(MAKE) clean; \ 1.4 (guido 01-Aug-94): else $(MAKE) -f Makefile.*in clean; \ 1.4 (guido 01-Aug-94): fi); \ 1.74 (guido 19-May-98): else true; fi; \ 1.1 (guido 24-Dec-93): done Make distclean succeeded so I tried the following: make distclean ./configure make clean but the last step still failed. Any idea why make clean is now failing (for me)? Can anyone else reproduce this problem? Skip From greg at cosc.canterbury.ac.nz Mon Dec 18 01:02:32 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 18 Dec 2000 13:02:32 +1300 (NZDT) Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com> Message-ID: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> "M.-A. Lemburg" : > Format characters will always > be ASCII and thus 7-bit -- theres really no need to expand the > set of possibilities beyond 8 bits ;-) But the error message is being produced because the character is NOT a valid format character. One of the reasons for that might be because it's not in the 7-bit range! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From MarkH at ActiveState.com Mon Dec 18 07:02:27 2000 From: MarkH at ActiveState.com (Mark Hammond) Date: Mon, 18 Dec 2000 17:02:27 +1100 Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in? In-Reply-To: <14909.21086.92774.940814@beluga.mojam.com> Message-ID: > I found the following at line 20 of Modules/Makefile.pre.in: > > @SET_CXX@ I dont have time to investigate this specific problem, but I definitely had problems with SET_CXX around 6 months back. This was trying to build an external C++ application, so may be different. My message and other followups at the time implied noone really knew and everyone agreed it was likely SET_CXX was broken :-( I even referenced the CVS chekin that I thought broke it. Mark. From mal at lemburg.com Mon Dec 18 10:58:37 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 10:58:37 +0100 Subject: [Python-Dev] Pragmas and warnings References: <3A3D1C4B.8F08A744@ActiveState.com> Message-ID: <3A3DDFCD.34AB05B2@lemburg.com> Paul Prescod wrote: > > A couple of other threads started me to thinking that there are a couple > of things missing from our warnings framework. > > Many languages have pragmas that allow you turn warnings on and off in > code. For instance, I should be able to put a pragma at the top of a > module that uses string functions to say: "I know that this module > doesn't adhere to the latest Python conventions. Please don't warn me > about it." I should also be able to put a declaration that says: "I'm > really paranoid about shadowing globals and builtins. Please warn me > when I do that." > > Batch and visual linters could also use the declarations to customize > their behaviors. > > And of course we have a stack of other features that could use pragmas: > > * type signatures > * Unicode syntax declarations > * external object model language binding hints > * ... > > A case could be made that warning pragmas could use a totally different > syntax from "user-defined" pragmas. I don't care much. There was a long thread about this some months ago. We agreed to add a new keyword to the language (I think it was "define") which then uses a very simple syntax which can be interpreted at compile time to modify the behaviour of the compiler, e.g. define = There was also a discussion about allowing limited forms of expressions instead of the constant literal. define source_encoding = "utf-8" was the original motivation for this, but (as always ;) the usefulness for other application areas was quickly recognized, e.g. to enable compilation in optimization mode on a per module basis. PS: "define" is perhaps not obscure enough as keyword... -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Mon Dec 18 11:04:08 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:04:08 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> Message-ID: <3A3DE118.3355896D@lemburg.com> Greg Ewing wrote: > > "M.-A. Lemburg" : > > > Format characters will always > > be ASCII and thus 7-bit -- theres really no need to expand the > > set of possibilities beyond 8 bits ;-) > > But the error message is being produced because the > character is NOT a valid format character. One of the > reasons for that might be because it's not in the > 7-bit range! True. I think removing %c completely in that case is the right solution (in case you don't want to convert the Unicode char using the default encoding to a string first). -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Mon Dec 18 11:09:16 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:09:16 +0100 Subject: [Python-Dev] What to do about PEP 229? References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> <20001216191739.B6703@kronos.cnri.reston.va.us> Message-ID: <3A3DE24C.DA0B2F6C@lemburg.com> Andrew Kuchling wrote: > > On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote: > >Can't distutils try both and then settle for the working combination ? > > I'm worried about subtle problems; what if an unneeded -lfoo drags in > a customized malloc, or has symbols which conflict with some other > library. In that case, I think the user will have to decide. setup.py should then default to not integrating the module in question and issue a warning telling the use what to look for and how to call setup.py in order to add the right combination of libs. > >... BTW, where is Greg ? I haven't heard from him in quite a while.] > > Still around; he just hasn't been posting much these days. Good to know :) > >Why not parse Setup and use it as input to distutils setup.py ? > > That was option 1. The existing Setup format doesn't really contain > enough intelligence, though; the intelligence is usually in comments > such as "Uncomment the following line for Solaris". So either the > Setup format is modified (bad, since we'd break existing 3rd-party > packages that still use a Makefile.pre.in), or I give up and just do > everything in a setup.py. I would still like a simple input to setup.py -- one that doesn't require hacking setup.py just to enable a few more modules. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fredrik at effbot.org Mon Dec 18 11:15:26 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Mon, 18 Dec 2000 11:15:26 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com> Message-ID: <004a01c068db$72403170$3c6340d5@hagrid> mal wrote: > > But the error message is being produced because the > > character is NOT a valid format character. One of the > > reasons for that might be because it's not in the > > 7-bit range! > > True. > > I think removing %c completely in that case is the right > solution (in case you don't want to convert the Unicode char > using the default encoding to a string first). how likely is it that a human programmer will use a bad formatting character that's not in the ASCII range? -1 on removing it -- people shouldn't have to learn the octal ASCII table just to be able to fix trivial typos. +1 on mapping the character back to a string in the same was as "repr" -- that is, print ASCII characters as is, map anything else to an octal escape. +0 on leaving it as it is, or mapping non-printables to "?". From mal at lemburg.com Mon Dec 18 11:34:02 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:34:02 +0100 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> Message-ID: <3A3DE81A.4B825D89@lemburg.com> > Here some results, dictionaries have 1000 entries: > > timing for strings old= 5.097 new= 5.088 > timing for bad integers (<<10) old=101.540 new=12.610 > timing for bad integers (<<16) old=571.210 new=19.220 Even though I think concentrating on string keys would provide more performance boost for Python in general, I think you have a point there. +1 from here. BTW, would changing the hash function on strings from the simple XOR scheme to something a little smarter help improve the performance too (e.g. most strings used in programming never use the 8-th bit) ? I also think that we could inline the string compare function in dictobject:lookdict_string to achieve even better performance. Currently it uses a function which doesn't trigger compiler inlining. And finally: I think a generic PyString_Compare() API would be useful in a lot of places where strings are being compared (e.g. dictionaries and keyword parameters). Unicode already has such an API (along with dozens of other useful APIs which are not available for strings). -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Mon Dec 18 11:41:38 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:41:38 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com> <004a01c068db$72403170$3c6340d5@hagrid> Message-ID: <3A3DE9E2.77FF0FA9@lemburg.com> Fredrik Lundh wrote: > > mal wrote: > > > > But the error message is being produced because the > > > character is NOT a valid format character. One of the > > > reasons for that might be because it's not in the > > > 7-bit range! > > > > True. > > > > I think removing %c completely in that case is the right > > solution (in case you don't want to convert the Unicode char > > using the default encoding to a string first). > > how likely is it that a human programmer will use a bad formatting > character that's not in the ASCII range? Not very likely... the most common case of this error is probably the use of % as percent sign in a formatting string. The next character in those cases is usually whitespace. > -1 on removing it -- people shouldn't have to learn the octal ASCII > table just to be able to fix trivial typos. > > +1 on mapping the character back to a string in the same was as > "repr" -- that is, print ASCII characters as is, map anything else to > an octal escape. > > +0 on leaving it as it is, or mapping non-printables to "?". Agreed. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer at tismer.com Mon Dec 18 12:08:34 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 18 Dec 2000 13:08:34 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> <3A3DE81A.4B825D89@lemburg.com> Message-ID: <3A3DF032.5F86AD15@tismer.com> "M.-A. Lemburg" wrote: > > > Here some results, dictionaries have 1000 entries: > > > > timing for strings old= 5.097 new= 5.088 > > timing for bad integers (<<10) old=101.540 new=12.610 > > timing for bad integers (<<16) old=571.210 new=19.220 > > Even though I think concentrating on string keys would provide more > performance boost for Python in general, I think you have a point > there. +1 from here. > > BTW, would changing the hash function on strings from the simple > XOR scheme to something a little smarter help improve the performance > too (e.g. most strings used in programming never use the 8-th > bit) ? Yes, it would. I spent the rest of last night to do more accurate tests, also refined the implementation (using longs for the shifts etc), and turned from timing over to trip counting, i.e. a dict counts every round through the re-hash. That showed two things: - The bits used from the string hash are not well distributed - using a "warmup wheel" on the hash to suck all bits in gives the same quality of hashes like random numbers. I will publish some results later today. > I also think that we could inline the string compare function > in dictobject:lookdict_string to achieve even better performance. > Currently it uses a function which doesn't trigger compiler > inlining. Sure! ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From guido at python.org Mon Dec 18 15:20:22 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 09:20:22 -0500 Subject: [Python-Dev] The Dictionary Gem is polished! In-Reply-To: Your message of "Sun, 17 Dec 2000 19:38:31 +0200." <3A3CFA17.ED26F51A@tismer.com> References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> Message-ID: <200012181420.JAA25063@cj20424-a.reston1.va.home.com> > Problem: There are hash functions which are "good" in this sense, > but they do not spread their randomness uniformly over the > 32 bits. > > Example: Integers use their own value as hash. > This is ok, as far as the integers are uniformly distributed. > But if they all contain a high power of two, for instance, > the low bits give a very bad hash function. > > Take a dictionary with integers range(1000) as keys and access > all entries. Then use a dictionay with the integers shifted > left by 16. > Access time is slowed down by a factor of 100, since every > access is a linear search now. Ai. I think what happened is this: long ago, the hash table sizes were primes, or at least not powers of two! I'll leave it to the more mathematically-inclined to judge your solution... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Dec 18 15:52:35 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 09:52:35 -0500 Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in? In-Reply-To: Your message of "Sun, 17 Dec 2000 17:55:10 CST." <14909.21086.92774.940814@beluga.mojam.com> References: <14909.21086.92774.940814@beluga.mojam.com> Message-ID: <200012181452.JAA04372@cj20424-a.reston1.va.home.com> > Make distclean succeeded so I tried the following: > > make distclean > ./configure > make clean > > but the last step still failed. Any idea why make clean is now failing (for > me)? Can anyone else reproduce this problem? Yes. I don't understand it, but this takes care of it: make distclean ./configure make Makefiles # <--------- !!! make clean --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Dec 18 15:54:20 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 09:54:20 -0500 Subject: [Python-Dev] Pragmas and warnings In-Reply-To: Your message of "Mon, 18 Dec 2000 10:58:37 +0100." <3A3DDFCD.34AB05B2@lemburg.com> References: <3A3D1C4B.8F08A744@ActiveState.com> <3A3DDFCD.34AB05B2@lemburg.com> Message-ID: <200012181454.JAA04394@cj20424-a.reston1.va.home.com> > There was a long thread about this some months ago. We agreed > to add a new keyword to the language (I think it was "define") I don't recall agreeing. :-) This is PEP material. For 2.2, please! --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Mon Dec 18 15:56:33 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 15:56:33 +0100 Subject: [Python-Dev] Pragmas and warnings References: <3A3D1C4B.8F08A744@ActiveState.com> <3A3DDFCD.34AB05B2@lemburg.com> <200012181454.JAA04394@cj20424-a.reston1.va.home.com> Message-ID: <3A3E25A1.DFD2BDBF@lemburg.com> Guido van Rossum wrote: > > > There was a long thread about this some months ago. We agreed > > to add a new keyword to the language (I think it was "define") > > I don't recall agreeing. :-) Well, maybe it was a misinterpretation on my part... you said something like "add a new keyword and live with the consequences". AFAIR, of course :-) > This is PEP material. For 2.2, please! Ok. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at python.org Mon Dec 18 16:15:26 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 10:15:26 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Sun, 17 Dec 2000 12:09:47 MST." <001f01c0685c$ef555200$7bdb5da6@vaio> References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> <001f01c0685c$ef555200$7bdb5da6@vaio> Message-ID: <200012181515.KAA04571@cj20424-a.reston1.va.home.com> [Mark Lutz] > So please: can we keep string around? Like it or not, we're > way past the point of removing such core modules at this point. Of course we're keeping string around. I already said that for backwards compatibility reasons it would not disappear before Py3K. I think there's a misunderstanding about the meaning of deprecation, too. That word doesn't mean to remove a feature. It doesn't even necessarily mean to warn every time a feature is used. It just means (to me) that at some point in the future the feature will change or disappear, there's a new and better way to do it, and that we encourage users to start using the new way, to save them from work later. In my mind, there's no reason to start emitting warnings about every deprecated feature. The warnings are only needed late in the deprecation cycle. PEP 5 says "There must be at least a one-year transition period between the release of the transitional version of Python and the release of the backwards incompatible version." Can we now stop getting all bent out of shape over this? String methods *are* recommended over equivalent string functions. Those string functions *are* already deprecated, in the informal sense (i.e. just that it is recommended to use string methods instead). This *should* (take notice, Fred!) be documented per 2.1. We won't however be issuing run-time warnings about the use of string functions until much later. (Lint-style tools may start warning sooner -- that's up to the author of the lint tool to decide.) Note that I believe Java makes a useful distinction that PEP 5 misses: it defines both deprecated features and obsolete features. *Deprecated* features are simply features for which a better alternative exists. *Obsolete* features are features that are only being kept around for backwards compatibility. Deprecated features may also be (and usually are) *obsolescent*, meaning they will become obsolete in the future. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Dec 18 16:22:09 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 10:22:09 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Mon, 18 Dec 2000 00:45:56 +0100." <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> References: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> Message-ID: <200012181522.KAA04597@cj20424-a.reston1.va.home.com> > At some time, there were only string exceptions. Then, instance > exceptions were added, some releases later they were considered the > better choice, so the standard library was converted to use them. > Still, there is no sign whatsoever that anybody plans to deprecate > string exceptions. Now there is: I hereby state that I officially deprecate string exceptions. Py3K won't support them, and it *may* even require that all exception classes are derived from Exception. > I believe the string module will get less importance over > time. Comparing it with string exception, that may be well 5 years. > It seems there are two ways of "deprecation": a loud "we will remove > that, change your code", and a silent "strings have methods" > (i.e. don't mention the module when educating people). The latter > approach requires educators to agree that the module is > "uninteresting", and people to really not use once they find out it > exists. Exactly. This is what I hope will happen. I certainly hope that Mark Lutz has already started teaching string methods! > I think deprecation should be only attempted once there is a clear > sign that people don't use it massively for new code anymore. Right. So now we're on the first step: get the word out! > Removal should only occur if keeping the module [is] less pain than > maintaining it. Exactly. Guess where the string module falls today. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From Barrett at stsci.edu Mon Dec 18 17:50:49 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Mon, 18 Dec 2000 11:50:49 -0500 (EST) Subject: [Python-Dev] PEP 207 -- Rich Comparisons Message-ID: <14910.16431.554136.374725@nem-srvr.stsci.edu> Guido van Rossum writes: > > > > 1. The current boolean operator behavior does not have to change, and > > hence will be backward compatible. > > What incompatibility do you see in the current proposal? You have to choose between using rich comparisons or boolean comparisons. You can't use both for the same (rich/complex) object. > > 2. It eliminates the need to decide whether or not rich comparisons > > takes precedence over boolean comparisons. > > Only if you want different semantics -- that's only an issue for NumPy. No. I think NumPy is the tip of the iceberg, when discussing new semantics. Most users don't consider these broader semantic issues, because Python doesn't give them the opportunity to do so. I can see possible scenarios of using both boolean and non-boolean comparisons for Python lists and dictionaries in addition to NumPy. I chose to use Python because it provides a richer framework than other languages. When Python fails to provide such benefits, I'll move to another language. I moved from PERL to Python because the multi-dimensional array syntax is vastly better in Python than PERL, though as a novice I don't have to know that it exists. What I'm proposing here is in a similar vein. > > 3. The new operators add additional behavior without directly impacting > > current behavior and the use of them is unambigous, at least in > > relation to current Python behavior. You know by the operator what > > type of comparison will be returned. This should appease Jim > > Fulton, based on his arguments in 1998 about comparison operators > > always returning a boolean value. > > As you know, I'm now pretty close to Jim. :-) He seemed pretty mellow > about this now. Yes, I would hope so! It appears though that you misunderstand me. My point was that I tend to agree with Jim Fulton's arguments for a limited interpretation of the current comparison operators. I too expect them to return a boolean result. I have never felt comfortable using such comparison operators in an array context, e.g. as in the array language, IDL. It just looks wrong. So my suggestion is to create new ones whose implicit meaning is to provide element-wise or rich comparison behavior. And to add similar behavior for the other operators for consistency. Can someone provide an example in mathematics where comparison operators are used in a non-boolean, ie. rich comparison, context. If so, this might shut me up! > > 4. Compound objects, such as lists, could implement both rich > > and boolean comparisons. The boolean comparison would remain as > > is, while the rich comparison would return a list of boolean > > values. Current behavior doesn't change; just a new feature, which > > you may or may not choose to use, is added. > > > > If we go one step further and add the matrix-style operators along > > with the comparison operators, we can provide a consistent user > > interface to array/complex operations without changing current Python > > behavior. If a user has no need for these new operators, he doesn't > > have to use them or even know about them. All we've done is made > > Python richer, but I believe with making it more complex. For Phrase should be: "but I believe without making it more complex.". ------- > > example, all element-wise operations could have a ':' appended to > > them, e.g. '+:', '<:', etc.; and will define element-wise addition, > > element-wise less-than, etc. The traditional '*', '/', etc. operators > > can then be used for matrix operations, which will appease the Matlab > > people. > > > > Therefore, I don't think rich comparisons and matrix-type operators > > should be considered separable. I really think you should consider > > this suggestion. It appeases many groups while providing a consistent > > and clear user interface, while greatly impacting current Python > > behavior. The last phrase should read: "while not greatly impacting current --- Python behavior." > > > > Always-causing-havoc-at-the-last-moment-ly Yours, > > I think you misunderstand. Rich comparisons are mostly about allowing > the separate overloading of <, <=, ==, !=, >, and >=. This is useful > in its own light. No, I do understand. I've read most of the early discussions on this issue and one of those issues was about having to choose between boolean and rich comparisons and what should take precedence, when both may be appropriate. I'm suggesting an alternative here. > If you don't want to use this overloading facility for elementwise > comparisons in NumPy, that's fine with me. Nobody says you have to -- > it's just that you *could*. Yes, I understand. > Red my lips: there won't be *any* new operators in 2.1. OK, I didn't expect this to make it into 2.1. > There will a better way to overload the existing Boolean operators, > and they will be able to return non-Boolean results. That's useful in > other situations besides NumPy. Yes, I agree, this should be done anyway. I'm just not sure that the implicit meaning that these comparison operators are being given is the best one. I'm just looking for ways to incorporate rich comparisons into a broader framework, numpy just currently happens to be the primary example of this proposal. Assuming the current comparison operator overloading is already implemented and has been used to implement rich comparisons for some objects, then my rich comparison proposal would cause confusion. This is what I'm trying to avoid. > Feel free to lobby for elementwise operators -- but based on the > discussion about this subject so far, I don't give it much of a chance > even past Python 2.1. They would add a lot of baggage to the language > (e.g. the table of operators in all Python books would be about twice > as long) and by far the most users don't care about them. (Read the > intro to 211 for some of the concerns -- this PEP tries to make the > addition palatable by adding exactly *one* new operator.) So! Introductory books don't have to discuss these additional operators. I don't have to know about XML and socket modules to start using Python effectively, nor do I have to know about 'zip' or list comprehensions. These additions decrease the code size and increase efficiency, but don't really add any new expressive power that can't already be done by a 'for' loop. I'll try to convince myself that this suggestion is crazy and not bother you with this issue for awhile. Cheers, Paul From guido at python.org Mon Dec 18 18:18:11 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 12:18:11 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Mon, 18 Dec 2000 11:50:49 EST." <14910.16431.554136.374725@nem-srvr.stsci.edu> References: <14910.16431.554136.374725@nem-srvr.stsci.edu> Message-ID: <200012181718.MAA14030@cj20424-a.reston1.va.home.com> Paul Barret: > > > 1. The current boolean operator behavior does not have to change, and > > > hence will be backward compatible. Guido van Rossum: > > What incompatibility do you see in the current proposal? Paul Barret: > You have to choose between using rich comparisons or boolean > comparisons. You can't use both for the same (rich/complex) object. Sure. I thought that the NumPy folks were happy with this. Certainly two years ago they seemed to be. > > > 2. It eliminates the need to decide whether or not rich comparisons > > > takes precedence over boolean comparisons. > > > > Only if you want different semantics -- that's only an issue for NumPy. > > No. I think NumPy is the tip of the iceberg, when discussing new > semantics. Most users don't consider these broader semantic issues, > because Python doesn't give them the opportunity to do so. I can see > possible scenarios of using both boolean and non-boolean comparisons > for Python lists and dictionaries in addition to NumPy. That's the same argument that has been made for new operators all along. I've explained already why they are not on the table for 2.1. > I chose to use Python because it provides a richer framework than > other languages. When Python fails to provide such benefits, I'll > move to another language. I moved from PERL to Python because the > multi-dimensional array syntax is vastly better in Python than PERL, > though as a novice I don't have to know that it exists. What I'm > proposing here is in a similar vein. > > > > 3. The new operators add additional behavior without directly impacting > > > current behavior and the use of them is unambigous, at least in > > > relation to current Python behavior. You know by the operator what > > > type of comparison will be returned. This should appease Jim > > > Fulton, based on his arguments in 1998 about comparison operators > > > always returning a boolean value. > > > > As you know, I'm now pretty close to Jim. :-) He seemed pretty mellow > > about this now. > > Yes, I would hope so! > > It appears though that you misunderstand me. My point was that I tend > to agree with Jim Fulton's arguments for a limited interpretation of > the current comparison operators. I too expect them to return a > boolean result. I have never felt comfortable using such comparison > operators in an array context, e.g. as in the array language, IDL. It > just looks wrong. So my suggestion is to create new ones whose > implicit meaning is to provide element-wise or rich comparison > behavior. And to add similar behavior for the other operators for > consistency. > > Can someone provide an example in mathematics where comparison > operators are used in a non-boolean, ie. rich comparison, context. > If so, this might shut me up! Not me (I no longer consider myself a mathematician :-). Why are you requiring an example from math though? Again, you will be able to make this argument to the NumPy folks when they are ready to change the meaning of A > > 4. Compound objects, such as lists, could implement both rich > > > and boolean comparisons. The boolean comparison would remain as > > > is, while the rich comparison would return a list of boolean > > > values. Current behavior doesn't change; just a new feature, which > > > you may or may not choose to use, is added. > > > > > > If we go one step further and add the matrix-style operators along > > > with the comparison operators, we can provide a consistent user > > > interface to array/complex operations without changing current Python > > > behavior. If a user has no need for these new operators, he doesn't > > > have to use them or even know about them. All we've done is made > > > Python richer, but I believe with making it more complex. For > > Phrase should be: "but I believe without making it more complex.". > ------- > > > > example, all element-wise operations could have a ':' appended to > > > them, e.g. '+:', '<:', etc.; and will define element-wise addition, > > > element-wise less-than, etc. The traditional '*', '/', etc. operators > > > can then be used for matrix operations, which will appease the Matlab > > > people. > > > > > > Therefore, I don't think rich comparisons and matrix-type operators > > > should be considered separable. I really think you should consider > > > this suggestion. It appeases many groups while providing a consistent > > > and clear user interface, while greatly impacting current Python > > > behavior. > > The last phrase should read: "while not greatly impacting current > --- > Python behavior." I don't see any argument for elementwise operators here that I haven't heard before, and AFAIK it's all in the two PEPs. > > > Always-causing-havoc-at-the-last-moment-ly Yours, > > > > I think you misunderstand. Rich comparisons are mostly about allowing > > the separate overloading of <, <=, ==, !=, >, and >=. This is useful > > in its own light. > > No, I do understand. I've read most of the early discussions on this > issue and one of those issues was about having to choose between > boolean and rich comparisons and what should take precedence, when > both may be appropriate. I'm suggesting an alternative here. Note that Python doesn't decide which should take precedent. The implementer of an individual extension type decides what his comparison operators will return. > > If you don't want to use this overloading facility for elementwise > > comparisons in NumPy, that's fine with me. Nobody says you have to -- > > it's just that you *could*. > > Yes, I understand. > > > Red my lips: there won't be *any* new operators in 2.1. > > OK, I didn't expect this to make it into 2.1. > > > There will a better way to overload the existing Boolean operators, > > and they will be able to return non-Boolean results. That's useful in > > other situations besides NumPy. > > Yes, I agree, this should be done anyway. I'm just not sure that the > implicit meaning that these comparison operators are being given is > the best one. I'm just looking for ways to incorporate rich > comparisons into a broader framework, numpy just currently happens to > be the primary example of this proposal. > > Assuming the current comparison operator overloading is already > implemented and has been used to implement rich comparisons for some > objects, then my rich comparison proposal would cause confusion. This > is what I'm trying to avoid. AFAIK, rich comparisons haven't been used anywhere to return non-Boolean results. > > Feel free to lobby for elementwise operators -- but based on the > > discussion about this subject so far, I don't give it much of a chance > > even past Python 2.1. They would add a lot of baggage to the language > > (e.g. the table of operators in all Python books would be about twice > > as long) and by far the most users don't care about them. (Read the > > intro to 211 for some of the concerns -- this PEP tries to make the > > addition palatable by adding exactly *one* new operator.) > > So! Introductory books don't have to discuss these additional > operators. I don't have to know about XML and socket modules to start > using Python effectively, nor do I have to know about 'zip' or list > comprehensions. These additions decrease the code size and increase > efficiency, but don't really add any new expressive power that can't > already be done by a 'for' loop. > > I'll try to convince myself that this suggestion is crazy and not > bother you with this issue for awhile. Happy holidays nevertheless. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at home.com Mon Dec 18 19:38:13 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 18 Dec 2000 13:38:13 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <14910.16431.554136.374725@nem-srvr.stsci.edu> Message-ID: [Paul Barrett] > ... > Can someone provide an example in mathematics where comparison > operators are used in a non-boolean, ie. rich comparison, context. > If so, this might shut me up! By my informal accounting, over the years there have been more requests for three-outcome comparison operators than for elementwise ones, although the three-outcome lobby isn't organized so is less visible. It's a natural request for anyone working with partial orderings (a < b -> one of {yes, no, unordered}). Another large group of requests comes from people working with variants of fuzzy logic, where it's desired that the comparison operators be definable to return floats (intuitively corresponding to the probability that the stated relation "is true"). Another desire comes from the symbolic math camp, which would like to be able to-- as is possible for "+", "*", etc --define "<" so that e.g. "x < y" return an object capturing that somebody *asked* for "x < y"; they're not interested in numeric or Boolean results so much as symbolic expressions. "<" is used for all these things in the literature too. Whatever. "<" and friends are just collections of pixels. Add 300 new operator symbols, and people will want to redefine all of them at will too. draw-a-line-in-the-sand-and-the-wind-blows-it-away-ly y'rs - tim From tim.one at home.com Mon Dec 18 21:37:13 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 18 Dec 2000 15:37:13 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > ... > If you're saying that we should give users ample time for the > transition, I'm with you. Then we're with each other, for suitably large values of "ample" . > If you're saying that you think the string module is too prominent to > ever start deprecating its use, I'm afraid we have a problem. We may. Time will tell. It needs a conversion tool, else I think it's unsellable. > ... > I'd also like to note that using the string module's wrappers incurs > the overhead of a Python function call -- using string methods is > faster. > > Finally, I like the look of fields[i].strip().lower() much better than > that of string.lower(string.strip(fields[i])) -- an actual example > from mimetools.py. I happen to like string methods better myself; I don't think that's at issue (except that loads of people apparently don't like "join" as a string method -- idiots ). The issue to me is purely breaking old code someday -- "string" is in very heavy use, and unlike as when deprecating regex in favor of re (either pre or especially sre), string methods aren't orders of magnitude better than the old way; and also unlike regex-vs-re it's not the case that the string module has become unmaintainable (to the contrary, string.py has become trivial). IOW, this one would be unprecedented fiddling. > ... > Note that I believe Java makes a useful distinction that PEP 5 misses: > it defines both deprecated features and obsolete features. > *Deprecated* features are simply features for which a better > alternative exists. *Obsolete* features are features that are only > being kept around for backwards compatibility. Deprecated features > may also be (and usually are) *obsolescent*, meaning they will become > obsolete in the future. I agree it would be useful to define these terms, although those particular definitions appear to be missing the most important point from the user's POV (not a one says "going away someday"). A Google search on "java obsolete obsolescent deprecated" doesn't turn up anything useful, so I doubt the usages you have in mind come from Java (it has "deprecated", but doesn't appear to have any well-defined meaning for the others). In keeping with the religious nature of the battle-- and religion offers precise terms for degrees of damnation! --I suggest: struggling -- a supported feature; the initial state of all features; may transition to Anathematized anathematized -- this feature is now cursed, but is supported; may transition to Condemned or Struggling; intimacy with Anathematized features is perilous condemned -- a feature scheduled for crucifixion; may transition to Crucified, Anathematized (this transition is called "a pardon"), or Struggling (this transition is called "a miracle"); intimacy with Condemned features is suicidal crucified -- a feature that is no longer supported; may transition to Resurrected resurrected -- a once-Crucified feature that is again supported; may transition to Condemned, Anathematized or Struggling; although since Resurrection is a state of grace, there may be no point in human time at which a feature is identifiably Resurrected (i.e., it may *appear*, to the unenlightened, that a feature moved directly from Crucified to Anathematized or Struggling or Condemned -- although saying so out loud is heresy). From tismer at tismer.com Mon Dec 18 23:58:03 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 18 Dec 2000 23:58:03 +0100 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> <200012181420.JAA25063@cj20424-a.reston1.va.home.com> Message-ID: <3A3E967B.BE404114@tismer.com> Guido van Rossum wrote: [me, expanding on hashes, integers,and how to tame them cheaply] > Ai. I think what happened is this: long ago, the hash table sizes > were primes, or at least not powers of two! At some time I will wake up and they tell me that I'm reducible :-) > I'll leave it to the more mathematically-inclined to judge your > solution... I love small lists! - ciao - chris +1 (being a member, hopefully) -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From greg at cosc.canterbury.ac.nz Tue Dec 19 00:04:42 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Dec 2000 12:04:42 +1300 (NZDT) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Message-ID: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> [Paul Barrett] > ... > Can someone provide an example in mathematics where comparison > operators are used in a non-boolean, ie. rich comparison, context. > If so, this might shut me up! Not exactly mathematical, but some day I'd like to create a database access module which lets you say things like mydb = OpenDB("inventory") parts = mydb.parts tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42) Of course, to really make this work I need to be able to overload "and" and "or" as well, but that's a whole 'nother PEP... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Tue Dec 19 00:32:51 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 18:32:51 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Tue, 19 Dec 2000 12:04:42 +1300." <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> References: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> Message-ID: <200012182332.SAA18456@cj20424-a.reston1.va.home.com> > Not exactly mathematical, but some day I'd like to create > a database access module which lets you say things like > > mydb = OpenDB("inventory") > parts = mydb.parts > tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42) > > Of course, to really make this work I need to be able > to overload "and" and "or" as well, but that's a whole > 'nother PEP... Believe it or not, in 1998 we already had a suggestion for overloading these too. This is hinted at in David Ascher's proposal (the Appendix of PEP 208) where objects could define __boolean_and__ to overload x Message-ID: Sounds good to me! It's a very cheap way to get the high bits into play. > i = (~_hash) & mask The ~ here seems like pure superstition to me (and the comments in the C code don't justify it at all -- I added a nag of my own about that the last time I checked in dictobject.c -- and see below for a bad consequence of doing ~). > # note that we do not mask! > # even the shifting my not be worth it. > incr = _hash ^ (_hash >> 3) The shifting was just another cheap trick to get *some* influence from the high bits. It's very limited, though. Toss it (it appears to be from the "random operations yield random results" matchbook school of design). [MAL] > BTW, would changing the hash function on strings from the simple > XOR scheme to something a little smarter help improve the performance > too (e.g. most strings used in programming never use the 8-th > bit) ? Don't understand -- the string hash uses multiplication: x = (1000003*x) ^ *p++; in a loop. Replacing "^" there by "+" should yield slightly better results. As is, string hashes are a lot like integer hashes, in that "consecutive" strings J001 J002 J003 J004 ... yield hashes very close together in value. But, because the current dict algorithm uses ~ on the full hash but does not use ~ on the initial increment, (~hash)+incr too often yields the same result for distinct hashes (i.e., there's a systematic (but weak) form of clustering). Note that Python is doing something very unusual: hashes are *usually* designed to yield an approximation to randomness across all bits. But Python's hashes never achieve that. This drives theoreticians mad (like the fellow who originally came up with the GF idea), but tends to work "better than random" in practice (e.g., a truly random hash function would almost certainly produce many collisions when fed a fat range of consecutive integers but still less than half the table size; but Python's trivial "identity" integer hash produces no collisions in that common case). [Christian] > - The bits used from the string hash are not well distributed > - using a "warmup wheel" on the hash to suck all bits in > gives the same quality of hashes like random numbers. See above and be very cautious: none of Python's hash functions produce well-distributed bits, and-- in effect --that's why Python dicts often perform "better than random" on common data. Even what you've done so far appears to provide marginally worse statistics for Guido's favorite kind of test case ("worse" in two senses: total number of collisions (a measure of amortized lookup cost), and maximum collision chain length (a measure of worst-case lookup cost)): d = {} for i in range(N): d[repr(i)] = i check-in-one-thing-then-let-it-simmer-ly y'rs - tim From tismer at tismer.com Tue Dec 19 02:16:27 2000 From: tismer at tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 02:16:27 +0100 Subject: [Python-Dev] The Dictionary Gem is polished! References: Message-ID: <3A3EB6EB.C79A3896@tismer.com> Greg Wilson wrote: > > > > > Here some results, dictionaries have 1000 entries: > > I will publish some results later today. > > In Doctor Dobb's Journal, right? :-) We'd *really* like this article... Well, the results are not so bad: I stopped testing computation time for the Python dictionary implementation, in favor of "trips". How many trips does the re-hash take in a dictionary? Tests were run for dictionaries of size 1000, 2000, 3000, 4000. Dictionary 1 consists of i, formatted as string. Dictionary 2 consists of strings containig the binary of i. Dictionary 3 consists of random numbers. Dictionary 4 consists of i << 16. Algorithms: old is the original dictionary algorithm implemented in Python (probably quite correct now, using longs :-) new is the proposed incremental bits-suck-in-division algorithm. new2 is a version of new, where all extra bits of the hash function are wheeled in in advance. The computation time of this is not neglectible, so please use this result for reference, only. Here the results: (bad integers(old) not computed for n>1000 ) """ D:\crml_doc\platf\py>python dictest.py N=1000 trips for strings old=293 new=302 new2=221 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=499500 new=13187 new2=999 trips for random integers old=377 new=371 new2=393 trips for windows names old=230 new=207 new2=200 N=2000 trips for strings old=1093 new=1109 new2=786 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=26455 new2=1999 trips for random integers old=691 new=710 new2=741 trips for windows names old=503 new=542 new2=564 N=3000 trips for strings old=810 new=839 new2=609 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=38681 new2=2999 trips for random integers old=762 new=740 new2=735 trips for windows names old=712 new=711 new2=691 N=4000 trips for strings old=1850 new=1843 new2=1375 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=52994 new2=3999 trips for random integers old=1440 new=1450 new2=1414 trips for windows names old=1449 new=1434 new2=1457 D:\crml_doc\platf\py> """ Interpretation: --------------- Short numeric strings show a slightly too high trip number. This means that the hash() function could be enhanced. But the effect would be below 10 percent compared to random hashes, therefore not worth it. Binary representations of numbers as strings still create perfect hash numbers. Bad integers (complete hash clash due to high power of 2) are handled fairly well by the new algorithm. "new2" shows that they can be brought down to nearly perfect hashes just by applying the "hash melting wheel": Windows names are almost upper case, and almost verbose. They appear to perform nearly as well as random numbers. This means: The Python string has function is very good for a wide area of applications. In Summary: I would try to modify the string hash function slightly for short strings, but only if this does not negatively affect the results of above. Summary of summary: There is no really low hanging fruit in string hashing. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com -------------- next part -------------- ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] polys = map(long, polys) class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg mp.warmup = newalg>1 mp.trips = 0 def getTrips(self): trips = self.trips self.trips = 0 return trips def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # the shifting is worth it in the incremental case. ## added after posting to python-dev: uhash = _hash & 0xffffffffl if mp.warmup: incr = uhash mask2 = 0xffffffffl ^ mask while mask2 > mask: if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 mask2 = mask2>>1 # this loop *can* be sped up by tables # with precomputed multiple shifts. # But I'm not sure if it is worth it at all. else: incr = uhash ^ (uhash >> 3) ###### TO HERE if (not incr): incr = mask while 1: mp.trips = mp.trips+1 ep = ep0[int((i+incr)&mask)] if (ep[me_key] is NULL) : if (freeslot is not NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup for i in range(n): s = str(i) #* 5 #s = chr(i%256) + chr(i>>8)## d1[s] = d2[s] = d3[s] = i return d1, d2, d3 def istring_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup for i in range(n): s = chr(i%256) + chr(i>>8) d1[s] = d2[s] = d3[s] = i return d1, d2, d3 def random_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup from whrandom import randint import sys keys = [] for i in range(n): keys.append(randint(0, sys.maxint-1)) for i in keys: d1[i] = d2[i] = d3[i] = i return d1, d2, d3 def badnum_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup shift = 10 if EXTREME: shift = 16 for i in range(n): bad = i << 16 d2[bad] = d3[bad] = i if n <= 1000: d1[bad] = i return d1, d2, d3 def names_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup import win32con keys = win32con.__dict__.keys() if len(keys) < n: keys = [] for s in keys[:n]: d1[s] = d2[s] = d3[s] = s return d1, d2, d3 def do_test(dict): keys = dict.keys() dict.getTrips() # reset test(keys, dict) return dict.getTrips() EXTREME=1 if __name__ == "__main__": for N in (1000,2000,3000,4000): sdold, sdnew, sdnew2 = string_dicts(N) idold, idnew, idnew2 = istring_dicts(N) bdold, bdnew, bdnew2 = badnum_dicts(N) rdold, rdnew, rdnew2 = random_dicts(N) ndold, ndnew, ndnew2 = names_dicts(N) print "N=%d" %N print "trips for strings old=%d new=%d new2=%d" % tuple( map(do_test, (sdold, sdnew, sdnew2)) ) print "trips for bin strings old=%d new=%d new2=%d" % tuple( map(do_test, (idold, idnew, idnew2)) ) print "trips for bad integers old=%d new=%d new2=%d" % tuple( map(do_test, (bdold, bdnew, bdnew2))) print "trips for random integers old=%d new=%d new2=%d" % tuple( map(do_test, (rdold, rdnew, rdnew2))) print "trips for windows names old=%d new=%d new2=%d" % tuple( map(do_test, (ndold, ndnew, ndnew2))) """ Results with a shift of 10 (EXTREME=0): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 Results with a shift of 16 (EXTREME=1): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.218 new=5.147 timing for bad integers old=571.210 new=19.220 """ From tismer at tismer.com Tue Dec 19 02:51:32 2000 From: tismer at tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 02:51:32 +0100 Subject: [Python-Dev] Re: The Dictionary Gem is polished! References: Message-ID: <3A3EBF23.750CF761@tismer.com> Tim Peters wrote: > > Sounds good to me! It's a very cheap way to get the high bits into play. That's what I wanted to hear. It's also the reason why I try to stay conservative: Just do an obviously useful bit, but do not break any of the inherent benefits, like those "better than random" amenities. Python's dictionary algorithm appears to be "near perfect" and of "never touch but veery carefully or redo it completely". I tried the tightrope walk of just adding a tiny topping. > > i = (~_hash) & mask Yes that stuff was 2 hours last nite :-) I just decided to not touch it. Arbitrary crap! Although an XOR with hash >> number of mask bits would perform much better (in many cases but not all). Anyway, simple shifting cannot solve general bit distribution problems. Nor can I :-) > The ~ here seems like pure superstition to me (and the comments in the C > code don't justify it at all -- I added a nag of my own about that the last > time I checked in dictobject.c -- and see below for a bad consequence of > doing ~). > > > # note that we do not mask! > > # even the shifting my not be worth it. > > incr = _hash ^ (_hash >> 3) > > The shifting was just another cheap trick to get *some* influence from the > high bits. It's very limited, though. Toss it (it appears to be from the > "random operations yield random results" matchbook school of > design). Now, comment it out, and you see my new algorithm perform much worse. I just kept it since it had an advantage on "my case". (bad guy I know). And I wanted to have an argument for my change to get accepted. "No cost, just profit, nearly the same" was what I tried to sell. > [MAL] > > BTW, would changing the hash function on strings from the simple > > XOR scheme to something a little smarter help improve the performance > > too (e.g. most strings used in programming never use the 8-th > > bit) ? > > Don't understand -- the string hash uses multiplication: > > x = (1000003*x) ^ *p++; > > in a loop. Replacing "^" there by "+" should yield slightly better results. For short strings, this prime has bad influence on the low bits, making it perform supoptimally for small dicts. See the new2 algo which funnily corrects for that. The reason is obvious: Just look at the bit pattern of 1000003: '0xf4243' Without giving proof, this smells like bad bit distribution on small strings to me. You smell it too, right? > As is, string hashes are a lot like integer hashes, in that "consecutive" > strings > > J001 > J002 > J003 > J004 > ... > > yield hashes very close together in value. A bad generator in that case. I'll look for a better one. > But, because the current dict > algorithm uses ~ on the full hash but does not use ~ on the initial > increment, (~hash)+incr too often yields the same result for distinct hashes > (i.e., there's a systematic (but weak) form of clustering). You name it. > Note that Python is doing something very unusual: hashes are *usually* > designed to yield an approximation to randomness across all bits. But > Python's hashes never achieve that. This drives theoreticians mad (like the > fellow who originally came up with the GF idea), but tends to work "better > than random" in practice (e.g., a truly random hash function would almost > certainly produce many collisions when fed a fat range of consecutive > integers but still less than half the table size; but Python's trivial > "identity" integer hash produces no collisions in that common case). A good reason to be careful with changes(ahem). > [Christian] > > - The bits used from the string hash are not well distributed > > - using a "warmup wheel" on the hash to suck all bits in > > gives the same quality of hashes like random numbers. > > See above and be very cautious: none of Python's hash functions produce > well-distributed bits, and-- in effect --that's why Python dicts often > perform "better than random" on common data. Even what you've done so far > appears to provide marginally worse statistics for Guido's favorite kind of > test case ("worse" in two senses: total number of collisions (a measure of > amortized lookup cost), and maximum collision chain length (a measure of > worst-case lookup cost)): > > d = {} > for i in range(N): > d[repr(i)] = i Nah, I did quite a lot of tests, and the trip number shows a variation of about 10%, without judging old or new for better. This is just the randomness inside. > check-in-one-thing-then-let-it-simmer-ly y'rs - tim This is why I think to be even more conservative: Try to use a division wheel, but with the inverses of the original primitive roots, just in order to get at Guido's results :-) making-perfect-hashes-of-interneds-still-looks-promising - ly y'rs - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From greg at cosc.canterbury.ac.nz Tue Dec 19 04:07:56 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Dec 2000 16:07:56 +1300 (NZDT) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <200012182332.SAA18456@cj20424-a.reston1.va.home.com> Message-ID: <200012190307.QAA02663@s454.cosc.canterbury.ac.nz> > The problem I have with this is that the code to evaluate g() has to > be generated twice! I have an idea how to fix that. There need to be two methods, __boolean_and_1__ and __boolean_and_2__. The first operand is evaluated and passed to __boolean_and_1__. If it returns a result, that becomes the result of the expression, and the second operand is short-circuited. If __boolean_and_1__ raises a NeedOtherOperand exception (or there is no __boolean_and_1__ method), the second operand is evaluated, and both operands are passed to __boolean_and_2__. The bytecode would look something like BOOLEAN_AND_1 label BOOLEAN_AND_2 label: ... I'll make a PEP out of this one day if I get enthusiastic enough. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From tim.one at home.com Tue Dec 19 05:55:33 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 18 Dec 2000 23:55:33 -0500 Subject: [Python-Dev] The Dictionary Gem is polished! In-Reply-To: <3A3EB6EB.C79A3896@tismer.com> Message-ID: Something else to ponder: my tests show that the current ("old") algorithm performs much better (somewhat worse than "new2" == new algorithm + warmup) if incr is simply initialized like so instead: if mp.oldalg: incr = (_hash & 0xffffffffL) % (mp.ma_size - 1) That's another way to get all the bits to contribute to the result. Note that a mod by size-1 is analogous to "casting out nines" in decimal: it's the same as breaking hash into fixed-sized pieces from the right (10 bits each if size=2**10, etc), adding the pieces together, and repeating that process until only one piece remains. IOW, it's a degenerate form of division, but works well all the same. It didn't improve over that when I tried a mod by the largest prime less than the table size (which suggests we're sucking all we can out of the *probe* sequence given a sometimes-poor starting index). However, it's subject to the same weak clustering phenomenon as the old method due to the ill-advised "~hash" operation in computing the initial index. If ~ is also thrown away, it's as good as new2 (here I've tossed out the "windows names", and "old" == existing algorithm except (a) get rid of ~ when computing index and (b) do mod by size-1 when computing incr): N=1000 trips for strings old=230 new=261 new2=239 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=999 new=13212 new2=999 trips for random integers old=399 new=421 new2=410 N=2000 trips for strings old=787 new=1066 new2=827 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=26481 new2=1999 trips for random integers old=652 new=733 new2=650 N=3000 trips for strings old=547 new=760 new2=617 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=38701 new2=2999 trips for random integers old=724 new=743 new2=768 N=4000 trips for strings old=1311 new=1657 new2=1315 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=53014 new2=3999 trips for random integers old=1476 new=1587 new2=1493 The new and new2 values differ in minor ways from the ones you posted because I got rid of the ~ (the ~ has a bad interaction with "additive" means of computing incr, because the ~ tends to move the index in the opposite direction, and these moves in opposite directions tend to cancel out when computing incr+index the first time). too-bad-mod-is-expensive!-ly y'rs - tim From tim.one at home.com Tue Dec 19 06:50:01 2000 From: tim.one at home.com (Tim Peters) Date: Tue, 19 Dec 2000 00:50:01 -0500 Subject: [Python-Dev] SourceForge SSH silliness In-Reply-To: <20001217220008.D29681@xs4all.nl> Message-ID: [Tim] > Starting last night, I get this msg whenever I update Python code w/ > CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python: > > """ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > @ WARNING: HOST IDENTIFICATION HAS CHANGED! @ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! > Someone could be eavesdropping on you right now > (man-in-the-middle attack)! > It is also possible that the host key has just been changed. > Please contact your system administrator. > Add correct host key in C:\Code/.ssh/known_hosts to get rid of > this message. > Password authentication is disabled to avoid trojan horses. > """ > > This is SourceForge's doing, and is permanent (they've changed > keys on their end). ... [Thomas Wouters] > What sourceforge did was switch Linux distributions, and upgrade. > The switch doesn't really matter for the SSH problem, because recent > Debian and recent RedHat releases both use a new ssh, the OpenBSD > ssh imlementation. > Apparently, it isn't entirely backwards compatible to old versions of > F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This > might or might not be your problem; if it is, you should get a decent > message that gives a relatively clear message such as 'cypher type 'idea' > not supported'. > ... [and quite a bit more] ... I hope you're feeling better today . "The problem" was one the wng msg spelled out: "It is also possible that the host key has just been changed.". SF changed keys. That's the whole banana right there. Deleting the sourceforge keys from known_hosts fixed it (== convinced ssh to install new SF keys the next time I connected). From tim.one at home.com Tue Dec 19 06:58:45 2000 From: tim.one at home.com (Tim Peters) Date: Tue, 19 Dec 2000 00:58:45 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <200012171438.JAA21603@cj20424-a.reston1.va.home.com> Message-ID: [Tim] > I expect it would do less harm to introduce a compile-time warning for > locals that are never referenced (such as the "a" in "set"). [Guido] > Another warning that would be quite useful (and trap similar cases) > would be "local variable used before set". Java elevated that last one to a compile-time error, via its "definite assignment" rules: you not only have to make sure a local is bound before reference, you have to make it *obvious* to the compiler that it's bound before reference. I think this is a Good Thing, because with intense training, people can learn to think like a compiler too . Seriously, in several of the cases where gcc warned about "maybe used before set" in the Python implementation, the warnings were bogus but it was non-trivial to deduce that. Such code is very brittle under modification, and the definite assignment rules make that path to error a non-starter. Example: def f(N): if N > 0: for i in range(N): if i == 0: j = 42 else: f2(i) elif N <= 0: j = 24 return j It's a Crime Against Humanity to make the code reader *deduce* that j is always bound by the time "return" is executed. From guido at python.org Tue Dec 19 07:08:14 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 01:08:14 -0500 Subject: [Python-Dev] Error: syncmail script missing Message-ID: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> I just checked in the documentation for the warnings module. (Check it out!) When I ran "cvs commit" in the Doc directory, it said, amongst other things: sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory I suppose this may be a side effect of the transition to new hardware of the SourceForge CVS archive. (Which, by the way, has dramatically improved the performance of typical CVS operations -- I am no longer afraid to do a cvs diff or cvs log in Emacs, or to do a cvs update just to be sure.) Could some of the Powers That Be (Fred or Barry :-) check into what happened to the syncmail script? --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Tue Dec 19 07:10:04 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 19 Dec 2000 01:10:04 -0500 (EST) Subject: [Python-Dev] Error: syncmail script missing In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> Message-ID: <14910.64444.662460.48236@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > Could some of the Powers That Be (Fred or Barry :-) check into what > happened to the syncmail script? We've seen this before, but I'm not sure what it was. Barry, do you recall? Had the Python interpreter landed in a different directory? Or perhaps the location of the CVS repository is different, so syncmail isn't where loginfo says. Tomorrow... scp to SF appears broken as well. ;( -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tim.one at home.com Tue Dec 19 07:16:15 2000 From: tim.one at home.com (Tim Peters) Date: Tue, 19 Dec 2000 01:16:15 -0500 Subject: [Python-Dev] Error: syncmail script missing In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > I just checked in the documentation for the warnings module. (Check > it out!) Everyone should note that this means Guido will be taking his traditional post-release vacation almost immediately . > When I ran "cvs commit" in the Doc directory, it said, amongst other > things: > > sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory > > I suppose this may be a side effect of the transition to new hardware > of the SourceForge CVS archive. The lack of checkin mail was first noted on a Jython list. Finn wisely replied that he'd just sit back and wait for the CPython people to figure out how to fix it. > ... > Could some of the Powers That Be (Fred or Barry :-) check into what > happened to the syncmail script? Don't worry, I'll do my part by nagging them in your absence . Bon holiday voyage! From cgw at fnal.gov Tue Dec 19 07:32:15 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Tue, 19 Dec 2000 00:32:15 -0600 (CST) Subject: [Python-Dev] cycle-GC question Message-ID: <14911.239.12288.546710@buffalo.fnal.gov> The following program: import rexec while 1: x = rexec.RExec() del x leaks memory at a fantastic rate. It seems clear (?) that this is due to the call to "set_rexec" at rexec.py:140, which creates a circular reference between the `rexec' and `hooks' objects. (There's even a nice comment to that effect). I'm curious however as to why the spiffy new cyclic-garbage collector doesn't pick this up? Just-wondering-ly y'rs, cgw From tim_one at email.msn.com Tue Dec 19 10:24:18 2000 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 19 Dec 2000 04:24:18 -0500 Subject: [Python-Dev] RE: The Dictionary Gem is polished! In-Reply-To: <3A3EBF23.750CF761@tismer.com> Message-ID: [Christian Tismer] > ... > For short strings, this prime has bad influence on the low bits, > making it perform supoptimally for small dicts. > See the new2 algo which funnily corrects for that. > The reason is obvious: Just look at the bit pattern > of 1000003: '0xf4243' > > Without giving proof, this smells like bad bit distribution on small > strings to me. You smell it too, right? > ... [Tim] > As is, string hashes are a lot like integer hashes, in that > "consecutive" strings > > J001 > J002 > J003 > J004 > ... > > yield hashes very close together in value. [back to Christian] > A bad generator in that case. I'll look for a better one. Not necessarily! It's for that same reason "consecutive strings" can have "better than random" behavior today. And consecutive strings-- like consecutive ints --are a common case. Here are the numbers for the synthesized string cases: N=1000 trips for strings old=293 new=302 new2=221 trips for bin strings old=0 new=0 new2=0 N=2000 trips for strings old=1093 new=1109 new2=786 trips for bin strings old=0 new=0 new2=0 N=3000 trips for strings old=810 new=839 new2=609 trips for bin strings old=0 new=0 new2=0 N=4000 trips for strings old=1850 new=1843 new2=1375 trips for bin strings old=0 new=0 new2=0 Here they are again, after doing nothing except changing the "^" to "+" in the string hash, i.e. replacing x = (1000003*x) ^ *p++; by x = (1000003*x) + *p++; N=1000 trips for strings old=140 new=127 new2=108 trips for bin strings old=0 new=0 new2=0 N=2000 trips for strings old=480 new=434 new2=411 trips for bin strings old=0 new=0 new2=0 N=3000 trips for strings old=821 new=857 new2=631 trips for bin strings old=0 new=0 new2=0 N=4000 trips for strings old=1892 new=1852 new2=1476 trips for bin strings old=0 new=0 new2=0 The first two sizes are dramatically better, the last two a wash. If you want to see a real disaster, replace the "+" with "*" : N=1000 trips for strings old=71429 new=6449 new2=2048 trips for bin strings old=81187 new=41117 new2=41584 N=2000 trips for strings old=26882 new=9300 new2=6103 trips for bin strings old=96018 new=46932 new2=42408 I got tired of waiting at that point ... suspecting-a-better-string-hash-is-hard-to-find-ly y'rs - tim From martin at loewis.home.cs.tu-berlin.de Tue Dec 19 12:58:17 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 19 Dec 2000 12:58:17 +0100 Subject: [Python-Dev] Death to string functions! Message-ID: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> > I agree it would be useful to define these terms, although those > particular definitions appear to be missing the most important point > from the user's POV (not a one says "going away someday"). PEP 4 says # Usage of a module may be `deprecated', which means that it may be # removed from a future Python release. Proposals for better wording are welcome (and yes, I still have to get the comments that I got into the document). Regards, Martin From guido at python.org Tue Dec 19 15:48:47 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 09:48:47 -0500 Subject: [Python-Dev] cycle-GC question In-Reply-To: Your message of "Tue, 19 Dec 2000 00:32:15 CST." <14911.239.12288.546710@buffalo.fnal.gov> References: <14911.239.12288.546710@buffalo.fnal.gov> Message-ID: <200012191448.JAA28737@cj20424-a.reston1.va.home.com> > The following program: > > import rexec > while 1: > x = rexec.RExec() > del x > > leaks memory at a fantastic rate. > > It seems clear (?) that this is due to the call to "set_rexec" at > rexec.py:140, which creates a circular reference between the `rexec' > and `hooks' objects. (There's even a nice comment to that effect). > > I'm curious however as to why the spiffy new cyclic-garbage collector > doesn't pick this up? Me too. I turned on gc debugging (gc.set_debug(077) :-) and got messages suggesting that it is not collecting everything. The output looks like this: . . . gc: collecting generation 0... gc: objects in each generation: 764 6726 89174 gc: done. gc: collecting generation 1... gc: objects in each generation: 0 8179 89174 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 0 97235 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 747 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 1386 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 2082 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 2721 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 3417 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 4056 97184 gc: done. . . . With the third number growing each time a "generation 1" collection is done. Maybe Neil can shed some light? The gc.garbage list is empty. This is about as much as I know about the GC stuff... --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli at amber.org Tue Dec 19 16:25:18 2000 From: petrilli at amber.org (Christopher Petrilli) Date: Tue, 19 Dec 2000 10:25:18 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 12:58:17PM +0100 References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> Message-ID: <20001219102518.A14288@trump.amber.org> So I was thinking about this whole thing, and wondering why it was that seeing things like: " ".join(aList) bugged me to no end, while: aString.lower() didn't seem to look wrong. I finally put my finger on it, and I haven't seen anyone mention it, so I guess I'll do so. To me, the concept of "join" on a string is just not quite kosher, instead it should be something like this: aList.join(" ") or if you want it without the indirection: ['item', 'item', 'item'].join(" ") Now *THAT* looks right to me. The example of a join method on a string just doesn't quite gel in my head, and I did some thinking and digging, and well, when I pulled up my Smalltalk browser, things like join are done on Collections, not on Strings. You're joining the collection, not the string. Perhaps in a rush to move some things that were "string related" in the string module into methods on the strings themselves (something I whole-heartedly support), we moved a few too many things there---things that symantically don't really belong as methods on a string object. How this gets resolved, I don't know... but I know a lot of people have looked at the string methods---and they each keep coming back to 1 or 2 that bug them... and I think it's those that really aren't methods of a string, but instead something that operates with strings, but expects other things. Chris -- | Christopher Petrilli | petrilli at amber.org From guido at python.org Tue Dec 19 16:37:15 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 10:37:15 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Tue, 19 Dec 2000 10:25:18 EST." <20001219102518.A14288@trump.amber.org> References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org> Message-ID: <200012191537.KAA28909@cj20424-a.reston1.va.home.com> > So I was thinking about this whole thing, and wondering why it was > that seeing things like: > > " ".join(aList) > > bugged me to no end, while: > > aString.lower() > > didn't seem to look wrong. I finally put my finger on it, and I > haven't seen anyone mention it, so I guess I'll do so. To me, the > concept of "join" on a string is just not quite kosher, instead it > should be something like this: > > aList.join(" ") > > or if you want it without the indirection: > > ['item', 'item', 'item'].join(" ") > > Now *THAT* looks right to me. The example of a join method on a > string just doesn't quite gel in my head, and I did some thinking and > digging, and well, when I pulled up my Smalltalk browser, things like > join are done on Collections, not on Strings. You're joining the > collection, not the string. > > Perhaps in a rush to move some things that were "string related" in > the string module into methods on the strings themselves (something I > whole-heartedly support), we moved a few too many things > there---things that symantically don't really belong as methods on a > string object. > > How this gets resolved, I don't know... but I know a lot of people > have looked at the string methods---and they each keep coming back to > 1 or 2 that bug them... and I think it's those that really aren't > methods of a string, but instead something that operates with strings, > but expects other things. Boy, are you stirring up a can of worms that we've been through many times before! Nothing you say hasn't been said at least a hundred times before, on this list as well as on c.l.py. The problem is that if you want to make this a method on lists, you'll also have to make it a method on tuples, and on arrays, and on NumPy arrays, and on any user-defined type that implements the sequence protocol... That's just not reasonable to expect. There really seem to be only two possibilities that don't have this problem: (1) make it a built-in, or (2) make it a method on strings. We chose for (2) for uniformity, and to avoid the potention with os.path.join(), which is sometimes imported as a local. If " ".join(L) bugs you, try this: space = " " # This could be a global . . . s = space.join(L) --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at digicool.com Tue Dec 19 16:46:55 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 10:46:55 -0500 Subject: [Python-Dev] Death to string functions! References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org> Message-ID: <14911.33519.764029.306876@anthem.concentric.net> >>>>> "CP" == Christopher Petrilli writes: CP> So I was thinking about this whole thing, and wondering why it CP> was that seeing things like: CP> " ".join(aList) CP> bugged me to no end, while: CP> aString.lower() CP> didn't seem to look wrong. I finally put my finger on it, and CP> I haven't seen anyone mention it, so I guess I'll do so. Actually, it has been debated to death. ;) This looks better: SPACE = ' ' SPACE.join(aList) That reads good to me ("space-join this list") and that's how I always write it. That said, there are certainly lots of people who agree with you. You can't put join() on sequences though, until you have builtin base-classes, or interfaces, or protocols or some such construct, because otherwise you'd have to add it to EVERY sequence, including classes that act like sequences. One idea that I believe has merit is to consider adding join() to the builtins, probably with a signature like: join(aList, aString) -> aString This horse has been whacked pretty good too, but I don't remember seeing a patch or a pronouncement. -Barry From nas at arctrix.com Tue Dec 19 09:53:36 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 19 Dec 2000 00:53:36 -0800 Subject: [Python-Dev] cycle-GC question In-Reply-To: <200012191448.JAA28737@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Dec 19, 2000 at 09:48:47AM -0500 References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com> Message-ID: <20001219005336.A303@glacier.fnational.com> On Tue, Dec 19, 2000 at 09:48:47AM -0500, Guido van Rossum wrote: > > import rexec > > while 1: > > x = rexec.RExec() > > del x > > > > leaks memory at a fantastic rate. > > > > It seems clear (?) that this is due to the call to "set_rexec" at > > rexec.py:140, which creates a circular reference between the `rexec' > > and `hooks' objects. (There's even a nice comment to that effect). Line 140 is not the only place a circular reference is created. There is another one which is trickier to find: def add_module(self, mname): if self.modules.has_key(mname): return self.modules[mname] self.modules[mname] = m = self.hooks.new_module(mname) m.__builtins__ = self.modules['__builtin__'] return m If the module being added is __builtin__ then m.__builtins__ = m. The GC currently doesn't track modules. I guess it should. It might be possible to avoid this circular reference but I don't know enough about how RExec works. Would something like: def add_module(self, mname): if self.modules.has_key(mname): return self.modules[mname] self.modules[mname] = m = self.hooks.new_module(mname) if mname != '__builtin__': m.__builtins__ = self.modules['__builtin__'] return m do the trick? Neil From fredrik at effbot.org Tue Dec 19 16:39:49 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Tue, 19 Dec 2000 16:39:49 +0100 Subject: [Python-Dev] Death to string functions! References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org> Message-ID: <008301c069d3$76560a20$3c6340d5@hagrid> "Christopher Petrilli" wrote: > didn't seem to look wrong. I finally put my finger on it, and I > haven't seen anyone mention it, so I guess I'll do so. To me, the > concept of "join" on a string is just not quite kosher, instead it > should be something like this: > > aList.join(" ") > > or if you want it without the indirection: > > ['item', 'item', 'item'].join(" ") > > Now *THAT* looks right to me. why do we keep coming back to this? aString.join can do anything string.join can do, but aList.join cannot. if you don't understand why, check the archives. From martin at loewis.home.cs.tu-berlin.de Tue Dec 19 16:44:48 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 19 Dec 2000 16:44:48 +0100 Subject: [Python-Dev] cycle-GC question Message-ID: <200012191544.QAA11408@loewis.home.cs.tu-berlin.de> > It seems clear (?) that this is due to the call to "set_rexec" at > rexec.py:140, which creates a circular reference between the `rexec' > and `hooks' objects. (There's even a nice comment to that effect). It's not all that clear that *this* is the cycle. In fact, it is not. > I'm curious however as to why the spiffy new cyclic-garbage > collector doesn't pick this up? It's an interesting problem, so I spent this afternoon investigating it. I soon found that I need a tool, so I introduced a new function gc.getreferents which, when given an object, returns a list of objects referring to that object. The patch for that feature is in http://sourceforge.net/patch/?func=detailpatch&patch_id=102925&group_id=5470 Applying that function recursively, I can get an output that looks like that: dictionary 0x81f4f24 dictionary 0x81f4f24 (seen) dictionary 0x81f4f24 (seen) dictionary 0x8213bc4 dictionary 0x820869c dictionary 0x820866c (seen) dictionary 0x8213bf4 dictionary 0x820866c (seen) dictionary 0x8214144 dictionary 0x820866c (seen) Each indentation level shows the objects which refer to the outer-next object, e.g. the dictionary 0x820869c refers to the RExec instance, and the RHooks instance refers to that dictionary. Clearly, the dictionary 0x820869c is the RHooks' __dict__, and the reference belongs to the 'rexec' key in that dictionary. The recursion stops only when an object has been seen before (so its a cycle, or other non-tree graph), or if there are no referents (the lists created to do the iteration are ignored). So it appears that the r_import method is referenced from some dictionary, but that dictionary is not referenced anywhere??? Checking the actual structures shows that rexec creates a __builtin__ module, which has a dictionary that has an __import__ key. So the reference to the method comes from the __builtin__ module, which in turn is referenced as the RExec's .modules attribute, giving another cycle. However, module objects don't participate in garbage collection. Therefore, gc.getreferents cannot traverse a module, and the garbage collector won't find a cycle involving a garbage module. I just submitted a bug report, http://sourceforge.net/bugs/?func=detailbug&bug_id=126345&group_id=5470 which suggests that modules should also participate in garbage collection. Regards, Martin From guido at python.org Tue Dec 19 17:01:46 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 11:01:46 -0500 Subject: [Python-Dev] cycle-GC question In-Reply-To: Your message of "Tue, 19 Dec 2000 00:53:36 PST." <20001219005336.A303@glacier.fnational.com> References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com> <20001219005336.A303@glacier.fnational.com> Message-ID: <200012191601.LAA29015@cj20424-a.reston1.va.home.com> > might be possible to avoid this circular reference but I don't > know enough about how RExec works. Would something like: > > def add_module(self, mname): > if self.modules.has_key(mname): > return self.modules[mname] > self.modules[mname] = m = self.hooks.new_module(mname) > if mname != '__builtin__': > m.__builtins__ = self.modules['__builtin__'] > return m > > do the trick? That's certainly a good thing to do (__builtin__ has no business having a __builtins__!), but (in my feeble experiment) it doesn't make the leaks go away. Note that almost every module participates heavily in cycles: whenever you define a function f(), f.func_globals is the module's __dict__, which also contains a reference to f. Similar for classes, with an extra hop via the class object and its __dict__. --Guido van Rossum (home page: http://www.python.org/~guido/) From cgw at fnal.gov Tue Dec 19 17:06:06 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Tue, 19 Dec 2000 10:06:06 -0600 (CST) Subject: [Python-Dev] cycle-GC question In-Reply-To: <20001219005336.A303@glacier.fnational.com> References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com> <20001219005336.A303@glacier.fnational.com> Message-ID: <14911.34670.664178.418523@buffalo.fnal.gov> Neil Schemenauer writes: > > Line 140 is not the only place a circular reference is created. > There is another one which is trickier to find: > > def add_module(self, mname): > if self.modules.has_key(mname): > return self.modules[mname] > self.modules[mname] = m = self.hooks.new_module(mname) > m.__builtins__ = self.modules['__builtin__'] > return m > > If the module being added is __builtin__ then m.__builtins__ = m. > The GC currently doesn't track modules. I guess it should. It > might be possible to avoid this circular reference but I don't > know enough about how RExec works. Would something like: > > def add_module(self, mname): > if self.modules.has_key(mname): > return self.modules[mname] > self.modules[mname] = m = self.hooks.new_module(mname) > if mname != '__builtin__': > m.__builtins__ = self.modules['__builtin__'] > return m > > do the trick? No... if you change "add_module" in exactly the way you suggest (without worrying about whether it breaks the functionality of rexec!) and run the test while 1: rexec.REXec() you will find that it still leaks memory at a prodigious rate. So, (unless there is yet another module-level cyclic reference) I don't think this theory explains the problem. From martin at loewis.home.cs.tu-berlin.de Tue Dec 19 17:07:04 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 19 Dec 2000 17:07:04 +0100 Subject: [Python-Dev] cycle-GC question Message-ID: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de> > There is another one which is trickier to find: [__builtin__.__builtins__ == __builtin__] > Would something like: [do not add builtins to builtin > work? No, because there is another one that is even trickier to find :-) >>> print r >>> print r.modules['__builtin__'].open.im_self Please see my other message; I think modules should be gc'ed. Regards, Martin From nas at arctrix.com Tue Dec 19 10:24:29 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 19 Dec 2000 01:24:29 -0800 Subject: [Python-Dev] cycle-GC question In-Reply-To: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 05:07:04PM +0100 References: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de> Message-ID: <20001219012429.A520@glacier.fnational.com> On Tue, Dec 19, 2000 at 05:07:04PM +0100, Martin v. Loewis wrote: > I think modules should be gc'ed. I agree. Its easy to do. If no one does over Christmas I will do it before 2.1 is released. Neil From tismer at tismer.com Tue Dec 19 16:48:58 2000 From: tismer at tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 17:48:58 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: Message-ID: <3A3F836A.DEDF1011@tismer.com> Tim Peters wrote: > > Something else to ponder: my tests show that the current ("old") algorithm > performs much better (somewhat worse than "new2" == new algorithm + warmup) > if incr is simply initialized like so instead: > > if mp.oldalg: > incr = (_hash & 0xffffffffL) % (mp.ma_size - 1) Sure. I did this as well, but didn't consider a division since it said to be too slow. But this is very platform dependant. On Pentiums this might be not noticeable. > That's another way to get all the bits to contribute to the result. Note > that a mod by size-1 is analogous to "casting out nines" in decimal: it's > the same as breaking hash into fixed-sized pieces from the right (10 bits > each if size=2**10, etc), adding the pieces together, and repeating that > process until only one piece remains. IOW, it's a degenerate form of > division, but works well all the same. It didn't improve over that when I > tried a mod by the largest prime less than the table size (which suggests > we're sucking all we can out of the *probe* sequence given a sometimes-poor > starting index). Again I tried this too. Instead of the largest near prime I used the nearest prime. Remarkably the nearest prime is identical to the primitive element in a lot of cases. But no improvement over the modulus. > > However, it's subject to the same weak clustering phenomenon as the old > method due to the ill-advised "~hash" operation in computing the initial > index. If ~ is also thrown away, it's as good as new2 (here I've tossed out > the "windows names", and "old" == existing algorithm except (a) get rid of ~ > when computing index and (b) do mod by size-1 when computing incr): ... > The new and new2 values differ in minor ways from the ones you posted > because I got rid of the ~ (the ~ has a bad interaction with "additive" > means of computing incr, because the ~ tends to move the index in the > opposite direction, and these moves in opposite directions tend to cancel > out when computing incr+index the first time). Remarkable. > too-bad-mod-is-expensive!-ly y'rs - tim Yes. The wheel is cheapest yet. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From just at letterror.com Tue Dec 19 18:11:55 2000 From: just at letterror.com (Just van Rossum) Date: Tue, 19 Dec 2000 18:11:55 +0100 Subject: [Python-Dev] Death to string functions! Message-ID: Barry wrote: >Actually, it has been debated to death. ;) This looks better: > > SPACE = ' ' > SPACE.join(aList) > >That reads good to me ("space-join this list") and that's how I always >write it. I just did a quick scan through the 1.5.2 library, and _most_ occurrances of string.join() are used with a string constant for the second argument. There is a whole bunch of one-arg string.join()'s, too. Recommending replacing all of these (not to mention all the code "out there") with named constants seems plain silly. Sure, " ".join() is the most "logical" choice for Python as it stands, but it's definitely not the most intuitive, as evidenced by the number of times this comes up on c.l.py: to many people it simply "looks wrong". Maybe this is the deal: joiner.join() makes a whole lot of sense from an _implementation_ standpoint, but a whole lot less as a public interface. It's easy to explain why join() can't be a method of sequences (in Python), but that alone doesn't justify a string method. string.join() is not quite unlike map() and friends: map() wouldn't be so bad as a sequence method, but that isn't practical for exactly the same reasons: so it's a builtin. (And not a function method...) So, making join() a builtin makes a whole lot of sense. Not doing this because people sometimes use a local reference to os.path.join seems awfully backward. Hm, maybe joiner.join() could become a special method: joiner.__join__(), that way other objects could define their own implementation for join(). (Hm, wouldn't be the worst thing for, say, a file path object...) Just From barry at digicool.com Tue Dec 19 18:20:07 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 12:20:07 -0500 Subject: [Python-Dev] Death to string functions! References: Message-ID: <14911.39111.710940.342986@anthem.concentric.net> >>>>> "JvR" == Just van Rossum writes: JvR> Recommending replacing all of these (not to mention all the JvR> code "out there") with named constants seems plain silly. Until there's a tool to do the migration, I don't (personally) recommend wholesale migration. For new code I write though, I usually do it the way I described (which is intuitive to me, but then so is moving your fingers at a blinding speed up and down 5 long strips of metal to cause low bowel-tickling rumbly noises). JvR> So, making join() a builtin makes a whole lot of sense. Not JvR> doing this because people sometimes use a local reference to JvR> os.path.join seems awfully backward. I agree. Have we agreed on the semantics and signature of builtin join() though? Is it just string.join() stuck in builtins? -Barry From fredrik at effbot.org Tue Dec 19 18:25:49 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Tue, 19 Dec 2000 18:25:49 +0100 Subject: [Python-Dev] Death to string functions! References: <14911.39111.710940.342986@anthem.concentric.net> Message-ID: <012901c069e0$bd724fb0$3c6340d5@hagrid> Barry wrote: > JvR> So, making join() a builtin makes a whole lot of sense. Not > JvR> doing this because people sometimes use a local reference to > JvR> os.path.join seems awfully backward. > > I agree. Have we agreed on the semantics and signature of builtin > join() though? Is it just string.join() stuck in builtins? +1 (let's leave the __join__ slot and other super-generalized variants for 2.2) From thomas at xs4all.net Tue Dec 19 18:54:34 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Tue, 19 Dec 2000 18:54:34 +0100 Subject: [Python-Dev] SourceForge SSH silliness In-Reply-To: ; from tim.one@home.com on Tue, Dec 19, 2000 at 12:50:01AM -0500 References: <20001217220008.D29681@xs4all.nl> Message-ID: <20001219185434.E29681@xs4all.nl> On Tue, Dec 19, 2000 at 12:50:01AM -0500, Tim Peters wrote: > [Thomas Wouters] > > What sourceforge did was switch Linux distributions, and upgrade. > > ... [and quite a bit more] ... > I hope you're feeling better today . "The problem" was one the wng > msg spelled out: "It is also possible that the host key has just been > changed.". SF changed keys. That's the whole banana right there. Deleting > the sourceforge keys from known_hosts fixed it (== convinced ssh to install > new SF keys the next time I connected). Well, if you'd read the thread , you'll notice that other people had problems even after that. I'm glad you're not one of them, though :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From barry at digicool.com Tue Dec 19 19:22:19 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 13:22:19 -0500 Subject: [Python-Dev] Error: syncmail script missing References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> Message-ID: <14911.42843.284822.935268@anthem.concentric.net> Folks, Python wasn't installed on the new SF CVS machine, which was why syncmail was broken. My thanks to the SF guys for quickly remedying this situation! Please give it a test. -Barry From barry at digicool.com Tue Dec 19 19:23:32 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 13:23:32 -0500 Subject: [Python-Dev] Error: syncmail script missing References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> <14911.42843.284822.935268@anthem.concentric.net> Message-ID: <14911.42916.573600.922606@anthem.concentric.net> >>>>> "BAW" == Barry A Warsaw writes: BAW> Python wasn't installed on the new SF CVS machine, which was BAW> why syncmail was broken. My thanks to the SF guys for BAW> quickly remedying this situation! BTW, it's currently Python 1.5.2. From tismer at tismer.com Tue Dec 19 18:34:14 2000 From: tismer at tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 19:34:14 +0200 Subject: [Python-Dev] Re: The Dictionary Gem is polished! References: Message-ID: <3A3F9C16.562F9D9F@tismer.com> Again... Tim Peters wrote: > > Sounds good to me! It's a very cheap way to get the high bits into play. ... > [Christian] > > - The bits used from the string hash are not well distributed > > - using a "warmup wheel" on the hash to suck all bits in > > gives the same quality of hashes like random numbers. > > See above and be very cautious: none of Python's hash functions produce > well-distributed bits, and-- in effect --that's why Python dicts often > perform "better than random" on common data. Even what you've done so far > appears to provide marginally worse statistics for Guido's favorite kind of > test case ("worse" in two senses: total number of collisions (a measure of > amortized lookup cost), and maximum collision chain length (a measure of > worst-case lookup cost)): > > d = {} > for i in range(N): > d[repr(i)] = i I will look into this. > check-in-one-thing-then-let-it-simmer-ly y'rs - tim Are you saying I should check the thing in? Really? In another reply to this message I was saying """ This is why I think to be even more conservative: Try to use a division wheel, but with the inverses of the original primitive roots, just in order to get at Guido's results :-) """ This was a religious desire, but such an inverse cannot exist. Well, all inverses exists, but it is an error to think that they can produce similar bit patterns. Changing the root means changing the whole system, since we have just a *representation* of a goup, via polynomial coefficients. A simple example which renders my thought useless is this: There is no general pattern that can turn a physical right shift into a left shift, for all bit combinations. Anyway, how can I produce a nearly complete scheme like today with the same "cheaper than random" properties? Ok, we have to stick with the given polymomials to stay compatible, and we also have to shift left. How do we then rotate the random bits in? Well, we can in fact do a rotation of the whole index, moving the highest bit into the lowest. Too bad that this isn't supported in C. It is a native machine instruction on X86 machines. We would then have: incr = ROTATE_LEFT(incr, 1) if (incr > mask): incr = incr ^ mp.ma_poly The effect is similar to the "old" algorithm, bits are shiftet left. Only if the hash happens to have hight bits, they appear in the modulus. On the current "faster than random" cases, I assume that high bits in the hash are less likely than low bits, so it is more likely that an entry finds its good place in the dict, before bits are rotated in. hence the "good" cases would be kept. I did all tests again, now including maximum trip length, and added a "rotate-left" version as well: D:\crml_doc\platf\py>python dictest.py N=1000 trips for strings old=293/9 new=302/7 new2=221/7 rot=278/5 trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16754/31 trips for random integers old=360/8 new=369/8 new2=358/6 rot=356/7 trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/5 N=2000 trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1082/8 trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33524/34 trips for random integers old=704/7 new=686/8 new2=685/7 rot=693/7 trips for windows names old=503/8 new=542/9 new2=564/6 rot=529/7 N=3000 trips for strings old=810/5 new=839/6 new2=609/5 rot=796/5 trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=49828/38 trips for random integers old=708/5 new=723/7 new2=724/5 rot=722/6 trips for windows names old=712/6 new=711/5 new2=691/5 rot=738/9 N=4000 trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1848/10 trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=66356/38 trips for random integers old=1395/9 new=1397/8 new2=1435/9 rot=1394/13 trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1513/9 D:\crml_doc\platf\py> Concerning trip length, rotate is better than old in most cases. Random integers seem to withstand any of these procedures. For bad integers, rot takes naturally more trips than new, since the path to the bits is longer. All in all I don't see more than marginal differences between the approaches, and I tent to stick with "new", since it is theapest to implement. (it does not cost anything and might instead be a little cheaper for some compilers, since it does not reference the mask variable). I'd say let's do the patch -- ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com -------------- next part -------------- ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] polys = map(long, polys) class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg mp.warmup = newalg==2 mp.rotleft = newalg==3 mp.trips = 0 mp.tripmax = 0 def getTrips(self): trips, tripmax = self.trips, self.tripmax self.trips = self.tripmax = 0 return trips, tripmax def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # the shifting is worth it in the incremental case. ## added after posting to python-dev: uhash = _hash & 0xffffffffl if mp.warmup: incr = uhash mask2 = 0xffffffffl ^ mask while mask2 > mask: if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 mask2 = mask2>>1 # this loop *can* be sped up by tables # with precomputed multiple shifts. # But I'm not sure if it is worth it at all. else: incr = uhash ^ (uhash >> 3) ###### TO HERE if (not incr): incr = mask triplen = 0 while 1: mp.trips = mp.trips+1 triplen = triplen+1 if triplen > mp.tripmax: mp.tripmax = triplen ep = ep0[int((i+incr)&mask)] if (ep[me_key] is NULL) : if (freeslot is not NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly elif mp.rotleft: if incr &0x80000000L: incr = (incr << 1) | 1 else: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft for i in range(n): s = str(i) #* 5 #s = chr(i%256) + chr(i>>8)## d1[s] = d2[s] = d3[s] = d4[s] = i return d1, d2, d3, d4 def istring_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft for i in range(n): s = chr(i%256) + chr(i>>8) d1[s] = d2[s] = d3[s] = d4[s] = i return d1, d2, d3, d4 def random_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft from whrandom import randint import sys keys = [] for i in range(n): keys.append(randint(0, sys.maxint-1)) for i in keys: d1[i] = d2[i] = d3[i] = d4[i] = i return d1, d2, d3, d4 def badnum_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft shift = 10 if EXTREME: shift = 16 for i in range(n): bad = i << 16 d2[bad] = d3[bad] = d4[bad] = i if n <= 1000: d1[bad] = i return d1, d2, d3, d4 def names_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft import win32con keys = win32con.__dict__.keys() if len(keys) < n: keys = [] for s in keys[:n]: d1[s] = d2[s] = d3[s] = d4[s] = s return d1, d2, d3, d4 def do_test(dict): keys = dict.keys() dict.getTrips() # reset test(keys, dict) return "%d/%d" % dict.getTrips() EXTREME=1 if __name__ == "__main__": for N in (1000,2000,3000,4000): sdold, sdnew, sdnew2, sdrot = string_dicts(N) #idold, idnew, idnew2, idrot = istring_dicts(N) bdold, bdnew, bdnew2, bdrot = badnum_dicts(N) rdold, rdnew, rdnew2, rdrot = random_dicts(N) ndold, ndnew, ndnew2, ndrot = names_dicts(N) fmt = "old=%s new=%s new2=%s rot=%s" print "N=%d" %N print ("trips for strings "+fmt) % tuple( map(do_test, (sdold, sdnew, sdnew2, sdrot)) ) #print ("trips for bin strings "+fmt) % tuple( # map(do_test, (idold, idnew, idnew2, idrot)) ) print ("trips for bad integers "+fmt) % tuple( map(do_test, (bdold, bdnew, bdnew2, bdrot))) print ("trips for random integers "+fmt) % tuple( map(do_test, (rdold, rdnew, rdnew2, rdrot))) print ("trips for windows names "+fmt) % tuple( map(do_test, (ndold, ndnew, ndnew2, ndrot))) """ Results with a shift of 10 (EXTREME=0): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 Results with a shift of 16 (EXTREME=1): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.218 new=5.147 timing for bad integers old=571.210 new=19.220 """ From just at letterror.com Tue Dec 19 19:46:18 2000 From: just at letterror.com (Just van Rossum) Date: Tue, 19 Dec 2000 19:46:18 +0100 Subject: [Python-Dev] Death to string functions! In-Reply-To: <14911.39111.710940.342986@anthem.concentric.net> References: Message-ID: At 12:20 PM -0500 19-12-2000, Barry A. Warsaw wrote: >I agree. Have we agreed on the semantics and signature of builtin >join() though? Is it just string.join() stuck in builtins? Yep. I'm with /F that further generalization can be done later. Oh, does this mean that "".join() becomes deprecated? (Nice test case for the warning framework...) Just From barry at digicool.com Tue Dec 19 19:56:45 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 13:56:45 -0500 Subject: [Python-Dev] Death to string functions! References: Message-ID: <14911.44909.414520.788073@anthem.concentric.net> >>>>> "JvR" == Just van Rossum writes: JvR> Oh, does this mean that "".join() becomes deprecated? Please, no. From guido at python.org Tue Dec 19 19:56:39 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 13:56:39 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Tue, 19 Dec 2000 13:56:45 EST." <14911.44909.414520.788073@anthem.concentric.net> References: <14911.44909.414520.788073@anthem.concentric.net> Message-ID: <200012191856.NAA30524@cj20424-a.reston1.va.home.com> > >>>>> "JvR" == Just van Rossum writes: > > JvR> Oh, does this mean that "".join() becomes deprecated? > > Please, no. No. --Guido van Rossum (home page: http://www.python.org/~guido/) From just at letterror.com Tue Dec 19 20:15:19 2000 From: just at letterror.com (Just van Rossum) Date: Tue, 19 Dec 2000 20:15:19 +0100 Subject: [Python-Dev] Death to string functions! In-Reply-To: <14911.44909.414520.788073@anthem.concentric.net> References: Message-ID: At 1:56 PM -0500 19-12-2000, Barry A. Warsaw wrote: >>>>>> "JvR" == Just van Rossum writes: > > JvR> Oh, does this mean that "".join() becomes deprecated? > >Please, no. And keep two non-deprecated ways to do the same thing? I'm not saying it should be removed, just that the powers that be declare that _one_ of them is the preferred way. And-if-that-one-isn't-builtin-join()-I-don't-know-why-to-even-bother y'rs -- Just From greg at cosc.canterbury.ac.nz Tue Dec 19 23:35:05 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 20 Dec 2000 11:35:05 +1300 (NZDT) Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012191537.KAA28909@cj20424-a.reston1.va.home.com> Message-ID: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz> Guido: > Boy, are you stirring up a can of worms that we've been through many > times before! Nothing you say hasn't been said at least a hundred > times before, on this list as well as on c.l.py. And I'll wager you'll continue to hear them said at regular intervals for a long time to come, because you've done something which a lot of people feel very strongly was a mistake, and they have some very rational arguments as to why it was a mistake, whereas you don't seem to have any arguments to the contrary which those people are likely to find convincing. > There really seem to be only two possibilities that don't have this > problem: (1) make it a built-in, or (2) make it a method on strings. False dichotomy. Some other possibilities: (3) Use an operator. (4) Leave it in the string module! Really, I don't see what would be so bad about that. You still need somewhere to put all the string-related constants, so why not keep the string module for those, plus the few functions that don't have any other obvious place? > If " ".join(L) bugs you, try this: > > space = " " # This could be a global > . > . > . > s = space.join(L) Surely you must realise that this completely fails to address Mr. Petrilli's concern? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From akuchlin at mems-exchange.org Wed Dec 20 15:40:58 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Wed, 20 Dec 2000 09:40:58 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: ; from noreply@sourceforge.net on Tue, Dec 19, 2000 at 07:02:05PM -0800 References: Message-ID: <20001220094058.A17623@kronos.cnri.reston.va.us> On Tue, Dec 19, 2000 at 07:02:05PM -0800, noreply at sourceforge.net wrote: >Date: 2000-Dec-19 19:02 >By: tim_one >Unrelated to your patch but in the same area: the other msg, "ord() >expected string or Unicode character", doesn't read right. The type >names in question are "string" and "unicode": > >>>> type("") > >>>> type(u"") > >>>> > >"character" is out of place, or not in enough places. Just thought I'd mention that, since *you're* so cute! Is it OK to refer to 8-bit strings under that name? How about "expected an 8-bit string or Unicode string", when the object passed to ord() isn't of the right type. Similarly, when the value is of the right type but has length>1, the message is "ord() expected a character, length-%d string found". Should that be "length-%d (string / unicode) found)" And should the type names be changed to '8-bit string'/'Unicode string', maybe? --amk From barry at digicool.com Wed Dec 20 16:39:30 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Wed, 20 Dec 2000 10:39:30 -0500 Subject: [Python-Dev] IGNORE - this is only a test Message-ID: <14912.53938.280864.596141@anthem.concentric.net> Testing the new MX for python.org... From fdrake at acm.org Wed Dec 20 17:57:09 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 20 Dec 2000 11:57:09 -0500 (EST) Subject: [Python-Dev] scp with SourceForge Message-ID: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> I've not been able to get scp to work with SourceForge since they upgraded their machines. ssh works fine. Is this related to the protocol mismatch problem that was discussed earlier? My ssh tells me "SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote sshd is sending it's version as "Remote protocol version 1.99, remote software version OpenSSH_2.2.0p1". Was there a reasonable way to deal with this? I'm running Linux-Mandrake 7.1 with very little customization or extra stuff. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tismer at tismer.com Wed Dec 20 17:31:00 2000 From: tismer at tismer.com (Christian Tismer) Date: Wed, 20 Dec 2000 18:31:00 +0200 Subject: [Python-Dev] Re: The Dictionary Gem is polished! References: <3A3F9C16.562F9D9F@tismer.com> Message-ID: <3A40DEC4.5F659E8E@tismer.com> Christian Tismer wrote: ... When talking about left rotation, an error crept in. Sorry! > We would then have: > > incr = ROTATE_LEFT(incr, 1) > if (incr > mask): > incr = incr ^ mp.ma_poly If incr contains the high bits of the hash, then the above must be replaced by incr = ROTATE_LEFT(incr, 1) if (incr & (mask+1)): incr = incr ^ mp.ma_poly or the multiplicative group is not guaranteed to be generated, obviously. This doesn't change my results, rotating right is still my choice. ciao - chris D:\crml_doc\platf\py>python dictest.py N=1000 trips for strings old=293/9 new=302/7 new2=221/7 rot=272/8 trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16982/27 trips for random integers old=339/9 new=337/7 new2=343/10 rot=342/8 trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/6 N=2000 trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1090/9 trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33985/31 trips for random integers old=747/10 new=733/7 new2=734/7 rot=728/8 trips for windows names old=503/8 new=542/9 new2=564/6 rot=521/11 N=3000 trips for strings old=810/5 new=839/6 new2=609/5 rot=820/6 trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=50985/26 trips for random integers old=709/4 new=728/5 new2=767/5 rot=711/6 trips for windows names old=712/6 new=711/5 new2=691/5 rot=727/7 N=4000 trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1861/9 trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=67986/26 trips for random integers old=1584/9 new=1606/8 new2=1505/9 rot=1579/8 trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1476/7 -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tim.one at home.com Wed Dec 20 20:52:40 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 20 Dec 2000 14:52:40 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: [Fred L. Drake, Jr.] > I've not been able to get scp to work with SourceForge since they > upgraded their machines. ssh works fine. Same here. In particular, I can use ssh to log in to shell.sourceforge.net, but attempts to scp there act like this (breaking long lines by hand with \n\t): > scp -v pep-0042.html tim_one at shell.sourceforge.net:/home/groups/python/htdocs/peps Executing: host shell.sourceforge.net, user tim_one, command scp -v -t /home/groups/python/htdocs/peps SSH Version 1.2.14 [winnt-4.0-x86], protocol version 1.4. Standard version. Does not use RSAREF. ssh_connect: getuid 0 geteuid 0 anon 0 Connecting to shell.sourceforge.net [216.136.171.201] port 22. Connection established. Remote protocol version 1.99, remote software version OpenSSH_2.2.0p1 Waiting for server public key. Received server public key (768 bits) and host key (1024 bits). Host 'shell.sourceforge.net' is known and matches the host key. Initializing random; seed file C:\Code/.ssh/random_seed IDEA not supported, using 3des instead. Encryption type: 3des Sent encrypted session key. Received encrypted confirmation. Trying RSA authentication with key 'sourceforge' Server refused our key. Doing password authentication. Password: **** here tim enteredth his password **** Sending command: scp -v -t /home/groups/python/htdocs/peps Entering interactive session. And there it sits forever. Several others report the same symptom on SF forums, and assorted unresolved SF Support and Bug reports. We don't know what your symptom is! > Is this related to the protocol mismatch problem that was discussed > earlier? Doubt it. Most commentators pin the blame elsewhere. > ... > Was there a reasonable way to deal with this? A new note was added to http://sourceforge.net/support/?func=detailsupport&support_id=110235&group_i d=1 today, including: """ Re: Shell server We're also aware of the number of problems on the shell server with respect to restricitive permissions on some programs - and sourcing of shell environments. We're also aware of the troubles with scp and transferring files. As a work around, we recommend either editing files on the shell server, or scping files to the shell server from external hosts to the shell server, whilst logged in to the shell server. """ So there you go: scp files to the shell server from external hosts to the shell server whilst logged in to the shell server . Is scp working for *anyone*??? From fdrake at acm.org Wed Dec 20 21:17:58 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 20 Dec 2000 15:17:58 -0500 (EST) Subject: [Python-Dev] scp with SourceForge In-Reply-To: References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com> Tim Peters writes: > And there it sits forever. Several others report the same symptom on SF > forums, and assorted unresolved SF Support and Bug reports. We don't know > what your symptom is! Exactly the same. > So there you go: scp files to the shell server from external hosts to the > shell server whilst logged in to the shell server . Yeah, that really helps.... NOT! All I want to be able to do is post a new development version of the documentation. ;-( -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From bckfnn at worldonline.dk Wed Dec 20 21:23:33 2000 From: bckfnn at worldonline.dk (Finn Bock) Date: Wed, 20 Dec 2000 20:23:33 GMT Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <3a411449.5247545@smtp.worldonline.dk> [Fred L. Drake] > I've not been able to get scp to work with SourceForge since they >upgraded their machines. ssh works fine. Is this related to the >protocol mismatch problem that was discussed earlier? My ssh tells me >"SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote >sshd is sending it's version as "Remote protocol version 1.99, remote >software version OpenSSH_2.2.0p1". > Was there a reasonable way to deal with this? I'm running >Linux-Mandrake 7.1 with very little customization or extra stuff. I managed to update the jython website by logging into the shell machine by ssh and doing a ftp back to my machine (using the IP number). That isn't exactly reasonable, but I was desperate. regards, finn From tim.one at home.com Wed Dec 20 21:42:11 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 20 Dec 2000 15:42:11 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com> Message-ID: [Tim] > So there you go: scp files to the shell server from external > hosts to the shell server whilst logged in to the shell server . [Fred] > Yeah, that really helps.... NOT! All I want to be able to do is > post a new development version of the documentation. ;-( All I want to do is make a measly change to a PEP -- I'm afraid it doesn't ask how trivial your intents are. If some suck^H^H^H^Hdeveloper admits that scp works for them, maybe we can mail them stuff and have *them* copy it over. no-takers-so-far-though-ly y'rs - tim From barry at digicool.com Wed Dec 20 21:49:00 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Wed, 20 Dec 2000 15:49:00 -0500 Subject: [Python-Dev] scp with SourceForge References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <14913.6972.934625.840781@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> So there you go: scp files to the shell server from external TP> hosts to the shell server whilst logged in to the shell server TP> . Psheesh, /that/ was obvious. Did you even have to ask? TP> Is scp working for *anyone*??? Nope, same thing happens to me; it just hangs. -Barry From tim.one at home.com Wed Dec 20 21:53:38 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 20 Dec 2000 15:53:38 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14913.6972.934625.840781@anthem.concentric.net> Message-ID: [Tim, quoting a bit of immortal SF support prose] > TP> So there you go: scp files to the shell server from external > TP> hosts to the shell server whilst logged in to the shell server > TP> . [Barry] > Psheesh, /that/ was obvious. Did you even have to ask? Actually, isn't this easy to do on Linux? That is, run an ssh server (whatever) on your home machine, log in to the SF shell (which everyone seems able to do), then scp whatever your_home_IP_address:your_home_path from the SF shell? Heck, I can even get that to work on Windows, except I don't know how to set up anything on my end to accept the connection . > TP> Is scp working for *anyone*??? > Nope, same thing happens to me; it just hangs. That's good to know -- since nobody else mentioned this, Fred probably figured he was unique. not-that-he-isn't-it's-just-that-he's-not-ly y'rs - tim From fdrake at acm.org Wed Dec 20 21:52:10 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 20 Dec 2000 15:52:10 -0500 (EST) Subject: [Python-Dev] scp with SourceForge In-Reply-To: References: <14913.6972.934625.840781@anthem.concentric.net> Message-ID: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com> Tim Peters writes: > Actually, isn't this easy to do on Linux? That is, run an ssh server > (whatever) on your home machine, log in to the SF shell (which everyone > seems able to do), then > > scp whatever your_home_IP_address:your_home_path > > from the SF shell? Heck, I can even get that to work on Windows, except I > don't know how to set up anything on my end to accept the connection . Err, yes, that's easy to do, but... that means putting your private key on SourceForge. They're a great bunch of guys, but they can't have my private key! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tim.one at home.com Wed Dec 20 22:06:07 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 20 Dec 2000 16:06:07 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com> Message-ID: [Fred] > Err, yes, that's easy to do, but... that means putting your private > key on SourceForge. They're a great bunch of guys, but they can't > have my private key! So generate a unique one-shot key pair for the life of the copy. I can do that for you on Windows if you lack a real OS . From thomas at xs4all.net Wed Dec 20 23:59:49 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 20 Dec 2000 23:59:49 +0100 Subject: [Python-Dev] scp with SourceForge In-Reply-To: ; from tim.one@home.com on Wed, Dec 20, 2000 at 02:52:40PM -0500 References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <20001220235949.F29681@xs4all.nl> On Wed, Dec 20, 2000 at 02:52:40PM -0500, Tim Peters wrote: > So there you go: scp files to the shell server from external hosts to the > shell server whilst logged in to the shell server . > Is scp working for *anyone*??? Not for me, anyway. And I'm not just saying that to avoid scp-duty :) And I'm using the same ssh version, which works fine on all other machines. It probably has to do with the funky setup Sourceforge uses. (Try looking at 'df' and 'cat /proc/mounts', and comparing the two -- you'll see what I mean :) That also means I'm not tempted to try and reproduce it, obviously :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim.one at home.com Thu Dec 21 04:24:12 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 20 Dec 2000 22:24:12 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz> Message-ID: [Guido] >> Boy, are you stirring up a can of worms that we've been through many >> times before! Nothing you say hasn't been said at least a hundred >> times before, on this list as well as on c.l.py. [Greg Ewing] > And I'll wager you'll continue to hear them said at regular intervals > for a long time to come, because you've done something which a lot of > people feel very strongly was a mistake, and they have some very > rational arguments as to why it was a mistake, whereas you don't seem > to have any arguments to the contrary which those people are likely to > find convincing. Then it's a wash: Guido doesn't find their arguments convincing either, and ties favor the status quo even in the absence of BDFLness. >> There really seem to be only two possibilities that don't have this >> problem: (1) make it a built-in, or (2) make it a method on strings. > False dichotomy. Some other possibilities: > > (3) Use an operator. Oh, that's likely . > (4) Leave it in the string module! Really, I don't see what > would be so bad about that. You still need somewhere to put > all the string-related constants, so why not keep the string > module for those, plus the few functions that don't have > any other obvious place? Guido said he wants to deprecate the entire string module, so that Python can eventually warn on the mere presence of "import string". That's what he said when I earlier ranted in favor of keeping the string module around. My guess is that making it a builtin is the only alternative that stands any chance at this point. >> If " ".join(L) bugs you, try this: >> >> space = " " # This could be a global >> . >> . >> . >> s = space.join(L) > Surely you must realise that this completely fails to > address Mr. Petrilli's concern? Don't know about Guido, but I don't realize that, and we haven't heard back from Charles. His objections were raised the first day " ".join was suggested, space.join was suggested almost immediately after, and that latter suggestion did seem to pacify at least several objectors. Don't know whether it makes Charles happier, but since it *has* made others happier in the past, it's not unreasonable to imagine that Charles might like it too. if-we're-to-be-swayed-by-his-continued-outrage-afraid-it-will- have-to-come-from-him-ly y'rs - tim From tim.one at home.com Thu Dec 21 08:44:19 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 21 Dec 2000 02:44:19 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: <20001220094058.A17623@kronos.cnri.reston.va.us> Message-ID: [Andrew Kuchling] > Is it OK to refer to 8-bit strings under that name? > How about "expected an 8-bit string or Unicode string", when the > object passed to ord() isn't of the right type. > > Similarly, when the value is of the right type but has length>1, > the message is "ord() expected a character, length-%d string found". > Should that be "length-%d (string / unicode) found)" > > And should the type names be changed to '8-bit string'/'Unicode > string', maybe? Actually, upon reflection I think it was a mistake to add all these "or Unicode" clauses to the error msgs to begin with. Python used to have only one string type, we're saying that's also a hope for the future, and in the meantime I know I'd have no trouble understanding "string" as including both 8-bit strings and Unicode strings. So we should say "8-bit string" or "Unicode string" when *only* one of those is allowable. So "ord() expected string ..." instead of (even a repaired version of) "ord() expected string or Unicode character ..." but-i'm-not-even-motivated-enough-to-finish-this-sig- From tim.one at home.com Thu Dec 21 09:52:54 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 21 Dec 2000 03:52:54 -0500 Subject: [Python-Dev] RE: The Dictionary Gem is polished! In-Reply-To: <3A3F9C16.562F9D9F@tismer.com> Message-ID: [Christian Tismer] > Are you saying I should check the thing in? Really? Of course. The first thing you talked about showed a major improvement in some bad cases, did no harm in the others, and both results were more than just plausible -- they made compelling sense and were backed by simulation. So why not check it in? It's a clear net win! Stuff since then has been a spattering of maybe-good maybe-bad maybe-neutral ideas that hasn't gotten anywhere conclusive. What I want to avoid is another "Unicode compression" scenario, where we avoid grabbing a clear win for months just because it may not be the best possible of all conceivable compression schemes -- and then mistakes get made in a last-second rush to get *any* improvement. Checking in a clear improvement today does not preclude checking in a better one next week . > ... > Ok, we have to stick with the given polymomials to stay > compatible, Na, feel free to explore that too, if you like. It really should get some study! The polys there now are utterly arbitrary: of all polys that happen to be irreducible and that have x as a primitive root in the induced multiplicative group, these are simply the smallest when viewed as binary integers. That's because they were *found* by trying all odd binary ints with odd parity (even ints and ints with even parity necessarily correspond to reducible polys), starting with 2**N+3 and going up until finding the first one that was both irreducible and had x as a primitive root. There's no theory at all that I know of to say that any such poly is any better for this purpose than any other. And they weren't tested for that either -- they're simply the first ones "that worked at all" in a brute search. Besides, Python's "better than random" dict behavior-- when it obtains! --is almost entirely due to that its hash functions produce distinct starting indices more often than a random hash function would. The contribution of the GF-based probe sequence in case of collision is to avoid the terrible behavior most other forms of probe sequence would cause given that Python's hash functions also tend to fill solid contiguous slices of the table more often than would a random hash function. [stuff about rotation] > ... > Too bad that this isn't supported in C. It is a native > machine instruction on X86 machines. Guido long ago rejected hash functions based on rotation for this reason; he's not likely to approve of rotations more in the probe sequence . A similar frustration is that almost modern CPUs have a fast instruction to get at the high 32 bits of a 32x32->64 bit multiply: another way to get the high bits of the hash code into play is to multiply the 32-bit hash code by a 32-bit constant (see Knuth for "Fibonacci hashing" details), and take the least-significant N bits of the *upper* 32 bits of the 64-bit product as the initial table index. If the constant is chosen correctly, this defines a permutation on the space of 32-bit unsigned ints, and can be very effective at "scrambling" arithmetic progressions (which Python's hash functions often produce). But C doesn't give a decent way to get at that either. > ... > On the current "faster than random" cases, I assume that > high bits in the hash are less likely than low bits, I'm not sure what this means. As the comment in dictobject.c says, it's common for Python's hash functions to return a result with lots of leading zeroes. But the lookup currently applies ~ to those first (which is a bad idea -- see earlier msgs), so the actual hash that gets *used* often has lots of leading ones. > so it is more likely that an entry finds its good place in the dict, > before bits are rotated in. hence the "good" cases would be kept. I can agree with this easily if I read the above as asserting that in the very good cases today, the low bits of hashes (whether or not ~ is applied) vary more than the high bits. > ... > Random integers seem to withstand any of these procedures. If you wanted to, you could *define* random this way . > ... > I'd say let's do the patch -- ciao - chris full-circle-ly y'rs - tim From mal at lemburg.com Thu Dec 21 12:16:27 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 21 Dec 2000 12:16:27 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix References: Message-ID: <3A41E68B.6B12CD71@lemburg.com> Tim Peters wrote: > > [Andrew Kuchling] > > Is it OK to refer to 8-bit strings under that name? > > How about "expected an 8-bit string or Unicode string", when the > > object passed to ord() isn't of the right type. > > > > Similarly, when the value is of the right type but has length>1, > > the message is "ord() expected a character, length-%d string found". > > Should that be "length-%d (string / unicode) found)" > > > > And should the type names be changed to '8-bit string'/'Unicode > > string', maybe? > > Actually, upon reflection I think it was a mistake to add all these "or > Unicode" clauses to the error msgs to begin with. Python used to have only > one string type, we're saying that's also a hope for the future, and in the > meantime I know I'd have no trouble understanding "string" as including both > 8-bit strings and Unicode strings. > > So we should say "8-bit string" or "Unicode string" when *only* one of those > is allowable. So > > "ord() expected string ..." > > instead of (even a repaired version of) > > "ord() expected string or Unicode character ..." I think this has to do with understanding that there are two string types in Python 2.0 -- a novice won't notice this until she sees the error message. My understanding is similar to yours, "string" should mean "any string object" and in cases where the difference between 8-bit string and Unicode matters, these should be referred to as "8-bit string" and "Unicode string". Still, I think it is a good idea to make people aware of the possibility of passing Unicode objects to these functions, so perhaps the idea of adding both possibilies to error messages is not such a bad idea for 2.1. The next phases would be converting all messages back to "string" and then convert all strings to Unicode ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From akuchlin at mems-exchange.org Thu Dec 21 19:37:19 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Thu, 21 Dec 2000 13:37:19 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: ; from tim.one@home.com on Thu, Dec 21, 2000 at 02:44:19AM -0500 References: <20001220094058.A17623@kronos.cnri.reston.va.us> Message-ID: <20001221133719.B11880@kronos.cnri.reston.va.us> On Thu, Dec 21, 2000 at 02:44:19AM -0500, Tim Peters wrote: >So we should say "8-bit string" or "Unicode string" when *only* one of those >is allowable. So OK... how about this patch? Index: bltinmodule.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v retrieving revision 2.185 diff -u -r2.185 bltinmodule.c --- bltinmodule.c 2000/12/20 15:07:34 2.185 +++ bltinmodule.c 2000/12/21 18:36:54 @@ -1524,13 +1524,14 @@ } } else { PyErr_Format(PyExc_TypeError, - "ord() expected string or Unicode character, " \ + "ord() expected string of length 1, but " \ "%.200s found", obj->ob_type->tp_name); return NULL; } PyErr_Format(PyExc_TypeError, - "ord() expected a character, length-%d string found", + "ord() expected a character, " + "but string of length %d found", size); return NULL; } From thomas at xs4all.net Fri Dec 22 16:21:43 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 22 Dec 2000 16:21:43 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: ; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800 References: Message-ID: <20001222162143.A5515@xs4all.nl> On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply at sourceforge.net wrote: > * Guido-style: 8-column hard-tab indents. > * New style: 4-column space-only indents. Hm, I must have missed this... Is 'new style' the preferred style, as its name suggests, or is Guido mounting a rebellion to adhere to the One True Style (or rather his own version of it, which just has the * in pointer type declarations wrong ? :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From fdrake at acm.org Fri Dec 22 16:31:21 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 10:31:21 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <20001222162143.A5515@xs4all.nl> References: <20001222162143.A5515@xs4all.nl> Message-ID: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> Thomas Wouters writes: > Hm, I must have missed this... Is 'new style' the preferred style, as its > name suggests, or is Guido mounting a rebellion to adhere to the One True > Style (or rather his own version of it, which just has the * in pointer > type declarations wrong ? :) Guido has grudgingly granted that new code in the "New style" is acceptable, mostly because many people complain that "Guido style" causes too much code to get scrunched up on the right margin. The "New style" is more like the recommendations for Python code as well, so it's easier for Python programmers to read (Tabs are hard to read clearly! ;). -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From cgw at fnal.gov Fri Dec 22 16:43:45 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Fri, 22 Dec 2000 09:43:45 -0600 (CST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> Message-ID: <14915.30385.201343.360880@buffalo.fnal.gov> Fred L. Drake, Jr. writes: > > Guido has grudgingly granted that new code in the "New style" is > acceptable, mostly because many people complain that "Guido style" > causes too much code to get scrunched up on the right margin. I am reminded of Linus Torvalds comments on this subject (see /usr/src/linux/Documentation/CodingStyle): Now, some people will claim that having 8-character indentations makes the code move too far to the right, and makes it hard to read on a 80-character terminal screen. The answer to that is that if you need more than 3 levels of indentation, you're screwed anyway, and should fix your program. From fdrake at acm.org Fri Dec 22 16:58:56 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 10:58:56 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.30385.201343.360880@buffalo.fnal.gov> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> Message-ID: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> Charles G Waldman writes: > I am reminded of Linus Torvalds comments on this subject (see > /usr/src/linux/Documentation/CodingStyle): The catch, of course, is Python/cevel.c, where breaking it up can hurt performance. People scream when you do things like that.... -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From cgw at fnal.gov Fri Dec 22 17:07:47 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Fri, 22 Dec 2000 10:07:47 -0600 (CST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> Message-ID: <14915.31827.250987.283364@buffalo.fnal.gov> Fred L. Drake, Jr. writes: > > The catch, of course, is Python/cevel.c, where breaking it up can > hurt performance. People scream when you do things like that.... Quoting again from the same source: Use helper functions with descriptive names (you can ask the compiler to in-line them if you think it's performance-critical, and it will probably do a better job of it that you would have done). But I should have pointed out that I was quoting the great Linus mostly for entertainment/cultural value, and was not really trying to add fuel to the fire. In other words, a message that I thought was amusing, but probably shouldn't have sent ;-) From fdrake at acm.org Fri Dec 22 17:20:52 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 11:20:52 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.31827.250987.283364@buffalo.fnal.gov> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> <14915.31827.250987.283364@buffalo.fnal.gov> Message-ID: <14915.32612.252115.562296@cj42289-a.reston1.va.home.com> Charles G Waldman writes: > But I should have pointed out that I was quoting the great Linus > mostly for entertainment/cultural value, and was not really trying to > add fuel to the fire. In other words, a message that I thought was > amusing, but probably shouldn't have sent ;-) I understood the intent; I think he's really got a point. There are a few places in Python where it would really help to break things up! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From fredrik at effbot.org Fri Dec 22 17:33:37 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Fri, 22 Dec 2000 17:33:37 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support References: <20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov> <14915.32612.252115.562296@cj42289-a.reston1.va.home.com> Message-ID: <004b01c06c34$f08151c0$e46940d5@hagrid> Fred wrote: > I understood the intent; I think he's really got a point. There are > a few places in Python where it would really help to break things up! if that's what you want, maybe you could start by putting the INLINE stuff back again? (if C/C++ compatibility is a problem, put it inside a cplusplus ifdef, and mark it as "for internal use only. don't use inline on public interfaces") From fdrake at acm.org Fri Dec 22 17:36:15 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 11:36:15 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <004b01c06c34$f08151c0$e46940d5@hagrid> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> <14915.31827.250987.283364@buffalo.fnal.gov> <14915.32612.252115.562296@cj42289-a.reston1.va.home.com> <004b01c06c34$f08151c0$e46940d5@hagrid> Message-ID: <14915.33535.520957.215310@cj42289-a.reston1.va.home.com> Fredrik Lundh writes: > if that's what you want, maybe you could start by > putting the INLINE stuff back again? I could not see the value in the inline stuff that configure was setting up, and still don't. > (if C/C++ compatibility is a problem, put it inside a > cplusplus ifdef, and mark it as "for internal use only. > don't use inline on public interfaces") We should be able to come up with something reasonable, but I don't have time right now, and my head isn't currently wrapped around C compilers. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From akuchlin at cnri.reston.va.us Fri Dec 22 19:01:43 2000 From: akuchlin at cnri.reston.va.us (Andrew Kuchling) Date: Fri, 22 Dec 2000 13:01:43 -0500 Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: ; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800 References: Message-ID: <20001222130143.B7127@newcnri.cnri.reston.va.us> On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply at sourceforge.net wrote: > * Guido-style: 8-column hard-tab indents. > * New style: 4-column space-only indents. > * _curses style: 2 column indents. > >I'd prefer "New style", myself. New style it is. (Barry, is the "python" style in cc-mode.el going to be changed to new style, or a "python2" style added?) I've been wanting to reformat _cursesmodule.c to match the Python style for some time. Probably I'll do that a little while after the panel module has settled down a bit. Fred, did you look at the use of the CObject for exposing the API? Did that look reasonable? Also, should py_curses.h go in the Include/ subdirectory instead of Modules/? --amk From fredrik at effbot.org Fri Dec 22 19:03:43 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Fri, 22 Dec 2000 19:03:43 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support References: <20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov><14915.32612.252115.562296@cj42289-a.reston1.va.home.com><004b01c06c34$f08151c0$e46940d5@hagrid> <14915.33535.520957.215310@cj42289-a.reston1.va.home.com> Message-ID: <006701c06c41$896a1a00$e46940d5@hagrid> Fred wrote: > > if that's what you want, maybe you could start by > > putting the INLINE stuff back again? > > I could not see the value in the inline stuff that configure was > setting up, and still don't. the INLINE stuff guarantees that "inline" is defined to be whatever directive the compiler uses for explicit inlining. quoting the autoconf docs: If the C compiler supports the keyword inline, do nothing. Otherwise define inline to __inline__ or __inline if it accepts one of those, otherwise define inline to be empty as a result, you can always use "inline" in your code, and have it do the right thing on all compilers that support ex- plicit inlining (all modern C compilers, in practice). ::: to deal with people compiling Python with a C compiler, but linking it with a C++ compiler, the config.h.in file could be written as: /* Define "inline" to be whatever the C compiler calls it. To avoid problems when mixing C and C++, make sure to only use "inline" for internal interfaces. */ #ifndef __cplusplus #undef inline #endif From akuchlin at mems-exchange.org Fri Dec 22 20:40:15 2000 From: akuchlin at mems-exchange.org (A.M. Kuchling) Date: Fri, 22 Dec 2000 14:40:15 -0500 Subject: [Python-Dev] PEP 222 draft Message-ID: <200012221940.OAA01936@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com> I've completed a draft of PEP 222 (sort of -- note the XXX comments in the text for things that still need to be resolved). This is being posted to python-dev, python-web-modules, and python-list/comp.lang.python, to get comments on the proposed interface. I'm on all three lists, but would prefer to see followups on python-list/comp.lang.python, so if you can reply there, please do so. --amk Abstract This PEP proposes a set of enhancements to the CGI development facilities in the Python standard library. Enhancements might be new features, new modules for tasks such as cookie support, or removal of obsolete code. The intent is to incorporate the proposals emerging from this document into Python 2.1, due to be released in the first half of 2001. Open Issues This section lists changes that have been suggested, but about which no firm decision has yet been made. In the final version of this PEP, this section should be empty, as all the changes should be classified as accepted or rejected. cgi.py: We should not be told to create our own subclass just so we can handle file uploads. As a practical matter, I have yet to find the time to do this right, so I end up reading cgi.py's temp file into, at best, another file. Some of our legacy code actually reads it into a second temp file, then into a final destination! And even if we did, that would mean creating yet another object with its __init__ call and associated overhead. cgi.py: Currently, query data with no `=' are ignored. Even if keep_blank_values is set, queries like `...?value=&...' are returned with blank values but queries like `...?value&...' are completely lost. It would be great if such data were made available through the FieldStorage interface, either as entries with None as values, or in a separate list. Utility function: build a query string from a list of 2-tuples Dictionary-related utility classes: NoKeyErrors (returns an empty string, never a KeyError), PartialStringSubstitution (returns the original key string, never a KeyError) New Modules This section lists details about entire new packages or modules that should be added to the Python standard library. * fcgi.py : A new module adding support for the FastCGI protocol. Robin Dunn's code needs to be ported to Windows, though. Major Changes to Existing Modules This section lists details of major changes to existing modules, whether in implementation or in interface. The changes in this section therefore carry greater degrees of risk, either in introducing bugs or a backward incompatibility. The cgi.py module would be deprecated. (XXX A new module or package name hasn't been chosen yet: 'web'? 'cgilib'?) Minor Changes to Existing Modules This section lists details of minor changes to existing modules. These changes should have relatively small implementations, and have little risk of introducing incompatibilities with previous versions. Rejected Changes The changes listed in this section were proposed for Python 2.1, but were rejected as unsuitable. For each rejected change, a rationale is given describing why the change was deemed inappropriate. * An HTML generation module is not part of this PEP. Several such modules exist, ranging from HTMLgen's purely programming interface to ASP-inspired simple templating to DTML's complex templating. There's no indication of which templating module to enshrine in the standard library, and that probably means that no module should be so chosen. * cgi.py: Allowing a combination of query data and POST data. This doesn't seem to be standard at all, and therefore is dubious practice. Proposed Interface XXX open issues: naming convention (studlycaps or underline-separated?); need to look at the cgi.parse*() functions and see if they can be simplified, too. Parsing functions: carry over most of the parse* functions from cgi.py # The Response class borrows most of its methods from Zope's # HTTPResponse class. class Response: """ Attributes: status: HTTP status code to return headers: dictionary of response headers body: string containing the body of the HTTP response """ def __init__(self, status=200, headers={}, body=""): pass def setStatus(self, status, reason=None): "Set the numeric HTTP response code" pass def setHeader(self, name, value): "Set an HTTP header" pass def setBody(self, body): "Set the body of the response" pass def setCookie(self, name, value, path = '/', comment = None, domain = None, max-age = None, expires = None, secure = 0 ): "Set a cookie" pass def expireCookie(self, name): "Remove a cookie from the user" pass def redirect(self, url): "Redirect the browser to another URL" pass def __str__(self): "Convert entire response to a string" pass def dump(self): "Return a string representation useful for debugging" pass # XXX methods for specific classes of error:serverError, badRequest, etc.? class Request: """ Attributes: XXX should these be dictionaries, or dictionary-like objects? .headers : dictionary containing HTTP headers .cookies : dictionary of cookies .fields : data from the form .env : environment dictionary """ def __init__(self, environ=os.environ, stdin=sys.stdin, keep_blank_values=1, strict_parsing=0): """Initialize the request object, using the provided environment and standard input.""" pass # Should people just use the dictionaries directly? def getHeader(self, name, default=None): pass def getCookie(self, name, default=None): pass def getField(self, name, default=None): "Return field's value as a string (even if it's an uploaded file)" pass def getUploadedFile(self, name): """Returns a file object that can be read to obtain the contents of an uploaded file. XXX should this report an error if the field isn't actually an uploaded file? Or should it wrap a StringIO around simple fields for consistency? """ def getURL(self, n=0, query_string=0): """Return the URL of the current request, chopping off 'n' path components from the right. Eg. if the URL is "http://foo.com/bar/baz/quux", n=2 would return "http://foo.com/bar". Does not include the query string (if any) """ def getBaseURL(self, n=0): """Return the base URL of the current request, adding 'n' path components to the end to recreate more of the whole URL. Eg. if the request URL is "http://foo.com/q/bar/baz/qux", n=0 would return "http://foo.com/", and n=2 "http://foo.com/q/bar". Returned URL does not include the query string, if any. """ def dump(self): "String representation suitable for debugging output" pass # Possibilities? I don't know if these are worth doing in the # basic objects. def getBrowser(self): "Returns Mozilla/IE/Lynx/Opera/whatever" def isSecure(self): "Return true if this is an SSLified request" # Module-level function def wrapper(func, logfile=sys.stderr): """ Calls the function 'func', passing it the arguments (request, response, logfile). Exceptions are trapped and sent to the file 'logfile'. """ # This wrapper will detect if it's being called from the command-line, # and if so, it will run in a debugging mode; name=value pairs # can be entered on standard input to set field values. # (XXX how to do file uploads in this syntax?) Copyright This document has been placed in the public domain. From tim.one at home.com Fri Dec 22 20:31:07 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 22 Dec 2000 14:31:07 -0500 Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support) In-Reply-To: <20001222162143.A5515@xs4all.nl> Message-ID: [Thomas Wouters] >> * Guido-style: 8-column hard-tab indents. >> * New style: 4-column space-only indents. > > Hm, I must have missed this... Is 'new style' the preferred style, as > its name suggests, or is Guido mounting a rebellion to adhere to the > One True Style (or rather his own version of it, which just has > the * in pointer type declarations wrong ? :) Every time this comes up wrt C code, 1. Fred repeats that he thinks Guido caved in (but doesn't supply a reference to anything saying so). 2. Guido repeats that he prefers old-style (but in a wishy-washy way that leaves it uncertain (*)). 3. Fredrik and/or I repeat a request for a BDFL Pronouncement. 4. And there the thread ends. It's *very* hard to find this history in the Python-Dev archives because these threads always have subject lines like this one originally had ("RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support"). Fred already did the #1 bit in this thread. You can consider this msg the repeat of #3. Since Guido is out of town, we can skip #2 and go straight to #4 early . (*) Two examples of #2 from this year: Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/ Modules mmapmodule.c,2.1,2.2 From: Guido van Rossum Date: Fri, 31 Mar 2000 07:10:45 -0500 > Can we change the 8-space-tab rule for all new C code that goes in? I > know that we can't practically change existing code right now, but for > new C code, I propose we use no tab characters, and we use a 4-space > block indentation. Actually, this one was formatted for 8-space indents but using 4-space tabs, so in my editor it looked like 16-space indents! Given that we don't want to change existing code, I'd prefer to stick with 1-tab 8-space indents. Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules linuxaudiodev.c,2.2,2.3 From: Guido van Rossum Date: Sat, 08 Jul 2000 09:39:51 -0500 > Aren't tabs preferred as C-source indents, instead of 4-spaces ? At > least, that's what I see in Python/*.c and Object/*.c, but I only > vaguely recall it from the style document... Yes, you're right. From fredrik at effbot.org Fri Dec 22 21:37:35 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Fri, 22 Dec 2000 21:37:35 +0100 Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support) References: Message-ID: <00e201c06c57$052fff00$e46940d5@hagrid> > 3. Fredrik and/or I repeat a request for a BDFL Pronouncement. and. From akuchlin at mems-exchange.org Fri Dec 22 22:09:47 2000 From: akuchlin at mems-exchange.org (A.M. Kuchling) Date: Fri, 22 Dec 2000 16:09:47 -0500 Subject: [Python-Dev] Reviving the bookstore Message-ID: <200012222109.QAA02737@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com> Since the PSA isn't doing anything for us any longer, I've been working on reviving the bookstore at a new location with a new affiliate code. A draft version is up at its new home, http://www.kuchling.com/bookstore/ . Please take a look and offer comments. Book authors, please take a look at the entry for your book and let me know about any corrections. Links to reviews of books would also be really welcomed. I'd like to abolish having book listings with no description or review, so if you notice a book that you've read has no description, please feel free to submit a description and/or review. --amk From tim.one at home.com Sat Dec 23 08:15:59 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 23 Dec 2000 02:15:59 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: <3A41E68B.6B12CD71@lemburg.com> Message-ID: [Tim] > ... > So we should say "8-bit string" or "Unicode string" when *only* > one of those is allowable. So > > "ord() expected string ..." > > instead of (even a repaired version of) > > "ord() expected string or Unicode character ..." [MAL] > I think this has to do with understanding that there are two > string types in Python 2.0 -- a novice won't notice this until > she sees the error message. Except that this error msg has nothing to do with how many string types there are: they didn't pass *any* flavor of string when they get this msg. At the time they pass (say) a float to ord(), that there are currently two flavors of string is more information than they need to know. > My understanding is similar to yours, "string" should mean > "any string object" and in cases where the difference between > 8-bit string and Unicode matters, these should be referred to > as "8-bit string" and "Unicode string". In that happy case of universal harmony, the msg above should say just "string" and leave it at that. > Still, I think it is a good idea to make people aware of the > possibility of passing Unicode objects to these functions, Me too. > so perhaps the idea of adding both possibilies to error messages > is not such a bad idea for 2.1. But not that. The user is trying to track down their problem. Advertising an irrelevant (to their problem) distinction at that time of crisis is simply spam. TypeError: ord() requires an 8-bit string or a Unicode string. On the other hand, you'd be surprised to discover all the things you can pass to chr(): it's not just ints. Long ints are also accepted, by design, and due to an obscure bug in the Python internals, you can also pass floats, which get truncated to ints. > The next phases would be converting all messages back to "string" > and then convert all strings to Unicode ;-) Then we'll save a lot of work by skipping the need for the first half of that -- unless you're volunteering to do all of it . From tim.one at home.com Sat Dec 23 08:16:29 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 23 Dec 2000 02:16:29 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: <20001221133719.B11880@kronos.cnri.reston.va.us> Message-ID: [Tim] > So we should say "8-bit string" or "Unicode string" when *only* > one of those is allowable. [Andrew] > OK... how about this patch? +1 from me. And maybe if you offer to send a royalty to Marc-Andre each time it's printed, he'll back down from wanting to use the error msgs as a billboard . > Index: bltinmodule.c > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v > retrieving revision 2.185 > diff -u -r2.185 bltinmodule.c > --- bltinmodule.c 2000/12/20 15:07:34 2.185 > +++ bltinmodule.c 2000/12/21 18:36:54 > @@ -1524,13 +1524,14 @@ > } > } else { > PyErr_Format(PyExc_TypeError, > - "ord() expected string or Unicode character, " \ > + "ord() expected string of length 1, but " \ > "%.200s found", obj->ob_type->tp_name); > return NULL; > } > > PyErr_Format(PyExc_TypeError, > - "ord() expected a character, length-%d string found", > + "ord() expected a character, " > + "but string of length %d found", > size); > return NULL; > } From barry at digicool.com Sat Dec 23 17:43:37 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Sat, 23 Dec 2000 11:43:37 -0500 Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support References: <20001222130143.B7127@newcnri.cnri.reston.va.us> Message-ID: <14916.54841.418495.194558@anthem.concentric.net> >>>>> "AK" == Andrew Kuchling writes: AK> New style it is. (Barry, is the "python" style in cc-mode.el AK> going to be changed to new style, or a "python2" style added?) There should probably be a second style added to cc-mode.el. I haven't maintained that package in a long time, but I'll work out a patch and send it to the current maintainer. Let's call it "python2". -Barry From cgw at fnal.gov Sat Dec 23 18:09:57 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Sat, 23 Dec 2000 11:09:57 -0600 (CST) Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14916.54841.418495.194558@anthem.concentric.net> References: <20001222130143.B7127@newcnri.cnri.reston.va.us> <14916.54841.418495.194558@anthem.concentric.net> Message-ID: <14916.56421.370499.762023@buffalo.fnal.gov> Barry A. Warsaw writes: > There should probably be a second style added to cc-mode.el. I > haven't maintained that package in a long time, but I'll work out a > patch and send it to the current maintainer. Let's call it > "python2". Maybe we should wait for the BDFL's pronouncement? From barry at digicool.com Sat Dec 23 20:24:42 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Sat, 23 Dec 2000 14:24:42 -0500 Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support References: <20001222130143.B7127@newcnri.cnri.reston.va.us> <14916.54841.418495.194558@anthem.concentric.net> <14916.56421.370499.762023@buffalo.fnal.gov> Message-ID: <14916.64506.56351.443287@anthem.concentric.net> >>>>> "CGW" == Charles G Waldman writes: CGW> Barry A. Warsaw writes: >> There should probably be a second style added to cc-mode.el. I >> haven't maintained that package in a long time, but I'll work >> out a patch and send it to the current maintainer. Let's call >> it "python2". CGW> Maybe we should wait for the BDFL's pronouncement? Sure, at least before submitting a patch. Here's the simple one liner you can add to your .emacs file to play with the new style in the meantime. -Barry (c-add-style "python2" '("python" (c-basic-offset . 4))) From tim.one at home.com Sun Dec 24 05:04:47 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 23 Dec 2000 23:04:47 -0500 Subject: [Python-Dev] PEP 208 and __coerce__ In-Reply-To: <20001209033006.A3737@glacier.fnational.com> Message-ID: [Neil Schemenauer Saturday, December 09, 2000 6:30 AM] > While working on the implementation of PEP 208, I discovered that > __coerce__ has some surprising properties. Initially I > implemented __coerce__ so that the numberic operation currently > being performed was called on the values returned by __coerce__. > This caused test_class to blow up due to code like this: > > class Test: > def __coerce__(self, other): > return (self, other) > > The 2.0 "solves" this by not calling __coerce__ again if the > objects returned by __coerce__ are instances. If C.__coerce__ doesn't *know* it can do the full job, it should return None. This is what's documented, too: a coerce method should return a pair consisting of objects of the same type, or return None. It's always going to be somewhat clumsy since what you really want is double (or, in the case of pow, sometimes triple) dispatch. Now there's a deliberate cheat that may not have gotten documented comprehensibly: when __coerce__ returns a pair, Python does not check to verify both elements are of the same class. That's because "a pair consisting of objects of the same type" is often not what you *want* from coerce. For example, if I've got a matrix class M, then in M() + 42 I really don't want M.__coerce__ "promoting" 42 to a multi-gigabyte matrix matching the shape and size of M(). M.__add__ can deal with that much more efficiently if it gets 42 directly. OTOH, M.__coerce__ may want to coerce types other than scalar numbers to conform to the shape and size of self, or fiddle self to conform to some other type. What Python accepts back from __coerce__ has to be flexible enough to allow all of those without further interference from the interpreter (just ask MAL : the *real* problem in practice is making coerce more of a help than a burden to the end user; outside of int->long->float->complex (which is itself partly broken, because long->float can lose precision or even fail outright), "coercion to a common type" is almost never quite right; note that C99 introduces distinct imaginary and complex types, because even auto-conversion of imaginary->complex can be a royal PITA!). > This has the effect of making code like: > > class A: > def __coerce__(self, other): > return B(), other > > class B: > def __coerce__(self, other): > return 1, other > > A() + 1 > > fail to work in the expected way. I have no idea how you expected that to work. Neither coerce() method looks reasonable: they don't follow the rules for coerce methods. If A thinks it needs to create a B() and have coercion "start over from scratch" with that, then it should do so explicitly: class A: def __coerce__(self, other): return coerce(B(), other) > The question is: how should __coerce__ work? This can't be answered by a one-liner: the intended behavior is documented by a complex set of rules at the bottom of Lang Ref 3.3.6 ("Emulating numeric types"). Alternatives should be written up as a diff against those rules, which Guido worked hard on in years past -- more than once, too . From esr at thyrsus.com Mon Dec 25 10:17:23 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 25 Dec 2000 04:17:23 -0500 Subject: [Python-Dev] Tkinter support under RH 7.0? Message-ID: <20001225041723.A9567@thyrsus.com> I just upgraded to Red Hat 7.0 and installed Python 2.0. Anybody have a recipe for making Tkinter support work in this environment? -- Eric S. Raymond "Government is not reason, it is not eloquence, it is force; like fire, a troublesome servant and a fearful master. Never for a moment should it be left to irresponsible action." -- George Washington, in a speech of January 7, 1790 From thomas at xs4all.net Mon Dec 25 11:59:45 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 25 Dec 2000 11:59:45 +0100 Subject: [Python-Dev] Tkinter support under RH 7.0? In-Reply-To: <20001225041723.A9567@thyrsus.com>; from esr@thyrsus.com on Mon, Dec 25, 2000 at 04:17:23AM -0500 References: <20001225041723.A9567@thyrsus.com> Message-ID: <20001225115945.A25820@xs4all.nl> On Mon, Dec 25, 2000 at 04:17:23AM -0500, Eric S. Raymond wrote: > I just upgraded to Red Hat 7.0 and installed Python 2.0. Anybody have > a recipe for making Tkinter support work in this environment? I installed Python 2.0 + Tkinter both from the BeOpen rpms and later from source (for various reasons) and both were a breeze. I didn't really use the 2.0+tkinter rpm version until I needed Numpy and various other things and had to revert to the self-compiled version, but it seemed to work fine. As far as I can recall, there's only two things you have to keep in mind: the tcl/tk version that comes with RedHat 7.0 is 8.3, so you have to adjust the Tkinter section of Modules/Setup accordingly, and some of the RedHat-supplied scripts stop working because they use deprecated modules (at least 'rand') and use the socket.socket call wrong. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From esr at thyrsus.com Wed Dec 27 20:37:50 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Dec 2000 14:37:50 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues Message-ID: <20001227143750.A26894@thyrsus.com> I have 2.0 up and running on RH7.0, compiled from sources. In the process, I discovered a couple of issues: 1. The curses module is commented out in the default Modules/Setup file. This is not good, as it may lead careless distribution builders to ship Python 2.0s that will not be able to support the curses front end in CML2. Supporting CML2 (and thus getting Python the "design win" of being involved in the Linux kernel build) was the major point of integrating the curses module into the Python core. It is possible that one little "#" may have blown that. 2.The default Modules/Setup file assumes that various Tkinter-related libraries are in /usr/local. But /usr would be a more appropriate choice under most circumstances. Most Linux users now install their Tcl/Tk stuff from RPMs or .deb packages that place the binaries and libraries under /usr. Under most other Unixes (e.g. Solaris) they were there to begin with. 3. The configure machinery could be made to deduce more about the contents of Modules/Setup than it does now. In particular, it's silly that the person building Python has to fill in the locations of X librasries when configure is in principle perfectly capable of finding them. -- Eric S. Raymond Our society won't be truly free until "None of the Above" is always an option. From guido at digicool.com Wed Dec 27 22:04:27 2000 From: guido at digicool.com (Guido van Rossum) Date: Wed, 27 Dec 2000 16:04:27 -0500 Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support) In-Reply-To: Your message of "Fri, 22 Dec 2000 14:31:07 EST." References: Message-ID: <200012272104.QAA22278@cj20424-a.reston1.va.home.com> > 2. Guido repeats that he prefers old-style (but in a wishy-washy way that > leaves it uncertain (*)). OK, since a pronouncement is obviously needed, here goes: Python C source code should be indented using tabs only. Exceptions: (1) If 3rd party code is already written using a different style, it can stay that way, especially if it's a large volume that would be hard to reformat. But only if it is consistent within a file or set of files (e.g. a 3rd party patch will have to conform to the prevailing style in the patched file). (2) Occasionally (e.g. in ceval.c) there is code that's very deeply nested. I will allow 4-space indents for the innermost nesting levels here. Other C whitespace nits: - Always place spaces around assignment operators, comparisons, &&, ||. - No space between function name and left parenthesis. - Always a space between a keyword ('if', 'for' etc.) and left paren. - No space inside parentheses, brackets etc. - No space before a comma or semicolon. - Always a space after a comma (and semicolon, if not at end of line). - Use ``return x;'' instead of ``return(x)''. --Guido van Rossum (home page: http://www.python.org/~guido/) From cgw at fnal.gov Wed Dec 27 23:17:31 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Wed, 27 Dec 2000 16:17:31 -0600 (CST) Subject: [Python-Dev] sourceforge: problems with bug list? Message-ID: <14922.27259.456364.750295@buffalo.fnal.gov> Is it just me, or is anybody else getting this error when trying to access the bug list? > An error occured in the logger. ERROR: pg_atoi: error in "5470/": > can't parse "/" From akuchlin at mems-exchange.org Wed Dec 27 23:39:35 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Wed, 27 Dec 2000 17:39:35 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500 References: <20001227143750.A26894@thyrsus.com> Message-ID: <20001227173935.A25605@kronos.cnri.reston.va.us> On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote: >1. The curses module is commented out in the default Modules/Setup >file. This is not good, as it may lead careless distribution builders It always has been commented out. Good distributions ship with most of the available modules enabled; I can't say if RH7.0 counts as a good distribution or not (still on 6.2). >3. The configure machinery could be made to deduce more about the contents >of Modules/Setup than it does now. In particular, it's silly that the person This is the point of PEP 229 and patch #102588, which uses a setup.py script to build extension modules. (I need to upload an updated version of the patch which actually includes setup.py -- thought I did that, but apparently not...) The patch is still extremely green, though, but I think it's the best course; witness the tissue of hackery required to get the bsddb module automatically detected and built. --amk From guido at digicool.com Wed Dec 27 23:54:26 2000 From: guido at digicool.com (Guido van Rossum) Date: Wed, 27 Dec 2000 17:54:26 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: Your message of "Fri, 22 Dec 2000 10:58:56 EST." <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> Message-ID: <200012272254.RAA22931@cj20424-a.reston1.va.home.com> > Charles G Waldman writes: > > I am reminded of Linus Torvalds comments on this subject (see > > /usr/src/linux/Documentation/CodingStyle): Fred replied: > The catch, of course, is Python/cevel.c, where breaking it up can > hurt performance. People scream when you do things like that.... Funny, Jeremy is doing just that, and it doesn't seem to be hurting performance at all. See http://sourceforge.net/patch/?func=detailpatch&patch_id=102337&group_id=5470 (though this is not quite finished). --Guido van Rossum (home page: http://www.python.org/~guido/) From esr at thyrsus.com Thu Dec 28 00:05:46 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Dec 2000 18:05:46 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001227173935.A25605@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Dec 27, 2000 at 05:39:35PM -0500 References: <20001227143750.A26894@thyrsus.com> <20001227173935.A25605@kronos.cnri.reston.va.us> Message-ID: <20001227180546.A4365@thyrsus.com> Andrew Kuchling : > >1. The curses module is commented out in the default Modules/Setup > >file. This is not good, as it may lead careless distribution builders > > It always has been commented out. Good distributions ship with most > of the available modules enabled; I can't say if RH7.0 counts as a > good distribution or not (still on 6.2). I think this needs to change. If curses is a core facility now, the default build should tread it as one. -- Eric S. Raymond If a thousand men were not to pay their tax-bills this year, that would ... [be] the definition of a peaceable revolution, if any such is possible. -- Henry David Thoreau From tim.one at home.com Thu Dec 28 01:44:29 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 27 Dec 2000 19:44:29 -0500 Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109 In-Reply-To: Message-ID: [Barry Warsaw] > Modified Files: > python-mode.el > Log Message: > (python-font-lock-keywords): Add highlighting of `as' as a keyword, > but only in "import foo as bar" statements (including optional > preceding `from' clause). Oh, that's right, try to make IDLE look bad, will you? I've got half a mind to take up the challenge. Unfortunately, I only have half a mind in total, so you may get away with this backstabbing for a while . From thomas at xs4all.net Thu Dec 28 10:53:31 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 10:53:31 +0100 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500 References: <20001227143750.A26894@thyrsus.com> Message-ID: <20001228105331.A6042@xs4all.nl> On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote: > I have 2.0 up and running on RH7.0, compiled from sources. In the process, > I discovered a couple of issues: > 1. The curses module is commented out in the default Modules/Setup > file. This is not good, as it may lead careless distribution builders > to ship Python 2.0s that will not be able to support the curses front > end in CML2. Supporting CML2 (and thus getting Python the "design > win" of being involved in the Linux kernel build) was the major point > of integrating the curses module into the Python core. It is possible > that one little "#" may have blown that. Note that Tkinter is off by default too. And readline. And ssl. And the use of shared libraries. We *can't* enable the cursesmodule by default, because we don't know what the system's curses library is called. We'd have to auto-detect that before we can enable it (and lots of other modules) automatically, and that's a lot of work. I personally favour autoconf for the job, but since amk is already busy on using distutils, I'm not going to work on that. > 2.The default Modules/Setup file assumes that various Tkinter-related libraries > are in /usr/local. But /usr would be a more appropriate choice under most > circumstances. Most Linux users now install their Tcl/Tk stuff from RPMs > or .deb packages that place the binaries and libraries under /usr. Under > most other Unixes (e.g. Solaris) they were there to begin with. This is nonsense. The line above it specifically states 'edit to reflect where your Tcl/Tk headers are'. And besides from the issue whether they are usually found in /usr (I don't believe so, not even on Solaris, but 'my' Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane choice, since /usr is already included in the include-path, but /usr/local usually is not. > 3. The configure machinery could be made to deduce more about the contents > of Modules/Setup than it does now. In particular, it's silly that the person > building Python has to fill in the locations of X librasries when > configure is in principle perfectly capable of finding them. In principle, I agree. It's a lot of work, though. For instance, Debian stores the Tcl/Tk headers in /usr/include/tcl, which means you can compile for more than one tcl version, by just changing your include path and the library you link with. And there are undoubtedly several other variants out there. Should we really make the Setup file default to Linux, and leave other operating systems in the dark about what it might be on their system ? I think people with Linux and without clue are the least likely people to compile their own Python, since Linux distributions already come with a decent enough Python. And, please, lets assume the people assembling those know how to read ? Maybe we just need a HOWTO document covering Setup ? (Besides, won't this all be fixed when CML2 comes with a distribution, Eric ? They'll *have* to have working curses/tkinter then :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From MarkH at ActiveState.com Thu Dec 28 13:34:09 2000 From: MarkH at ActiveState.com (Mark Hammond) Date: Thu, 28 Dec 2000 23:34:09 +1100 Subject: [Python-Dev] Fwd: try...else Message-ID: <3A4B3341.5010707@ActiveState.com> Spotted on c.l.python. Although Pythonwin is mentioned, python.exe gives the same results - as does 1.5.2. Seems a reasonable question... [Also, if Robin hasn't been invited to join us here, I think it could make some sense...] Mark. -------- Original Message -------- Subject: try...else Date: Fri, 22 Dec 2000 18:02:27 +0000 From: Robin Becker Newsgroups: comp.lang.python I had expected that in try: except: else the else clause always got executed, but it seems not for return PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32.Portions Copyright 1994-2000 Mark Hammond (MarkH at ActiveState.com) - see 'Help/About PythonWin' for further copyright information. >>> def bang(): .... try: .... return 'return value' .... except: .... print 'bang failed' .... else: .... print 'bang succeeded' .... >>> bang() 'return value' >>> is this a 'feature' or bug. The 2.0 docs seem not to mention return/continue except for try finally. -- Robin Becker From mal at lemburg.com Thu Dec 28 15:45:49 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 15:45:49 +0100 Subject: [Python-Dev] Fwd: try...else References: <3A4B3341.5010707@ActiveState.com> Message-ID: <3A4B521D.4372224A@lemburg.com> Mark Hammond wrote: > > Spotted on c.l.python. Although Pythonwin is mentioned, python.exe > gives the same results - as does 1.5.2. > > Seems a reasonable question... > > [Also, if Robin hasn't been invited to join us here, I think it could > make some sense...] > > Mark. > -------- Original Message -------- > Subject: try...else > Date: Fri, 22 Dec 2000 18:02:27 +0000 > From: Robin Becker > Newsgroups: comp.lang.python > > I had expected that in try: except: else > the else clause always got executed, but it seems not for return I think Robin mixed up try...finally with try...except...else. The finally clause is executed even in case an exception occurred. He does have a point however that 'return' will bypass try...else and try...finally clauses. I don't think we can change that behaviour, though, as it would break code. > PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on > win32.Portions Copyright 1994-2000 Mark Hammond (MarkH at ActiveState.com) > - see 'Help/About PythonWin' for further copyright information. > >>> def bang(): > .... try: > .... return 'return value' > .... except: > .... print 'bang failed' > .... else: > .... print 'bang succeeded' > .... > >>> bang() > 'return value' > >>> > > is this a 'feature' or bug. The 2.0 docs seem not to mention > return/continue except for try finally. > -- > Robin Becker > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://www.python.org/mailman/listinfo/python-dev -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at digicool.com Thu Dec 28 16:04:23 2000 From: guido at digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 10:04:23 -0500 Subject: [Python-Dev] chomp()? Message-ID: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Someone just posted a patch to implement s.chomp() as a string method: http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470 Pseudo code (for those not aware of the Perl function by that name): def chomp(s): if s[-2:] == '\r\n': return s[:-2] if s[-1:] == '\r' or s[-1:] == '\n': return s[:-1] return s I.e. it removes a trailing \r\n, \r, or \n. Any comments? Is this needed given that we have s.rstrip() already? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at digicool.com Thu Dec 28 16:30:57 2000 From: guido at digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 10:30:57 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: Your message of "Wed, 27 Dec 2000 14:37:50 EST." <20001227143750.A26894@thyrsus.com> References: <20001227143750.A26894@thyrsus.com> Message-ID: <200012281530.KAA26049@cj20424-a.reston1.va.home.com> Eric, I think your recent posts have shown a worldview that's a bit too Eric-centered. :-) Not all the world is Linux. CML2 isn't the only Python application that matters. Python world domination is not a goal. There is no Eric conspiracy! :-) That said, I think that the future is bright: Anderw is already working on a much more intelligent configuration manager. I believe it would be a mistake to enable curses by default using the current approach to module configuration: it doesn't compile out of the box on every platform, and you wouldn't believe how much email I get from clueless Unix users trying to build Python when there's a problem like that in the distribution. So I'd rather wait for Andrew's work. You could do worse than help him with that, to further your goal! --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 28 16:41:23 2000 From: fdrake at acm.org (Fred L. Drake) Date: Thu, 28 Dec 2000 10:41:23 -0500 Subject: [Python-Dev] chomp()? In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: On Thu, 28 Dec 2000 10:04:23 -0500, Guido wrote: > Someone just posted a patch to implement s.chomp() as a > string method: ... > Any comments? Is this needed given that we have > s.rstrip() already? I've always considered this a different operation from rstrip(). When you intend to be as surgical in your changes as possible, it is important *not* to use rstrip(). I don't feel strongly that it needs to be implemented in C, though I imagine people who do a lot of string processing feel otherwise. It's just hard to beat the performance difference if you are doing this a lot. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From barry at digicool.com Thu Dec 28 17:00:36 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Thu, 28 Dec 2000 11:00:36 -0500 Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109 References: Message-ID: <14923.25508.668453.186209@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> [Barry Warsaw] >> Modified Files: python-mode.el Log Message: >> (python-font-lock-keywords): Add highlighting of `as' as a >> keyword, but only in "import foo as bar" statements (including >> optional preceding `from' clause). TP> Oh, that's right, try to make IDLE look bad, will you? I've TP> got half a mind to take up the challenge. Unfortunately, I TP> only have half a mind in total, so you may get away with this TP> backstabbing for a while . With my current network (un)connectivity, I feel like a nuclear sub which can only surface once a month to receive low frequency orders from some remote antenna farm out in New Brunswick. Just think of me as a rogue commander who tries to do as much damage as possible when he's not joyriding in the draft-wake of giant squids. rehoming-all-remaining-missiles-at-the-Kingdom-of-Timbotia-ly y'rs, -Barry From esr at thyrsus.com Thu Dec 28 17:01:54 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 11:01:54 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <200012281530.KAA26049@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 10:30:57AM -0500 References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> Message-ID: <20001228110154.D32394@thyrsus.com> Guido van Rossum : > Not all the world is Linux. CML2 isn't the only Python application > that matters. Python world domination is not a goal. There is no > Eric conspiracy! :-) Perhaps I misunderstood you, then. I thought you considered CML2 an potentially important design win, and that was why curses didn't get dropped from the core. Have you changed your mind about this? If Python world domination is not a goal then I can only conclude that you haven't had your morning coffee yet :-). There's a more general question here about what it means for something to be in the core language. Developers need to have a clear, bright-line picture of what they can count on to be present. To me this implies that it's the job of the Python maintainers to make sure that a facility declared "core" by its presence in the standard library documentation is always present, for maximum "batteries are included" effect. Yes, dealing with cross-platform variations in linking curses is a pain -- but dealing with that kind of pain so the Python user doesn't have to is precisely our job. Or so I understand it, anyway. -- Eric S. Raymond Conservatism is the blind and fear-filled worship of dead radicals. From moshez at zadka.site.co.il Thu Dec 28 17:51:32 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: 28 Dec 2000 16:51:32 -0000 Subject: [Python-Dev] chomp()? In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <20001228165132.8025.qmail@stimpy.scso.com> On Thu, 28 Dec 2000, Guido van Rossum wrote: > Someone just posted a patch to implement s.chomp() as a string method: ... > Any comments? Is this needed given that we have s.rstrip() already? Yes. i=0 for line in fileinput.input(): print '%d: %s' % (i, line.chomp()) i++ I want that operation to be invertable by sed 's/^[0-9]*: //' From guido at digicool.com Thu Dec 28 18:08:18 2000 From: guido at digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 12:08:18 -0500 Subject: [Python-Dev] scp to sourceforge Message-ID: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> I've seen a thread on this but there was no conclusive answer, so I'm reopening this. I can't SCP updated PEPs to the SourceForge machine. The "pep2html.py -i" command just hangs. I can ssh into shell.sourceforge.net just fine, but scp just hangs. "scp -v" prints a bunch of things suggesting that it can authenticate itself just fine, ending with these three lines: cj20424-a.reston1.va.home.com: RSA authentication accepted by server. cj20424-a.reston1.va.home.com: Sending command: scp -v -t . cj20424-a.reston1.va.home.com: Entering interactive session. and then nothing. It just sits there. Would somebody please figure out a way to update the PEPs? It's kind of pathetic to see the website not have the latest versions... --Guido van Rossum (home page: http://www.python.org/~guido/) From moshez at zadka.site.co.il Thu Dec 28 17:28:07 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: 28 Dec 2000 16:28:07 -0000 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B521D.4372224A@lemburg.com> References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> Message-ID: <20001228162807.7229.qmail@stimpy.scso.com> On Thu, 28 Dec 2000, "M.-A. Lemburg" wrote: > He does have a point however that 'return' will bypass > try...else and try...finally clauses. I don't think we can change > that behaviour, though, as it would break code. It doesn't bypass try..finally: >>> def foo(): ... try: ... print "hello" ... return ... finally: ... print "goodbye" ... >>> foo() hello goodbye From guido at digicool.com Thu Dec 28 17:43:26 2000 From: guido at digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 11:43:26 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: Your message of "Thu, 28 Dec 2000 11:01:54 EST." <20001228110154.D32394@thyrsus.com> References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> <20001228110154.D32394@thyrsus.com> Message-ID: <200012281643.LAA26687@cj20424-a.reston1.va.home.com> > Guido van Rossum : > > Not all the world is Linux. CML2 isn't the only Python application > > that matters. Python world domination is not a goal. There is no > > Eric conspiracy! :-) > > Perhaps I misunderstood you, then. I thought you considered CML2 an > potentially important design win, and that was why curses didn't get > dropped from the core. Have you changed your mind about this? Supporting CML2 was one of the reasons to keep curses in the core, but not the only one. Linux kernel configuration is so far removed from my daily use of computers that I don't have a good way to judge its importance in the grand scheme of things. Since you obviously consider it very important, and since I generally trust your judgement (except on the issue of firearms :-), your plea for keeping, and improving, curses support in the Python core made a difference in my decision. And don't worry, I don't expect to change that decision -- though I personally still find it curious that curses is so important. I find curses-style user interfaces pretty pathetic, and wished that Linux migrated to a real GUI for configuration. (And the linuxconf approach does *not* qualify as a a real GUI. :-) > If Python world domination is not a goal then I can only conclude that > you haven't had your morning coffee yet :-). Sorry to disappoint you, Eric. I gave up coffee years ago. :-) I was totally serious though: my personal satisfaction doesn't come from Python world domination. Others seem have that goal, and if it doesn't inconvenience me too much I'll play along, but in the end I've got some goals in my personal life that are much more important. > There's a more general question here about what it means for something > to be in the core language. Developers need to have a clear, > bright-line picture of what they can count on to be present. To me > this implies that it's the job of the Python maintainers to make sure > that a facility declared "core" by its presence in the standard > library documentation is always present, for maximum "batteries are > included" effect. We do the best we can. Using the current module configuration system, it's a one-character edit to enable curses if you need it. With Andrew's new scheme, it will be automatic. > Yes, dealing with cross-platform variations in linking curses is a > pain -- but dealing with that kind of pain so the Python user doesn't > have to is precisely our job. Or so I understand it, anyway. So help Andrew: http://python.sourceforge.net/peps/pep-0229.html --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Thu Dec 28 17:52:36 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 17:52:36 +0100 Subject: [Python-Dev] Fwd: try...else References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> Message-ID: <3A4B6FD3.9B576E9A@lemburg.com> Moshe Zadka wrote: > > On Thu, 28 Dec 2000, "M.-A. Lemburg" wrote: > > > He does have a point however that 'return' will bypass > > try...else and try...finally clauses. I don't think we can change > > that behaviour, though, as it would break code. > > It doesn't bypass try..finally: > > >>> def foo(): > ... try: > ... print "hello" > ... return > ... finally: > ... print "goodbye" > ... > >>> foo() > hello > goodbye Hmm, that must have changed between Python 1.5 and more recent versions: Python 1.5: >>> def f(): ... try: ... return 1 ... finally: ... print 'finally' ... >>> f() 1 >>> Python 2.0: >>> def f(): ... try: ... return 1 ... finally: ... print 'finally' ... >>> f() finally 1 >>> -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From moshez at stimpy.scso.com Thu Dec 28 17:59:32 2000 From: moshez at stimpy.scso.com (Moshe Zadka) Date: 28 Dec 2000 16:59:32 -0000 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B6FD3.9B576E9A@lemburg.com> References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> Message-ID: <20001228165932.8143.qmail@stimpy.scso.com> On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" wrote: [about try..finally not playing well with return] > Hmm, that must have changed between Python 1.5 and more recent > versions: I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2? From esr at thyrsus.com Thu Dec 28 18:20:48 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 12:20:48 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001228105331.A6042@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 10:53:31AM +0100 References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> Message-ID: <20001228122048.A1381@thyrsus.com> Thomas Wouters : > > 1. The curses module is commented out in the default Modules/Setup > > file. This is not good, as it may lead careless distribution builders > > to ship Python 2.0s that will not be able to support the curses front > > end in CML2. Supporting CML2 (and thus getting Python the "design > > win" of being involved in the Linux kernel build) was the major point > > of integrating the curses module into the Python core. It is possible > > that one little "#" may have blown that. > > Note that Tkinter is off by default too. And readline. And ssl. And the use > of shared libraries. IMO ssl isn't an issue because it's not documented as being in the standard module set. Readline is a minor issue because raw_input()'s functionality changes somewhat if it's not linked, but I think we can live with this -- the change isn't visible to calling programs. Hm. It appears tkinter isn't documented in the standard set of modules either. Interesting. Technically this means I don't have a problem with it not being built in by default, but I think there is a problem here... My more general point is that right now Pyjthon has three classes of modules: 1. Documented as being in the core and built in by default. 2. Not documented as being in the core and not built in by default. 3. Documented as being in the core but not built in by default. My more general claim is that the existence of class 3 is a problem, because it compromises the "batteries are included" effect -- it means Python users don't have a bright-line test for what will be present in every Python (or at least every Python on an operating system theoretically feature-compatible with theirs). My struggle to get CML2 adopted brings this problem into particularly sharp focus because the kernel group is allergic to big footprints or having to download extension modules to do a build. But the issue is really broader than that. I think we ought to be migrating stuff out of class 3 into class 1 where possible and to class 2 only where unavoidable. > We *can't* enable the cursesmodule by default, because > we don't know what the system's curses library is called. We'd have to > auto-detect that before we can enable it (and lots of other modules) > automatically, and that's a lot of work. I personally favour autoconf for > the job, but since amk is already busy on using distutils, I'm not going to > work on that. Yes, we need to do a lot more autodetection -- this is a large part of my point. I have nothing against distutils, but I don't see how it solves this problem unless we assume that we'll always have Python already available on any platform where we're building Python. I'm willing to put my effort where my mouth is on this. I have a lot of experience with autoconf; I'm willing to write some of these nasty config tests. > > 2.The default Modules/Setup file assumes that various Tkinter-related libraries > > are in /usr/local. But /usr would be a more appropriate choice under most > > circumstances. Most Linux users now install their Tcl/Tk stuff from RPMs > > or .deb packages that place the binaries and libraries under /usr. Under > > most other Unixes (e.g. Solaris) they were there to begin with. > > This is nonsense. The line above it specifically states 'edit to reflect > where your Tcl/Tk headers are'. And besides from the issue whether they are > usually found in /usr (I don't believe so, not even on Solaris, but 'my' > Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane > choice, since /usr is already included in the include-path, but /usr/local > usually is not. Is it? That is not clear from the comment. Perhaps this is just a documentation problem. I'll look again. > > 3. The configure machinery could be made to deduce more about the contents > > of Modules/Setup than it does now. In particular, it's silly that the > > person building Python has to fill in the locations of X librasries when > > configure is in principle perfectly capable of finding them. > > In principle, I agree. It's a lot of work, though. For instance, Debian > stores the Tcl/Tk headers in /usr/include/tcl, which means you can > compile for more than one tcl version, by just changing your include path > and the library you link with. And there are undoubtedly several other > variants out there. As I said to Guido, I think it is exactly our job to deal with this sort of grottiness. One of Python's major selling points is supposed to be cross-platform consistency of the API. If we fail to do what you're describing, we're failing to meet Python users' reasonable expectations for the language. > Should we really make the Setup file default to Linux, and leave other > operating systems in the dark about what it might be on their system ? I > think people with Linux and without clue are the least likely people to > compile their own Python, since Linux distributions already come with a > decent enough Python. And, please, lets assume the people assembling those > know how to read ? Please note that I am specifically *not* advocating making the build defaults Linux-centric. That's not my point at all. > Maybe we just need a HOWTO document covering Setup ? That would be a good idea. > (Besides, won't this all be fixed when CML2 comes with a distribution, Eric ? > They'll *have* to have working curses/tkinter then :-) I'm concerned that it will work the other way around, that CML2 won't happen if the core does not reliably include these facilities. In itself CML2 not happening wouldn't be the end of the world of course, but I'm pushing on this because I think the larger issue of class 3 modules is actually important to the health of Python and needs to be attacked seriously. -- Eric S. Raymond The Bible is not my book, and Christianity is not my religion. I could never give assent to the long, complicated statements of Christian dogma. -- Abraham Lincoln From cgw at fnal.gov Thu Dec 28 18:36:06 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Thu, 28 Dec 2000 11:36:06 -0600 (CST) Subject: [Python-Dev] chomp()? In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <14923.31238.65155.496546@buffalo.fnal.gov> Guido van Rossum writes: > Someone just posted a patch to implement s.chomp() as a string method: > I.e. it removes a trailing \r\n, \r, or \n. > > Any comments? Is this needed given that we have s.rstrip() already? -1 from me. P=NP (Python is not Perl). "Chomp" is an excessively cute name. And like you said, this is too much like "rstrip" to merit a separate method. From esr at thyrsus.com Thu Dec 28 18:41:17 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 12:41:17 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <200012281643.LAA26687@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 11:43:26AM -0500 References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> <20001228110154.D32394@thyrsus.com> <200012281643.LAA26687@cj20424-a.reston1.va.home.com> Message-ID: <20001228124117.B1381@thyrsus.com> Guido van Rossum : > Supporting CML2 was one of the reasons to keep curses in the core, but > not the only one. Linux kernel configuration is so far removed from > my daily use of computers that I don't have a good way to judge its > importance in the grand scheme of things. Since you obviously > consider it very important, and since I generally trust your judgement > (except on the issue of firearms :-), your plea for keeping, and > improving, curses support in the Python core made a difference in my > decision. And don't worry, I don't expect to change that decision > -- though I personally still find it curious that curses is so important. > I find curses-style user interfaces pretty pathetic, and wished that > Linux migrated to a real GUI for configuration. (And the linuxconf > approach does *not* qualify as a a real GUI. :-) Thank you, that makes your priorities much clearer. Actually I agree with you that curses interfaces are mostly pretty pathetic. A lot of people still like them, though, because they tend to be fast and lightweight. Then, too, a really well-designed curses interface can in fact be good enough that the usability gain from GUIizing is marginal. My favorite examples of this are mutt and slrn. The fact that GUI programs have failed to make much headway against this is not simply due to user conservatism, it's genuinely hard to see how a GUI interface could be made significantly better. And unfortunately, there is a niche where it is still important to support curses interfacing independently of anyone's preferences in interface style -- early in the system-configuration process before one has bootstrapped to the point where X is reliably available. I hasten to add that this is not just *my* problem -- one of your more important Python constituencies in a practical sense is the guys who maintain Red Hat's installer. > I was totally serious though: my personal satisfaction doesn't come > from Python world domination. Others seem have that goal, and if it > doesn't inconvenience me too much I'll play along, but in the end I've > got some goals in my personal life that are much more important. There speaks the new husband :-). OK. So what *do* you want from Python? Personally, BTW, my goal is not exactly Python world domination either -- it's that the world should be dominated by the language that has the least tendency to produce grotty fragile code (remember that I tend to obsess about the software-quality problem :-)). Right now that's Python. -- Eric S. Raymond The people of the various provinces are strictly forbidden to have in their possession any swords, short swords, bows, spears, firearms, or other types of arms. The possession of unnecessary implements makes difficult the collection of taxes and dues and tends to foment uprisings. -- Toyotomi Hideyoshi, dictator of Japan, August 1588 From mal at lemburg.com Thu Dec 28 18:43:13 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 18:43:13 +0100 Subject: [Python-Dev] chomp()? References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <3A4B7BB1.F09660ED@lemburg.com> Guido van Rossum wrote: > > Someone just posted a patch to implement s.chomp() as a string method: > > http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470 > > Pseudo code (for those not aware of the Perl function by that name): > > def chomp(s): > if s[-2:] == '\r\n': > return s[:-2] > if s[-1:] == '\r' or s[-1:] == '\n': > return s[:-1] > return s > > I.e. it removes a trailing \r\n, \r, or \n. > > Any comments? Is this needed given that we have s.rstrip() already? We already have .splitlines() which does the above (remove line breaks) not only for a single line, but for many lines at once. Even better: .splitlines() also does the right thing for Unicode. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Thu Dec 28 20:06:33 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 20:06:33 +0100 Subject: [Python-Dev] Fwd: try...else References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> <20001228165932.8143.qmail@stimpy.scso.com> Message-ID: <3A4B8F39.58C64EFB@lemburg.com> Moshe Zadka wrote: > > On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" wrote: > > [about try..finally not playing well with return] > > Hmm, that must have changed between Python 1.5 and more recent > > versions: > > I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2? Sorry, false alarm: there was a bug in my patched 1.5 version. The original 1.5 version does not show the described behaviour. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From thomas at xs4all.net Thu Dec 28 21:21:15 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 21:21:15 +0100 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B521D.4372224A@lemburg.com>; from mal@lemburg.com on Thu, Dec 28, 2000 at 03:45:49PM +0100 References: <3A4B3341.5010707@ActiveState.com> <3A4B521D.4372224A@lemburg.com> Message-ID: <20001228212115.C1811@xs4all.nl> On Thu, Dec 28, 2000 at 03:45:49PM +0100, M.-A. Lemburg wrote: > > I had expected that in try: except: else > > the else clause always got executed, but it seems not for return > I think Robin mixed up try...finally with try...except...else. > The finally clause is executed even in case an exception occurred. (MAL and I already discussed this in private mail: Robin did mean try/except/else, and 'finally' already executes when returning directly from the 'try' block, even in Python 1.5) > He does have a point however that 'return' will bypass > try...else and try...finally clauses. I don't think we can change > that behaviour, though, as it would break code. This code: try: return except: pass else: print "returning" will indeed not print 'returning', but I believe it's by design. I'm against changing it, in any case, and not just because it'd break code :) If you want something that always executes, use a 'finally'. Or don't return from the 'try', but return in the 'else' clause. The 'except' clause is documented to execute if a matching exception occurs, and 'else' if no exception occurs. Maybe the intent of the 'else' clause would be clearer if it was documented to 'execute if the try: clause finishes without an exception being raised' ? The 'else' clause isn't executed when you 'break' or (after applying my continue-in-try patch ;) 'continue' out of the 'try', either. Robin... Did I already reply this, on python-list or to you directly ? I distinctly remember writing that post, but I'm not sure if it arrived. Maybe I didn't send it after all, or maybe something on mail.python.org is detaining it ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas at xs4all.net Thu Dec 28 19:19:06 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 19:19:06 +0100 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001228122048.A1381@thyrsus.com>; from esr@thyrsus.com on Thu, Dec 28, 2000 at 12:20:48PM -0500 References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com> Message-ID: <20001228191906.F1281@xs4all.nl> On Thu, Dec 28, 2000 at 12:20:48PM -0500, Eric S. Raymond wrote: > My more general point is that right now Pyjthon has three classes of > modules: > 1. Documented as being in the core and built in by default. > 2. Not documented as being in the core and not built in by default. > 3. Documented as being in the core but not built in by default. > My more general claim is that the existence of class 3 is a problem, > because it compromises the "batteries are included" effect -- it means > Python users don't have a bright-line test for what will be present in > every Python (or at least every Python on an operating system > theoretically feature-compatible with theirs). It depends on your definition of 'being in the core'. Some of the things that are 'in the core' are simply not possible on all platforms. So if you want really portable code, you don't want to use them. Other features are available on all systems that matter [to you], so you don't really care about it, just use them, and at best document that they need feature X. There is also the subtle difference between a Python user and a Python compiler/assembler (excuse my overloading of the terms, but you know what I mean). People who choose to compile their own Python should realize that they might disable or misconfigure some parts of it. I personally trust most people that assemble OS distributions to compile a proper Python binary + modules, but I think a HOWTO isn't a bad idea -- unless we autoconf everything. > I think we ought to be migrating stuff out > of class 3 into class 1 where possible and to class 2 only where > unavoidable. [ and ] > I'm willing to put my effort where my mouth is on this. I have a lot > of experience with autoconf; I'm willing to write some of these nasty > config tests. [ and ] > As I said to Guido, I think it is exactly our job to deal with this sort > of grottiness. One of Python's major selling points is supposed to be > cross-platform consistency of the API. If we fail to do what you're > describing, we're failing to meet Python users' reasonable expectations > for the language. [ and ] > Please note that I am specifically *not* advocating making the build defaults > Linux-centric. That's not my point at all. I apologize for the tone of my previous post, and the above snippet. I'm not trying to block progress here ;) I'm actually all for autodetecting as much as possible, and more than willing to put effort into it as well (as long as it's deemed useful, and isn't supplanted by a distutils variant weeks later.) And I personally have my doubts about the distutils variant, too, but that's partly because I have little experience with distutils. If we can work out a deal where both autoconf and distutils are an option, I'm happy to write a few, if not all, autoconf tests for the currently disabled modules. So, Eric, let's split the work. I'll do Tkinter if you do curses. :) However, I'm also keeping those oddball platforms that just don't support some features in mind. If you want truly portable code, you have to work at it. I think it's perfectly okay to say "your Python needs to have the curses module or the tkinter module compiled in -- contact your administrator if it has neither". There will still be platforms that don't have curses, or syslog, or crypt(), though hopefully none of them will be Linux. Oh, and I also apologize for possibly duplicating what has already been said by others. I haven't seen anything but this post (which was CC'd to me directly) since I posted my reply to Eric, due to the ululating bouts of delay on mail.python.org. Maybe DC should hire some *real* sysadmins, instead of those silly programmer-kniggits ? >:-> -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mwh21 at cam.ac.uk Thu Dec 28 19:27:48 2000 From: mwh21 at cam.ac.uk (Michael Hudson) Date: Thu, 28 Dec 2000 18:27:48 +0000 (GMT) Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B521D.4372224A@lemburg.com> Message-ID: On Thu, 28 Dec 2000, M.-A. Lemburg wrote: > I think Robin mixed up try...finally with try...except...else. I think so too. > The finally clause is executed even in case an exception occurred. > > He does have a point however that 'return' will bypass > try...else and try...finally clauses. I don't think we can change > that behaviour, though, as it would break code. return does not skip finally clauses[1]. In my not especially humble opinion, the current behaviour is the Right Thing. I'd have to think for a moment before saying what Robin's example would print, but I think the alternative would disturb me far more. Cheers, M. [1] In fact the flow of control on return is very similar to that of an exception - ooh, look at that implementation... From esr at thyrsus.com Thu Dec 28 20:17:51 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 14:17:51 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001228191906.F1281@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 07:19:06PM +0100 References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com> <20001228191906.F1281@xs4all.nl> Message-ID: <20001228141751.B2528@thyrsus.com> Thomas Wouters : > > My more general claim is that the existence of class 3 is a problem, > > because it compromises the "batteries are included" effect -- it means > > Python users don't have a bright-line test for what will be present in > > every Python (or at least every Python on an operating system > > theoretically feature-compatible with theirs). > > It depends on your definition of 'being in the core'. Some of the things > that are 'in the core' are simply not possible on all platforms. So if you > want really portable code, you don't want to use them. Other features are > available on all systems that matter [to you], so you don't really care > about it, just use them, and at best document that they need feature X. I understand. We can't, for example, guarantee to duplicate the Windows- specific stuff in the Unix port (nor would we want to in most cases :-)). However, I think "we build in curses/Tkinter everywhere the corresponding libraries exist" is a guarantee we can and should make. Similarly for other modules presently in class 3. > There is also the subtle difference between a Python user and a Python > compiler/assembler (excuse my overloading of the terms, but you know what I > mean). Yes. We have three categories here: 1. People who use python for applications (what I've been calling users) 2. People who configure Python binary packages for distribution (what you call a "compiler/assembler" and I think of as a "builder"). 3. People who hack Python itself. Problem is that "developer" is very ambiguous in this context... > People who choose to compile their own Python should realize that > they might disable or misconfigure some parts of it. I personally trust most > people that assemble OS distributions to compile a proper Python binary + > modules, but I think a HOWTO isn't a bad idea -- unless we autoconf > everything. I'd like to see both things happen (HOWTO and autoconfing) and am willing to work on both. > I apologize for the tone of my previous post, and the above snippet. No offense taken at all, I assure you. > I'm not > trying to block progress here ;) I'm actually all for autodetecting as much > as possible, and more than willing to put effort into it as well (as long as > it's deemed useful, and isn't supplanted by a distutils variant weeks > later.) And I personally have my doubts about the distutils variant, too, > but that's partly because I have little experience with distutils. If we can > work out a deal where both autoconf and distutils are an option, I'm happy > to write a few, if not all, autoconf tests for the currently disabled > modules. I admit I'm not very clear on the scope of what distutils is supposed to handle, and how. Perhaps amk can enlighten us? > So, Eric, let's split the work. I'll do Tkinter if you do curses. :) You've got a deal. I'll start looking at the autoconf code. I've already got a fair idea how to do this. -- Eric S. Raymond No one who's seen it in action can say the phrase "government help" without either laughing or crying. From tim.one at home.com Fri Dec 29 03:59:53 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 28 Dec 2000 21:59:53 -0500 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > I've seen a thread on this but there was no conclusive answer, so I'm > reopening this. It hasn't budged an inch since then: your "Entering interactive session" problem is the same one everyone has; it gets reported on SF's bug and/or support managers at least daily; SF has not fixed it yet; these days they don't even respond to scp bug reports anymore; the cause appears to be SF's custom sfshell, and only SF can change that; the only known workarounds are to (a) modify files on SF directly (they suggest vi ), or (b) initiate scp *from* SF, using your local machine as a server (if you can do that -- I cannot, or at least haven't succeeded). From martin at loewis.home.cs.tu-berlin.de Thu Dec 28 23:52:02 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 28 Dec 2000 23:52:02 +0100 Subject: [Python-Dev] curses in the core? Message-ID: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de> > If curses is a core facility now, the default build should tread it > as one. ... > IMO ssl isn't an issue because it's not documented as being in the > standard module set. ... > 3. Documented as being in the core but not built in by default. > My more general claim is that the existence of class 3 is a problem In the case of curses, I believe there is a documentation error in the 2.0 documentation. The curses packages is listed under "Generic Operating System Services". I believe this is wrong, it should be listed as "Unix Specific Services". Unless I'm mistaken, the curses module is not available on the Mac and on Windows. With that change, the curses module would then fall into Eric's category 2 (Not documented as being in the core and not built in by default). That documentation change should be carried out even if curses is autoconfigured; autoconf is used on Unix only, either. Regards, Martin P.S. The "Python Library Reference" content page does not mention the word "core" at all, except as part of asyncore... From thomas at xs4all.net Thu Dec 28 23:58:25 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 23:58:25 +0100 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 12:08:18PM -0500 References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> Message-ID: <20001228235824.E1811@xs4all.nl> On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote: > I've seen a thread on this but there was no conclusive answer, so I'm > reopening this. Actually there was: it's all SourceForge's fault. (At least that's my professional opinion ;) They honestly have a strange setup, though how strange and to what end I cannot tell. > Would somebody please figure out a way to update the PEPs? It's kind > of pathetic to see the website not have the latest versions... The way to update the peps is by ssh'ing into shell.sourceforge.net, and then scp'ing the files from your work repository to the htdocs/peps directory. That is, until SF fixes the scp problem. This method works (I just updated all PEPs to up-to-date CVS versions) but it's a bit cumbersome. And it only works if you have ssh access to your work environment. And it's damned hard to script; I tried playing with a single ssh command that did all the work, but between shell weirdness, scp weirdness and a genuine bash bug I couldn't figure it out. I assume that SF is aware of the severity of this problem, and is working on something akin to a fix or workaround. Until then, I can do an occasional update of the PEPs, for those that can't themselves. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas at xs4all.net Fri Dec 29 00:05:28 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 29 Dec 2000 00:05:28 +0100 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <20001228235824.E1811@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 11:58:25PM +0100 References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> <20001228235824.E1811@xs4all.nl> Message-ID: <20001229000528.F1811@xs4all.nl> On Thu, Dec 28, 2000 at 11:58:25PM +0100, Thomas Wouters wrote: > On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote: > > Would somebody please figure out a way to update the PEPs? It's kind > > of pathetic to see the website not have the latest versions... > > The way to update the peps is by ssh'ing into shell.sourceforge.net, and > then scp'ing the files from your work repository to the htdocs/peps [ blah blah ] And then they fixed it ! At least, for me, direct scp now works fine. (I should've tested that before posting my blah blah, sorry.) Anybody else, like people using F-secure ssh (unix or windows) experience the same ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From MarkH at ActiveState.com Fri Dec 29 00:15:01 2000 From: MarkH at ActiveState.com (Mark Hammond) Date: Fri, 29 Dec 2000 10:15:01 +1100 Subject: [Python-Dev] chomp()? In-Reply-To: <14923.31238.65155.496546@buffalo.fnal.gov> Message-ID: > -1 from me. P=NP (Python is not Perl). "Chomp" is an > excessively cute name. > And like you said, this is too much like "rstrip" to merit a separate > method. My thoughts exactly. I can't remember _ever_ wanting to chomp() when rstrip() wasnt perfectly suitable. I'm sure it happens, but not often enough to introduce an ambiguous new function purely for "feature parity" with Perl. Mark. From esr at thyrsus.com Fri Dec 29 00:25:28 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 18:25:28 -0500 Subject: [Python-Dev] Re: curses in the core? In-Reply-To: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Thu, Dec 28, 2000 at 11:52:02PM +0100 References: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de> Message-ID: <20001228182528.A10743@thyrsus.com> Martin v. Loewis : > In the case of curses, I believe there is a documentation error in the > 2.0 documentation. The curses packages is listed under "Generic > Operating System Services". I believe this is wrong, it should be listed > as "Unix Specific Services". I agree that this is an error and should be fixed. > Unless I'm mistaken, the curses module is not available on the Mac and > on Windows. With that change, the curses module would then fall into > Eric's category 2 (Not documented as being in the core and not built > in by default). Well...that's a definitional question that is part of the larger issue here. What does being in the Python core mean? There are two potential definitions: 1. Documentation says it's available on all platforms. 2. Documentation restricts it to one of the three platform groups (Unix/Windows/Mac) but implies that it will be available on any OS in that group. I think the second one is closer to what application programmers thinking about which batteries are included expect. But I could be persuaded otherwise by a good argument. -- Eric S. Raymond The difference between death and taxes is death doesn't get worse every time Congress meets -- Will Rogers From akuchlin at mems-exchange.org Fri Dec 29 01:33:36 2000 From: akuchlin at mems-exchange.org (A.M. Kuchling) Date: Thu, 28 Dec 2000 19:33:36 -0500 Subject: [Python-Dev] Bookstore completed Message-ID: <200012290033.TAA01295@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> OK, I think I'm ready to declare the Python bookstore complete enough to go public. Before I set up redirects from www.python.org, please take another look. (More book descriptions would be helpful...) http://www.amk.ca/bookstore/ --amk From akuchlin at mems-exchange.org Fri Dec 29 01:46:16 2000 From: akuchlin at mems-exchange.org (A.M. Kuchling) Date: Thu, 28 Dec 2000 19:46:16 -0500 Subject: [Python-Dev] Help wanted with setup.py script Message-ID: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> Want to help with the laudable goal of automating the Python build process? It'll need lots of testing on many different platforms, and I'd like to start the process now. First, download the setup.py script from http://www.amk.ca/files/python/setup.py Next, drop it in the root directory of your Python source tree and run "python setup.py build". If it dies with an exception, let me know. (Replies to this list are OK.) If it runs to completion, look in the Modules/build/lib. directory to see which modules got built. (On my system, is "linux-i686-2.0", but of course this will depend on your platform.) Is anything missing that should have been built? (_tkinter.so is the prime candidate; the autodetection code is far too simple at the moment and assumes one particular version of Tcl and Tk.) Did an attempt at building a module fail? These indicate problems autodetecting something, so if you can figure out how to find the required library or include file, let me know what to do. --amk From fdrake at acm.org Fri Dec 29 05:12:18 2000 From: fdrake at acm.org (Fred L. Drake) Date: Thu, 28 Dec 2000 23:12:18 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <20001228212115.C1811@xs4all.nl> Message-ID: On Thu, 28 Dec 2000 21:21:15 +0100, Thomas Wouters wrote: > The 'except' clause is documented to execute if a > matching exception occurs, > and 'else' if no exception occurs. Maybe the intent of > the 'else' clause This can certainly be clarified in the documentation -- please file a bug report at http://sourceforge.net/projects/python/. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tim.one at home.com Fri Dec 29 05:25:44 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 28 Dec 2000 23:25:44 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <20001228212115.C1811@xs4all.nl> Message-ID: [Fred, suggested doc change near the end] [Thomas Wouters] > (MAL and I already discussed this in private mail: Robin did mean > try/except/else, and 'finally' already executes when returning > directly from the 'try' block, even in Python 1.5) > > This code: > > try: > return > except: > pass > else: > print "returning" > > will indeed not print 'returning', but I believe it's by design. > I'm against changing it, in any case, and not just because it'd > break code :) If you want something that always executes, use a > 'finally'. Or don't return from the 'try', but return in the > 'else' clause. Guido's out of town again, so I'll channel him: Thomas is correct on all counts. In try/else, the "else" clause should execute if and only if control "falls off the end" of the "try" block. IOW, consider: try: arbitrary stuff x = 1 An "else" clause added to that "try" should execute when and only when the code as written executes the "x = 1" after the block. When "arbitrary stuff" == "return", control does not fall off the end, so "else" shouldn't trigger. Same thing if "arbitrary stuff" == "break" and we're inside a loop, or "continue" after Thomas's patch gets accepted. > The 'except' clause is documented to execute if a matching > exception occurs, and 'else' if no exception occurs. Yup, and that's imprecise: the same words are used to describe (part of) when 'finally' executes, but they weren't intended to be the same. > Maybe the intent of the 'else' clause would be clearer if it > was documented to 'execute if the try: clause finishes without > an exception being raised' ? Sorry, I don't find that any clearer. Let's be explicit: The optional 'else' clause is executed when the 'try' clause terminates by any means other than an exception or executing a 'return', 'continue' or 'break' statement. Exceptions in the 'else' clause are not handled by the preceding 'except' clauses. > The 'else' clause isn't executed when you 'break' or (after > applying my continue-in-try patch ;) 'continue' out of the > 'try', either. Hey, now you're channeling me ! Be afraid -- be very afraid. From moshez at zadka.site.co.il Fri Dec 29 15:42:44 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Fri, 29 Dec 2000 16:42:44 +0200 (IST) Subject: [Python-Dev] chomp()? In-Reply-To: <3A4B7BB1.F09660ED@lemburg.com> References: <3A4B7BB1.F09660ED@lemburg.com>, <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <20001229144244.D5AD0A84F@darjeeling.zadka.site.co.il> On Thu, 28 Dec 2000, "M.-A. Lemburg" wrote: [about chomp] > We already have .splitlines() which does the above (remove > line breaks) not only for a single line, but for many lines at once. > > Even better: .splitlines() also does the right thing for Unicode. OK, I retract my earlier +1, and instead I move that this be added to the FAQ. Where is the FAQ maintained nowadays? The grail link doesn't work anymore. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From loewis at informatik.hu-berlin.de Fri Dec 29 17:52:13 2000 From: loewis at informatik.hu-berlin.de (Martin von Loewis) Date: Fri, 29 Dec 2000 17:52:13 +0100 (MET) Subject: [Python-Dev] Re: [Patch #103002] Fix for #116285: Properly raise UnicodeErrors Message-ID: <200012291652.RAA20251@pandora.informatik.hu-berlin.de> [resent since python.org ran out of disk space] > My only problem with it is your copyright notice. AFAIK, patches to > the Python core cannot contain copyright notices without proper > license information. OTOH, I don't think that these minor changes > really warrant adding a complete license paragraph. I'd like to get an "official" clarification on this question. Is it the case that patches containing copyright notices are only accepted if they are accompanied with license information? I agree that the changes are minor, I also believe that I hold the copyright to the changes whether I attach a notice or not (at least according to our local copyright law). What concerns me that without such a notice, gencodec.py looks as if CNRI holds the copyright to it. I'm not willing to assign the copyright of my changes to CNRI, and I'd like to avoid the impression of doing so. What is even more concerning is that CNRI also holds the copyright to the generated files, even though they are derived from information made available by the Unicode consortium! Regards, Martin From tim.one at home.com Fri Dec 29 20:56:36 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 29 Dec 2000 14:56:36 -0500 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <20001229000528.F1811@xs4all.nl> Message-ID: [Thomas Wouters] > And then they fixed it ! At least, for me, direct scp now works > fine. (I should've tested that before posting my blah blah, sorry.) I tried it immediately before posting my blah-blah yesterday, and it was still hanging. > Anybody else, like people using F-secure ssh (unix or windows) > experience the same ? Same here: I tried it again just now (under Win98 cmdline ssh/scp) and it worked fine! We're in business again. Thanks for fixing it, Thomas . now-if-only-we-could-get-python-dev-email-on-an-approximation-to-the- same-day-it's-sent-ly y'rs - tim From tim.one at home.com Fri Dec 29 21:27:40 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 29 Dec 2000 15:27:40 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: Message-ID: [Robin Becker] > The 2.0 docs clearly state 'The optional else clause is executed when no > exception occurs in the try clause.' This makes it sound as though it > gets executed on the 'way out'. Of course. That's not what the docs meant, though, and Guido is not going to change the implementation now because that would break code that relies on how Python has always *worked* in these cases. The way Python works is also the way Guido intended it to work (I'm allowed to channel him when he's on vacation <0.9 wink)>. Indeed, that's why I suggested a specific doc change. If your friend would also be confused by that, then we still have a problem; else we don't. From tim.one at home.com Fri Dec 29 21:37:08 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 29 Dec 2000 15:37:08 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: Message-ID: [Fred] > This can certainly be clarified in the documentation -- > please file a bug report at http://sourceforge.net/projects/python/. Here you go: https://sourceforge.net/bugs/?func=detailbug&bug_id=127098&group_id=5470 From thomas at xs4all.net Fri Dec 29 21:59:16 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 29 Dec 2000 21:59:16 +0100 Subject: [Python-Dev] Fwd: try...else In-Reply-To: ; from tim.one@home.com on Fri, Dec 29, 2000 at 03:27:40PM -0500 References: Message-ID: <20001229215915.L1281@xs4all.nl> On Fri, Dec 29, 2000 at 03:27:40PM -0500, Tim Peters wrote: > Indeed, that's why I suggested a specific doc change. If your friend would > also be confused by that, then we still have a problem; else we don't. Note that I already uploaded a patch to fix the docs, assigned to fdrake, using Tim's wording exactly. (patch #103045) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From moshez at zadka.site.co.il Sun Dec 31 01:33:30 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Sun, 31 Dec 2000 02:33:30 +0200 (IST) Subject: [Python-Dev] FAQ Horribly Out Of Date Message-ID: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Hi! The current FAQ is horribly out of date. I think the FAQ-Wizard method has proven itself not very efficient (for example, apparently no one noticed until now that it's not working <0.2 wink>). Is there any hope putting the FAQ in Misc/, having a script which scp's it to the SF page, and making that the official FAQ? On a related note, what is the current status of the PSA? Is it officially dead? -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tim.one at home.com Sat Dec 30 21:48:08 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 30 Dec 2000 15:48:08 -0500 Subject: [Python-Dev] Most everything is busted Message-ID: Add this error to the pot: """ http://www.python.org/cgi-bin/moinmoin Proxy Error The proxy server received an invalid response from an upstream server. The proxy server could not handle the request GET /cgi-bin/moinmoin. Reason: Document contains no data ------------------------------------------------------------------- Apache/1.3.9 Server at www.python.org Port 80 """ Also, as far as I can tell: + news->mail for c.l.py hasn't delivered anything for well over 24 hours. + No mail to Python-Dev has showed up in the archives (let alone been delivered) since Fri, 29 Dec 2000 16:42:44 +0200 (IST). + The other Python mailing lists appear equally dead. time-for-a-new-year!-ly y'rs - tim From barry at wooz.org Sun Dec 31 02:06:23 2000 From: barry at wooz.org (Barry A. Warsaw) Date: Sat, 30 Dec 2000 20:06:23 -0500 Subject: [Python-Dev] Re: Most everything is busted References: Message-ID: <14926.34447.60988.553140@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> + news->mail for c.l.py hasn't delivered anything for well TP> over 24 hours. TP> + No mail to Python-Dev has showed up in the archives (let TP> alone been delivered) since Fri, 29 Dec 2000 16:42:44 +0200 TP> (IST). TP> + The other Python mailing lists appear equally dead. There's a stupid, stupid bug in Mailman 2.0, which I've just fixed and (hopefully) unjammed things on the Mailman end[1]. We're still probably subject to the Postfix delays unfortunately; I think those are DNS related, and I've gotten a few other reports of DNS oddities, which I've forwarded off to the DC sysadmins. I don't think that particular problem will be fixed until after the New Year. relax-and-enjoy-the-quiet-ly y'rs, -Barry [1] For those who care: there's a resource throttle in qrunner which limits the number of files any single qrunner process will handle. qrunner does a listdir() on the qfiles directory and ignores any .msg file it finds (it only does the bulk of the processing on the corresponding .db files). But it performs the throttle check on every file in listdir() so depending on the order that listdir() returns and the number of files in the qfiles directory, the throttle check might get triggered before any .db file is seen. Wedge city. This is serious enough to warrant a Mailman 2.0.1 release, probably mid-next week. From gstein at lyra.org Sun Dec 31 11:19:50 2000 From: gstein at lyra.org (Greg Stein) Date: Sun, 31 Dec 2000 02:19:50 -0800 Subject: [Python-Dev] FAQ Horribly Out Of Date In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200 References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Message-ID: <20001231021950.M28628@lyra.org> On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote: >... > On a related note, what is the current status of the PSA? Is it officially > dead? The PSA was always kind of a (legal) fiction with the basic intent to help provide some funding for Python development. Since that isn't occurring at CNRI any more, the PSA is a bit moot. There was always some idea that maybe the PSA would be the "sponsor" (and possibly the beneficiary) of the conferences. That wasn't ever really formalized either. From akuchlin at cnri.reston.va.us Sun Dec 31 16:58:12 2000 From: akuchlin at cnri.reston.va.us (Andrew Kuchling) Date: Sun, 31 Dec 2000 10:58:12 -0500 Subject: [Python-Dev] FAQ Horribly Out Of Date In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200 References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Message-ID: <20001231105812.A12168@newcnri.cnri.reston.va.us> On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote: >The current FAQ is horribly out of date. I think the FAQ-Wizard method >has proven itself not very efficient (for example, apparently no one >noticed until now that it's not working <0.2 wink>). Is there any It also leads to one section of the FAQ (#3, I think) having something like 60 questions jumbled together. IMHO the FAQ should be a text file, perhaps in the PEP format so it can be converted to HTML, and it should have an editor who'll arrange it into smaller sections. Any volunteers? (Must ... resist ... urge to volunteer myself... help me, Spock...) --amk From skip at mojam.com Sun Dec 31 20:25:18 2000 From: skip at mojam.com (Skip Montanaro) Date: Sun, 31 Dec 2000 13:25:18 -0600 (CST) Subject: [Python-Dev] plz test bsddb using shared linkage Message-ID: <14927.34846.153117.764547@beluga.mojam.com> A bug was filed on SF contending that the default linkage for bsddb should be shared instead of static because some Linux systems ship multiple versions of libdb. Would those of you who can and do build bsddb (probably only unixoids of some variety) please give this simple test a try? Uncomment the *shared* line in Modules/Setup.config.in, re-run configure, build Python and then try: import bsddb db = bsddb.btopen("/tmp/dbtest.db", "c") db["1"] = "1" print db["1"] db.close() del db If this doesn't fail for anyone I'll check the change in and close the bug report, otherwise I'll add a(nother) comment to the bug report that *shared* breaks bsddb for others and close the bug report. Thx, Skip From skip at mojam.com Sun Dec 31 20:26:16 2000 From: skip at mojam.com (Skip Montanaro) Date: Sun, 31 Dec 2000 13:26:16 -0600 (CST) Subject: [Python-Dev] plz test bsddb using shared linkage Message-ID: <14927.34904.20832.319647@beluga.mojam.com> oops, forgot the bug report is at http://sourceforge.net/bugs/?func=detailbug&bug_id=126564&group_id=5470 for those of you who do not monitor python-bugs-list. S From tim.one at home.com Sun Dec 31 21:28:47 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 31 Dec 2000 15:28:47 -0500 Subject: [Python-Dev] FAQ Horribly Out Of Date In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Message-ID: [Moshe Zadka] > The current FAQ is horribly out of date. The password is Spam. Fix it . > I think the FAQ-Wizard method has proven itself not very > efficient (for example, apparently no one noticed until now > that it's not working <0.2 wink>). I'm afraid almost nothing on python.org with an active component works today (not searches, not the FAQ Wizard, not the 2.0 Wiki, ...). If history is any clue, these will remain broken until Guido gets back from vacation. > Is there any hope putting the FAQ in Misc/, having a script > which scp's it to the SF page, and making that the official FAQ? Would be OK by me. I'm more concerned that the whole of python.org has barely been updated since March; huge chunks of the FAQ are still relevant, but, e.g., the Job Board hasn't been touched in over 3 months; the News got so out of date Guido deleted the whole section; etc. > On a related note, what is the current status of the PSA? Is it > officially dead? It appears that CNRI can only think about one thing at a time <0.5 wink>. For the last 6 months, that thing has been the license. If they ever resolve the GPL compatibility issue, maybe they can be persuaded to think about the PSA. In the meantime, I'd suggest you not renew . From tim.one at home.com Sun Dec 31 23:12:43 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 31 Dec 2000 17:12:43 -0500 Subject: [Python-Dev] plz test bsddb using shared linkage In-Reply-To: <14927.34846.153117.764547@beluga.mojam.com> Message-ID: [Skip Montanaro] > ... > Would those of you who can and do build bsddb (probably only > unixoids of some variety) please give this simple test a try? Just noting that bsddb already ships with the Windows installer as a (shared) DLL. But it's an old (1.85?) Windows port from Sam Rushing. From gward at mems-exchange.org Fri Dec 1 00:14:39 2000 From: gward at mems-exchange.org (Greg Ward) Date: Thu, 30 Nov 2000 18:14:39 -0500 Subject: [Python-Dev] PEP 229 and 222 In-Reply-To: <20001128215748.A22105@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Tue, Nov 28, 2000 at 09:57:48PM -0500 References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> Message-ID: <20001130181438.A21596@ludwig.cnri.reston.va.us> On 28 November 2000, Andrew Kuchling said: > On Tue, Nov 28, 2000 at 06:01:38PM -0500, Guido van Rossum wrote: > >- Always shared libs. What about Unixish systems that don't have > > shared libs? What if you just want something to be hardcoded as > > statically linked, e.g. for security reasons? (On the other hand > > Beats me. I'm not even sure if the Distutils offers a way to compile > a static Python binary. (GPW: well, does it?) It's in the CCompiler interface, but hasn't been exposed to the outside world. (IOW, it's mainly a question of desiging the right setup script/command line interface: the implementation should be fairly straightforward, assuming the existing CCompiler classes do the right thing for generating binary executables.) Greg From gward at mems-exchange.org Fri Dec 1 00:19:38 2000 From: gward at mems-exchange.org (Greg Ward) Date: Thu, 30 Nov 2000 18:19:38 -0500 Subject: [Python-Dev] A house upon the sand In-Reply-To: ; from tim.one@home.com on Wed, Nov 29, 2000 at 01:23:10AM -0500 References: <200011281510.KAA03475@cj20424-a.reston1.va.home.com> Message-ID: <20001130181937.B21596@ludwig.cnri.reston.va.us> On 29 November 2000, Tim Peters said: > [Guido] > > ... > > Because of its importance, the deprecation time of the string module > > will be longer than that of most deprecated modules. I expect it > > won't be removed until Python 3000. > > I see nothing in the 2.0 docs, code, or "what's new" web pages saying that > it's deprecated. So I don't think you can even start the clock on this one > before 2.1 (a fuzzy stmt on the web page for the unused 1.6 release doesn't > count ...). FWIW, I would argue against *ever* removing (much less "deprecating", ie. threatening to remove) the string module. To a rough approximation, every piece of Python code in existence code prior to Python 1.6 depends on the string module. I for one do not want to have to change all occurences of string.foo(x) to x.foo() -- it just doesn't buy enough to make it worth changing all that code. Not only does the amount of code to change mean the change would be non-trivial, it's not always the right thing, especially if you happen to be one of the people who dislikes the "delim.join(list)" idiom. (I'm still undecided.) Greg From fredrik at effbot.org Fri Dec 1 07:39:57 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Fri, 1 Dec 2000 07:39:57 +0100 Subject: [Python-Dev] TypeError: foo, bar Message-ID: <008f01c05b61$877263b0$3c6340d5@hagrid> just stumbled upon yet another (high-profile) python newbie confused a "TypeError: read-only character buffer, dictionary" message. how about changing "read-only character buffer" to "string or read-only character buffer", and the "foo, bar" format to "expected foo, found bar", so we get: "TypeError: expected string or read-only character buffer, found dictionary" From tim.one at home.com Fri Dec 1 07:58:53 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 1 Dec 2000 01:58:53 -0500 Subject: [Python-Dev] TypeError: foo, bar In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid> Message-ID: [Fredrik Lundh] > just stumbled upon yet another (high-profile) python newbie > confused a "TypeError: read-only character buffer, dictionary" > message. > > how about changing "read-only character buffer" to "string > or read-only character buffer", and the "foo, bar" format to > "expected foo, found bar", so we get: > > "TypeError: expected string or read-only character > buffer, found dictionary" +0. +1 if "found" is changed to "got". "found"-implies-a-search-ly y'rs - tim From thomas.heller at ion-tof.com Fri Dec 1 09:10:21 2000 From: thomas.heller at ion-tof.com (Thomas Heller) Date: Fri, 1 Dec 2000 09:10:21 +0100 Subject: [Python-Dev] PEP 229 and 222 References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us> Message-ID: <014301c05b6e$269716a0$e000a8c0@thomasnotebook> > > Beats me. I'm not even sure if the Distutils offers a way to compile > > a static Python binary. (GPW: well, does it?) > > It's in the CCompiler interface, but hasn't been exposed to the outside > world. (IOW, it's mainly a question of desiging the right setup > script/command line interface: the implementation should be fairly > straightforward, assuming the existing CCompiler classes do the right > thing for generating binary executables.) Distutils currently only supports build_*** commands for C-libraries and Python extensions. Shouldn't there also be build commands for shared libraries, executable programs and static Python binaries? Thomas BTW: Distutils-sig seems pretty dead these days... From ping at lfw.org Fri Dec 1 11:23:56 2000 From: ping at lfw.org (Ka-Ping Yee) Date: Fri, 1 Dec 2000 02:23:56 -0800 (PST) Subject: [Python-Dev] Cryptic error messages Message-ID: An attempt to use sockets for the first time yesterday left a friend of mine bewildered: >>> import socket >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) >>> s.connect('localhost:234') Traceback (most recent call last): File "", line 1, in ? TypeError: 2-sequence, 13-sequence >>> "What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked. I see in getargs.c (line 275) that this type of message is documented: /* Convert a tuple argument. [...] If the argument is invalid: [...] *msgbuf contains an error message, whose format is: ", ", where: is the name of the expected type, and is the name of the actual type, (so you can surround it by "expected ... found"), and msgbuf is returned. */ It's clear that the socketmodule is not prepending "expected" and appending "found", as the author of converttuple intended. But when i grepped through the source code, i couldn't find anyone applying this "expected %s found" % msgbuf convention outside of getargs.c. Is it really in use? Could we just change getargs.c so that converttuple() returns a message like "expected ..., got ..." instead of seterror()? Additionally it would be nice to say '13-character string' instead of '13-sequence'... -- ?!ng "All models are wrong; some models are useful." -- George Box From mwh21 at cam.ac.uk Fri Dec 1 12:20:23 2000 From: mwh21 at cam.ac.uk (Michael Hudson) Date: 01 Dec 2000 11:20:23 +0000 Subject: [Python-Dev] Cryptic error messages In-Reply-To: Ka-Ping Yee's message of "Fri, 1 Dec 2000 02:23:56 -0800 (PST)" References: Message-ID: Ka-Ping Yee writes: > An attempt to use sockets for the first time yesterday left a > friend of mine bewildered: > > >>> import socket > >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > >>> s.connect('localhost:234') > Traceback (most recent call last): > File "", line 1, in ? > TypeError: 2-sequence, 13-sequence > >>> > > "What the heck does '2-sequence, 13-sequence' mean?" he rightfully asked. > I'm not sure about the general case, but in this case you could do something like: http://sourceforge.net/patch/?func=detailpatch&patch_id=102599&group_id=5470 Now you get an error message like: TypeError: getsockaddrarg: AF_INET address must be tuple, not string Cheers, M. -- I have gathered a posie of other men's flowers, and nothing but the thread that binds them is my own. -- Montaigne From guido at python.org Fri Dec 1 14:02:02 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 01 Dec 2000 08:02:02 -0500 Subject: [Python-Dev] TypeError: foo, bar In-Reply-To: Your message of "Fri, 01 Dec 2000 07:39:57 +0100." <008f01c05b61$877263b0$3c6340d5@hagrid> References: <008f01c05b61$877263b0$3c6340d5@hagrid> Message-ID: <200012011302.IAA31609@cj20424-a.reston1.va.home.com> > just stumbled upon yet another (high-profile) python newbie > confused a "TypeError: read-only character buffer, dictionary" > message. > > how about changing "read-only character buffer" to "string > or read-only character buffer", and the "foo, bar" format to > "expected foo, found bar", so we get: > > "TypeError: expected string or read-only character > buffer, found dictionary" The first was easy, and I've done it. The second one, for some reason, is hard. I forget why. Sorry. --Guido van Rossum (home page: http://www.python.org/~guido/) From cgw at fnal.gov Fri Dec 1 14:41:04 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Fri, 1 Dec 2000 07:41:04 -0600 (CST) Subject: [Python-Dev] TypeError: foo, bar In-Reply-To: <008f01c05b61$877263b0$3c6340d5@hagrid> References: <008f01c05b61$877263b0$3c6340d5@hagrid> Message-ID: <14887.43632.812342.414156@buffalo.fnal.gov> Fredrik Lundh writes: > how about changing "read-only character buffer" to "string > or read-only character buffer", and the "foo, bar" format to > "expected foo, found bar", so we get: > > "TypeError: expected string or read-only character > buffer, found dictionary" +100. Recently, I've been teaching Python to some beginners and they find this message absolutely inscrutable. Also agree with Tim about "found" vs. "got", but this is of secondary importance. From moshez at math.huji.ac.il Fri Dec 1 15:26:03 2000 From: moshez at math.huji.ac.il (Moshe Zadka) Date: Fri, 1 Dec 2000 16:26:03 +0200 (IST) Subject: [Python-Dev] [OT] Change of Address Message-ID: I'm sorry to bother you all with this, but from time to time you might need to reach my by e-mail... 30 days from now, this e-mail address will no longer be valid. Please use anything at zadka.site.co.il to reach me. Thank you for your time. -- Moshe Zadka -- 95855124 http://advogato.org/person/moshez From gward at mems-exchange.org Fri Dec 1 16:14:53 2000 From: gward at mems-exchange.org (Greg Ward) Date: Fri, 1 Dec 2000 10:14:53 -0500 Subject: [Python-Dev] PEP 229 and 222 In-Reply-To: <014301c05b6e$269716a0$e000a8c0@thomasnotebook>; from thomas.heller@ion-tof.com on Fri, Dec 01, 2000 at 09:10:21AM +0100 References: <200011282213.OAA31146@slayer.i.sourceforge.net> <20001128171735.A21996@kronos.cnri.reston.va.us> <200011282301.SAA03304@cj20424-a.reston1.va.home.com> <20001128215748.A22105@kronos.cnri.reston.va.us> <20001130181438.A21596@ludwig.cnri.reston.va.us> <014301c05b6e$269716a0$e000a8c0@thomasnotebook> Message-ID: <20001201101452.A26074@ludwig.cnri.reston.va.us> On 01 December 2000, Thomas Heller said: > Distutils currently only supports build_*** commands for > C-libraries and Python extensions. > > Shouldn't there also be build commands for shared libraries, > executable programs and static Python binaries? Andrew and I talked about this a bit yesterday, and the proposed interface is as follows: python setup.py build_ext --static will compile all extensions in the current module distribution, but instead of creating a .so (.pyd) file for each one, will create a new python binary in build/bin.. Issue to be resolved: what to call the new python binary, especially when installing it (presumably we *don't* want to clobber the stock binary, but supplement it with (eg.) "foopython"). Note that there is no provision for selectively building some extensions as shared. This means that Andrew's Distutil-ization of the standard library will have to override the build_ext command and have some extra way to select extensions for shared/static. Neither of us considered this a problem. > BTW: Distutils-sig seems pretty dead these days... Yeah, that's a combination of me playing on other things and python.net email being dead for over a week. I'll cc the sig on this and see if this interface proposal gets anyone's attention. Greg From jeremy at alum.mit.edu Fri Dec 1 20:27:14 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Dec 2000 14:27:14 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test Message-ID: <14887.64402.88530.714821@bitdiddle.concentric.net> There was recently some idle chatter in Guido's living room about using a unit testing framework (like PyUnit) for the Python regression test suite. We're also writing tests for some DC projects, and need to decide what framework to use. Does anyone have opinions on test frameworks? A quick web search turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver that allows implements xUnit-style unit tests. Are there other tools we should consider? Is anyone else interested in migrating the current test suite to a new framework? I hope the new framework will allow us to improve the test suite in a number of ways: - run an entire test suite to completion instead of stopping on the first failure - clearer reporting of what went wrong - better support for conditional tests, e.g. write a test for httplib that only runs if the network is up. This is tied into better error reporting, since the current test suite could only report that httplib succeeded or failed. Jeremy From fdrake at acm.org Fri Dec 1 20:24:46 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 1 Dec 2000 14:24:46 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net> References: <14887.64402.88530.714821@bitdiddle.concentric.net> Message-ID: <14887.64254.399477.935828@cj42289-a.reston1.va.home.com> Jeremy Hylton writes: > - better support for conditional tests, e.g. write a test for > httplib that only runs if the network is up. This is tied into > better error reporting, since the current test suite could only > report that httplib succeeded or failed. There is a TestSkipped exception that can be raised with an explanation of why. It's used in the largefile test (at least). I think it is documented in the README. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From akuchlin at mems-exchange.org Fri Dec 1 20:58:27 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Fri, 1 Dec 2000 14:58:27 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 02:27:14PM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> Message-ID: <20001201145827.D16751@kronos.cnri.reston.va.us> On Fri, Dec 01, 2000 at 02:27:14PM -0500, Jeremy Hylton wrote: >There was recently some idle chatter in Guido's living room about >using a unit testing framework (like PyUnit) for the Python regression >test suite. We're also writing tests for some DC projects, and need Someone remembered my post of 23 Nov, I see... The only other test framework I know of is the unittest.py inside Quixote, written because we thought PyUnit was kind of clunky. Greg Ward, who primarily wrote it, used more sneaky interpreter tricks to make the interface more natural, though it still worked with Jython last time we checked (some time ago, though). No GUI, but it can optionally show the code coverage of a test suite, too. See http://x63.deja.com/=usenet/getdoc.xp?AN=683946404 for some notes on using it. Obviously I think the Quixote unittest.py is the best choice for the stdlib. --amk From jeremy at alum.mit.edu Fri Dec 1 21:55:28 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Dec 2000 15:55:28 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us> References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> Message-ID: <14888.4160.838336.537708@bitdiddle.concentric.net> Is there any documentation for the Quixote unittest tool? The Example page is helpful, but it feels like there are some details that are not explained. Jeremy From akuchlin at mems-exchange.org Fri Dec 1 22:12:12 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Fri, 1 Dec 2000 16:12:12 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14888.4160.838336.537708@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 03:55:28PM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> Message-ID: <20001201161212.A12372@kronos.cnri.reston.va.us> On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote: >Is there any documentation for the Quixote unittest tool? The Example >page is helpful, but it feels like there are some details that are not >explained. I don't believe we've written docs at all for internal use. What details seem to be missing? --amk From jeremy at alum.mit.edu Fri Dec 1 22:21:27 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri, 1 Dec 2000 16:21:27 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <20001201161212.A12372@kronos.cnri.reston.va.us> References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> <20001201161212.A12372@kronos.cnri.reston.va.us> Message-ID: <14888.5719.844387.435471@bitdiddle.concentric.net> >>>>> "AMK" == Andrew Kuchling writes: AMK> On Fri, Dec 01, 2000 at 03:55:28PM -0500, Jeremy Hylton wrote: >> Is there any documentation for the Quixote unittest tool? The >> Example page is helpful, but it feels like there are some details >> that are not explained. AMK> I don't believe we've written docs at all for internal use. AMK> What details seem to be missing? Details: - I assume setup/shutdown are equivalent to setUp/tearDown - Is it possible to override constructor for TestScenario? - Is there something equivalent to PyUnit self.assert_ - What does parse_args() do? - What does run_scenarios() do? - If I have multiple scenarios, how do I get them to run? Jeremy From akuchlin at mems-exchange.org Fri Dec 1 22:34:30 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Fri, 1 Dec 2000 16:34:30 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14888.5719.844387.435471@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Fri, Dec 01, 2000 at 04:21:27PM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <14888.4160.838336.537708@bitdiddle.concentric.net> <20001201161212.A12372@kronos.cnri.reston.va.us> <14888.5719.844387.435471@bitdiddle.concentric.net> Message-ID: <20001201163430.A12417@kronos.cnri.reston.va.us> On Fri, Dec 01, 2000 at 04:21:27PM -0500, Jeremy Hylton wrote: > - I assume setup/shutdown are equivalent to setUp/tearDown Correct. > - Is it possible to override constructor for TestScenario? Beats me; I see no reason why you couldn't, though. > - Is there something equivalent to PyUnit self.assert_ Probably test_bool(), I guess: self.test_bool('self.run.is_draft()') asserts that self.run.is_draft() will return true. Or does self.assert_() do something more? > - What does parse_args() do? > - What does run_scenarios() do? > - If I have multiple scenarios, how do I get them to run? These 3 questions are all related, really. At the bottom of our test scripts, we have the following stereotyped code: if __name__ == "__main__": (scenarios, options) = parse_args() run_scenarios (scenarios, options) parse_args() ensures consistent arguments to test scripts; -c measures code coverage, -v is verbose, etc. It also looks in the __main__ module and finds all subclasses of TestScenario, so you can do: python test_process_run.py # Runs all N scenarios python test_process_run.py ProcessRunTest # Runs all cases for 1 scenario python test_process_run.py ProcessRunTest:check_access # Runs one test case # in one scenario class --amk From tim.one at home.com Fri Dec 1 22:47:54 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 1 Dec 2000 16:47:54 -0500 Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <14887.64402.88530.714821@bitdiddle.concentric.net> Message-ID: [Jeremy Hylton] > There was recently some idle chatter in Guido's living room about > using a unit testing framework (like PyUnit) for the Python regression > test suite. We're also writing tests for some DC projects, and need > to decide what framework to use. > > Does anyone have opinions on test frameworks? A quick web search > turned up PyUnit (pyunit.sourceforge.net) and a script by Tres Seaver > that allows implements xUnit-style unit tests. Are there other tools > we should consider? My own doctest is loved by people other than just me , but is aimed at ensuring that examples in docstrings work exactly as shown (which is why it starts with "doc" instead of "test"). > Is anyone else interested in migrating the current test suite to a new > framework? Yes. > I hope the new framework will allow us to improve the test > suite in a number of ways: > > - run an entire test suite to completion instead of stopping on > the first failure doctest does that. > - clearer reporting of what went wrong Ditto. > - better support for conditional tests, e.g. write a test for > httplib that only runs if the network is up. This is tied into > better error reporting, since the current test suite could only > report that httplib succeeded or failed. A doctest test is simply an interactive Python session pasted into a docstring (or more than one session, and/or interspersed with prose). If you can write an example in the interactive shell, doctest will verify it still works as advertised. This allows for embedding unit tests into the docs for each function, method and class. Nothing about them "looks like" an artificial test tacked on: the examples in the docs *are* the test cases. I need to try the other frameworks. I dare say doctest is ideal for computational functions, where the intended input->output relationship can be clearly explicated via examples. It's useless for GUIs. Usefulness varies accordingly between those extremes (doctest is natural exactly to the extent that a captured interactive session is helpful for documentation purposes). testing-ain't-easy-ly y'rs - tim From barry at digicool.com Sat Dec 2 04:52:29 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Fri, 1 Dec 2000 22:52:29 -0500 Subject: [Python-Dev] PEP 231, __findattr__() Message-ID: <14888.29181.355023.669030@anthem.concentric.net> I've just uploaded PEP 231, which describes a new hook in the instance access mechanism, called __findattr__() after a similar mechanism that exists in Jython (but is not exposed at the Python layer). You can do all kinds of interesting things with __findattr__(), including implement the __of__() protocol of ExtensionClass, and thus implicit and explicit acquisitions, in pure Python. You can also do Java Bean-like interfaces and C++-like access control. The PEP contains sample implementations of all of these, although the latter isn't as clean as I'd like, due to other restrictions in Python. My hope is that __findattr__() would eliminate most, if not all, the need for ExtensionClass, at least within the Zope and ZODB contexts. I haven't tried to implement Persistent using it though. Since it's a long PEP, I won't include it here. You can read about it at this URL http://python.sourceforge.net/peps/pep-0231.html It includes a link to the patch implementing this feature on SourceForge. Enjoy, -Barry From moshez at math.huji.ac.il Sat Dec 2 10:11:50 2000 From: moshez at math.huji.ac.il (Moshe Zadka) Date: Sat, 2 Dec 2000 11:11:50 +0200 (IST) Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: On Fri, 1 Dec 2000, Barry A. Warsaw wrote: > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). There's one thing that bothers me about this: what exactly is "the call stack"? Let me clarify: what happens when you have threads. Either machine-level threads and stackless threads confuse the issues here, not to talk about stackless continuations. Can you add a few words to the PEP about dealing with those? From mal at lemburg.com Sat Dec 2 11:03:11 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 02 Dec 2000 11:03:11 +0100 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: <3A28C8DF.E430484F@lemburg.com> "Barry A. Warsaw" wrote: > > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). > > You can do all kinds of interesting things with __findattr__(), > including implement the __of__() protocol of ExtensionClass, and thus > implicit and explicit acquisitions, in pure Python. You can also do > Java Bean-like interfaces and C++-like access control. The PEP > contains sample implementations of all of these, although the latter > isn't as clean as I'd like, due to other restrictions in Python. > > My hope is that __findattr__() would eliminate most, if not all, the > need for ExtensionClass, at least within the Zope and ZODB contexts. > I haven't tried to implement Persistent using it though. The PEP does define when and how __findattr__() is called, but makes no statement about what it should do or return... Here's a slightly different idea: Given the name, I would expect it to go look for an attribute and then return the attribute and its container (this doesn't seem to be what you have in mind here, though). An alternative approach given the semantics above would then be to first try a __getattr__() lookup and revert to __findattr__() in case this fails. I don't think there is any need to overload __setattr__() in such a way, because you cannot be sure which object actually gets the new attribute. By exposing the functionality using a new builtin, findattr(), this could be used for all the examples you give too. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From barry at digicool.com Sat Dec 2 17:50:02 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Sat, 2 Dec 2000 11:50:02 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> <3A28C8DF.E430484F@lemburg.com> Message-ID: <14889.10298.621133.961677@anthem.concentric.net> >>>>> "M" == M writes: M> The PEP does define when and how __findattr__() is called, M> but makes no statement about what it should do or return... Good point. I've clarified that in the PEP. M> Here's a slightly different idea: M> Given the name, I would expect it to go look for an attribute M> and then return the attribute and its container (this doesn't M> seem to be what you have in mind here, though). No, because some applications won't need a wrapped object. E.g. in the Java bean example, it just returns the attribute (which is stored with a slightly different name). M> An alternative approach given the semantics above would then be M> to first try a __getattr__() lookup and revert to M> __findattr__() in case this fails. I don't think this is as useful. What would that buy you that you can't already do today? The key concept here is that you want to give the class first crack to interpose on every attribute access. You want this hook to get called before anybody else can get at, or set, your attributes. That gives you (the class) total control to implement whatever policy is useful. M> I don't think there is any need to overload __setattr__() in M> such a way, because you cannot be sure which object actually M> gets the new attribute. M> By exposing the functionality using a new builtin, findattr(), M> this could be used for all the examples you give too. No, because then people couldn't use the object in the normal dot-notational way. -Barry From tismer at tismer.com Sat Dec 2 17:27:33 2000 From: tismer at tismer.com (Christian Tismer) Date: Sat, 02 Dec 2000 18:27:33 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: <3A2922F5.C2E0D10@tismer.com> Hi Barry, "Barry A. Warsaw" wrote: > > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). > > You can do all kinds of interesting things with __findattr__(), > including implement the __of__() protocol of ExtensionClass, and thus > implicit and explicit acquisitions, in pure Python. You can also do > Java Bean-like interfaces and C++-like access control. The PEP > contains sample implementations of all of these, although the latter > isn't as clean as I'd like, due to other restrictions in Python. > > My hope is that __findattr__() would eliminate most, if not all, the > need for ExtensionClass, at least within the Zope and ZODB contexts. > I haven't tried to implement Persistent using it though. I have been using ExtensionClass for quite a long time, and I have to say that you indeed eliminate most of its need through this super-elegant idea. Congratulations! Besides acquisition and persitency interception, wrapping plain C objects and giving them Class-like behavior while retaining fast access to internal properties but being able to override methods by Python methods was my other use of ExtensionClass. I assume this is the other "20%" part you mention, which is much harder to achieve? But that part also looks easier to implement now, by the support of the __findattr__ method. > Since it's a long PEP, I won't include it here. You can read about it > at this URL > > http://python.sourceforge.net/peps/pep-0231.html Great. I had to read it twice, but it was fun. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tismer at tismer.com Sat Dec 2 17:55:21 2000 From: tismer at tismer.com (Christian Tismer) Date: Sat, 02 Dec 2000 18:55:21 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: Message-ID: <3A292979.60BB1731@tismer.com> Moshe Zadka wrote: > > On Fri, 1 Dec 2000, Barry A. Warsaw wrote: > > > I've just uploaded PEP 231, which describes a new hook in the instance > > access mechanism, called __findattr__() after a similar mechanism that > > exists in Jython (but is not exposed at the Python layer). > > There's one thing that bothers me about this: what exactly is "the > call stack"? Let me clarify: what happens when you have threads. > Either machine-level threads and stackless threads confuse the issues > here, not to talk about stackless continuations. Can you add a few > words to the PEP about dealing with those? As far as I understood the patch (just skimmed), thee is no stack involved directly, but the instance increments and decrments a variable infindattr. + if (v != NULL && !inst->infindaddr && + (func = inst->in_class->cl_findattr)) + { + PyObject *args, *res; + args = Py_BuildValue("(OOO)", inst, name, v); + if (args == NULL) + return -1; + ++inst->infindaddr; + res = PyEval_CallObject(func, args); + --inst->infindaddr; This is: The call modifies the instance's state, while calling the findattr method. You are right: I see a serious problem with this. It doesn't even need continuations to get things messed up. Guido's proposed coroutines, together with uThread-Switching, might be able to enter the same instance twice with ease. Barry, after second thought, I feel this can become a problem in the future. This infindattr attribute only works correctly if we are guaranteed to use strict stack order of execution. What you're *intending* to to is to tell the PyEval_CallObject that it should not find the __findattr__ attribute. But this should be done only for this call and all of its descendants, but no *fresh* access from elsewhere. The hard way to get out of this would be to stop scheduling in that case. Maybe this is very cheap, but quite unelegant. We have a quite peculiar system state here: A function call acts like an escape, to make all subsequent calls behave differently, until this call is finished. Without blocking microthreads, a clean way to do this would be a search up in the frame chain, if there is a running __findattr__ method of this object. Fairly expensive. Well, the problem also exists with real threads, if they are allowed to switch in such a context. I fear it is necessary to either block this stuff until it is ready, or to maintain some thread-wise structure for the state of this object. Ok, after thinking some more, I'll start an extra message to Barry on this topic. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tismer at tismer.com Sat Dec 2 18:21:18 2000 From: tismer at tismer.com (Christian Tismer) Date: Sat, 02 Dec 2000 19:21:18 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> Message-ID: <3A292F8D.7C616449@tismer.com> "Barry A. Warsaw" wrote: > > I've just uploaded PEP 231, which describes a new hook in the instance > access mechanism, called __findattr__() after a similar mechanism that > exists in Jython (but is not exposed at the Python layer). Ok, as I announced already, here some thoughts on __findattr__, system state, and how it could work. Looking at your patch, I realize that you are blocking __findattr__ for your whole instance, until this call ends. This is not what you want to do, I guess. This has an effect of affecting the whole system state when threads are involved. Also you cannot use __findattr__ on any other attribute during this call. You want most probably do this: __findattr__ should not be invoked again for this instance, with this attribute name, for this "thread", until you are done. The correct way to find out whether __findattr__ is active or not would be to look upwards the frame chain and inspect it. Moshe also asked about continuations: I think this would resolve quite fine. However we jump around, the current chain of frames dictates the semantics of __findattr__. It even applies to Guido's tamed coroutines, given that an explicit switch were allowed in the context of __findattr__. In a sense, we get some kind of dynamic context here, since we need to do a lookup for something in the dynamic call chain. I guess this would be quite messy to implement, and inefficient. Isn't there a way to accomplish the desired effect without modifying the instance? In the context of __findattr__, *we* know that we don't want to get a recursive call. Let's assume __getattr__ and __setattr__ had yet another optional parameter: infindattr, defaulting to 0. We would than have to pass a positive value in this context, which would object.c tell to not try to invoke __findattr__ again. With explicit passing of state, no problems with threads can occour. Readability might improve as well. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From moshez at math.huji.ac.il Sun Dec 3 14:14:43 2000 From: moshez at math.huji.ac.il (Moshe Zadka) Date: Sun, 3 Dec 2000 15:14:43 +0200 (IST) Subject: [Python-Dev] Another Python Developer Missing Message-ID: Gordon McMillan is not a possible assignee in the assign_to field. -- Moshe Zadka -- 95855124 http://moshez.org From tim.one at home.com Sun Dec 3 18:35:36 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 3 Dec 2000 12:35:36 -0500 Subject: [Python-Dev] Another Python Developer Missing In-Reply-To: Message-ID: [Moshe Zadka] > Gordon McMillan is not a possible assignee in the assign_to field. We almost never add people as Python developers unless they ask for that, since it comes with responsibility as well as riches beyond the dreams of avarice. If Gordon would like to apply, we won't charge him any interest until 2001 . From mal at lemburg.com Sun Dec 3 20:21:11 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Sun, 03 Dec 2000 20:21:11 +0100 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib urllib.py,1.107,1.108 References: <200012031830.KAA30620@slayer.i.sourceforge.net> Message-ID: <3A2A9D27.AF43D665@lemburg.com> "Martin v. L?wis" wrote: > > Update of /cvsroot/python/python/dist/src/Lib > In directory slayer.i.sourceforge.net:/tmp/cvs-serv30506 > > Modified Files: > urllib.py > Log Message: > Convert Unicode strings to byte strings before passing them into specific > protocols. Closes bug #119822. > > ... > + > + def toBytes(url): > + """toBytes(u"URL") --> 'URL'.""" > + # Most URL schemes require ASCII. If that changes, the conversion > + # can be relaxed > + if type(url) is types.UnicodeType: > + try: > + url = url.encode("ASCII") You should make this: 'ascii' -- encoding names are lower case per convention (and the implementation has a short-cut to speed up conversion to 'ascii' -- not for 'ASCII'). > + except UnicodeError: > + raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters") Would it be better to use a simple ValueError here ? (UnicodeError is a subclass of ValueError, but the error doesn't really have something to do with Unicode conversions...) > + return url > > def unwrap(url): -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer at tismer.com Sun Dec 3 21:01:07 2000 From: tismer at tismer.com (Christian Tismer) Date: Sun, 03 Dec 2000 22:01:07 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib filecmp.py,1.6,1.7 References: <200012032048.MAA10353@slayer.i.sourceforge.net> Message-ID: <3A2AA683.3840AA8A@tismer.com> Moshe Zadka wrote: > > Update of /cvsroot/python/python/dist/src/Lib > In directory slayer.i.sourceforge.net:/tmp/cvs-serv9465 > > Modified Files: > filecmp.py > Log Message: > Call of _cmp had wrong number of paramereters. > Fixed definition of _cmp. ... > ! return not abs(cmp(a, b, sh, st)) > except os.error: > return 2 Ugh! Wouldn't that be a fine chance to rename the cmp function in this module? Overriding a built-in is really not nice to have in a library. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From moshez at math.huji.ac.il Sun Dec 3 22:01:07 2000 From: moshez at math.huji.ac.il (Moshe Zadka) Date: Sun, 3 Dec 2000 23:01:07 +0200 (IST) Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib filecmp.py,1.6,1.7 In-Reply-To: <3A2AA683.3840AA8A@tismer.com> Message-ID: On Sun, 3 Dec 2000, Christian Tismer wrote: > Ugh! Wouldn't that be a fine chance to rename the cmp > function in this module? Overriding a built-in > is really not nice to have in a library. The fine chance was when we moved cmp.py->filecmp.py. Now it would just break backwards compatability. -- Moshe Zadka -- 95855124 http://moshez.org From tismer at tismer.com Sun Dec 3 21:12:15 2000 From: tismer at tismer.com (Christian Tismer) Date: Sun, 03 Dec 2000 22:12:15 +0200 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Libfilecmp.py,1.6,1.7 References: Message-ID: <3A2AA91F.843E2BAE@tismer.com> Moshe Zadka wrote: > > On Sun, 3 Dec 2000, Christian Tismer wrote: > > > Ugh! Wouldn't that be a fine chance to rename the cmp > > function in this module? Overriding a built-in > > is really not nice to have in a library. > > The fine chance was when we moved cmp.py->filecmp.py. > Now it would just break backwards compatability. Yes, I see. cmp belongs to the module's interface. Maybe it could be renamed anyway, and be assigned to cmp at the very end of the file, but not using cmp anywhere in the code. My first reaction on reading the patch was "juck!" since I didn't know this module. python-dev/null - ly y'rs - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From martin at loewis.home.cs.tu-berlin.de Sun Dec 3 22:56:44 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 3 Dec 2000 22:56:44 +0100 Subject: [Python-Dev] PEP 231, __findattr__() Message-ID: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> > Isn't there a way to accomplish the desired effect without modifying > the instance? In the context of __findattr__, *we* know that we > don't want to get a recursive call. Let's assume __getattr__ and > __setattr__ had yet another optional parameter: infindattr, > defaulting to 0. We would than have to pass a positive value in > this context, which would object.c tell to not try to invoke > __findattr__ again. Who is "we" here? The Python code implementing __findattr__? How would it pass a value to __setattr__? It doesn't call __setattr__, instead it has "self.__myfoo = x"... I agree that the current implementation is not thread-safe. To solve that, you'd need to associate with each instance not a single "infindattr" attribute, but a whole set of them - one per "thread of execution" (which would be a thread-id in most threading systems). Of course, that would need some cooperation from the any thread scheme (including uthreads), which would need to provide an identification for a "calling context". Regards, Martin From martin at loewis.home.cs.tu-berlin.de Sun Dec 3 23:07:17 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 3 Dec 2000 23:07:17 +0100 Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108 Message-ID: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de> > You should make this: 'ascii' -- encoding names are lower case per > convention (and the implementation has a short-cut to speed up > conversion to 'ascii' -- not for 'ASCII'). With conventions, it is a difficult story. I'm pretty certain that users typically see that particular american standard as ASCII (to the extend of calling it "a s c two"), not ascii. As for speed - feel free to change the code if you think it matters. > + raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters") > Would it be better to use a simple ValueError here ? (UnicodeError > is a subclass of ValueError, but the error doesn't really have > something to do with Unicode conversions...) Why does it not have to do with Unicode conversion? A conversion from Unicode to ASCII was attempted, and failed. I guess I would be more open to suggested changes if you had put them into the patch manager at the time you've reviewed the patch... Regards, Martin From tismer at tismer.com Sun Dec 3 22:38:11 2000 From: tismer at tismer.com (Christian Tismer) Date: Sun, 03 Dec 2000 23:38:11 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> Message-ID: <3A2ABD43.AB56BD60@tismer.com> "Martin v. Loewis" wrote: > > > Isn't there a way to accomplish the desired effect without modifying > > the instance? In the context of __findattr__, *we* know that we > > don't want to get a recursive call. Let's assume __getattr__ and > > __setattr__ had yet another optional parameter: infindattr, > > defaulting to 0. We would than have to pass a positive value in > > this context, which would object.c tell to not try to invoke > > __findattr__ again. > > Who is "we" here? The Python code implementing __findattr__? How would > it pass a value to __setattr__? It doesn't call __setattr__, instead > it has "self.__myfoo = x"... Ouch - right! Sorry :) > I agree that the current implementation is not thread-safe. To solve > that, you'd need to associate with each instance not a single > "infindattr" attribute, but a whole set of them - one per "thread of > execution" (which would be a thread-id in most threading systems). Of > course, that would need some cooperation from the any thread scheme > (including uthreads), which would need to provide an identification > for a "calling context". Right, that is one possible way to do it. I also thought about some alternatives, but they all sound too complicated to justify them. Also I don't think this is only thread-related, since mess can happen even with an explicit coroutine jmp. Furthermore, how to deal with multiple attribute names? The function works wrong if __findattr__ tries to inspect another attribute. IMO, the state of the current interpreter changes here (or should do so), and this changed state needs to be carried down with all subsequent function calls. confused - ly chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From mal at lemburg.com Sun Dec 3 23:51:10 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Sun, 03 Dec 2000 23:51:10 +0100 Subject: [Python-Dev] Re: CVS: python/dist/src/Lib urllib.py,1.107,1.108 References: <200012032207.XAA03394@loewis.home.cs.tu-berlin.de> Message-ID: <3A2ACE5E.A9F860A8@lemburg.com> "Martin v. Loewis" wrote: > > > You should make this: 'ascii' -- encoding names are lower case per > > convention (and the implementation has a short-cut to speed up > > conversion to 'ascii' -- not for 'ASCII'). > > With conventions, it is a difficult story. I'm pretty certain that > users typically see that particular american standard as ASCII (to the > extend of calling it "a s c two"), not ascii. It's a convention in the codec registry design and used as such in the Unicode implementation. > As for speed - feel free to change the code if you think it matters. Hey... this was just a suggestion. I thought that you didn't know of the internal short-cut and wanted to hint at it. > > + raise UnicodeError("URL "+repr(url)+" contains non-ASCII characters") > > > Would it be better to use a simple ValueError here ? (UnicodeError > > is a subclass of ValueError, but the error doesn't really have > > something to do with Unicode conversions...) > > Why does it not have to do with Unicode conversion? A conversion from > Unicode to ASCII was attempted, and failed. Sure, but the fact that URLs have to be ASCII is not something that is enforced by the Unicode implementation. > I guess I would be more open to suggested changes if you had put them > into the patch manager at the time you've reviewed the patch... I didn't review the patch, only the summary... Don't have much time to look into these things closely right now, so all I can do is comment. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From barry at scottb.demon.co.uk Mon Dec 4 01:55:32 2000 From: barry at scottb.demon.co.uk (Barry Scott) Date: Mon, 4 Dec 2000 00:55:32 -0000 Subject: [Python-Dev] A house upon the sand In-Reply-To: <20001130181937.B21596@ludwig.cnri.reston.va.us> Message-ID: <000201c05d8c$e7a15b10$060210ac@private> I fully support Greg Wards view. If string was removed I'd not update the old code but add in my own string module. Given the effort you guys went to to keep the C extension protocol the same (in the context of crashing on importing a 1.5 dll into 2.0) I amazed you think that string could be removed... Could you split the lib into blessed and backward compatibility sections? Then by some suitable mechanism I can choose the compatibility I need? Oh and as for join obviously a method of a list... ['thats','better'].join(' ') Barry From fredrik at pythonware.com Mon Dec 4 11:37:18 2000 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 4 Dec 2000 11:37:18 +0100 Subject: [Python-Dev] unit testing and Python regression test References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> Message-ID: <00e701c05dde$2d77c240$0900a8c0@SPIFF> andrew kuchling wrote: > Someone remembered my post of 23 Nov, I see... The only other test > framework I know of is the unittest.py inside Quixote, written because > we thought PyUnit was kind of clunky. the pythonware teams agree -- we've been using an internal reimplementation of Kent Beck's original Smalltalk work, but we're switching to unittest.py. > Obviously I think the Quixote unittest.py is the best choice for the stdlib. +1 from here. From mal at lemburg.com Mon Dec 4 12:14:20 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 04 Dec 2000 12:14:20 +0100 Subject: [Python-Dev] PEP 231, __findattr__() References: <14888.29181.355023.669030@anthem.concentric.net> <3A28C8DF.E430484F@lemburg.com> <14889.10298.621133.961677@anthem.concentric.net> Message-ID: <3A2B7C8C.D6B889EE@lemburg.com> "Barry A. Warsaw" wrote: > > >>>>> "M" == M writes: > > M> The PEP does define when and how __findattr__() is called, > M> but makes no statement about what it should do or return... > > Good point. I've clarified that in the PEP. > > M> Here's a slightly different idea: > > M> Given the name, I would expect it to go look for an attribute > M> and then return the attribute and its container (this doesn't > M> seem to be what you have in mind here, though). > > No, because some applications won't need a wrapped object. E.g. in > the Java bean example, it just returns the attribute (which is stored > with a slightly different name). I was thinking of a standardised helper which could then be used for all kinds of attribute retrieval techniques. Acquisition would be easy to do, access control too. In most cases __findattr__ would simply return (self, self.attrname). > M> An alternative approach given the semantics above would then be > M> to first try a __getattr__() lookup and revert to > M> __findattr__() in case this fails. > > I don't think this is as useful. What would that buy you that you > can't already do today? Forget that idea... *always* calling __findattr__ is the more useful way, just like you intended. > The key concept here is that you want to give the class first crack to > interpose on every attribute access. You want this hook to get called > before anybody else can get at, or set, your attributes. That gives > you (the class) total control to implement whatever policy is useful. Right. > M> I don't think there is any need to overload __setattr__() in > M> such a way, because you cannot be sure which object actually > M> gets the new attribute. > > M> By exposing the functionality using a new builtin, findattr(), > M> this could be used for all the examples you give too. > > No, because then people couldn't use the object in the normal > dot-notational way. Uhm, why not ? -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gvwilson at nevex.com Mon Dec 4 15:40:58 2000 From: gvwilson at nevex.com (Greg Wilson) Date: Mon, 4 Dec 2000 09:40:58 -0500 Subject: [Python-Dev] Q: Python standard library re-org plans/schedule? In-Reply-To: <20001201145827.D16751@kronos.cnri.reston.va.us> Message-ID: Hi, everyone. A potential customer has asked whether there are any plans to re-organize and rationalize the Python standard library. If there are any firms plans, and a schedule (however tentative), I'd be grateful for a pointer. Thanks, Greg From barry at digicool.com Mon Dec 4 16:13:23 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 10:13:23 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> Message-ID: <14891.46227.785856.307437@anthem.concentric.net> >>>>> "MvL" == Martin v Loewis writes: MvL> I agree that the current implementation is not MvL> thread-safe. To solve that, you'd need to associate with each MvL> instance not a single "infindattr" attribute, but a whole set MvL> of them - one per "thread of execution" (which would be a MvL> thread-id in most threading systems). Of course, that would MvL> need some cooperation from the any thread scheme (including MvL> uthreads), which would need to provide an identification for MvL> a "calling context". I'm still catching up on several hundred emails over the weekend. I had a sneaking suspicion that infindattr wasn't thread-safe, so I'm convinced this is a bug in the implementation. One approach might be to store the info in the thread state object (isn't that how the recursive repr stop flag is stored?) That would also save having to allocate an extra int for every instance (yuck) but might impose a bit more of a performance overhead. I'll work more on this later today. -Barry From jeremy at alum.mit.edu Mon Dec 4 16:23:10 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon, 4 Dec 2000 10:23:10 -0500 (EST) Subject: [Python-Dev] unit testing and Python regression test In-Reply-To: <00e701c05dde$2d77c240$0900a8c0@SPIFF> References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <00e701c05dde$2d77c240$0900a8c0@SPIFF> Message-ID: <14891.46814.359333.76720@bitdiddle.concentric.net> >>>>> "FL" == Fredrik Lundh writes: FL> andrew kuchling wrote: >> Someone remembered my post of 23 Nov, I see... The only other >> test framework I know of is the unittest.py inside Quixote, >> written because we thought PyUnit was kind of clunky. FL> the pythonware teams agree -- we've been using an internal FL> reimplementation of Kent Beck's original Smalltalk work, but FL> we're switching to unittest.py. Can you provide any specifics about what you like about unittest.py (perhaps as opposed to PyUnit)? Jeremy From guido at python.org Mon Dec 4 16:20:11 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 04 Dec 2000 10:20:11 -0500 Subject: [Python-Dev] Q: Python standard library re-org plans/schedule? In-Reply-To: Your message of "Mon, 04 Dec 2000 09:40:58 EST." References: Message-ID: <200012041520.KAA20979@cj20424-a.reston1.va.home.com> > Hi, everyone. A potential customer has asked whether there are any > plans to re-organize and rationalize the Python standard library. > If there are any firms plans, and a schedule (however tentative), > I'd be grateful for a pointer. Alas, none that I know of except the ineffable Python 3000 schedule. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Mon Dec 4 16:46:53 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Mon, 4 Dec 2000 10:46:53 -0500 Subject: [Python-Dev] Quixote unit testing docs (Was: unit testing) In-Reply-To: <14891.46814.359333.76720@bitdiddle.concentric.net>; from jeremy@alum.mit.edu on Mon, Dec 04, 2000 at 10:23:10AM -0500 References: <14887.64402.88530.714821@bitdiddle.concentric.net> <20001201145827.D16751@kronos.cnri.reston.va.us> <00e701c05dde$2d77c240$0900a8c0@SPIFF> <14891.46814.359333.76720@bitdiddle.concentric.net> Message-ID: <20001204104653.A19387@kronos.cnri.reston.va.us> Prodded by Jeremy, I went and actually wrote some documentation for the Quixote unittest.py; please see . The HTML is from a manually hacked Library Reference, so ignore the broken image links and other formatting goofyness. In case anyone needs it, the LaTeX is in /files/python/. The plain text version comes out to around 290 lines; I can post it to this list if that's desired. --amk From pf at artcom-gmbh.de Mon Dec 4 18:59:54 2000 From: pf at artcom-gmbh.de (Peter Funk) Date: Mon, 4 Dec 2000 18:59:54 +0100 (MET) Subject: Tim Peter's doctest compared to Quixote unit testing (was Re: [Python-Dev] Quixote unit testing docs) In-Reply-To: <20001204104653.A19387@kronos.cnri.reston.va.us> from Andrew Kuchling at "Dec 4, 2000 10:46:53 am" Message-ID: Hi all, Andrew Kuchling: > ... I ... actually wrote some documentation for > the Quixote unittest.py; please see > . [...] > comes out to around 290 lines; I can post it to this list if that's > desired. After reading Andrews docs, I think Quixote basically offers three additional features if compared with Tim Peters 'doctest': 1. integration of Skip Montanaro's code coverage analysis. 2. the idea of Scenario objects useful to share the setup needed to test related functions or methods of a class (same start condition). 3. Some useful functions to check whether the result returned by some test fullfills certain properties without having to be so explicite, as cut'n'paste from the interactive interpreter session would have been. As I've pointed out before in private mail to Jeremy I've used Tim Peters 'doctest.py' to accomplish all testing of Python apps in our company. In doctest each doc string is an independent unit, which starts fresh. Sometimes this leads to duplicated setup stuff, which is needed to test each method of a set of related methods from a class. This is distracting, if you intend the test cases to take their double role of being at same time useful documentational examples for the intended use of the provided API. Tim_one: Do you read this? What do you think about the idea to add something like the following two functions to 'doctest': use_module_scenario() -- imports all objects created and preserved during execution of the module doc string examples. use_class_scenario() -- imports all objects created and preserved during the execution of doc string examples of a class. Only allowed in doc string examples of methods. This would allow easily to provide the same setup scenario to a group of related test cases. AFAI understand doctest handles test-shutdown automatically, iff the doc string test examples leave no persistent resources behind. Regards, Peter From moshez at zadka.site.co.il Tue Dec 5 04:31:18 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Tue, 05 Dec 2000 05:31:18 +0200 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: Message from barry@digicool.com (Barry A. Warsaw) of "Mon, 04 Dec 2000 10:13:23 EST." <14891.46227.785856.307437@anthem.concentric.net> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> Message-ID: <20001205033118.9135CA817@darjeeling.zadka.site.co.il> > I'm still catching up on several hundred emails over the weekend. I > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm > convinced this is a bug in the implementation. One approach might be > to store the info in the thread state object I don't think this is a good idea -- continuations and coroutines might mess it up. Maybe the right thing is to mess with the *compilation* of __findattr__ so that it would call __setattr__ and __getattr__ with special flags that stop them from calling __findattr__? This is ugly, but I can't think of a better way. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tismer at tismer.com Mon Dec 4 19:35:19 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 04 Dec 2000 20:35:19 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> Message-ID: <3A2BE3E7.60A8E220@tismer.com> Moshe Zadka wrote: > > > I'm still catching up on several hundred emails over the weekend. I > > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm > > convinced this is a bug in the implementation. One approach might be > > to store the info in the thread state object > > I don't think this is a good idea -- continuations and coroutines might > mess it up. Maybe the right thing is to mess with the *compilation* of > __findattr__ so that it would call __setattr__ and __getattr__ with > special flags that stop them from calling __findattr__? This is > ugly, but I can't think of a better way. Yeah, this is what I tried to say by "different machine state"; compiling different behavior in the case of a special method is an interesting idea. It is limited somewhat, since the changed system state is not inherited to called functions. But if __findattr__ performs its one, single task in its body alone, we are fine. still-thinking-of-alternatives - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tismer at tismer.com Mon Dec 4 19:52:43 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 04 Dec 2000 20:52:43 +0200 Subject: [Python-Dev] A house upon the sand References: <000201c05d8c$e7a15b10$060210ac@private> Message-ID: <3A2BE7FB.831F2F93@tismer.com> Barry Scott wrote: > > I fully support Greg Wards view. If string was removed I'd not > update the old code but add in my own string module. > > Given the effort you guys went to to keep the C extension protocol the > same (in the context of crashing on importing a 1.5 dll into 2.0) I > amazed you think that string could be removed... > > Could you split the lib into blessed and backward compatibility sections? > Then by some suitable mechanism I can choose the compatibility I need? > > Oh and as for join obviously a method of a list... > > ['thats','better'].join(' ') The above is the way as it is defined for JavaScript. But in JavaScript, the list join method performs an implicit str() on the list elements. As has been discussed some time ago, Python's lists are too versatile to justify a string-centric method. Marc Andr? pointed out that one could do a reduction with the semantics of the "+" operator, but Guido said that he wouldn't like to see [2, 3, 5].join(7) being reduced to 2+7+3+7+5 == 24. That could only be avoided if there were a way to distinguish numeric addition from concatenation. but-I-could-live-with-it - ly y'rs - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From barry at digicool.com Mon Dec 4 22:23:00 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 16:23:00 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> Message-ID: <14892.2868.982013.313562@anthem.concentric.net> >>>>> "CT" == Christian Tismer writes: CT> You want most probably do this: __findattr__ should not be CT> invoked again for this instance, with this attribute name, for CT> this "thread", until you are done. First, I think the rule should be "__findattr__ should not be invoked again for this instance, in this thread, until you are done". I.e. once in __findattr__, you want all subsequent attribute references to bypass findattr, because presumably, your instance now has complete control for all accesses in this thread. You don't want to limit it to just the currently named attribute. Second, if "this thread" is defined as _PyThreadState_Current, then we have a simple solution, as I mapped out earlier. We do a PyThreadState_GetDict() and store the instance in that dict on entry to __findattr__ and remove it on exit from __findattr__. If the instance can be found in the current thread's dict, we bypass __findattr__. >>>>> "MZ" == Moshe Zadka writes: MZ> I don't think this is a good idea -- continuations and MZ> coroutines might mess it up. You might be right, but I'm not sure. If we make __findattr__ thread safe according to the definition above, and if uthread/coroutine/continuation safety can be accomplished by the __findattr__ programmer's discipline, then I think that is enough. IOW, if we can tell the __findattr__ author to not relinquish the uthread explicitly during the __findattr__ call, we're cool. Oh, and as long as we're not somehow substantially reducing the utility of __findattr__ by making that restriction. What I worry about is re-entrancy that isn't under the programmer's control, like the Real Thread-safety problem. -Barry From barry at digicool.com Mon Dec 4 23:58:33 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 17:58:33 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <3A2C0E0D.E042D026@tismer.com> Message-ID: <14892.8601.41178.81475@anthem.concentric.net> >>>>> "CT" == Christian Tismer writes: CT> Hmm. WHat do you think about Moshe's idea to change compiling CT> of the method? It has the nice advantage that there are no CT> Thread-safety problems by design. The only drawback is that CT> the contract of not-calling-myself only holds for this CT> function. I'm not sure I understand what Moshe was proposing. Moshe: are you saying that we should change the way the compiler works, so that it somehow recognizes this special case? I'm not sure I like that approach. I think I want something more runtime-y, but I'm not sure why (maybe just because I'm more comfortable mucking about in the run-time than in the compiler). -Barry From guido at python.org Tue Dec 5 00:16:17 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 04 Dec 2000 18:16:17 -0500 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: Your message of "Mon, 04 Dec 2000 16:23:00 EST." <14892.2868.982013.313562@anthem.concentric.net> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> Message-ID: <200012042316.SAA23081@cj20424-a.reston1.va.home.com> I'm unconvinced by the __findattr__ proposal as it now stands. - Do you really think that JimF would do away with ExtensionClasses if __findattr__ was intruduced? I kinda doubt it. See [*footnote]. It seems that *using* __findattr__ is expensive (even if *not* using is cheap :-). - Why is deletion not supported? What if you want to enforce a policy on deletions too? - It's ugly to use the same call for get and set. The examples indicate that it's not such a great idea: every example has *two* tests whether it's get or set. To share a policy, the proper thing to do is to write a method that either get or set can use. - I think it would be sufficient to *only* use __findattr__ for getattr -- __setattr__ and __delattr__ already have full control. The "one routine to implement the policy" argument doesn't really hold, I think. - The PEP says that the "in-findattr" flag is set on the instance. We've already determined that this is not thread-safe. This is not just a bug in the implementation -- it's a bug in the specification. I also find it ugly. But if we decide to do this, it can go in the thread-state -- if we ever add coroutines, we have to decide on what stuff to move from the thread state to the coroutine state anyway. - It's also easy to conceive situations where recursive __findattr__ calls on the same instance in the same thread/coroutine are perfectly desirable -- e.g. when __findattr__ ends up calling a method that uses a lot of internal machinery of the class. You don't want all the machinery to have to be aware of the fact that it may be called with __findattr__ on the stack and without it. So perhaps it may be better to only treat the body of __findattr__ itself special, as Moshe suggested. What does Jython do here? - The code examples require a *lot* of effort to understand. These are complicated issues! (I rewrote the Bean example using __getattr__ and __setattr__ and found no need for __findattr__; the __getattr__ version is simpler and easier to understand. I'm still studying the other __findattr__ examples.) - The PEP really isn't that long, except for the code examples. I recommend reading the patch first -- the patch is probably shorter than any specification of the feature can be. --Guido van Rossum (home page: http://www.python.org/~guido/) [*footnote] There's an easy way (that few people seem to know) to cause __getattr__ to be called for virtually all attribute accesses: put *all* (user-visible) attributes in a sepate dictionary. If you want to prevent access to this dictionary too (for Zope security enforcement), make it a global indexed by id() -- a destructor(__del__) can take care of deleting entries here. From martin at loewis.home.cs.tu-berlin.de Tue Dec 5 00:10:43 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 5 Dec 2000 00:10:43 +0100 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <14891.46227.785856.307437@anthem.concentric.net> (barry@digicool.com) References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> Message-ID: <200012042310.AAA00786@loewis.home.cs.tu-berlin.de> > I'm still catching up on several hundred emails over the weekend. I > had a sneaking suspicion that infindattr wasn't thread-safe, so I'm > convinced this is a bug in the implementation. One approach might be > to store the info in the thread state object (isn't that how the > recursive repr stop flag is stored?) Whether this works depends on how exactly the info is stored. A single flag won't be sufficient, since multiple objects may have __findattr__ in progress in a given thread. With a set of instances, it would work, though. Regards, Martin From martin at loewis.home.cs.tu-berlin.de Tue Dec 5 00:13:15 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 5 Dec 2000 00:13:15 +0100 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <20001205033118.9135CA817@darjeeling.zadka.site.co.il> (message from Moshe Zadka on Tue, 05 Dec 2000 05:31:18 +0200) References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> Message-ID: <200012042313.AAA00832@loewis.home.cs.tu-berlin.de> > I don't think this is a good idea -- continuations and coroutines > might mess it up. If coroutines and continuations present operate preemptively, then they should present themselves as an implementation of the thread API; perhaps the thread API needs to be extended to allow for such a feature. If yielding control is in the hands of the implementation, it would be easy to outrule a context switch while findattr is in progress. Regards, Martin From martin at loewis.home.cs.tu-berlin.de Tue Dec 5 00:19:37 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 5 Dec 2000 00:19:37 +0100 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <14892.8601.41178.81475@anthem.concentric.net> (barry@digicool.com) References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <3A2C0E0D.E042D026@tismer.com> <14892.8601.41178.81475@anthem.concentric.net> Message-ID: <200012042319.AAA00877@loewis.home.cs.tu-berlin.de> > I'm not sure I understand what Moshe was proposing. Moshe: are you > saying that we should change the way the compiler works, so that it > somehow recognizes this special case? I'm not sure I like that > approach. I think I want something more runtime-y, but I'm not sure > why (maybe just because I'm more comfortable mucking about in the > run-time than in the compiler). I guess you are also uncomfortable with the problem that the compile-time analysis cannot "see" through levels of indirection. E.g. if findattr as return self.compute_attribute(real_attribute) then compile-time analysis could figure out to call compute_attribute directly. However, that method may be implemented as def compute_attribute(self,name): return self.mapping[name] where the access to mapping could not be detected statically. Regards, Martin From tismer at tismer.com Mon Dec 4 22:35:09 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 04 Dec 2000 23:35:09 +0200 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> Message-ID: <3A2C0E0D.E042D026@tismer.com> "Barry A. Warsaw" wrote: > > >>>>> "CT" == Christian Tismer writes: > > CT> You want most probably do this: __findattr__ should not be > CT> invoked again for this instance, with this attribute name, for > CT> this "thread", until you are done. > > First, I think the rule should be "__findattr__ should not be invoked > again for this instance, in this thread, until you are done". Maybe this is better. Surely easier. :) [ThreadState solution - well fine so far] > MZ> I don't think this is a good idea -- continuations and > MZ> coroutines might mess it up. > > You might be right, but I'm not sure. > > If we make __findattr__ thread safe according to the definition above, > and if uthread/coroutine/continuation safety can be accomplished by > the __findattr__ programmer's discipline, then I think that is enough. > IOW, if we can tell the __findattr__ author to not relinquish the > uthread explicitly during the __findattr__ call, we're cool. Oh, and > as long as we're not somehow substantially reducing the utility of > __findattr__ by making that restriction. > > What I worry about is re-entrancy that isn't under the programmer's > control, like the Real Thread-safety problem. Hmm. WHat do you think about Moshe's idea to change compiling of the method? It has the nice advantage that there are no Thread-safety problems by design. The only drawback is that the contract of not-calling-myself only holds for this function. I don't know how Threadstate scale up when there are more things like these invented. Well, for the moment, the simple solution with Stackless would just be to let the interpreter recurse in this call, the same as it happens during __init__ and anything else that isn't easily turned into tail-recursion. It just blocks :-) ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From barry at digicool.com Tue Dec 5 03:54:23 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 4 Dec 2000 21:54:23 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> Message-ID: <14892.22751.921264.156010@anthem.concentric.net> >>>>> "GvR" == Guido van Rossum writes: GvR> - Do you really think that JimF would do away with GvR> ExtensionClasses if __findattr__ was intruduced? I kinda GvR> doubt it. See [*footnote]. It seems that *using* GvR> __findattr__ is expensive (even if *not* using is cheap :-). That's not even the real reason why JimF wouldn't stop using ExtensionClass. He's already got too much code invested in EC. However EC can be a big pill to swallow for some applications because it's a C extension (and because it has some surprising non-Pythonic side effects). In those situations, a pure Python approach, even though slower, is useful. GvR> - Why is deletion not supported? What if you want to enforce GvR> a policy on deletions too? It could be, without much work. GvR> - It's ugly to use the same call for get and set. The GvR> examples indicate that it's not such a great idea: every GvR> example has *two* tests whether it's get or set. To share a GvR> policy, the proper thing to do is to write a method that GvR> either get or set can use. I don't have strong feelings either way. GvR> - I think it would be sufficient to *only* use __findattr__ GvR> for getattr -- __setattr__ and __delattr__ already have full GvR> control. The "one routine to implement the policy" argument GvR> doesn't really hold, I think. What about the ability to use "normal" x.name attribute access syntax inside the hook? Let me guess your answer. :) GvR> - The PEP says that the "in-findattr" flag is set on the GvR> instance. We've already determined that this is not GvR> thread-safe. This is not just a bug in the implementation -- GvR> it's a bug in the specification. I also find it ugly. But GvR> if we decide to do this, it can go in the thread-state -- if GvR> we ever add coroutines, we have to decide on what stuff to GvR> move from the thread state to the coroutine state anyway. Right. That's where we've ended up in subsequent messages on this thread. GvR> - It's also easy to conceive situations where recursive GvR> __findattr__ calls on the same instance in the same GvR> thread/coroutine are perfectly desirable -- e.g. when GvR> __findattr__ ends up calling a method that uses a lot of GvR> internal machinery of the class. You don't want all the GvR> machinery to have to be aware of the fact that it may be GvR> called with __findattr__ on the stack and without it. Hmm, okay, I don't really understand your example. I suppose I'm envisioning __findattr__ as a way to provide an interface to clients of the class. Maybe it's a bean interface, maybe it's an acquisition interface or an access control interface. The internal machinery has to know something about how that interface is implemented, so whether __findattr__ is recursive or not doesn't seem to enter into it. And also, allowing __findattr__ to be recursive will just impose different constraints on the internal machinery methods, just like __setattr__ currently does. I.e. you better know that you're in __setattr__ and not do self.name type things, or you'll recurse forever. GvR> So perhaps it may be better to only treat the body of GvR> __findattr__ itself special, as Moshe suggested. Maybe I'm being dense, but I'm not sure exactly what this means, or how you would do this. GvR> What does Jython do here? It's not exactly equivalent, because Jython's __findattr__ can't call back into Python. GvR> - The code examples require a *lot* of effort to understand. GvR> These are complicated issues! (I rewrote the Bean example GvR> using __getattr__ and __setattr__ and found no need for GvR> __findattr__; the __getattr__ version is simpler and easier GvR> to understand. I'm still studying the other __findattr__ GvR> examples.) Is it simpler because you separated out the set and get behavior? If __findattr__ only did getting, I think it would be a lot similar too (but I'd still be interested in seeing your __getattr__ only example). The acquisition examples are complicated because I wanted to support the same interface that EC's acquisition classes support. All that detail isn't necessary for example code. GvR> - The PEP really isn't that long, except for the code GvR> examples. I recommend reading the patch first -- the patch GvR> is probably shorter than any specification of the feature can GvR> be. Would it be more helpful to remove the examples? If so, where would you put them? It's certainly useful to have examples someplace I think. GvR> There's an easy way (that few people seem to know) to cause GvR> __getattr__ to be called for virtually all attribute GvR> accesses: put *all* (user-visible) attributes in a sepate GvR> dictionary. If you want to prevent access to this dictionary GvR> too (for Zope security enforcement), make it a global indexed GvR> by id() -- a destructor(__del__) can take care of deleting GvR> entries here. Presumably that'd be a module global, right? Maybe within Zope that could be protected, but outside of that, that global's always going to be accessible. So are methods, even if given private names. And I don't think that such code would be any more readable since instead of self.name you'd see stuff like def __getattr__(self, name): global instdict mydict = instdict[id(self)] obj = mydict[name] ... def __setattr__(self, name, val): global instdict mydict = instdict[id(self)] instdict[name] = val ... and that /might/ be a problem with Jython currently, because id()'s may be reused. And relying on __del__ may have unfortunate side effects when viewed in conjunction with garbage collection. You're probably still unconvinced , but are you dead-set against it? I can try implementing __findattr__() as a pre-__getattr__ hook only. Then we can live with the current __setattr__() restrictions and see what the examples look like in that situation. -Barry From guido at python.org Tue Dec 5 13:54:20 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 05 Dec 2000 07:54:20 -0500 Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: Your message of "Mon, 04 Dec 2000 21:54:23 EST." <14892.22751.921264.156010@anthem.concentric.net> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> <14892.22751.921264.156010@anthem.concentric.net> Message-ID: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> > >>>>> "GvR" == Guido van Rossum writes: > > GvR> - Do you really think that JimF would do away with > GvR> ExtensionClasses if __findattr__ was intruduced? I kinda > GvR> doubt it. See [*footnote]. It seems that *using* > GvR> __findattr__ is expensive (even if *not* using is cheap :-). > > That's not even the real reason why JimF wouldn't stop using > ExtensionClass. He's already got too much code invested in EC. > However EC can be a big pill to swallow for some applications because > it's a C extension (and because it has some surprising non-Pythonic > side effects). In those situations, a pure Python approach, even > though slower, is useful. Agreed. But I'm still hoping to find the silver bullet that lets Jim (and everybody else) do what ExtensionClass does without needing another extension. > GvR> - Why is deletion not supported? What if you want to enforce > GvR> a policy on deletions too? > > It could be, without much work. Then it should be -- except I prefer to do only getattr anyway, see below. > GvR> - It's ugly to use the same call for get and set. The > GvR> examples indicate that it's not such a great idea: every > GvR> example has *two* tests whether it's get or set. To share a > GvR> policy, the proper thing to do is to write a method that > GvR> either get or set can use. > > I don't have strong feelings either way. What does Jython do? I thought it only did set (hence the name :-). I think there's no *need* for findattr to catch the setattr operation, because __setattr__ *already* gets invoked on each set not just ones where the attr doesn't yet exist. > GvR> - I think it would be sufficient to *only* use __findattr__ > GvR> for getattr -- __setattr__ and __delattr__ already have full > GvR> control. The "one routine to implement the policy" argument > GvR> doesn't really hold, I think. > > What about the ability to use "normal" x.name attribute access syntax > inside the hook? Let me guess your answer. :) Aha! You got me there. Clearly the REAL reason for wanting __findattr__ is the no-recursive-calls rule -- which is also the most uncooked feature... Traditional getattr hooks don't need this as much because they don't get called when the attribute already exists; traditional setattr hooks deal with it by switching on the attribute name. The no-recursive-calls rule certainly SEEMS an attractive way around this. But I'm not sure that it really is... I need to get my head around this more. (The only reason I'm still posting this reply is to test the new mailing lists setup via mail.python.org.) > GvR> - The PEP says that the "in-findattr" flag is set on the > GvR> instance. We've already determined that this is not > GvR> thread-safe. This is not just a bug in the implementation -- > GvR> it's a bug in the specification. I also find it ugly. But > GvR> if we decide to do this, it can go in the thread-state -- if > GvR> we ever add coroutines, we have to decide on what stuff to > GvR> move from the thread state to the coroutine state anyway. > > Right. That's where we've ended up in subsequent messages on this thread. > > GvR> - It's also easy to conceive situations where recursive > GvR> __findattr__ calls on the same instance in the same > GvR> thread/coroutine are perfectly desirable -- e.g. when > GvR> __findattr__ ends up calling a method that uses a lot of > GvR> internal machinery of the class. You don't want all the > GvR> machinery to have to be aware of the fact that it may be > GvR> called with __findattr__ on the stack and without it. > > Hmm, okay, I don't really understand your example. I suppose I'm > envisioning __findattr__ as a way to provide an interface to clients > of the class. Maybe it's a bean interface, maybe it's an acquisition > interface or an access control interface. The internal machinery has > to know something about how that interface is implemented, so whether > __findattr__ is recursive or not doesn't seem to enter into it. But the class is also a client of itself, and not all cases where it is a client of itself are inside a findattr call. Take your bean example. Suppose your bean class also has a spam() method. The findattr code needs to account for this, e.g.: def __findattr__(self, name, *args): if name == "spam" and not args: return self.spam ...original body here... Or you have to add a _get_spam() method: def _get_spam(self): return self.spam Either solution gets tedious if there ar a lot of methods; instead, findattr could check if the attr is defined on the class, and then return that: def __findattr__(self, name, *args): if not args and name[0] != '_' and hasattr(self.__class__, name): return getattr(self, name) ...original body here... Anyway, let's go back to the spam method. Suppose it references self.foo. The findattr machinery will access it. Fine. But now consider another attribute (bar) with _set_bar() and _get_bar() methods that do a little more. Maybe bar is really calculated from the value of self.foo. Then _get_bar cannot use self.foo (because it's inside findattr so findattr won't resolve it, and self.foo doesn't actually exist on the instance) so it has to use self.__myfoo. Fine -- after all this is inside a _get_* handler, which knows it's being called from findattr. But what if, instead of needing self.foo, _get_bar wants to call self.spam() in order? Then self.spam() is being called from inside findattr, so when it access self.foo, findattr isn't used -- and it fails with an AttributeError! Sorry for the long detour, but *that's* the problem I was referring to. I think the scenario is quite realistic. > And also, allowing __findattr__ to be recursive will just impose > different constraints on the internal machinery methods, just like > __setattr__ currently does. I.e. you better know that you're in > __setattr__ and not do self.name type things, or you'll recurse > forever. Actually, this is usually solved by having __setattr__ check for specific names only, and for others do self.__dict__[name] = value; that way, recursive __setattr__ calls are okay. Similar for __getattr__ (which has to raise AttributeError for unrecognized names). > GvR> So perhaps it may be better to only treat the body of > GvR> __findattr__ itself special, as Moshe suggested. > > Maybe I'm being dense, but I'm not sure exactly what this means, or > how you would do this. Read Moshe's messages (and Martin's replies) again. I don't care that much for it so I won't explain it again. > GvR> What does Jython do here? > > It's not exactly equivalent, because Jython's __findattr__ can't call > back into Python. I'd say that Jython's __findattr__ is an entirely different beast than what we have here. Its min purpose in life appears to be to be a getattr equivalent that returns NULL instead of raising an exception when the attribute isn't found -- which is reasonable because from within Java, testing for null is much cheaper than checking for an exception, and you often need to look whether a given attribute exists do some default action if not. (In fact, I'd say that CPython could also use a findattr of this kind...) This is really too bad. Based on the name similarity and things I thought you'd said in private before, I thought that they would be similar. Then the experience with Jython would be a good argument for adding a findattr hook to CPython. But now that they are totally different beasts it doesn't help at all. > GvR> - The code examples require a *lot* of effort to understand. > GvR> These are complicated issues! (I rewrote the Bean example > GvR> using __getattr__ and __setattr__ and found no need for > GvR> __findattr__; the __getattr__ version is simpler and easier > GvR> to understand. I'm still studying the other __findattr__ > GvR> examples.) > > Is it simpler because you separated out the set and get behavior? If > __findattr__ only did getting, I think it would be a lot similar too > (but I'd still be interested in seeing your __getattr__ only > example). Here's my getattr example. It's more lines of code, but cleaner IMHO: class Bean: def __init__(self, x): self.__myfoo = x def __isprivate(self, name): return name.startswith('_') def __getattr__(self, name): if self.__isprivate(name): raise AttributeError, name return getattr(self, "_get_" + name)() def __setattr__(self, name, value): if self.__isprivate(name): self.__dict__[name] = value else: return getattr(self, "_set_" + name)(value) def _set_foo(self, x): self.__myfoo = x def _get_foo(self): return self.__myfoo b = Bean(3) print b.foo b.foo = 9 print b.foo > The acquisition examples are complicated because I wanted > to support the same interface that EC's acquisition classes support. > All that detail isn't necessary for example code. I *still* have to study the examples... :-( Will do next. > GvR> - The PEP really isn't that long, except for the code > GvR> examples. I recommend reading the patch first -- the patch > GvR> is probably shorter than any specification of the feature can > GvR> be. > > Would it be more helpful to remove the examples? If so, where would > you put them? It's certainly useful to have examples someplace I > think. No, my point is that the examples need more explanation. Right now the EC example is over 200 lines of brain-exploding code! :-) > GvR> There's an easy way (that few people seem to know) to cause > GvR> __getattr__ to be called for virtually all attribute > GvR> accesses: put *all* (user-visible) attributes in a sepate > GvR> dictionary. If you want to prevent access to this dictionary > GvR> too (for Zope security enforcement), make it a global indexed > GvR> by id() -- a destructor(__del__) can take care of deleting > GvR> entries here. > > Presumably that'd be a module global, right? Maybe within Zope that > could be protected, Yes. > but outside of that, that global's always going to > be accessible. So are methods, even if given private names. Aha! Another think that I expect has been on your agenda for a long time, but which isn't explicit in the PEP (AFAICT): findattr gives *total* control over attribute access, unlike __getattr__ and __setattr__ and private name mangling, which can all be defeated. And this may be one of the things that Jim is after with ExtensionClasses in Zope. Although I believe that in DTML, he doesn't trust this: he uses source-level (or bytecode-level) transformations to turn all X.Y operations into a call into a security manager. So I'm not sure that the argument is very strong. > And I > don't think that such code would be any more readable since instead of > self.name you'd see stuff like > > def __getattr__(self, name): > global instdict > mydict = instdict[id(self)] > obj = mydict[name] > ... > > def __setattr__(self, name, val): > global instdict > mydict = instdict[id(self)] > instdict[name] = val > ... > > and that /might/ be a problem with Jython currently, because id()'s > may be reused. And relying on __del__ may have unfortunate side > effects when viewed in conjunction with garbage collection. Fair enough. I withdraw the suggestion, and propose restricted execution instead. There, you can use Bastions -- which have problems of their own, but you do get total control. > You're probably still unconvinced , but are you dead-set against > it? I can try implementing __findattr__() as a pre-__getattr__ hook > only. Then we can live with the current __setattr__() restrictions > and see what the examples look like in that situation. I am dead-set against introducing a feature that I don't fully understand. Let's continue this discussion. --Guido van Rossum (home page: http://www.python.org/~guido/) From bckfnn at worldonline.dk Tue Dec 5 16:40:10 2000 From: bckfnn at worldonline.dk (Finn Bock) Date: Tue, 05 Dec 2000 15:40:10 GMT Subject: [Python-Dev] PEP 231, __findattr__() In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> <14892.22751.921264.156010@anthem.concentric.net> <200012051254.HAA25502@cj20424-a.reston1.va.home.com> Message-ID: <3a2d0c29.242749@smtp.worldonline.dk> On Tue, 05 Dec 2000 07:54:20 -0500, you wrote: >> GvR> What does Jython do here? >> >> It's not exactly equivalent, because Jython's __findattr__ can't call >> back into Python. > >I'd say that Jython's __findattr__ is an entirely different beast than >what we have here. Its min purpose in life appears to be to be a >getattr equivalent that returns NULL instead of raising an exception >when the attribute isn't found -- which is reasonable because from >within Java, testing for null is much cheaper than checking for an >exception, and you often need to look whether a given attribute exists >do some default action if not. Correct. It is also the method to override when making a new builtin type and it will be called on such a type subclass regardless of the presence of any __getattr__ hook and __dict__ content. So I think it have some of the properties which Barry wants. regards, finn From greg at cosc.canterbury.ac.nz Wed Dec 6 00:07:06 2000 From: greg at cosc.canterbury.ac.nz (greg at cosc.canterbury.ac.nz) Date: Wed, 06 Dec 2000 12:07:06 +1300 (NZDT) Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__()) In-Reply-To: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> Message-ID: <200012052307.MAA01082@s454.cosc.canterbury.ac.nz> I can't believe you're even considering a magic dynamically-scoped flag that invisibly changes the semantics of fundamental operations. To me the idea is utterly insane! If I understand correctly, the problem is that if you do something like def __findattr__(self, name): if name == 'spam': return self.__dict__['spam'] then self.__dict__ is going to trigger a recursive __findattr__ call. It seems to me that if you're going to have some sort of hook that is always called on any x.y reference, you need some way of explicitly bypassing it and getting at the underlying machinery. I can think of a couple of ways: 1) Make the __dict__ attribute special, so that accessing it always bypasses __findattr__. 2) Provide some other way of getting direct access to the attributes of an object, e.g. new builtins called peekattr() and pokeattr(). This assumes that you always know when you write a particular access whether you want it to be a "normal" or "special" one, so that you can use the appropriate mechanism. Are there any cases where this is not true? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From barry at digicool.com Wed Dec 6 03:20:40 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 5 Dec 2000 21:20:40 -0500 Subject: [Python-Dev] PEP 231, __findattr__() References: <200012032156.WAA03301@loewis.home.cs.tu-berlin.de> <14891.46227.785856.307437@anthem.concentric.net> <20001205033118.9135CA817@darjeeling.zadka.site.co.il> <14892.2868.982013.313562@anthem.concentric.net> <200012042316.SAA23081@cj20424-a.reston1.va.home.com> <14892.22751.921264.156010@anthem.concentric.net> <200012051254.HAA25502@cj20424-a.reston1.va.home.com> <3a2d0c29.242749@smtp.worldonline.dk> Message-ID: <14893.41592.701128.58110@anthem.concentric.net> >>>>> "FB" == Finn Bock writes: FB> Correct. It is also the method to override when making a new FB> builtin type and it will be called on such a type subclass FB> regardless of the presence of any __getattr__ hook and FB> __dict__ content. So I think it have some of the properties FB> which Barry wants. We had a discussion about this PEP at our group meeting today. Rather than write it all twice, I'm going to try to update the PEP and patch tonight. I think what we came up with will solve most of the problems raised, and will be implementable in Jython (I'll try to work up a Jython patch too, if I don't fall asleep first :) -Barry From barry at digicool.com Wed Dec 6 03:54:36 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 5 Dec 2000 21:54:36 -0500 Subject: Are you all mad? (Re: [Python-Dev] PEP 231, __findattr__()) References: <200012051254.HAA25502@cj20424-a.reston1.va.home.com> <200012052307.MAA01082@s454.cosc.canterbury.ac.nz> Message-ID: <14893.43628.61063.905227@anthem.concentric.net> >>>>> "greg" == writes: | 1) Make the __dict__ attribute special, so that accessing | it always bypasses __findattr__. You're not far from what I came up with right after our delicious lunch. We're going to invent a new protocol which passes __dict__ into the method as an argument. That way self.__dict__ doesn't need to be special cased at all because you can get at all the attributes via a local! So no recursion stop hack is necessary. More in the updated PEP and patch. -Barry From dgoodger at bigfoot.com Thu Dec 7 05:33:33 2000 From: dgoodger at bigfoot.com (David Goodger) Date: Wed, 06 Dec 2000 23:33:33 -0500 Subject: [Python-Dev] unit testing and Python regression test Message-ID: There is another unit testing implementation out there, OmPyUnit, available from: http://www.objectmentor.com/freeware/downloads.html -- David Goodger dgoodger at bigfoot.com Open-source projects: - The Go Tools Project: http://gotools.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net (soon!) From fdrake at users.sourceforge.net Thu Dec 7 07:26:54 2000 From: fdrake at users.sourceforge.net (Fred L. Drake) Date: Wed, 6 Dec 2000 22:26:54 -0800 Subject: [Python-Dev] [development doc updates] Message-ID: <200012070626.WAA22103@orbital.p.sourceforge.net> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Lots of small changes, but most important, more DOM documentation: http://python.sourceforge.net/devel-docs/lib/module-xml.dom.html From guido at python.org Thu Dec 7 18:48:53 2000 From: guido at python.org (Guido van Rossum) Date: Thu, 07 Dec 2000 12:48:53 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons Message-ID: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> After perusing David Ascher's proposal, several versions of his patches, and hundreds of email exchanged on this subject (almost all of this dated April or May of 1998), I've produced a reasonably semblance of PEP 207. Get it from CVS or here on the web: http://python.sourceforge.net/peps/pep-0207.html I'd like to hear your comments, praise, and criticisms! The PEP still needs work; in particular, the minority point of view back then (that comparisons should return only Boolean results) is not adequately represented (but I *did* work in a reference to tabnanny, to ensure Tim's support :-). I'd like to work on a patch next, but I think there will be interference with Neil's coercion patch. I'm not sure how to resolve that yet; maybe I'll just wait until Neil's coercion patch is checked in. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Dec 7 18:54:51 2000 From: guido at python.org (Guido van Rossum) Date: Thu, 07 Dec 2000 12:54:51 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) Message-ID: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> I'm maybe about three quarters on the way with writing PEP 230 -- far enough along to be asking for comments. Get it from CVS or go to: http://python.sourceforge.net/peps/pep-0230.html A prototype implementation in Python is included in the PEP; I think this shows that the implementation is not too complex (Paul Prescod's fear about my proposal). This is pretty close to what I proposed earlier (Nov 5), except that I have added warning category classes (inspired by Paul's proposal). This class also serves as the exception to be raised when warnings are turned into exceptions. Do I need to include a discussion of Paul's counter-proposal and why I rejected it? --Guido van Rossum (home page: http://www.python.org/~guido/) From Barrett at stsci.edu Thu Dec 7 23:49:02 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Thu, 7 Dec 2000 17:49:02 -0500 (EST) Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays Message-ID: <14896.1191.240597.632888@nem-srvr.stsci.edu> What is the status of PEP 209? I see David Ascher is the champion of this PEP, but nothing has been written up. Is the intention of this PEP to make the current Numeric a built-in feature of Python or to re-implement and replace the current Numeric module? The reason that I ask these questions is because I'm working on a prototype of a new N-dimensional Array module which I call Numeric 2. This new module will be much more extensible than the current Numeric. For example, new array types and universal functions can be loaded or imported on demand. We also intend to implement a record (or C-structure) type, because 1-D arrays or lists of records are a common data structure for storing photon events in astronomy and related fields. The current Numeric does not handle record types efficiently, particularly when the data type is not aligned and is in non-native endian format. To handle such data, temporary arrays must be created and alignment and byte-swapping done on them. Numeric 2 does such pre- and post-processing inside the inner-most loop which is more efficient in both time and memory. It also does type conversion at this level which is consistent with that proposed for PEP 208. Since many scientific users would like direct access to the array data via C pointers, we have investigated using the buffer object. We have not had much success with it, because of its implementation. I have scanned the python-dev mailing list for discussions of this issue and found that it now appears to be deprecated. My opinion on this is that a new _fundamental_ built-in type should be created for memory allocation with features and an interface similar to the _mmap_ object. I'll call this a _malloc_ object. This would allow Numeric 2 to use either object interchangeably depending on the circumstance. The _string_ type could also benefit from this new object by using a read-only version of it. Since its an object, it's memory area should be safe from inadvertent deletion. Because of these and other new features in Numeric 2, I have a keen interest in the status of PEPs 207, 208, 211, 225, and 228; and also in the proposed buffer object. I'm willing to implement this new _malloc_ object if members of the python-dev list are in agreement. Actually I see no alternative, given the current design of Numeric 2, since the Array class will initially be written completely in Python and will need a mutable memory buffer, while the _string_ type is meant to be a read-only object. All comments welcome. -- Paul -- Dr. Paul Barrett Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Group FAX: 410-338-4767 Baltimore, MD 21218 From DavidA at ActiveState.com Fri Dec 8 02:13:04 2000 From: DavidA at ActiveState.com (David Ascher) Date: Thu, 7 Dec 2000 17:13:04 -0800 (Pacific Standard Time) Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays In-Reply-To: <14896.1191.240597.632888@nem-srvr.stsci.edu> Message-ID: On Thu, 7 Dec 2000, Paul Barrett wrote: > What is the status of PEP 209? I see David Ascher is the champion of > this PEP, but nothing has been written up. Is the intention of this I put my name on the PEP just to make sure it wasn't forgotten. If someone wants to champion it, their name should go on it. --david From guido at python.org Fri Dec 8 17:10:50 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 11:10:50 -0500 Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays In-Reply-To: Your message of "Thu, 07 Dec 2000 17:49:02 EST." <14896.1191.240597.632888@nem-srvr.stsci.edu> References: <14896.1191.240597.632888@nem-srvr.stsci.edu> Message-ID: <200012081610.LAA30679@cj20424-a.reston1.va.home.com> > What is the status of PEP 209? I see David Ascher is the champion of > this PEP, but nothing has been written up. Is the intention of this > PEP to make the current Numeric a built-in feature of Python or to > re-implement and replace the current Numeric module? David has already explained why his name is on it -- basically, David's name is on several PEPs but he doesn't currently have any time to work on these, so other volunteers are most welcome to join. It is my understanding that the current Numeric is sufficiently messy in implementation and controversial in semantics that it would not be a good basis to start from. However, I do think that a basic multi-dimensional array object would be a welcome addition to core Python. > The reason that I ask these questions is because I'm working on a > prototype of a new N-dimensional Array module which I call Numeric 2. > This new module will be much more extensible than the current Numeric. > For example, new array types and universal functions can be loaded or > imported on demand. We also intend to implement a record (or > C-structure) type, because 1-D arrays or lists of records are a common > data structure for storing photon events in astronomy and related > fields. I'm not familiar with the use of computers in astronomy and related fields, so I'll take your word for that! :-) > The current Numeric does not handle record types efficiently, > particularly when the data type is not aligned and is in non-native > endian format. To handle such data, temporary arrays must be created > and alignment and byte-swapping done on them. Numeric 2 does such > pre- and post-processing inside the inner-most loop which is more > efficient in both time and memory. It also does type conversion at > this level which is consistent with that proposed for PEP 208. > > Since many scientific users would like direct access to the array data > via C pointers, we have investigated using the buffer object. We have > not had much success with it, because of its implementation. I have > scanned the python-dev mailing list for discussions of this issue and > found that it now appears to be deprecated. Indeed. I think it's best to leave the buffer object out of your implementation plans. There are several problems with it, and one of the backburner projects is to redesign it to be much more to the point (providing less, not more functionality). > My opinion on this is that a new _fundamental_ built-in type should be > created for memory allocation with features and an interface similar > to the _mmap_ object. I'll call this a _malloc_ object. This would > allow Numeric 2 to use either object interchangeably depending on the > circumstance. The _string_ type could also benefit from this new > object by using a read-only version of it. Since its an object, it's > memory area should be safe from inadvertent deletion. Interesting. I'm actually not sufficiently familiar with mmap to comment. But would the existing array module's array object be at all useful? You can get to the raw bytes in C (using the C buffer API, which is not deprecated) and it is extensible. > Because of these and other new features in Numeric 2, I have a keen > interest in the status of PEPs 207, 208, 211, 225, and 228; and also > in the proposed buffer object. Here are some quick comments on the mentioned PEPs. 207: Rich Comparisons. This will go into Python 2.1. (I just finished the first draft of the PEP, please read it and comment.) 208: Reworking the Coercion Model. This will go into Python 2.1. Neil Schemenauer has mostly finished the patches already. Please comment. 211: Adding New Lineal Algebra Operators (Greg Wilson). This is unlikely to go into Python 2.1. I don't like the idea much. If you disagree, please let me know! (Also, a choice has to be made between 211 and 225; I don't want to accept both, so until 225 is rejected, 211 is in limbo.) 225: Elementwise/Objectwise Operators (Zhu, Lielens). This will definitely not go into Python 2.1. It adds too many new operators. 228: Reworking Python's Numeric Model. This is a total pie-in-the-sky PEP, and this kind of change is not likely to happen before Python 3000. > I'm willing to implement this new _malloc_ object if members of the > python-dev list are in agreement. Actually I see no alternative, > given the current design of Numeric 2, since the Array class will > initially be written completely in Python and will need a mutable > memory buffer, while the _string_ type is meant to be a read-only > object. Would you be willing to take over authorship of PEP 209? David Ascher and the Numeric Python community will thank you. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Dec 8 19:43:39 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 13:43:39 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Your message of "Thu, 30 Nov 2000 17:46:52 EST." References: Message-ID: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> After the last round of discussion, I was left with the idea that the best thing we could do to help destructive iteration is to introduce a {}.popitem() that returns an arbitrary (key, value) pair and deletes it. I wrote about this: > > One more concern: if you repeatedly remove the *first* item, the hash > > table will start looking lobsided. Since we don't resize the hash > > table on deletes, maybe picking an item at random (but not using an > > expensive random generator!) would be better. and Tim replied: > Which is the reason SETL doesn't specify *which* set item is removed: if > you always start looking at "the front" of a dict that's being consumed, the > dict fills with turds without shrinking, you skip over them again and again, > and consuming the entire dict is still quadratic time. > > Unfortunately, while using a random start point is almost always quicker > than that, the expected time for consuming the whole dict remains quadratic. > > The clearest way around that is to save a per-dict search finger, recording > where the last search left off. Start from its current value. Failure if > it wraps around. This is linear time in non-pathological cases (a > pathological case is one in which it isn't linear time ). I've implemented this, except I use a static variable for the finger intead of a per-dict finger. I'm concerned about adding 4-8 extra bytes to each dict object for a feature that most dictionaries never need. So, instead, I use a single shared finger. This works just as well as long as this is used for a single dictionary. For multiple dictionaries (either used by the same thread or in different threads), it'll work almost as well, although it's possible to make up a pathological example that would work qadratically. An easy example of such a pathological example is to call popitem() for two identical dictionaries in lock step. Comments please! We could: - Live with the pathological cases. - Forget the whole thing; and then also forget about firstkey() etc. which has the same problem only worse. - Fix the algorithm. Maybe jumping criss-cross through the hash table like lookdict does would improve that; but I don't understand the math used for that ("Cycle through GF(2^n)-{0}" ???). I've placed a patch on SourceForge: http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470 The algorithm is: static PyObject * dict_popitem(dictobject *mp, PyObject *args) { static int finger = 0; int i; dictentry *ep; PyObject *res; if (!PyArg_NoArgs(args)) return NULL; if (mp->ma_used == 0) { PyErr_SetString(PyExc_KeyError, "popitem(): dictionary is empty"); return NULL; } i = finger; if (i >= mp->ma_size) ir = 0; while ((ep = &mp->ma_table[i])->me_value == NULL) { i++; if (i >= mp->ma_size) i = 0; } finger = i+1; res = PyTuple_New(2); if (res != NULL) { PyTuple_SET_ITEM(res, 0, ep->me_key); PyTuple_SET_ITEM(res, 1, ep->me_value); Py_INCREF(dummy); ep->me_key = dummy; ep->me_value = NULL; mp->ma_used--; } return res; } --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Dec 8 19:51:49 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 13:51:49 -0500 Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use Message-ID: <200012081851.NAA32254@cj20424-a.reston1.va.home.com> Moshe proposes to add an overridable function sys.displayhook(obj) which will be called by the interpreter for the PRINT_EXPR opcode, instead of hardcoding the behavior. The default implementation will of course have the current behavior, but this makes it much simpler to experiment with alternatives, e.g. using str() instead of repr() (or to choose between str() and repr() based on the type). Moshe has asked me to pronounce on this PEP. I've thought about it, and I'm now all for it. Moshe (or anyone else), please submit a patch to SF that shows the complete implementation! --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at home.com Fri Dec 8 20:06:50 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 8 Dec 2000 14:06:50 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: [Guido, on sharing a search finger and getting worse-than-linear behavior in a simple test case] See my reply on SourceForge (crossed in the mails). I predict that fixing this in an acceptable way (not bulletproof, but linear-time for all predictably common cases) is a two-character change. Surprise, although maybe I'm hallucinating (would someone please confirm?): when I went to the SF patch manager page to look for your patch (using the Open Patches view), I couldn't find it. My guess is that if there are "too many" patches to fit on one screen, then unlike the SF *bug* manager, you don't get any indication that more patches exist or any control to go to the next page. From barry at digicool.com Fri Dec 8 20:18:26 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Fri, 8 Dec 2000 14:18:26 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: <14897.13314.469255.853298@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> Surprise, although maybe I'm hallucinating (would someone TP> please confirm?): when I went to the SF patch manager page to TP> look for your patch (using the Open Patches view), I couldn't TP> find it. My guess is that if there are "too many" patches to TP> fit on one screen, then unlike the SF *bug* manager, you don't TP> get any indication that more patches exist or any control to TP> go to the next page. I haven't checked recently, but this was definitely true a few weeks ago. I think I even submitted an admin request on it, but I don't remember for sure. -Barry From Barrett at stsci.edu Fri Dec 8 22:22:39 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Fri, 8 Dec 2000 16:22:39 -0500 (EST) Subject: [Python-Dev] What's up with PEP 209: Adding Multidimensional Arrays In-Reply-To: <200012081610.LAA30679@cj20424-a.reston1.va.home.com> References: <14896.1191.240597.632888@nem-srvr.stsci.edu> <200012081610.LAA30679@cj20424-a.reston1.va.home.com> Message-ID: <14897.10309.686024.254701@nem-srvr.stsci.edu> Guido van Rossum writes: > > What is the status of PEP 209? I see David Ascher is the champion of > > this PEP, but nothing has been written up. Is the intention of this > > PEP to make the current Numeric a built-in feature of Python or to > > re-implement and replace the current Numeric module? > > David has already explained why his name is on it -- basically, > David's name is on several PEPs but he doesn't currently have any time > to work on these, so other volunteers are most welcome to join. > > It is my understanding that the current Numeric is sufficiently messy > in implementation and controversial in semantics that it would not be > a good basis to start from. That is our (Rick, Perry, and I) belief also. > However, I do think that a basic multi-dimensional array object would > be a welcome addition to core Python. That's re-assuring. > Indeed. I think it's best to leave the buffer object out of your > implementation plans. There are several problems with it, and one of > the backburner projects is to redesign it to be much more to the point > (providing less, not more functionality). I agree and have already made the decision to leave it out. > > My opinion on this is that a new _fundamental_ built-in type should be > > created for memory allocation with features and an interface similar > > to the _mmap_ object. I'll call this a _malloc_ object. This would > > allow Numeric 2 to use either object interchangeably depending on the > > circumstance. The _string_ type could also benefit from this new > > object by using a read-only version of it. Since its an object, it's > > memory area should be safe from inadvertent deletion. > > Interesting. I'm actually not sufficiently familiar with mmap to > comment. But would the existing array module's array object be at all > useful? You can get to the raw bytes in C (using the C buffer API, > which is not deprecated) and it is extensible. I tried using this but had problems. I'll look into it again. > > Because of these and other new features in Numeric 2, I have a keen > > interest in the status of PEPs 207, 208, 211, 225, and 228; and also > > in the proposed buffer object. > > Here are some quick comments on the mentioned PEPs. I've got these PEPs on my desk and will comment on them when I can. > > I'm willing to implement this new _malloc_ object if members of the > > python-dev list are in agreement. Actually I see no alternative, > > given the current design of Numeric 2, since the Array class will > > initially be written completely in Python and will need a mutable > > memory buffer, while the _string_ type is meant to be a read-only > > object. > > Would you be willing to take over authorship of PEP 209? David Ascher > and the Numeric Python community will thank you. Yes, I'd gladly wield vast and inconsiderate power over unsuspecting pythoneers. ;-) -- Paul From guido at python.org Fri Dec 8 23:58:03 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 08 Dec 2000 17:58:03 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Thu, 07 Dec 2000 12:54:51 EST." <200012071754.MAA26557@cj20424-a.reston1.va.home.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> Message-ID: <200012082258.RAA02389@cj20424-a.reston1.va.home.com> Nobody seems to care much about the warnings PEP so far. What's up? Are you all too busy buying presents for the holidays? Then get me some too, please? :-) > http://python.sourceforge.net/peps/pep-0230.html I've now produced a prototype implementation for the C code: http://sourceforge.net/patch/?func=detailpatch&patch_id=102715&group_id=5470 Issues: - This defines a C API PyErr_Warn(category, message) instead of Py_Warn(message, category) as the PEP proposes. I actually like this better: it's consistent with PyErr_SetString() etc. rather than with the Python warn(message[, category]) function. - This calls the Python module from C. We'll have to see if this is fast enough. I wish I could postpone the import of warnings.py until the first call to PyErr_Warn(), but unfortunately the warning category classes must be initialized first (so they can be passed into PyErr_Warn()). The current version of warnings.py imports rather a lot of other modules (e.g. re and getopt); this can be reduced by placing those imports inside the functions that use them. - All the issues listed in the PEP. Please comment! BTW: somebody overwrote the PEP on SourceForge with an older version. Please remember to do a "cvs update" before running "make install" in the peps directory! --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Sat Dec 9 00:26:51 2000 From: gstein at lyra.org (Greg Stein) Date: Fri, 8 Dec 2000 15:26:51 -0800 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Dec 08, 2000 at 01:43:39PM -0500 References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: <20001208152651.H30644@lyra.org> On Fri, Dec 08, 2000 at 01:43:39PM -0500, Guido van Rossum wrote: >... > Comments please! We could: > > - Live with the pathological cases. I agree: live with it. The typical case will operate just fine. > - Forget the whole thing; and then also forget about firstkey() > etc. which has the same problem only worse. No opinion. > - Fix the algorithm. Maybe jumping criss-cross through the hash table > like lookdict does would improve that; but I don't understand the > math used for that ("Cycle through GF(2^n)-{0}" ???). No need. The keys were inserted randomly, so sequencing through is effectively random. :-) >... > static PyObject * > dict_popitem(dictobject *mp, PyObject *args) > { > static int finger = 0; > int i; > dictentry *ep; > PyObject *res; > > if (!PyArg_NoArgs(args)) > return NULL; > if (mp->ma_used == 0) { > PyErr_SetString(PyExc_KeyError, > "popitem(): dictionary is empty"); > return NULL; > } > i = finger; > if (i >= mp->ma_size) > ir = 0; Should be "i = 0" Cheers, -g -- Greg Stein, http://www.lyra.org/ From tismer at tismer.com Sat Dec 9 17:44:14 2000 From: tismer at tismer.com (Christian Tismer) Date: Sat, 09 Dec 2000 18:44:14 +0200 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: <3A32615E.D39B68D2@tismer.com> Guido van Rossum wrote: > > After the last round of discussion, I was left with the idea that the > best thing we could do to help destructive iteration is to introduce a > {}.popitem() that returns an arbitrary (key, value) pair and deletes > it. I wrote about this: > > > > One more concern: if you repeatedly remove the *first* item, the hash > > > table will start looking lobsided. Since we don't resize the hash > > > table on deletes, maybe picking an item at random (but not using an > > > expensive random generator!) would be better. > > and Tim replied: > > > Which is the reason SETL doesn't specify *which* set item is removed: if > > you always start looking at "the front" of a dict that's being consumed, the > > dict fills with turds without shrinking, you skip over them again and again, > > and consuming the entire dict is still quadratic time. > > > > Unfortunately, while using a random start point is almost always quicker > > than that, the expected time for consuming the whole dict remains quadratic. > > > > The clearest way around that is to save a per-dict search finger, recording > > where the last search left off. Start from its current value. Failure if > > it wraps around. This is linear time in non-pathological cases (a > > pathological case is one in which it isn't linear time ). > > I've implemented this, except I use a static variable for the finger > intead of a per-dict finger. I'm concerned about adding 4-8 extra > bytes to each dict object for a feature that most dictionaries never > need. So, instead, I use a single shared finger. This works just as > well as long as this is used for a single dictionary. For multiple > dictionaries (either used by the same thread or in different threads), > it'll work almost as well, although it's possible to make up a > pathological example that would work qadratically. > > An easy example of such a pathological example is to call popitem() > for two identical dictionaries in lock step. > > Comments please! We could: > > - Live with the pathological cases. > > - Forget the whole thing; and then also forget about firstkey() > etc. which has the same problem only worse. > > - Fix the algorithm. Maybe jumping criss-cross through the hash table > like lookdict does would improve that; but I don't understand the > math used for that ("Cycle through GF(2^n)-{0}" ???). That algorithm is really a gem which you should know, so let me try to explain it. Intro: A little story about finite field theory (very basic). ------------------------------------------------------------- For every prime p and every power p^n, there exists a Galois Field ( GF(p^n) ), which is a finite field. The additive group is called "elementary Abelian", it is commutative, and it looks a little like a vector space, since addition works in cycles modulo p for every p cell. The multiplicative group is cyclic, and it never touches 0. Cyclic groups are generated by a single primitive element. The powers of that element make up all the other elements. For all elements of the multiplication group GF(p^n)* the equality x^(p^n -1) == 1 . A generator element is therefore a primitive (p^n-1)th root of unity. From nas at arctrix.com Sat Dec 9 12:30:06 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Sat, 9 Dec 2000 03:30:06 -0800 Subject: [Python-Dev] PEP 208 and __coerce__ Message-ID: <20001209033006.A3737@glacier.fnational.com> While working on the implementation of PEP 208, I discovered that __coerce__ has some surprising properties. Initially I implemented __coerce__ so that the numberic operation currently being performed was called on the values returned by __coerce__. This caused test_class to blow up due to code like this: class Test: def __coerce__(self, other): return (self, other) The 2.0 "solves" this by not calling __coerce__ again if the objects returned by __coerce__ are instances. This has the effect of making code like: class A: def __coerce__(self, other): return B(), other class B: def __coerce__(self, other): return 1, other A() + 1 fail to work in the expected way. The question is: how should __coerce__ work? One option is to leave it work the way it does in 2.0. Alternatively, I could change it so that if coerce returns (self, *) then __coerce__ is not called again. Neil From mal at lemburg.com Sat Dec 9 19:49:29 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Sat, 09 Dec 2000 19:49:29 +0100 Subject: [Python-Dev] PEP 208 and __coerce__ References: <20001209033006.A3737@glacier.fnational.com> Message-ID: <3A327EB9.BD2CA3CC@lemburg.com> Neil Schemenauer wrote: > > While working on the implementation of PEP 208, I discovered that > __coerce__ has some surprising properties. Initially I > implemented __coerce__ so that the numberic operation currently > being performed was called on the values returned by __coerce__. > This caused test_class to blow up due to code like this: > > class Test: > def __coerce__(self, other): > return (self, other) > > The 2.0 "solves" this by not calling __coerce__ again if the > objects returned by __coerce__ are instances. This has the > effect of making code like: > > class A: > def __coerce__(self, other): > return B(), other > > class B: > def __coerce__(self, other): > return 1, other > > A() + 1 > > fail to work in the expected way. The question is: how should > __coerce__ work? One option is to leave it work the way it does > in 2.0. Alternatively, I could change it so that if coerce > returns (self, *) then __coerce__ is not called again. +0 -- the idea behind the PEP 208 is to get rid off the centralized coercion mechanism, so fixing it to allow yet more obscure variants should be carefully considered. I see __coerce__ et al. as old style mechanisms -- operator methods have much more information available to do the right thing than the single bottelneck __coerce__. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tim.one at home.com Sat Dec 9 21:49:04 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 9 Dec 2000 15:49:04 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > ... > I've implemented this, except I use a static variable for the finger > intead of a per-dict finger. I'm concerned about adding 4-8 extra > bytes to each dict object for a feature that most dictionaries never > need. It's a bit ironic that dicts are guaranteed to be at least 1/3 wasted space . Let's pick on Christian's idea to reclaim a few bytes of that. > So, instead, I use a single shared finger. This works just as > well as long as this is used for a single dictionary. For multiple > dictionaries (either used by the same thread or in different threads), > it'll work almost as well, although it's possible to make up a > pathological example that would work qadratically. > > An easy example of such a pathological example is to call popitem() > for two identical dictionaries in lock step. Please see my later comments attached to the patch: http://sourceforge.net/patch/?func=detailpatch&patch_id=102733&group_id=5470 In short, for me (truly) identical dicts perform well with or without my suggestion, while dicts cloned via dict.copy() perform horribly with or without my suggestion (their internal structures differ); still curious as to whether that's also true for you (am I looking at a Windows bug? I don't see how, but it's possible ...). In any case, my suggestion turned out to be worthless on my box. Playing around via simulations suggests that a shared finger is going to be disastrous when consuming more than one dict unless they have identical internal structure (not just compare equal). As soon as they get a little out of synch, it just gets worse with each succeeding probe. > Comments please! We could: > > - Live with the pathological cases. How boring . > - Forget the whole thing; and then also forget about firstkey() > etc. which has the same problem only worse. I don't know that this is an important idea for dicts in general (it is important for sets) -- it's akin to an xrange for dicts. But then I've had more than one real-life program that built giant dicts then ran out of memory trying to iterate over them! I'd like to fix that. > - Fix the algorithm. Maybe jumping criss-cross through the hash table > like lookdict does would improve that; but I don't understand the > math used for that ("Cycle through GF(2^n)-{0}" ???). Christian explained that well (thanks!). However, I still don't see any point to doing that business in .popitem(): when inserting keys, the jitterbug probe sequence has the crucial benefit of preventing primary clustering when keys collide. But when we consume a dict, we just want to visit every slot as quickly as possible. [Christian] > Appendix, on the use of finger: > ------------------------------- > > Instead of using a global finger variable, you can do the > following (involving a cast from object to int) : > > - if the 0'th slot of the dict is non-empty: > return this element and insert the dummy element > as key. Set the value field to the Dictionary Algorithm > would give for the removed object's hash. This is the > next finger. > - else: > treat the value field of the 0'th slot as the last finger. > If it is zero, initialize it with 2^n-1. > Repetitively use the DA until you find an entry. Save > the finger in slot 0 again. > > This dosn't cost an extra slot, and even when the dictionary > is written between removals, the chance to loose the finger > is just 1:(2^n-1) on every insertion. I like that, except: 1) As above, I don't believe the GF business buys anything over a straightforward search when consuming a dict. 2) Overloading the value field bristles with problems, in part because it breaks the invariant that a slot is unused if and only if the value field is NULL, in part because C doesn't guarantee that you can get away with casting an arbitrary int to a pointer and back again. None of the problems in #2 arise if we abuse the me_hash field instead, so the attached does that. Here's a typical run of Guido's test case using this (on an 866MHz machine w/ 256Mb RAM -- the early values jump all over the place from run to run): run = 0 log2size = 10 size = 1024 7.4 usec per item to build (total 0.008 sec) 3.4 usec per item to destroy twins (total 0.003 sec) log2size = 11 size = 2048 6.7 usec per item to build (total 0.014 sec) 3.4 usec per item to destroy twins (total 0.007 sec) log2size = 12 size = 4096 7.0 usec per item to build (total 0.029 sec) 3.7 usec per item to destroy twins (total 0.015 sec) log2size = 13 size = 8192 7.1 usec per item to build (total 0.058 sec) 5.9 usec per item to destroy twins (total 0.048 sec) log2size = 14 size = 16384 14.7 usec per item to build (total 0.241 sec) 6.4 usec per item to destroy twins (total 0.105 sec) log2size = 15 size = 32768 12.2 usec per item to build (total 0.401 sec) 3.9 usec per item to destroy twins (total 0.128 sec) log2size = 16 size = 65536 7.8 usec per item to build (total 0.509 sec) 4.0 usec per item to destroy twins (total 0.265 sec) log2size = 17 size = 131072 7.9 usec per item to build (total 1.031 sec) 4.1 usec per item to destroy twins (total 0.543 sec) The last one is over 100 usec per item using the original patch (with or without my first suggestion). if-i-were-a-betting-man-i'd-say-"bingo"-ly y'rs - tim Drop-in replacement for the popitem in the patch: static PyObject * dict_popitem(dictobject *mp, PyObject *args) { int i = 0; dictentry *ep; PyObject *res; if (!PyArg_NoArgs(args)) return NULL; if (mp->ma_used == 0) { PyErr_SetString(PyExc_KeyError, "popitem(): dictionary is empty"); return NULL; } /* Set ep to "the first" dict entry with a value. We abuse the hash * field of slot 0 to hold a search finger: * If slot 0 has a value, use slot 0. * Else slot 0 is being used to hold a search finger, * and we use its hash value as the first index to look. */ ep = &mp->ma_table[0]; if (ep->me_value == NULL) { i = (int)ep->me_hash; /* The hash field may be uninitialized trash, or it * may be a real hash value, or it may be a legit * search finger, or it may be a once-legit search * finger that's out of bounds now because it * wrapped around or the table shrunk -- simply * make sure it's in bounds now. */ if (i >= mp->ma_size || i < 1) i = 1; /* skip slot 0 */ while ((ep = &mp->ma_table[i])->me_value == NULL) { i++; if (i >= mp->ma_size) i = 1; } } res = PyTuple_New(2); if (res != NULL) { PyTuple_SET_ITEM(res, 0, ep->me_key); PyTuple_SET_ITEM(res, 1, ep->me_value); Py_INCREF(dummy); ep->me_key = dummy; ep->me_value = NULL; mp->ma_used--; } assert(mp->ma_table[0].me_value == NULL); mp->ma_table[0].me_hash = i + 1; /* next place to start */ return res; } From tim.one at home.com Sat Dec 9 22:09:30 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 9 Dec 2000 16:09:30 -0500 Subject: [Python-Dev] RE: {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Message-ID: > assert(mp->ma_table[0].me_value == NULL); > mp->ma_table[0].me_hash = i + 1; /* next place to start */ Ack, those two lines should move up into the "if (res != NULL)" block. errors-are-error-prone-ly y'rs - tim From gvwilson at nevex.com Sun Dec 10 17:11:09 2000 From: gvwilson at nevex.com (Greg Wilson) Date: Sun, 10 Dec 2000 11:11:09 -0500 Subject: [Python-Dev] re: So You Want to Write About Python? Message-ID: Hi, folks. Jon Erickson (Doctor Dobb's Journal), Frank Willison (O'Reilly), and I (professional loose cannon) are doing a workshop at IPC on writing books and magazine articles about Python. It would be great to have a few articles (in various stages of their lives) and/or book proposals from people on this list to use as examples. So, if you think the world oughta know about the things you're doing, and would like to use this to help get yourself motivated to start writing, please drop me a line. I'm particularly interested in: - the real-world issues involved in moving to Unicode - non-trivial XML processing using SAX and DOM (where "non-trivial" means "including namespaces, entity references, error handling, and all that") - the theory and practice of stackless, generators, and continuations - the real-world tradeoffs between the various memory management schemes that are now available for Python - feature comparisons of various Foobars that can be used with Python (where "Foobar" could be "GUI toolkit", "IDE", "web scripting toolkit", or just about anything else) - performance analysis and tuning of Python itself (as an example of how you speed up real applications --- this is something that matters a lot in the real world, but tends to get forgotten in school) - just about anything else that you wish someone had written for you before you started your last big project Thanks, Greg From paul at prescod.net Sun Dec 10 19:02:27 2000 From: paul at prescod.net (Paul Prescod) Date: Sun, 10 Dec 2000 10:02:27 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> Message-ID: <3A33C533.ABA27C7C@prescod.net> Guido van Rossum wrote: > > Nobody seems to care much about the warnings PEP so far. What's up? > Are you all too busy buying presents for the holidays? Then get me > some too, please? :-) My opinions: * it should be a built-in or keyword, not a function in "sys". Warning is supposed to be as easy as possible so people will do it often. sys.argv and sys.stdout annoy me as it is. * the term "level" applied to warnings typically means "warning level" as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something. * this level idea gives rise to another question. What if I want to see the full stack context of a warning? Do I have to implement a whole new warning output hook? It seems like I should be able to specify this as a command line option alongside the action. * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out. * should there be a sys.formatwarning? What if I want to redirect warnings to a socket -- I'd like to use the standard formatting machinery. Or vice versa, I might want to change the formatting but not override the destination. * there should be a "RuntimeWarning" -- base category for warnings about dubious runtime behaviors (e.g. integer division truncated value) * it should be possible to strip warnings as an optimization step. That may require interpreter and syntax support. * warnings will usually be tied to tests which the user will want to be able to optimize out also. (e.g. if __debug__ and type(foo)==StringType: warn "Should be Unicode!") I propose: >>> warn conditional, message[, category] to be very parallel with >>> assert conditional, message I'm not proposing the use of the assert keyword anymore, but I am trying to reuse the syntax for familiarity. Perhaps -Wstrip would strip warnings out of the bytecode. Paul Prescod From nas at arctrix.com Sun Dec 10 14:46:46 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Sun, 10 Dec 2000 05:46:46 -0800 Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) Message-ID: <20001210054646.A5219@glacier.fnational.com> Sourceforge unloads are not working. The lastest version of the patch for PEP 208 is here: http://arctrix.com/nas/python/coerce-6.0.diff Operations on instances now call __coerce__ if it exists. I think the patch is now complete. Converting other builtin types to "new style numbers" can be done with a separate patch. Neil From guido at python.org Sun Dec 10 23:17:08 2000 From: guido at python.org (Guido van Rossum) Date: Sun, 10 Dec 2000 17:17:08 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Sun, 10 Dec 2000 10:02:27 PST." <3A33C533.ABA27C7C@prescod.net> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> Message-ID: <200012102217.RAA12550@cj20424-a.reston1.va.home.com> > My opinions: > > * it should be a built-in or keyword, not a function in "sys". Warning > is supposed to be as easy as possible so people will do it often. Disagree. Warnings are there mostly for the Python system to warn the Python programmer. The most heavy use will come from the standard library, not from user code. > sys.argv and sys.stdout annoy me as it is. Too bad. > * the term "level" applied to warnings typically means "warning level" > as in -W1 -W2 -Wall. Proposal: call it "stacklevel" or something. Good point. > * this level idea gives rise to another question. What if I want to see > the full stack context of a warning? Do I have to implement a whole new > warning output hook? It seems like I should be able to specify this as a > command line option alongside the action. Turn warnings into errors and you'll get a full traceback. If you really want a full traceback without exiting, some creative use of sys._getframe() and the traceback module will probably suit you well. > * I prefer ":*:*:" to ":::" for leaving parts of the warning spec out. I don't. > * should there be a sys.formatwarning? What if I want to redirect > warnings to a socket -- I'd like to use the standard formatting > machinery. Or vice versa, I might want to change the formatting but not > override the destination. Good point. I'm changing this to: def showwarning(message, category, filename, lineno, file=None): """Hook to frite a warning to a file; replace if you like.""" and def formatwarning(message, category, filename, lineno): """Hook to format a warning the standard way.""" > * there should be a "RuntimeWarning" -- base category for warnings > about dubious runtime behaviors (e.g. integer division truncated value) OK. > * it should be possible to strip warnings as an optimization step. That > may require interpreter and syntax support. I don't see the point of this. I think this comes from our different views on who should issue warnings. > * warnings will usually be tied to tests which the user will want to be > able to optimize out also. (e.g. if __debug__ and type(foo)==StringType: > warn "Should be Unicode!") > > I propose: > > >>> warn conditional, message[, category] Sorry, this is not worth a new keyword. > to be very parallel with > > >>> assert conditional, message > > I'm not proposing the use of the assert keyword anymore, but I am trying > to reuse the syntax for familiarity. Perhaps -Wstrip would strip > warnings out of the bytecode. Why? --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at effbot.org Mon Dec 11 01:16:25 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Mon, 11 Dec 2000 01:16:25 +0100 Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com> Message-ID: <000901c06307$9a814d60$3c6340d5@hagrid> Guido wrote: > Moshe proposes to add an overridable function sys.displayhook(obj) > which will be called by the interpreter for the PRINT_EXPR opcode, > instead of hardcoding the behavior. The default implementation will > of course have the current behavior, but this makes it much simpler to > experiment with alternatives, e.g. using str() instead of repr() (or > to choose between str() and repr() based on the type). hmm. instead of patching here and there, what's stopping us from doing it the right way? I'd prefer something like: import code class myCLI(code.InteractiveConsole): def displayhook(self, data): # non-standard display hook print str(data) sys.setcli(myCLI()) (in other words, why not move the *entire* command line interface over to Python code) From guido at python.org Mon Dec 11 03:24:20 2000 From: guido at python.org (Guido van Rossum) Date: Sun, 10 Dec 2000 21:24:20 -0500 Subject: [Python-Dev] PEP 217 - Display Hook for Interactive Use In-Reply-To: Your message of "Mon, 11 Dec 2000 01:16:25 +0100." <000901c06307$9a814d60$3c6340d5@hagrid> References: <200012081851.NAA32254@cj20424-a.reston1.va.home.com> <000901c06307$9a814d60$3c6340d5@hagrid> Message-ID: <200012110224.VAA12844@cj20424-a.reston1.va.home.com> > Guido wrote: > > Moshe proposes to add an overridable function sys.displayhook(obj) > > which will be called by the interpreter for the PRINT_EXPR opcode, > > instead of hardcoding the behavior. The default implementation will > > of course have the current behavior, but this makes it much simpler to > > experiment with alternatives, e.g. using str() instead of repr() (or > > to choose between str() and repr() based on the type). Effbot regurgitates: > hmm. instead of patching here and there, what's stopping us > from doing it the right way? I'd prefer something like: > > import code > > class myCLI(code.InteractiveConsole): > def displayhook(self, data): > # non-standard display hook > print str(data) > > sys.setcli(myCLI()) > > (in other words, why not move the *entire* command line interface > over to Python code) Indeed, this is why I've been hesitant to bless Moshe's hack. I finally decided to go for it because I don't see this redesign of the CLI happening anytime soon. In order to do it right, it would require a redesign of the parser input handling, which is probably the oldest code in Python (short of the long integer math, which predates Python by several years). The current code module is a hack, alas, and doesn't always get it right the same way as the *real* CLI does things. So, rather than wait forever for the perfect solution, I think it's okay to settle for less sooner. "Now is better than never." --Guido van Rossum (home page: http://www.python.org/~guido/) From paulp at ActiveState.com Mon Dec 11 07:59:29 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Sun, 10 Dec 2000 22:59:29 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com> Message-ID: <3A347B51.ADB3F12C@ActiveState.com> Guido van Rossum wrote: > >... > > Disagree. Warnings are there mostly for the Python system to warn the > Python programmer. The most heavy use will come from the standard > library, not from user code. Most Python code is part of some library or another. It may not be the standard library but its still a library. Perl and Java both make warnings (especially about deprecation) very easy *for user code*. > > * it should be possible to strip warnings as an optimization step. That > > may require interpreter and syntax support. > > I don't see the point of this. I think this comes from our different > views on who should issue warnings. Everyone who creates a reusable library will want to issue warnings. That is to say, most serious Python programmers. Anyhow, let's presume that it is only the standard library that issues warnings (for arguments sake). What if I have a speed-critical module that triggers warnings in an inner loop. Turning off the warning doesn't turn off the overhead of the warning infrastructure. I should be able to turn off the overhead easily -- ideally from the Python command line. And I still feel that part of that "overhead" is in the code that tests to determine whether to issue the warnings. There should be a way to turn off that overhead also. Paul From paulp at ActiveState.com Mon Dec 11 08:23:17 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Sun, 10 Dec 2000 23:23:17 -0800 Subject: [Python-Dev] Online help PEP Message-ID: <3A3480E5.C2577AE6@ActiveState.com> PEP: ??? Title: Python Online Help Version: $Revision: 1.0 $ Author: paul at prescod.net, paulp at activestate.com (Paul Prescod) Status: Draft Type: Standards Track Python-Version: 2.1 Status: Incomplete Abstract This PEP describes a command-line driven online help facility for Python. The facility should be able to build on existing documentation facilities such as the Python documentation and docstrings. It should also be extensible for new types and modules. Interactive use: Simply typing "help" describes the help function (through repr overloading). "help" can also be used as a function: The function takes the following forms of input: help( "string" ) -- built-in topic or global help( ) -- docstring from object or type help( "doc:filename" ) -- filename from Python documentation If you ask for a global, it can be a fully-qualfied name such as help("xml.dom"). You can also use the facility from a command-line python --help if In either situation, the output does paging similar to the "more" command. Implementation The help function is implemented in an onlinehelp module which is demand-loaded. There should be options for fetching help information from environments other than the command line through the onlinehelp module: onelinehelp.gethelp(object_or_string) -> string It should also be possible to override the help display function by assigning to onlinehelp.displayhelp(object_or_string). The module should be able to extract module information from either the HTML or LaTeX versions of the Python documentation. Links should be accommodated in a "lynx-like" manner. Over time, it should also be able to recognize when docstrings are in "special" syntaxes like structured text, HTML and LaTeX and decode them appropriately. A prototype implementation is available with the Python source distribution as nondist/sandbox/doctools/onlinehelp.py. Built-in Topics help( "intro" ) - What is Python? Read this first! help( "keywords" ) - What are the keywords? help( "syntax" ) - What is the overall syntax? help( "operators" ) - What operators are available? help( "builtins" ) - What functions, types, etc. are built-in? help( "modules" ) - What modules are in the standard library? help( "copyright" ) - Who owns Python? help( "moreinfo" ) - Where is there more information? help( "changes" ) - What changed in Python 2.0? help( "extensions" ) - What extensions are installed? help( "faq" ) - What questions are frequently asked? help( "ack" ) - Who has done work on Python lately? Security Issues This module will attempt to import modules with the same names as requested topics. Don't use the modules if you are not confident that everything in your pythonpath is from a trusted source. Local Variables: mode: indented-text indent-tabs-mode: nil End: From tim.one at home.com Mon Dec 11 08:36:57 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 11 Dec 2000 02:36:57 -0500 Subject: [Python-Dev] FW: [Python-Help] indentation Message-ID: While we're talking about pluggable CLIs, I share this fellow's confusion over IDLE's CLI variant: block code doesn't "look right" under IDLE because sys.ps2 doesn't exist under IDLE. Some days you can't make *anybody* happy . -----Original Message----- ... Subject: [Python-Help] indentation Sent: Sunday, December 10, 2000 7:32 AM ... My Problem has to do with identation: I put the following to idle: >>> if not 1: print 'Hallo' else: SyntaxError: invalid syntax I get the Message above. I know that else must be 4 spaces to the left, but idle doesn't let me do this. I have only the alternative to put to most left point. But than I disturb the block structure and I get again the error message. I want to have it like this: >>> if not 1: print 'Hallo' else: Can you help me? ... From fredrik at pythonware.com Mon Dec 11 12:36:53 2000 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 11 Dec 2000 12:36:53 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> Message-ID: <033701c06366$ab746580$0900a8c0@SPIFF> christian wrote: > That algorithm is really a gem which you should know, > so let me try to explain it. I think someone just won the "brain exploder 2000" award ;-) to paraphrase Bertrand Russell, "Mathematics may be defined as the subject where I never know what you are talking about, nor whether what you are saying is true" cheers /F From thomas at xs4all.net Mon Dec 11 13:12:09 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 11 Dec 2000 13:12:09 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF>; from fredrik@pythonware.com on Mon, Dec 11, 2000 at 12:36:53PM +0100 References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: <20001211131208.G4396@xs4all.nl> On Mon, Dec 11, 2000 at 12:36:53PM +0100, Fredrik Lundh wrote: > christian wrote: > > That algorithm is really a gem which you should know, > > so let me try to explain it. > I think someone just won the "brain exploder 2000" award ;-) By acclamation, I'd expect. I know it was the best laugh I had since last week's Have I Got News For You, even though trying to understand it made me glad I had boring meetings to recuperate in ;) Highschool-dropout-ly y'rs, -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal at lemburg.com Mon Dec 11 13:33:18 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 11 Dec 2000 13:33:18 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: <3A34C98E.7C42FD24@lemburg.com> Fredrik Lundh wrote: > > christian wrote: > > That algorithm is really a gem which you should know, > > so let me try to explain it. > > I think someone just won the "brain exploder 2000" award ;-) > > to paraphrase Bertrand Russell, > > "Mathematics may be defined as the subject where I never > know what you are talking about, nor whether what you are > saying is true" Hmm, I must have missed that one... care to repost ? -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer at tismer.com Mon Dec 11 14:49:48 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 11 Dec 2000 15:49:48 +0200 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: <3A34DB7C.FF7E82CE@tismer.com> Fredrik Lundh wrote: > > christian wrote: > > That algorithm is really a gem which you should know, > > so let me try to explain it. > > I think someone just won the "brain exploder 2000" award ;-) > > to paraphrase Bertrand Russell, > > "Mathematics may be defined as the subject where I never > know what you are talking about, nor whether what you are > saying is true" :-)) Well, I was primarily targeting Guido, who said that he came from math, and one cannot study math without standing a basic algebra course, I think. I tried my best to explain it for those who know at least how groups, fields, rings and automorphisms work. Going into more details of the theory would be off-topic for python-dev, but I will try it in an upcoming DDJ article. As you might have guessed, I didn't do this just for fun. It is the old game of explaining what is there, convincing everybody that you at least know what you are talking about, and then three days later coming up with an improved application of the theory. Today is Monday, 2 days left. :-) ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From guido at python.org Mon Dec 11 16:12:24 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 10:12:24 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Your message of "Mon, 11 Dec 2000 15:49:48 +0200." <3A34DB7C.FF7E82CE@tismer.com> References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> Message-ID: <200012111512.KAA23622@cj20424-a.reston1.va.home.com> > Fredrik Lundh wrote: > > > > christian wrote: > > > That algorithm is really a gem which you should know, > > > so let me try to explain it. > > > > I think someone just won the "brain exploder 2000" award ;-) > > > > to paraphrase Bertrand Russell, > > > > "Mathematics may be defined as the subject where I never > > know what you are talking about, nor whether what you are > > saying is true" > > :-)) > > Well, I was primarily targeting Guido, who said that he > came from math, and one cannot study math without standing > a basic algebra course, I think. I tried my best to explain > it for those who know at least how groups, fields, rings > and automorphisms work. Going into more details of the > theory would be off-topic for python-dev, but I will try > it in an upcoming DDJ article. I do have a math degree, but it is 18 years old and I had to give up after the first paragraph of your explanation. It made me vividly recall the first and only class on Galois Theory that I ever took -- after one hour I realized that this was not for me and I didn't have a math brain after all. I went back to the basement where the software development lab was (i.e. a row of card punches :-). > As you might have guessed, I didn't do this just for fun. > It is the old game of explaining what is there, convincing > everybody that you at least know what you are talking about, > and then three days later coming up with an improved > application of the theory. > > Today is Monday, 2 days left. :-) I'm very impressed. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Dec 11 16:15:02 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 10:15:02 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Sun, 10 Dec 2000 22:59:29 PST." <3A347B51.ADB3F12C@ActiveState.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <200012102217.RAA12550@cj20424-a.reston1.va.home.com> <3A347B51.ADB3F12C@ActiveState.com> Message-ID: <200012111515.KAA23764@cj20424-a.reston1.va.home.com> [me] > > Disagree. Warnings are there mostly for the Python system to warn the > > Python programmer. The most heavy use will come from the standard > > library, not from user code. [Paul Prescod] > Most Python code is part of some library or another. It may not be the > standard library but its still a library. Perl and Java both make > warnings (especially about deprecation) very easy *for user code*. Hey. I'm not making it impossible to use warnings. I'm making it very easy. All you have to do is put "from warnings import warn" at the top of your library module. Requesting a built-in or even a new statement is simply excessive. > > > * it should be possible to strip warnings as an optimization step. That > > > may require interpreter and syntax support. > > > > I don't see the point of this. I think this comes from our different > > views on who should issue warnings. > > Everyone who creates a reusable library will want to issue warnings. > That is to say, most serious Python programmers. > > Anyhow, let's presume that it is only the standard library that issues > warnings (for arguments sake). What if I have a speed-critical module > that triggers warnings in an inner loop. Turning off the warning doesn't > turn off the overhead of the warning infrastructure. I should be able to > turn off the overhead easily -- ideally from the Python command line. > And I still feel that part of that "overhead" is in the code that tests > to determine whether to issue the warnings. There should be a way to > turn off that overhead also. So rewrite your code so that it doesn't trigger the warning. When you get a warning, you're doing something that could be done in a better way. So don't whine about the performance. It's a quality of implementation issue whether C code that tests for issues that deserve warnings can do the test without slowing down code that doesn't deserve a warning. Ditto for standard library code. Here's an example. I expect there will eventually (not in 2.1 yet!) warnings in the deprecated string module. If you get such a warning in a time-critical piece of code, the solution is to use string methods -- not to while about the performance of the backwards compatibility code. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at digicool.com Mon Dec 11 17:02:29 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 11 Dec 2000 11:02:29 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> Message-ID: <14900.64149.910989.998348@anthem.concentric.net> Some of my thoughts after reading the PEP and Paul/Guido's exchange. - A function in the warn module is better than one in the sys module. "from warnings import warn" is good enough to not warrant a built-in. I get the sense that the PEP description is behind Guido's currently implementation here. - When PyErr_Warn() returns 1, does that mean a warning has been transmuted into an exception, or some other exception occurred during the setting of the warning? (I think I know, but the PEP could be clearer here). - It would be nice if lineno can be a range specification. Other matches are based on regexps -- think of this as a line number regexp. - Why not do setupwarnings() in site.py? - Regexp matching on messages should be case insensitive. - The second argument to sys.warn() or PyErr_Warn() can be any class, right? If so, it's easy for me to have my own warning classes. What if I want to set up my own warnings filters? Maybe if `action' could be a callable as well as a string. Then in my IDE, I could set that to "mygui.popupWarningsDialog". -Barry From guido at python.org Mon Dec 11 16:57:33 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 10:57:33 -0500 Subject: [Python-Dev] Online help PEP In-Reply-To: Your message of "Sun, 10 Dec 2000 23:23:17 PST." <3A3480E5.C2577AE6@ActiveState.com> References: <3A3480E5.C2577AE6@ActiveState.com> Message-ID: <200012111557.KAA24266@cj20424-a.reston1.va.home.com> I approve of the general idea. Barry, please assign a PEP number. > PEP: ??? > Title: Python Online Help > Version: $Revision: 1.0 $ > Author: paul at prescod.net, paulp at activestate.com (Paul Prescod) > Status: Draft > Type: Standards Track > Python-Version: 2.1 > Status: Incomplete > > Abstract > > This PEP describes a command-line driven online help facility > for Python. The facility should be able to build on existing > documentation facilities such as the Python documentation > and docstrings. It should also be extensible for new types and > modules. > > Interactive use: > > Simply typing "help" describes the help function (through repr > overloading). Cute -- like license, copyright, credits I suppose. > "help" can also be used as a function: > > The function takes the following forms of input: > > help( "string" ) -- built-in topic or global Why does a global require string quotes? > help( ) -- docstring from object or type > help( "doc:filename" ) -- filename from Python documentation I'm missing help() -- table of contents I'm not sure if the table of contents should be printed by the repr output. > If you ask for a global, it can be a fully-qualfied name such as > help("xml.dom"). Why are the string quotes needed? When are they useful? > You can also use the facility from a command-line > > python --help if Is this really useful? Sounds like Perlism to me. > In either situation, the output does paging similar to the "more" > command. Agreed. But how to implement paging in a platform-dependent manner? On Unix, os.system("more") or "$PAGER" is likely to work. On Windows, I suppose we could use its MORE, although that's pretty braindead. On the Mac? Also, inside IDLE or Pythonwin, invoking the system pager isn't a good idea. > Implementation > > The help function is implemented in an onlinehelp module which is > demand-loaded. What does "demand-loaded" mean in a Python context? > There should be options for fetching help information from > environments other than the command line through the onlinehelp > module: > > onelinehelp.gethelp(object_or_string) -> string Good idea. > It should also be possible to override the help display function by > assigning to onlinehelp.displayhelp(object_or_string). Good idea. Pythonwin and IDLE could use this. But I'd like it to work at least "okay" if they don't. > The module should be able to extract module information from either > the HTML or LaTeX versions of the Python documentation. Links should > be accommodated in a "lynx-like" manner. I think this is beyond the scope. The LaTeX isn't installed anywhere (and processing would be too much work). The HTML is installed only on Windows, where there already is a way to get it to pop up in your browser (actually two: it's in the Start menu, and also in IDLE's Help menu). > Over time, it should also be able to recognize when docstrings are > in "special" syntaxes like structured text, HTML and LaTeX and > decode them appropriately. A standard syntax for docstrings is under development, PEP 216. I don't agree with the proposal there, but in any case the help PEP should not attempt to legalize a different format than PEP 216. > A prototype implementation is available with the Python source > distribution as nondist/sandbox/doctools/onlinehelp.py. Neat. I noticed that in a 24-line screen, the pagesize must be set to 21 to avoid stuff scrolling off the screen. Maybe there's an off-by-3 error somewhere? I also noticed that it always prints '1' when invoked as a function. The new license pager in site.py avoids this problem. help("operators") and several others raise an AttributeError('handledocrl'). The "lynx-line links" don't work. > Built-in Topics > > help( "intro" ) - What is Python? Read this first! > help( "keywords" ) - What are the keywords? > help( "syntax" ) - What is the overall syntax? > help( "operators" ) - What operators are available? > help( "builtins" ) - What functions, types, etc. are built-in? > help( "modules" ) - What modules are in the standard library? > help( "copyright" ) - Who owns Python? > help( "moreinfo" ) - Where is there more information? > help( "changes" ) - What changed in Python 2.0? > help( "extensions" ) - What extensions are installed? > help( "faq" ) - What questions are frequently asked? > help( "ack" ) - Who has done work on Python lately? I think it's naive to expect this help facility to replace browsing the website or the full documentation package. There should be one entry that says to point your browser there (giving the local filesystem URL if available), and that's it. The rest of the online help facility should be concerned with exposing doc strings. > Security Issues > > This module will attempt to import modules with the same names as > requested topics. Don't use the modules if you are not confident > that everything in your pythonpath is from a trusted source. Yikes! Another reason to avoid the "string" -> global variable option. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Dec 11 17:53:37 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 11:53:37 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 11:02:29 EST." <14900.64149.910989.998348@anthem.concentric.net> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> Message-ID: <200012111653.LAA24545@cj20424-a.reston1.va.home.com> > Some of my thoughts after reading the PEP and Paul/Guido's exchange. > > - A function in the warn module is better than one in the sys module. > "from warnings import warn" is good enough to not warrant a > built-in. I get the sense that the PEP description is behind > Guido's currently implementation here. Yes. I've updated the PEP to match the (2nd) implementation. > - When PyErr_Warn() returns 1, does that mean a warning has been > transmuted into an exception, or some other exception occurred > during the setting of the warning? (I think I know, but the PEP > could be clearer here). I've clarified this now: it returns 1 in either case. You have to do exception handling in either case. I'm not telling why -- you don't need to know. The caller of PyErr_Warn() should not attempt to catch the exception -- if that's your intent, you shouldn't be calling PyErr_Warn(). And PyErr_Warn() is complicated enough that it has to allow raising an exception. > - It would be nice if lineno can be a range specification. Other > matches are based on regexps -- think of this as a line number > regexp. Too much complexity already. > - Why not do setupwarnings() in site.py? See the PEP and the current implementation. The delayed-loading of the warnings module means that we have to save the -W options as sys.warnoptions. (This also makes them work when multiple interpreters are used -- they all get the -W options.) > - Regexp matching on messages should be case insensitive. Good point! Done in my version of the code. > - The second argument to sys.warn() or PyErr_Warn() can be any class, > right? Almost. It must be derived from __builtin__.Warning. > If so, it's easy for me to have my own warning classes. > What if I want to set up my own warnings filters? Maybe if `action' > could be a callable as well as a string. Then in my IDE, I could > set that to "mygui.popupWarningsDialog". No, for that purpose you would override warnings.showwarning(). --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas at xs4all.net Mon Dec 11 17:58:39 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 11 Dec 2000 17:58:39 +0100 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <14900.64149.910989.998348@anthem.concentric.net>; from barry@digicool.com on Mon, Dec 11, 2000 at 11:02:29AM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> Message-ID: <20001211175839.H4396@xs4all.nl> On Mon, Dec 11, 2000 at 11:02:29AM -0500, Barry A. Warsaw wrote: > - A function in the warn module is better than one in the sys module. > "from warnings import warn" is good enough to not warrant a > built-in. I get the sense that the PEP description is behind > Guido's currently implementation here. +1 on this. I have a response to Guido's first posted PEP on my laptop, but due to a weekend in Germany wasn't able to post it before he updated the PEP. I guess I can delete the arguments for this, now ;) but lets just say I think 'sys' is being a bit overused, and the case of a function in sys and its data in another module is just plain silly. > - When PyErr_Warn() returns 1, does that mean a warning has been > transmuted into an exception, or some other exception occurred > during the setting of the warning? (I think I know, but the PEP > could be clearer here). How about returning 1 for 'warning turned into exception' and -1 for 'normal exception' ? It would be slightly more similar to other functions if '-1' meant 'exception', and it would be easy to put in an if statement -- and still allow C code to ignore the produced error, if it wanted to. > - It would be nice if lineno can be a range specification. Other > matches are based on regexps -- think of this as a line number > regexp. +0 on this... I'm not sure if such fine-grained control is really necessary. I liked the hint at 'per function' granularity, but I realise it's tricky to do right, what with naming issues and all that. > - Regexp matching on messages should be case insensitive. How about being able to pass in compiled regexp objects as well as strings ? I haven't looked at the implementation at all, so I'm not sure how expensive it would be, but it might also be nice to have users (= programmers) pass in an object with its own 'match' method, so you can 'interactively' decide whether or not to raise an exception, popup a window, and what not. Sort of like letting 'action' be a callable, which I think is a good idea as well. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido at python.org Mon Dec 11 18:11:02 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 12:11:02 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 17:58:39 +0100." <20001211175839.H4396@xs4all.nl> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> Message-ID: <200012111711.MAA24818@cj20424-a.reston1.va.home.com> > > - When PyErr_Warn() returns 1, does that mean a warning has been > > transmuted into an exception, or some other exception occurred > > during the setting of the warning? (I think I know, but the PEP > > could be clearer here). > > How about returning 1 for 'warning turned into exception' and -1 for 'normal > exception' ? It would be slightly more similar to other functions if '-1' > meant 'exception', and it would be easy to put in an if statement -- and > still allow C code to ignore the produced error, if it wanted to. Why would you want this? The user clearly said that they wanted the exception! --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at effbot.org Mon Dec 11 18:13:10 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Mon, 11 Dec 2000 18:13:10 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com> Message-ID: <009a01c06395$a9da3220$3c6340d5@hagrid> > Hmm, I must have missed that one... care to repost ? doesn't everyone here read the daily URL? here's a link: http://mail.python.org/pipermail/python-dev/2000-December/010913.html From barry at digicool.com Mon Dec 11 18:18:04 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Mon, 11 Dec 2000 12:18:04 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> Message-ID: <14901.3149.109401.151742@anthem.concentric.net> >>>>> "GvR" == Guido van Rossum writes: GvR> I've clarified this now: it returns 1 in either case. You GvR> have to do exception handling in either case. I'm not GvR> telling why -- you don't need to know. The caller of GvR> PyErr_Warn() should not attempt to catch the exception -- if GvR> that's your intent, you shouldn't be calling PyErr_Warn(). GvR> And PyErr_Warn() is complicated enough that it has to allow GvR> raising an exception. Makes sense. >> - It would be nice if lineno can be a range specification. >> Other matches are based on regexps -- think of this as a line >> number regexp. GvR> Too much complexity already. Okay, no biggie I think. >> - Why not do setupwarnings() in site.py? GvR> See the PEP and the current implementation. The GvR> delayed-loading of the warnings module means that we have to GvR> save the -W options as sys.warnoptions. (This also makes GvR> them work when multiple interpreters are used -- they all get GvR> the -W options.) Cool. >> - Regexp matching on messages should be case insensitive. GvR> Good point! Done in my version of the code. Cool. >> - The second argument to sys.warn() or PyErr_Warn() can be any >> class, right? GvR> Almost. It must be derived from __builtin__.Warning. __builtin__.Warning == exceptions.Warning, right? >> If so, it's easy for me to have my own warning classes. What >> if I want to set up my own warnings filters? Maybe if `action' >> could be a callable as well as a string. Then in my IDE, I >> could set that to "mygui.popupWarningsDialog". GvR> No, for that purpose you would override GvR> warnings.showwarning(). Cool. Looks good. -Barry From thomas at xs4all.net Mon Dec 11 19:04:56 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 11 Dec 2000 19:04:56 +0100 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012111711.MAA24818@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 12:11:02PM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com> Message-ID: <20001211190455.I4396@xs4all.nl> On Mon, Dec 11, 2000 at 12:11:02PM -0500, Guido van Rossum wrote: > > How about returning 1 for 'warning turned into exception' and -1 for 'normal > > exception' ? It would be slightly more similar to other functions if '-1' > > meant 'exception', and it would be easy to put in an if statement -- and > > still allow C code to ignore the produced error, if it wanted to. > Why would you want this? The user clearly said that they wanted the > exception! The difference is that in one case, the user will see the original warning-turned-exception, and in the other she won't -- the warning will be lost. At best she'll see (by looking at the traceback) the code intended to give a warning (that might or might not have been turned into an exception) and failed. The warning code might decide to do something aditional to notify the user of the thing it intended to warn about, which ended up as a 'real' exception because of something else. It's no biggy, obviously, except that if you change your mind it will be hard to add it without breaking code. Even if you explicitly state the return value should be tested for boolean value, not greater-than-zero value. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From guido at python.org Mon Dec 11 19:16:58 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 13:16:58 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 19:04:56 +0100." <20001211190455.I4396@xs4all.nl> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <20001211175839.H4396@xs4all.nl> <200012111711.MAA24818@cj20424-a.reston1.va.home.com> <20001211190455.I4396@xs4all.nl> Message-ID: <200012111816.NAA25214@cj20424-a.reston1.va.home.com> > > > How about returning 1 for 'warning turned into exception' and -1 for 'normal > > > exception' ? It would be slightly more similar to other functions if '-1' > > > meant 'exception', and it would be easy to put in an if statement -- and > > > still allow C code to ignore the produced error, if it wanted to. > > > Why would you want this? The user clearly said that they wanted the > > exception! > > The difference is that in one case, the user will see the original > warning-turned-exception, and in the other she won't -- the warning will be > lost. At best she'll see (by looking at the traceback) the code intended to > give a warning (that might or might not have been turned into an exception) > and failed. Yes -- this is a standard convention in Python. if there's a bug in code that is used to raise or handle an exception, you get a traceback from that bug. > The warning code might decide to do something aditional to > notify the user of the thing it intended to warn about, which ended up as a > 'real' exception because of something else. Nah. The warning code shouldn't worry about that. If there's a bug in PyErr_Warn(), that should get top priority until it's fixed. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Mon Dec 11 19:12:56 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 11 Dec 2000 19:12:56 +0100 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34C98E.7C42FD24@lemburg.com> <009a01c06395$a9da3220$3c6340d5@hagrid> Message-ID: <3A351928.3A41C970@lemburg.com> Fredrik Lundh wrote: > > > Hmm, I must have missed that one... care to repost ? > > doesn't everyone here read the daily URL? No time for pull logic... only push logic ;-) > here's a link: > http://mail.python.org/pipermail/python-dev/2000-December/010913.html Thanks. A very nice introduction indeed. The only thing which didn't come through in the first reading: why do we need GF(p^n)'s in the first place ? The second reading then made this clear: we need to assure that by iterating through the set of possible coefficients we can actually reach all slots in the dictionary... a gem indeed. Now if we could only figure out an equally simple way of producing perfect hash functions on-the-fly we could eliminate the need for the PyObject_Compare()s... ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tim.one at home.com Mon Dec 11 21:22:55 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 11 Dec 2000 15:22:55 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <033701c06366$ab746580$0900a8c0@SPIFF> Message-ID: [/F, on Christian's GF tutorial] > I think someone just won the "brain exploder 2000" award ;-) Well, anyone can play. When keys collide, what we need is a function f(i) such that repeating i = f(i) visits every int in (0, 2**N) exactly once before setting i back to its initial value, for a fixed N and where the first i is in (0, 2**N). This is the quickest: def f(i): i -= 1 if i == 0: i = 2**N-1 return i Unfortunately, this leads to performance-destroying "primary collisions" (see Knuth, or any other text w/ a section on hashing). Other *good* possibilities include a pseudo-random number generator of maximal period, or viewing the ints in (0, 2**N) as bit vectors indicating set membership and generating all subsets of an N-element set in a Grey code order. The *form* of the function dictobject.c actually uses is: def f(i): i <<= 1 if i >= 2**N: i ^= MAGIC_CONSTANT_DEPENDING_ON_N return i which is suitably non-linear and as fast as the naive method. Given the form of the function, you don't need any theory at all to find a value for MAGIC_CONSTANT_DEPENDING_ON_N that simply works. In fact, I verified all the magic values in dictobject.c via brute force, because the person who contributed the original code botched the theory slightly and gave us some values that didn't work. I'll rely on the theory if and only if we have to extend this to 64-bit machines someday: I'm too old to wait for a brute search of a space with 2**64 elements . mathematics-is-a-battle-against-mortality-ly y'rs - tim From greg at cosc.canterbury.ac.nz Mon Dec 11 22:46:11 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Dec 2000 10:46:11 +1300 (NZDT) Subject: [Python-Dev] Online help PEP In-Reply-To: <200012111557.KAA24266@cj20424-a.reston1.va.home.com> Message-ID: <200012112146.KAA01771@s454.cosc.canterbury.ac.nz> Guido: > Paul Prescod: > > In either situation, the output does paging similar to the "more" > > command. > Agreed. Only if it can be turned off! I usually prefer to use the scrolling capabilities of whatever shell window I'm using rather than having some program's own idea of how to do paging forced upon me when I don't want it. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From moshez at zadka.site.co.il Tue Dec 12 07:33:02 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Tue, 12 Dec 2000 08:33:02 +0200 (IST) Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) Message-ID: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il> On Mon, 11 Dec 2000 15:22:55 -0500, "Tim Peters" wrote: > Well, anyone can play. When keys collide, what we need is a function f(i) > such that repeating > i = f(i) > visits every int in (0, 2**N) exactly once before setting i back to its > initial value, for a fixed N and where the first i is in (0, 2**N). OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)? Did 0 harm you in your childhood, and you're trying to get back? <0 wink>. If we had an affine operation, instead of a linear one, we could have [0, 2**n). I won't repeat the proof here but changing > def f(i): > i <<= 1 i^=1 # This is the line I added > if i >= 2**N: > i ^= MAGIC_CONSTANT_DEPENDING_ON_N > return i Makes you waltz all over [0, 2**n) if the original made you comple (0, 2**n). if-i'm-wrong-then-someone-should-shoot-me-to-save-me-the-embarrasment-ly y'rs, Z. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tim.one at home.com Mon Dec 11 23:38:56 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 11 Dec 2000 17:38:56 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: <20001212063302.05E0BA82E@darjeeling.zadka.site.co.il> Message-ID: [Tim] > Well, anyone can play. When keys collide, what we need is a > function f(i) such that repeating > i = f(i) > visits every int in (0, 2**N) exactly once before setting i back to its > initial value, for a fixed N and where the first i is in (0, 2**N). [Moshe Zadka] > OK, maybe this is me being *real* stupid, but why? Why not [0, 2**n)? > Did 0 harm you in your childhood, and you're trying to get > back? <0 wink>. We don't need f at all unless we've already determined there's a collision at some index h. The i sequence is used to offset h (mod 2**N). An increment of 0 would waste time (h+0 == h, but we've already done a full compare on the h'th table entry and already determined it wasn't equal to what we're looking for). IOW, there are only 2**N-1 slots still of interest by the time f is needed. > If we had an affine operation, instead of a linear one, we could have > [0, 2**n). I won't repeat the proof here but changing > > def f(i): > i <<= 1 > i^=1 # This is the line I added > if i >= 2**N: > i ^= MAGIC_CONSTANT_DEPENDING_ON_N > return i > > Makes you waltz all over [0, 2**n) if the original made you comple > (0, 2**n). But, Moshe! The proof would have been the most interesting part . From gstein at lyra.org Tue Dec 12 01:15:50 2000 From: gstein at lyra.org (Greg Stein) Date: Mon, 11 Dec 2000 16:15:50 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012111653.LAA24545@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 11:53:37AM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> Message-ID: <20001211161550.Y7732@lyra.org> On Mon, Dec 11, 2000 at 11:53:37AM -0500, Guido van Rossum wrote: >... > > - The second argument to sys.warn() or PyErr_Warn() can be any class, > > right? > > Almost. It must be derived from __builtin__.Warning. Since you must do "from warnings import warn" before using the warnings, then I think it makes sense to put the Warning classes into the warnings module. (e.g. why increase the size of the builtins?) Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at python.org Tue Dec 12 01:39:31 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 19:39:31 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 16:15:50 PST." <20001211161550.Y7732@lyra.org> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> Message-ID: <200012120039.TAA02983@cj20424-a.reston1.va.home.com> > Since you must do "from warnings import warn" before using the warnings, > then I think it makes sense to put the Warning classes into the warnings > module. (e.g. why increase the size of the builtins?) I don't particularly care whether the Warning category classes are builtins, but I can't declare them in the warnings module. Typical use from C is: if (PyErr_Warn(PyExc_DeprecationWarning, "the strop module is deprecated")) return NULL; PyErr_Warn() imports the warnings module on its first call. But the value of PyExc_DeprecationWarning c.s. must be available *before* the first call, so they can't be imported from the warnings module! My first version imported warnings at the start of the program, but this almost doubled the start-up time, hence the design where the module is imported only when needed. The most convenient place to create the Warning category classes is in the _exceptions module; doing it the easiest way there means that they are automatically exported to __builtin__. This doesn't bother me enough to try and hide them. --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Tue Dec 12 02:11:02 2000 From: gstein at lyra.org (Greg Stein) Date: Mon, 11 Dec 2000 17:11:02 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012120039.TAA02983@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 07:39:31PM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> Message-ID: <20001211171102.C7732@lyra.org> On Mon, Dec 11, 2000 at 07:39:31PM -0500, Guido van Rossum wrote: > > Since you must do "from warnings import warn" before using the warnings, > > then I think it makes sense to put the Warning classes into the warnings > > module. (e.g. why increase the size of the builtins?) > > I don't particularly care whether the Warning category classes are > builtins, but I can't declare them in the warnings module. Typical > use from C is: > > if (PyErr_Warn(PyExc_DeprecationWarning, > "the strop module is deprecated")) > return NULL; > > PyErr_Warn() imports the warnings module on its first call. But the > value of PyExc_DeprecationWarning c.s. must be available *before* the > first call, so they can't be imported from the warnings module! Do the following: pywarn.h or pyerrors.h: #define PyWARN_DEPRECATION "DeprecationWarning" ... if (PyErr_Warn(PyWARN_DEPRECATION, "the strop module is deprecated")) return NULL; The PyErr_Warn would then use the string to dynamically look up / bind to the correct value from the warnings module. By using the symbolic constant, you will catch typos in the C code (e.g. if people passed raw strings, then a typo won't be found until runtime; using symbols will catch the problem at compile time). The above strategy will allow for fully-delayed loading, and for all the warnings to be located in the "warnings" module. Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at python.org Tue Dec 12 02:21:41 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Dec 2000 20:21:41 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Mon, 11 Dec 2000 17:11:02 PST." <20001211171102.C7732@lyra.org> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> <20001211171102.C7732@lyra.org> Message-ID: <200012120121.UAA04576@cj20424-a.reston1.va.home.com> > > PyErr_Warn() imports the warnings module on its first call. But the > > value of PyExc_DeprecationWarning c.s. must be available *before* the > > first call, so they can't be imported from the warnings module! > > Do the following: > > pywarn.h or pyerrors.h: > > #define PyWARN_DEPRECATION "DeprecationWarning" > > ... > if (PyErr_Warn(PyWARN_DEPRECATION, > "the strop module is deprecated")) > return NULL; > > The PyErr_Warn would then use the string to dynamically look up / bind to > the correct value from the warnings module. By using the symbolic constant, > you will catch typos in the C code (e.g. if people passed raw strings, then > a typo won't be found until runtime; using symbols will catch the problem at > compile time). > > The above strategy will allow for fully-delayed loading, and for all the > warnings to be located in the "warnings" module. Yeah, that would be a possibility, if it was deemed evil that the warnings appear in __builtin__. I don't see what's so evil about that. (There's also the problem that the C code must be able to create new warning categories, as long as they are derived from the Warning base class. Your approach above doesn't support this. I'm sure you can figure a way around that too. But I prefer to hear why you think it's needed first.) --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Tue Dec 12 02:26:00 2000 From: gstein at lyra.org (Greg Stein) Date: Mon, 11 Dec 2000 17:26:00 -0800 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012120121.UAA04576@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Dec 11, 2000 at 08:21:41PM -0500 References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <3A33C533.ABA27C7C@prescod.net> <14900.64149.910989.998348@anthem.concentric.net> <200012111653.LAA24545@cj20424-a.reston1.va.home.com> <20001211161550.Y7732@lyra.org> <200012120039.TAA02983@cj20424-a.reston1.va.home.com> <20001211171102.C7732@lyra.org> <200012120121.UAA04576@cj20424-a.reston1.va.home.com> Message-ID: <20001211172600.E7732@lyra.org> On Mon, Dec 11, 2000 at 08:21:41PM -0500, Guido van Rossum wrote: >... > > The above strategy will allow for fully-delayed loading, and for all the > > warnings to be located in the "warnings" module. > > Yeah, that would be a possibility, if it was deemed evil that the > warnings appear in __builtin__. I don't see what's so evil about > that. > > (There's also the problem that the C code must be able to create new > warning categories, as long as they are derived from the Warning base > class. Your approach above doesn't support this. I'm sure you can > figure a way around that too. But I prefer to hear why you think it's > needed first.) I'm just attempting to avoid dumping more names into __builtins__ is all. I don't believe there is anything intrinsically bad about putting more names in there, but avoiding the kitchen-sink metaphor for __builtins__ has got to be a Good Thing :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From guido at python.org Tue Dec 12 14:43:59 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Dec 2000 08:43:59 -0500 Subject: [Python-Dev] Request review of gdbm patch Message-ID: <200012121343.IAA06713@cj20424-a.reston1.va.home.com> I'm asking for a review of the patch to gdbm at http://sourceforge.net/patch/?func=detailpatch&patch_id=102638&group_id=5470 I asked the author for clarification and this is what I got. Can anybody suggest what to do? His mail doesn't give me much confidence in the patch. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Forwarded Message Date: Tue, 12 Dec 2000 13:24:13 +0100 From: Damjan To: Guido van Rossum Subject: Re: your gdbm patch for Python On Mon, Dec 11, 2000 at 03:51:03PM -0500, Guido van Rossum wrote: > I'm looking at your patch at SourceForge: First, I'm sorry it was such a mess of a patch, but I could't figure it out how to send a more elaborate comment. (But then again, I would't have an email from Guido van Rossum in my mail-box, to show of my friends :) > and wondering two things: > > (1) what does the patch do? > > (2) why does the patch remove the 'f' / GDBM_FAST option? From the gdbm info page: ...The following may also be logically or'd into the database flags: GDBM_SYNC, which causes all database operations to be synchronized to the disk, and GDBM_NOLOCK, which prevents the library from performing any locking on the database file. The option GDBM_FAST is now obsolete, since `gdbm' defaults to no-sync mode... ^^^^^^^^ (1) My patch adds two options to the gdbm.open(..) function. These are 'u' for GDBM_NOLOCK, and 's' for GDBM_SYNC. (2) GDBM_FAST is obsolete because gdbm defaults to GDBM_FAST, so it's removed. I'm also thinking about adding a lock and unlock methods to the gdbm object, but it seems that a gdbm database can only be locked and not unlocked. - -- Damjan Georgievski | ???????????? ?????????????????????? Skopje, Macedonia | ????????????, ???????????????????? ------- End of Forwarded Message From mal at lemburg.com Tue Dec 12 14:49:40 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Tue, 12 Dec 2000 14:49:40 +0100 Subject: [Python-Dev] Codec aliasing and naming Message-ID: <3A362CF4.2082A606@lemburg.com> I just wanted to inform you of a change I plan for the standard encodings search function to enable better support for aliasing of encoding names. The current implementation caches the aliases returned from the codecs .getaliases() function in the encodings lookup cache rather than in the alias cache. As a consequence, the hyphen to underscore mapping is not applied to the aliases. A codec would have to return a list of all combinations of names with hyphens and underscores in order to emulate the standard lookup behaviour. I have a ptach which fixes this and also assures that aliases cannot be overwritten by codecs which register at some later point in time. This assures that we won't run into situations where a codec import suddenly overrides behaviour of previously active codecs. I would also like to propose the use of a new naming scheme for codecs which enables drop-in installation. As discussed on the i18n-sig list, people would like to install codecs without having the users to call a codec registration function or to touch site.py. The standard search function in the encodings package has a nice property (which I only noticed after the fact ;) which allows using Python package names in the encoding names, e.g. you can install a package 'japanese' and the access the codecs in that package using 'japanese.shiftjis' without having to bother registering a new codec search function for the package -- the encodings package search function will redirect the lookup to the 'japanese' package. Using package names in the encoding name has several advantages: * you know where the codec comes from * you can have mutliple codecs for the same encoding * drop-in installation without registration is possible * the need for a non-default encoding package is visible in the source code * you no longer need to drop new codecs into the Python standard lib Perhaps someone could add a note about this possibility to the codec docs ?! If noone objects, I'll apply the patch for the enhanced alias support later today. Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at python.org Tue Dec 12 14:57:01 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Dec 2000 08:57:01 -0500 Subject: [Python-Dev] Codec aliasing and naming In-Reply-To: Your message of "Tue, 12 Dec 2000 14:49:40 +0100." <3A362CF4.2082A606@lemburg.com> References: <3A362CF4.2082A606@lemburg.com> Message-ID: <200012121357.IAA06846@cj20424-a.reston1.va.home.com> > Perhaps someone could add a note about this possibility > to the codec docs ?! You can check it in yourself or mail it to Fred or submit it to SF... I don't expect anyone else will jump in and document this properly. > If noone objects, I'll apply the patch for the enhanced alias > support later today. Fine with me (but I don't use codecs -- where's the Dutch language support? :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Tue Dec 12 15:38:20 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Tue, 12 Dec 2000 15:38:20 +0100 Subject: [Python-Dev] Codec aliasing and naming References: <3A362CF4.2082A606@lemburg.com> <200012121357.IAA06846@cj20424-a.reston1.va.home.com> Message-ID: <3A36385C.60C7F2B@lemburg.com> Guido van Rossum wrote: > > > Perhaps someone could add a note about this possibility > > to the codec docs ?! > > You can check it in yourself or mail it to Fred or submit it to SF... > I don't expect anyone else will jump in and document this properly. I'll submit a bug report so that this doesn't get lost in the archives. Don't have time for it myself... alas, noone really does seem to have time these days ;-) > > If noone objects, I'll apply the patch for the enhanced alias > > support later today. > > Fine with me (but I don't use codecs -- where's the Dutch language > support? :-). OK. About the Dutch language support: this would make a nice Christmas fun-project... a new standard module which interfaces to babel.altavista.com (hmm, they don't list Dutch as a supported language yet, but maybe if we bug them enough... ;). -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From paulp at ActiveState.com Tue Dec 12 19:11:13 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Tue, 12 Dec 2000 10:11:13 -0800 Subject: [Python-Dev] Online help PEP References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> Message-ID: <3A366A41.1A14EFD4@ActiveState.com> Guido van Rossum wrote: > >... > > help( "string" ) -- built-in topic or global > > Why does a global require string quotes? It doesn't, but if you happen to say help( "dir" ) instead of help( dir ), I think it should do the right thing. > I'm missing > > help() -- table of contents > > I'm not sure if the table of contents should be printed by the repr > output. I don't see any benefit in having different behaviors for help and help(). > > If you ask for a global, it can be a fully-qualfied name such as > > help("xml.dom"). > > Why are the string quotes needed? When are they useful? When you haven't imported the thing you are asking about. Or when the string comes from another UI like an editor window, command line or web form. > > You can also use the facility from a command-line > > > > python --help if > > Is this really useful? Sounds like Perlism to me. I'm just trying to make it easy to quickly get answers to Python questions. I could totally see someone writing code in VIM switching to a bash window to type: python --help os.path.dirname That's alot easier than: $ python Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. >>> import os >>> help(os.path.dirname) And what does it hurt? > > In either situation, the output does paging similar to the "more" > > command. > > Agreed. But how to implement paging in a platform-dependent manner? > On Unix, os.system("more") or "$PAGER" is likely to work. On Windows, > I suppose we could use its MORE, although that's pretty braindead. On > the Mac? Also, inside IDLE or Pythonwin, invoking the system pager > isn't a good idea. The current implementation does paging internally. You could override it to use the system pager (or no pager). > What does "demand-loaded" mean in a Python context? When you "touch" the help object, it loads the onlinehelp module which has the real implementation. The thing in __builtins__ is just a lightweight proxy. > > It should also be possible to override the help display function by > > assigning to onlinehelp.displayhelp(object_or_string). > > Good idea. Pythonwin and IDLE could use this. But I'd like it to > work at least "okay" if they don't. Agreed. > > The module should be able to extract module information from either > > the HTML or LaTeX versions of the Python documentation. Links should > > be accommodated in a "lynx-like" manner. > > I think this is beyond the scope. Well, we have to do one of: * re-write a subset of the docs in a form that can be accessed from the command line * access the existing docs in a form that's installed * auto-convert the docs into a form that's compatible I've already implemented HTML parsing and LaTeX parsing is actually not that far off. I just need impetus to finish a LaTeX-parsing project I started on my last vacation. The reason that LaTeX is interesting is because it would be nice to be able to move documentation from existing LaTeX files into docstrings. > The LaTeX isn't installed anywhere > (and processing would be too much work). > The HTML is installed only > on Windows, where there already is a way to get it to pop up in your > browser (actually two: it's in the Start menu, and also in IDLE's Help > menu). If the documentation becomes an integral part of the Python code, then it will be installed. It's ridiculous that it isn't already. ActivePython does install the docs on all platforms. > A standard syntax for docstrings is under development, PEP 216. I > don't agree with the proposal there, but in any case the help PEP > should not attempt to legalize a different format than PEP 216. I won't hold my breath for a standard Python docstring format. I've gone out of my way to make the code format independent.. > Neat. I noticed that in a 24-line screen, the pagesize must be set to > 21 to avoid stuff scrolling off the screen. Maybe there's an off-by-3 > error somewhere? Yes. > I also noticed that it always prints '1' when invoked as a function. > The new license pager in site.py avoids this problem. Okay. > help("operators") and several others raise an > AttributeError('handledocrl'). Fixed. > The "lynx-line links" don't work. I don't think that's implemented yet. > I think it's naive to expect this help facility to replace browsing > the website or the full documentation package. There should be one > entry that says to point your browser there (giving the local > filesystem URL if available), and that's it. The rest of the online > help facility should be concerned with exposing doc strings. I don't want to replace the documentation. But there is no reason we should set out to make it incomplete. If its integrated with the HTML then people can choose whatever access mechanism is easiest for them right now I'm trying hard not to be "naive". Realistically, nobody is going to write a million docstrings between now and Python 2.1. It is much more feasible to leverage the existing documentation that Fred and others have spent months on. > > Security Issues > > > > This module will attempt to import modules with the same names as > > requested topics. Don't use the modules if you are not confident > > that everything in your pythonpath is from a trusted source. > Yikes! Another reason to avoid the "string" -> global variable > option. I don't think we should lose that option. People will want to look up information from non-executable environments like command lines, GUIs and web pages. Perhaps you can point me to techniques for extracting information from Python modules and packages without executing them. Paul From guido at python.org Tue Dec 12 21:46:09 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Dec 2000 15:46:09 -0500 Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE Message-ID: <200012122046.PAA16915@cj20424-a.reston1.va.home.com> ------- Forwarded Message Date: Tue, 12 Dec 2000 12:38:20 -0800 From: noreply at sourceforge.net To: noreply at sourceforge.net Subject: SourceForge: PROJECT DOWNTIME NOTICE ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS This update is being sent to project administrators only and contains important information regarding your project. Please read it in its entirety. INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION As noted in the sitewide email sent this week, the SourceForge.net infrastructure is being upgraded (and relocated). As part of this projects, plans are underway to further increase capacity and responsiveness. We are scheduling the relocation of the systems serving project subdomain web pages. IMPORTANT: This move will affect you in the following ways: 1. Service and availability of SourceForge.net and the development tools provided will continue uninterupted. 2. Project page webservers hosting subdomains (yourprojectname.sourceforge.net) will be down Friday December 15 from 9PM PST (12AM EST) until 3AM PST. 3. CVS will be unavailable (read only part of the time) from 7PM until 3AM PST 4. Mailing lists and mail aliases will be unavailable until 3AM PST - --------------------- This email was sent from sourceforge.net. To change your email receipt preferences, please visit the site and edit your account via the "Account Maintenance" link. Direct any questions to admin at sourceforge.net, or reply to this email. ------- End of Forwarded Message From greg at cosc.canterbury.ac.nz Tue Dec 12 23:42:01 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 13 Dec 2000 11:42:01 +1300 (NZDT) Subject: [Python-Dev] Online help PEP In-Reply-To: <3A366A41.1A14EFD4@ActiveState.com> Message-ID: <200012122242.LAA01902@s454.cosc.canterbury.ac.nz> Paul Prescod: > Guido: > > Why are the string quotes needed? When are they useful? > When you haven't imported the thing you are asking about. It would be interesting if the quoted form allowed you to extract doc info from a module *without* having the side effect of importing it. This could no doubt be done for pure Python modules. Would be rather tricky for extension modules, though, I expect. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From barry at digicool.com Wed Dec 13 03:21:36 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 12 Dec 2000 21:21:36 -0500 Subject: [Python-Dev] Two new PEPs, 232 & 233 Message-ID: <14902.56624.20961.768525@anthem.concentric.net> I've just uploaded two new PEPs. 232 is a revision of my pre-PEP era function attribute proposal. 233 is Paul Prescod's proposal for an on-line help facility. http://python.sourceforge.net/peps/pep-0232.html http://python.sourceforge.net/peps/pep-0233.html Let the games begin, -Barry From tim.one at home.com Wed Dec 13 04:34:35 2000 From: tim.one at home.com (Tim Peters) Date: Tue, 12 Dec 2000 22:34:35 -0500 Subject: [Python-Dev] {}.popitem() (was Re: {}.first[key,value,item] ...) In-Reply-To: Message-ID: [Moshe Zadka] > If we had an affine operation, instead of a linear one, we could have > [0, 2**n). I won't repeat the proof here but changing > > def f(i): > i <<= 1 > i^=1 # This is the line I added > if i >= 2**N: > i ^= MAGIC_CONSTANT_DEPENDING_ON_N > return i > > Makes you waltz all over [0, 2**n) if the original made you comple > (0, 2**n). [Tim] > But, Moshe! The proof would have been the most interesting part . Turns out the proof would have been intensely interesting, as you can see by running the attached with and without the new line commented out. don't-ever-trust-a-theoretician-ly y'rs - tim N = 2 MAGIC_CONSTANT_DEPENDING_ON_N = 7 def f(i): i <<= 1 # i^=1 # This is the line I added if i >= 2**N: i ^= MAGIC_CONSTANT_DEPENDING_ON_N return i i = 1 for nothing in range(4): print i, i = f(i) print i From amk at mira.erols.com Wed Dec 13 04:55:33 2000 From: amk at mira.erols.com (A.M. Kuchling) Date: Tue, 12 Dec 2000 22:55:33 -0500 Subject: [Python-Dev] Splitting up _cursesmodule Message-ID: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> At 2502 lines, _cursesmodule.c is cumbersomely large. I've just received a patch from Thomas Gellekum that adds support for the panel library that will add another 500 lines. I'd like to split the C file into several subfiles (_curses_panel.c, _curses_window.c, etc.) that get #included from the master _cursesmodule.c file. Do the powers that be approve of this idea? --amk From tim.one at home.com Wed Dec 13 04:54:20 2000 From: tim.one at home.com (Tim Peters) Date: Tue, 12 Dec 2000 22:54:20 -0500 Subject: FW: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE Message-ID: FYI, looks like SourceForge is scheduled to be unusable in a span covering late Friday thru early Saturday (OTT -- One True Time, defined by the clocks in Guido's house). -----Original Message----- From: python-dev-admin at python.org [mailto:python-dev-admin at python.org]On Behalf Of Guido van Rossum Sent: Tuesday, December 12, 2000 3:46 PM To: python-dev at python.org Subject: [Python-Dev] SourceForge: PROJECT DOWNTIME NOTICE ------- Forwarded Message Date: Tue, 12 Dec 2000 12:38:20 -0800 From: noreply at sourceforge.net To: noreply at sourceforge.net Subject: SourceForge: PROJECT DOWNTIME NOTICE ATTENTION SOURCEFORGE PROJECT ADMINISTRATORS This update is being sent to project administrators only and contains important information regarding your project. Please read it in its entirety. INFRASTRUCTURE UPGRADE, EXPANSION AND RELOCATION As noted in the sitewide email sent this week, the SourceForge.net infrastructure is being upgraded (and relocated). As part of this projects, plans are underway to further increase capacity and responsiveness. We are scheduling the relocation of the systems serving project subdomain web pages. IMPORTANT: This move will affect you in the following ways: 1. Service and availability of SourceForge.net and the development tools provided will continue uninterupted. 2. Project page webservers hosting subdomains (yourprojectname.sourceforge.net) will be down Friday December 15 from 9PM PST (12AM EST) until 3AM PST. 3. CVS will be unavailable (read only part of the time) from 7PM until 3AM PST 4. Mailing lists and mail aliases will be unavailable until 3AM PST --------------------- This email was sent from sourceforge.net. To change your email receipt preferences, please visit the site and edit your account via the "Account Maintenance" link. Direct any questions to admin at sourceforge.net, or reply to this email. ------- End of Forwarded Message _______________________________________________ Python-Dev mailing list Python-Dev at python.org http://www.python.org/mailman/listinfo/python-dev From esr at thyrsus.com Wed Dec 13 05:29:17 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 12 Dec 2000 23:29:17 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> Message-ID: <20001212232917.A22839@thyrsus.com> A.M. Kuchling : > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > received a patch from Thomas Gellekum that adds support for the panel > library that will add another 500 lines. I'd like to split the C file > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > get #included from the master _cursesmodule.c file. > > Do the powers that be approve of this idea? I doubt I qualify as a power that be, but I'm certainly +1 on panel support. -- Eric S. Raymond The biggest hypocrites on gun control are those who live in upscale developments with armed security guards -- and who want to keep other people from having guns to defend themselves. But what about lower-income people living in high-crime, inner city neighborhoods? Should such people be kept unarmed and helpless, so that limousine liberals can 'make a statement' by adding to the thousands of gun laws already on the books?" --Thomas Sowell From fdrake at acm.org Wed Dec 13 07:24:01 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 01:24:01 -0500 (EST) Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> Message-ID: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> A.M. Kuchling writes: > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > received a patch from Thomas Gellekum that adds support for the panel > library that will add another 500 lines. I'd like to split the C file > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > get #included from the master _cursesmodule.c file. Would it be reasonable to add panel support as a second extension module? Is there really a need for them to be in the same module, since the panel library is a separate library? -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From gstein at lyra.org Wed Dec 13 08:58:38 2000 From: gstein at lyra.org (Greg Stein) Date: Tue, 12 Dec 2000 23:58:38 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com>; from amk@mira.erols.com on Tue, Dec 12, 2000 at 10:55:33PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> Message-ID: <20001212235838.T8951@lyra.org> On Tue, Dec 12, 2000 at 10:55:33PM -0500, A.M. Kuchling wrote: > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > received a patch from Thomas Gellekum that adds support for the panel > library that will add another 500 lines. I'd like to split the C file > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > get #included from the master _cursesmodule.c file. Why should they be #included? I thought that we can build multiple .c files into a module... Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Wed Dec 13 09:05:05 2000 From: gstein at lyra.org (Greg Stein) Date: Wed, 13 Dec 2000 00:05:05 -0800 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects dictobject.c,2.68,2.69 In-Reply-To: <200012130102.RAA31828@slayer.i.sourceforge.net>; from tim_one@users.sourceforge.net on Tue, Dec 12, 2000 at 05:02:49PM -0800 References: <200012130102.RAA31828@slayer.i.sourceforge.net> Message-ID: <20001213000505.U8951@lyra.org> On Tue, Dec 12, 2000 at 05:02:49PM -0800, Tim Peters wrote: > Update of /cvsroot/python/python/dist/src/Objects > In directory slayer.i.sourceforge.net:/tmp/cvs-serv31776/python/dist/src/objects > > Modified Files: > dictobject.c > Log Message: > Bring comments up to date (e.g., they still said the table had to be > a prime size, which is in fact never true anymore ...). >... > --- 55,78 ---- > > /* > ! There are three kinds of slots in the table: > ! > ! 1. Unused. me_key == me_value == NULL > ! Does not hold an active (key, value) pair now and never did. Unused can > ! transition to Active upon key insertion. This is the only case in which > ! me_key is NULL, and is each slot's initial state. > ! > ! 2. Active. me_key != NULL and me_key != dummy and me_value != NULL > ! Holds an active (key, value) pair. Active can transition to Dummy upon > ! key deletion. This is the only case in which me_value != NULL. > ! > ! 3. Dummy. me_key == dummy && me_value == NULL > ! Previously held an active (key, value) pair, but that was deleted and an > ! active pair has not yet overwritten the slot. Dummy can transition to > ! Active upon key insertion. Dummy slots cannot be made Unused again > ! (cannot have me_key set to NULL), else the probe sequence in case of > ! collision would have no way to know they were once active. 4. The popitem finger. :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From moshez at zadka.site.co.il Wed Dec 13 20:19:53 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Wed, 13 Dec 2000 21:19:53 +0200 (IST) Subject: [Python-Dev] Splitting up _cursesmodule Message-ID: <20001213191953.7208DA82E@darjeeling.zadka.site.co.il> On Tue, 12 Dec 2000 23:29:17 -0500, "Eric S. Raymond" wrote: > A.M. Kuchling : > > At 2502 lines, _cursesmodule.c is cumbersomely large. I've just > > received a patch from Thomas Gellekum that adds support for the panel > > library that will add another 500 lines. I'd like to split the C file > > into several subfiles (_curses_panel.c, _curses_window.c, etc.) that > > get #included from the master _cursesmodule.c file. > > > > Do the powers that be approve of this idea? > > I doubt I qualify as a power that be, but I'm certainly +1 on panel support. I'm +1 on panel support, but that seems the wrong solution. Why not have several C moudles (_curses_panel,...) and manage a more unified namespace with the Python wrapper modules? /curses/panel.py -- from _curses_panel import * etc. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From akuchlin at mems-exchange.org Wed Dec 13 13:44:23 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Wed, 13 Dec 2000 07:44:23 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 01:24:01AM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> Message-ID: <20001213074423.A30348@kronos.cnri.reston.va.us> [CC'ing Thomas Gellekum ] On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote: > Would it be reasonable to add panel support as a second extension >module? Is there really a need for them to be in the same module, >since the panel library is a separate library? Quite possibly, though the patch isn't structured that way. The panel module will need access to the type object for the curses window object, so it'll have to ensure that _curses is already imported, but that's no problem. Thomas, do you feel capable of implementing it as a separate module, or should I work on it? Probably a _cursesmodule.h header will have to be created to make various definitions available to external users of the basic objects in _curses. (Bonus: this means that the menu and form libraries, if they ever get wrapped, can be separate modules, too.) --amk From tg at melaten.rwth-aachen.de Wed Dec 13 15:00:46 2000 From: tg at melaten.rwth-aachen.de (Thomas Gellekum) Date: 13 Dec 2000 15:00:46 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 07:44:23 -0500" References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> Message-ID: Andrew Kuchling writes: > [CC'ing Thomas Gellekum ] > > On Wed, Dec 13, 2000 at 01:24:01AM -0500, Fred L. Drake, Jr. wrote: > > Would it be reasonable to add panel support as a second extension > >module? Is there really a need for them to be in the same module, > >since the panel library is a separate library? > > Quite possibly, though the patch isn't structured that way. The panel > module will need access to the type object for the curses window > object, so it'll have to ensure that _curses is already imported, but > that's no problem. You mean as separate modules like import curses import panel ? Hm. A panel object is associated with a window object, so it's created from a window method. This means you'd need to add window.new_panel() to PyCursesWindow_Methods[] and curses.update_panels(), curses.panel_above() and curses.panel_below() (or whatever they're called after we're through discussing this ;-)) to PyCurses_Methods[]. Also, the curses.panel_{above,below}() wrappers need access to the list_of_panels via find_po(). > Thomas, do you feel capable of implementing it as a separate module, > or should I work on it? It's probably finished a lot sooner when you do it; OTOH, it would be fun to try it. Let's carry this discussion a bit further. > Probably a _cursesmodule.h header will have > to be created to make various definitions available to external > users of the basic objects in _curses. That's easy. The problem is that we want to extend those basic objects in _curses. > (Bonus: this means that the > menu and form libraries, if they ever get wrapped, can be separate > modules, too.) Sure, if we solve this for panel, the others are a SMOP. :-) tg From guido at python.org Wed Dec 13 15:31:52 2000 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 09:31:52 -0500 Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src README,1.106,1.107 In-Reply-To: Your message of "Wed, 13 Dec 2000 06:14:35 PST." <200012131414.GAA20849@slayer.i.sourceforge.net> References: <200012131414.GAA20849@slayer.i.sourceforge.net> Message-ID: <200012131431.JAA21243@cj20424-a.reston1.va.home.com> > + --with-cxx=: Some C++ compilers require that main() is > + compiled with the C++ if there is any C++ code in the application. > + Specifically, g++ on a.out systems may require that to support > + construction of global objects. With this option, the main() function > + of Python will be compiled with ; use that only if you > + plan to use C++ extension modules, and if your compiler requires > + compilation of main() as a C++ program. Thanks for documenting this; see my continued reservation in the (reopened) bug report. Another question remains regarding the docs though: why is it bad to always compile main.c with a C++ compiler? --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Wed Dec 13 16:19:01 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 10:19:01 -0500 (EST) Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> Message-ID: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> Thomas Gellekum writes: > You mean as separate modules like > > import curses > import panel Or better yet: import curses import curses.panel > ? Hm. A panel object is associated with a window object, so it's > created from a window method. This means you'd need to add > window.new_panel() to PyCursesWindow_Methods[] and > curses.update_panels(), curses.panel_above() and curses.panel_below() > (or whatever they're called after we're through discussing this ;-)) > to PyCurses_Methods[]. Do these new functions have to be methods on the window objects, or can they be functions in the new module that take a window as a parameter? The underlying window object can certainly provide slots for the use of the panel (forms, ..., etc.) bindings, and simply initialize them to NULL (or whatever) for newly created windows. > Also, the curses.panel_{above,below}() wrappers need access to the > list_of_panels via find_po(). There's no reason that underlying utilities can't be provided by _curses using a CObject. The Extending & Embedding manual has a section on using CObjects to provide a C API to a module without having to link to it directly. > That's easy. The problem is that we want to extend those basic objects > in _curses. Again, I'm curious about the necessity of this. I suspect it can be avoided. I think the approach I've hinted at above will allow you to avoid this, and will allow the panel (forms, ...) support to be added simply by adding additional modules as they are written and the underlying libraries are installed on the host. I know the question of including these modules in the core distribution has come up before, but the resurgence in interest in these makes me want to bring it up again: Does the curses package (and the associated C extension(s)) belong in the standard library, or does it make sense to spin out a distutils-based package? I've no objection to them being in the core, but it seems that the release cycle may want to diverge from Python's. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From guido at python.org Wed Dec 13 16:48:50 2000 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 10:48:50 -0500 Subject: [Python-Dev] Online help PEP In-Reply-To: Your message of "Tue, 12 Dec 2000 10:11:13 PST." <3A366A41.1A14EFD4@ActiveState.com> References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> <3A366A41.1A14EFD4@ActiveState.com> Message-ID: <200012131548.KAA21344@cj20424-a.reston1.va.home.com> [Paul's PEP] > > > help( "string" ) -- built-in topic or global [me] > > Why does a global require string quotes? [Paul] > It doesn't, but if you happen to say > > help( "dir" ) instead of help( dir ), I think it should do the right > thing. Fair enough. > > I'm missing > > > > help() -- table of contents > > > > I'm not sure if the table of contents should be printed by the repr > > output. > > I don't see any benefit in having different behaviors for help and > help(). Having the repr() overloading invoke the pager is dangerous. The beta version of the license command did this, and it caused some strange side effects, e.g. vars(__builtins__) would start reading from input and confuse the users. The new version's repr() returns the desired string if it's less than a page, and 'Type license() to see the full license text' if the pager would need to be invoked. > > > If you ask for a global, it can be a fully-qualfied name such as > > > help("xml.dom"). > > > > Why are the string quotes needed? When are they useful? > > When you haven't imported the thing you are asking about. Or when the > string comes from another UI like an editor window, command line or web > form. The implied import is a major liability. If you can do this without importing (e.g. by source code inspection), fine. Otherwise, you might issue some kind of message like "you must first import XXX.YYY". > > > You can also use the facility from a command-line > > > > > > python --help if > > > > Is this really useful? Sounds like Perlism to me. > > I'm just trying to make it easy to quickly get answers to Python > questions. I could totally see someone writing code in VIM switching to > a bash window to type: > > python --help os.path.dirname > > That's alot easier than: > > $ python > Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32 > Type "copyright", "credits" or "license" for more information. > >>> import os > >>> help(os.path.dirname) > > And what does it hurt? The hurt is code bloat in the interpreter and creeping featurism. If you need command line access to the docs (which may be a reasonable thing to ask for, although to me it sounds backwards :-), it's better to provide a separate command, e.g. pythondoc. (Analog to perldoc.) > > > In either situation, the output does paging similar to the "more" > > > command. > > > > Agreed. But how to implement paging in a platform-dependent manner? > > On Unix, os.system("more") or "$PAGER" is likely to work. On Windows, > > I suppose we could use its MORE, although that's pretty braindead. On > > the Mac? Also, inside IDLE or Pythonwin, invoking the system pager > > isn't a good idea. > > The current implementation does paging internally. You could override it > to use the system pager (or no pager). Yes. Please add that option to the PEP. > > What does "demand-loaded" mean in a Python context? > > When you "touch" the help object, it loads the onlinehelp module which > has the real implementation. The thing in __builtins__ is just a > lightweight proxy. Please suggest an implementation. > > > It Should Also Be Possible To Override The Help Display Function By > > > Assigning To Onlinehelp.Displayhelp(Object_Or_String). > > > > Good Idea. Pythonwin And Idle Could Use This. But I'D Like It To > > Work At Least "Okay" If They Don'T. > > Agreed. Glad You'Re So Agreeable. :) > > > The Module Should Be Able To Extract Module Information From Either > > > The Html Or Latex Versions Of The Python Documentation. Links Should > > > Be Accommodated In A "Lynx-Like" Manner. > > > > I Think This Is Beyond The Scope. > > Well, We Have To Do One Of: > > * Re-Write A Subset Of The Docs In A Form That Can Be Accessed From The > Command Line > * Access The Existing Docs In A Form That'S Installed > * Auto-Convert The Docs Into A Form That'S Compatible I Really Don'T Think That This Tool Should Attempt To Do Everything. If Someone *Really* Wants To Browse The Existing (Large) Doc Set In A Terminal Emulation Window, Let Them Use Lynx And Point It To The Documentation Set. (I Agree That The Html Docs Should Be Installed, By The Way.) > I'Ve Already Implemented Html Parsing And Latex Parsing Is Actually Not > That Far Off. I Just Need Impetus To Finish A Latex-Parsing Project I > Started On My Last Vacation. A Latex Parser Would Be Most Welcome -- If It Could Replace Latex2html! That Old Perl Program Is Really Ready For Retirement. (Ask Fred.) > The Reason That Latex Is Interesting Is Because It Would Be Nice To Be > Able To Move Documentation From Existing Latex Files Into Docstrings. That'S What Some People Think. I Disagree That It Would Be Either Feasible Or A Good Idea To Put All Documentation For A Typical Module In Its Doc Strings. > > The Latex Isn'T Installed Anywhere > > (And Processing Would Be Too Much Work). > > The Html Is Installed Only > > On Windows, Where There Already Is A Way To Get It To Pop Up In Your > > Browser (Actually Two: It'S In The Start Menu, And Also In Idle'S Help > > Menu). > > If The Documentation Becomes An Integral Part Of The Python Code, Then > It Will Be Installed. It'S Ridiculous That It Isn'T Already. Why Is That Ridiculous? It'S Just As Easy To Access Them Through The Web For Most People. If It'S Not, They Are Available In Easily Downloadable Tarballs Supporting A Variety Of Formats. That'S Just Too Much To Be Included In The Standard Rpms. (Also, Latex2html Requires So Much Hand-Holding, And Is So Slow, That It'S Really Not A Good Idea To Let "Make Install" Install The Html By Default.) > Activepython Does Install The Docs On All Platforms. Great. More Power To You. > > A Standard Syntax For Docstrings Is Under Development, Pep 216. I > > Don'T Agree With The Proposal There, But In Any Case The Help Pep > > Should Not Attempt To Legalize A Different Format Than Pep 216. > > I Won'T Hold My Breath For A Standard Python Docstring Format. I'Ve Gone > Out Of My Way To Make The Code Format Independent.. To Tell You The Truth, I'M Not Holding My Breath Either. :-) So your code should just dump the doc string on stdout without interpreting it in any way (except for paging). > > Neat. I noticed that in a 24-line screen, the pagesize must be set to > > 21 to avoid stuff scrolling off the screen. Maybe there's an off-by-3 > > error somewhere? > > Yes. It's buggier than just that. The output of the pager prints an extra "| " at the start of each page except for the first, and the first page is a line longer than subsequent pages. BTW, another bug: try help(cgi). It's nice that it gives the default value for arguments, but the defaults for FieldStorage.__init__ happen to include os.environ. Its entire value is dumped -- which causes the pager to be off (it wraps over about 20 lines for me). I think you may have to truncate long values a bit, e.g. by using the repr module. > > I also noticed that it always prints '1' when invoked as a function. > > The new license pager in site.py avoids this problem. > > Okay. Where's the check-in? :-) > > help("operators") and several others raise an > > AttributeError('handledocrl'). > > Fixed. > > > The "lynx-line links" don't work. > > I don't think that's implemented yet. I'm not sure what you intended to implement there. I prefer to see the raw URLs, then I can do whatever I normally do to paste them into my preferred webbrowser (which *not* lynx :-). > > I think it's naive to expect this help facility to replace browsing > > the website or the full documentation package. There should be one > > entry that says to point your browser there (giving the local > > filesystem URL if available), and that's it. The rest of the online > > help facility should be concerned with exposing doc strings. > > I don't want to replace the documentation. But there is no reason we > should set out to make it incomplete. If its integrated with the HTML > then people can choose whatever access mechanism is easiest for them > right now > > I'm trying hard not to be "naive". Realistically, nobody is going to > write a million docstrings between now and Python 2.1. It is much more > feasible to leverage the existing documentation that Fred and others > have spent months on. I said above, and I'll say it again: I think the majority of people would prefer to use their standard web browser to read the standard docs. It's not worth the effort to try to make those accessible through help(). In fact, I'd encourage the development of a command-line-invoked help facility that shows doc strings in the user's preferred web browser -- the webbrowser module makes this trivial. > > > Security Issues > > > > > > This module will attempt to import modules with the same names as > > > requested topics. Don't use the modules if you are not confident > > > that everything in your pythonpath is from a trusted source. > > Yikes! Another reason to avoid the "string" -> global variable > > option. > > I don't think we should lose that option. People will want to look up > information from non-executable environments like command lines, GUIs > and web pages. Perhaps you can point me to techniques for extracting > information from Python modules and packages without executing them. I don't know specific tools, but any serious docstring processing tool ends up parsing the source code for this very reason, so there's probably plenty of prior art. --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Wed Dec 13 17:07:22 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 11:07:22 -0500 (EST) Subject: [Python-Dev] Online help PEP In-Reply-To: <200012131548.KAA21344@cj20424-a.reston1.va.home.com> References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> <3A366A41.1A14EFD4@ActiveState.com> <200012131548.KAA21344@cj20424-a.reston1.va.home.com> Message-ID: <14903.40634.569192.704368@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > A Latex Parser Would Be Most Welcome -- If It Could Replace > Latex2html! That Old Perl Program Is Really Ready For Retirement. > (Ask Fred.) Note that Doc/tools/sgmlconv/latex2esis.py already includes a moderate start at a LaTeX parser. Paragraph marking is done as a separate step in Doc/tools/sgmlconv/docfixer.py, but I'd like to push that down into the LaTeX handler. (Note that these tools are mostly broken at the moment, except for latex2esis.py, which does most of what I need other than paragraph marking.) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From Barrett at stsci.edu Wed Dec 13 17:34:40 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Wed, 13 Dec 2000 11:34:40 -0500 (EST) Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) In-Reply-To: <20001210054646.A5219@glacier.fnational.com> References: <20001210054646.A5219@glacier.fnational.com> Message-ID: <14903.41669.883591.420446@nem-srvr.stsci.edu> Neil Schemenauer writes: > Sourceforge unloads are not working. The lastest version of the > patch for PEP 208 is here: > > http://arctrix.com/nas/python/coerce-6.0.diff > > Operations on instances now call __coerce__ if it exists. I > think the patch is now complete. Converting other builtin types > to "new style numbers" can be done with a separate patch. My one concern about this patch is whether the non-commutativity of operators is preserved. This issue being important for matrix operations (not to be confused with element-wise array operations). -- Paul From guido at python.org Wed Dec 13 17:45:12 2000 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 11:45:12 -0500 Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) In-Reply-To: Your message of "Wed, 13 Dec 2000 11:34:40 EST." <14903.41669.883591.420446@nem-srvr.stsci.edu> References: <20001210054646.A5219@glacier.fnational.com> <14903.41669.883591.420446@nem-srvr.stsci.edu> Message-ID: <200012131645.LAA21719@cj20424-a.reston1.va.home.com> > Neil Schemenauer writes: > > Sourceforge unloads are not working. The lastest version of the > > patch for PEP 208 is here: > > > > http://arctrix.com/nas/python/coerce-6.0.diff > > > > Operations on instances now call __coerce__ if it exists. I > > think the patch is now complete. Converting other builtin types > > to "new style numbers" can be done with a separate patch. > > My one concern about this patch is whether the non-commutativity of > operators is preserved. This issue being important for matrix > operations (not to be confused with element-wise array operations). Yes, this is preserved. (I'm spending most of my waking hours understanding this patch -- it is a true piece of wizardry.) --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Wed Dec 13 18:38:00 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Wed, 13 Dec 2000 18:38:00 +0100 Subject: [Python-Dev] Reference implementation for PEP 208 (coercion) References: <20001210054646.A5219@glacier.fnational.com> <14903.41669.883591.420446@nem-srvr.stsci.edu> <200012131645.LAA21719@cj20424-a.reston1.va.home.com> Message-ID: <3A37B3F7.5640FAFC@lemburg.com> Guido van Rossum wrote: > > > Neil Schemenauer writes: > > > Sourceforge unloads are not working. The lastest version of the > > > patch for PEP 208 is here: > > > > > > http://arctrix.com/nas/python/coerce-6.0.diff > > > > > > Operations on instances now call __coerce__ if it exists. I > > > think the patch is now complete. Converting other builtin types > > > to "new style numbers" can be done with a separate patch. > > > > My one concern about this patch is whether the non-commutativity of > > operators is preserved. This issue being important for matrix > > operations (not to be confused with element-wise array operations). > > Yes, this is preserved. (I'm spending most of my waking hours > understanding this patch -- it is a true piece of wizardry.) The fact that coercion didn't allow detection of parameter order was the initial cause for my try at fixing it back then. I was confronted with the fact that at C level there was no way to tell whether the operands were in the order left, right or right, left -- as a result I used a gross hack in mxDateTime to still make this work... -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From esr at thyrsus.com Wed Dec 13 22:01:46 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 16:01:46 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> Message-ID: <20001213160146.A24753@thyrsus.com> Fred L. Drake, Jr. : > I know the question of including these modules in the core > distribution has come up before, but the resurgence in interest in > these makes me want to bring it up again: Does the curses package > (and the associated C extension(s)) belong in the standard library, or > does it make sense to spin out a distutils-based package? I've no > objection to them being in the core, but it seems that the release > cycle may want to diverge from Python's. Curses needs to be in the core for political reasons. Specifically, to support CML2 without requiring any extra packages or downloads beyond the stock Python interpreter. And what makes CML2 so constrained and so important? It's my bid to replace the Linux kernel's configuration machinery. It has many advantages over the existing config system, but the linux developers are *very* resistant to adding things to the kernel's minimum build kit. Python alone may prove too much for them to swallow (though there are hopeful signs they will); Python plus a separately downloadable curses module would definitely be too much. Guido attaches sufficient importance to getting Python into the kernel build machinery that he approved adding ncurses to the standard modules on that basis. This would be a huge design win for us, raising Python's visibility considerably. So curses must stay in the core. I don't have a requirement for panels; my present curses front end simulates them. But if panels were integrated into the core I could simplify the front-end code significantly. Every line I can remove from my stuff (even if it, in effect, is just migrating into the Python core) makes it easier to sell CML2 into the kernel. -- Eric S. Raymond "Experience should teach us to be most on our guard to protect liberty when the government's purposes are beneficient... The greatest dangers to liberty lurk in insidious encroachment by men of zeal, well meaning but without understanding." -- Supreme Court Justice Louis Brandeis From jheintz at isogen.com Wed Dec 13 22:10:32 2000 From: jheintz at isogen.com (John D. Heintz) Date: Wed, 13 Dec 2000 15:10:32 -0600 Subject: [Python-Dev] Announcing ZODB-Corba code release Message-ID: <3A37E5C8.7000800@isogen.com> Here is the first release of code that exposes a ZODB database through CORBA (omniORB). The code is functioning, the docs are sparse, and it should work on your machines. ;-) I am only going to be in town for the next two days, then I will be unavailable until Jan 1. See http://www.zope.org/Members/jheintz/ZODB_CORBA_Connection to download the code. It's not perfect, but it works for me. Enjoy, John -- . . . . . . . . . . . . . . . . . . . . . . . . John D. Heintz | Senior Engineer 1016 La Posada Dr. | Suite 240 | Austin TX 78752 T 512.633.1198 | jheintz at isogen.com w w w . d a t a c h a n n e l . c o m From guido at python.org Wed Dec 13 22:19:01 2000 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 16:19:01 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Your message of "Wed, 13 Dec 2000 16:01:46 EST." <20001213160146.A24753@thyrsus.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> Message-ID: <200012132119.QAA11060@cj20424-a.reston1.va.home.com> > So curses must stay in the core. I don't have a requirement for > panels; my present curses front end simulates them. But if panels were > integrated into the core I could simplify the front-end code > significantly. Every line I can remove from my stuff (even if it, in > effect, is just migrating into the Python core) makes it easier to > sell CML2 into the kernel. On the other hand you may want to be conservative. You already have to require Python 2.0 (I presume). The panel stuff will be available in 2.1 at the earliest. You probably shouldn't throw out your panel emulation until your code has already been accepted... --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at loewis.home.cs.tu-berlin.de Wed Dec 13 22:56:27 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 13 Dec 2000 22:56:27 +0100 Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107 Message-ID: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de> > Another question remains regarding the docs though: why is it bad to > always compile main.c with a C++ compiler? For the whole thing to work, it may also be necessary to link the entire application with a C++ compiler; that in turn may bind to the C++ library. Linking with the system's C++ library means that the Python executable cannot be as easily exchanged between installations of the operating system - you'd also need to have the right version of the C++ library to run it. If the C++ library is static, that may also increase the size of the executable. I can't really point to a specific problem that would occur on a specific system I use if main() was compiled with a C++ compiler. However, on the systems I use (Windows, Solaris, Linux), you can build C++ extension modules even if Python was not compiled as a C++ application. On Solaris and Windows, you'd also have to chose the C++ compiler you want to use (MSVC++, SunPro CC, or g++); in turn, different C++ runtime systems would be linked into the application. Regards, Martin From esr at thyrsus.com Wed Dec 13 23:03:59 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 17:03:59 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132119.QAA11060@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 04:19:01PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> Message-ID: <20001213170359.A24915@thyrsus.com> Guido van Rossum : > > So curses must stay in the core. I don't have a requirement for > > panels; my present curses front end simulates them. But if panels were > > integrated into the core I could simplify the front-end code > > significantly. Every line I can remove from my stuff (even if it, in > > effect, is just migrating into the Python core) makes it easier to > > sell CML2 into the kernel. > > On the other hand you may want to be conservative. You already have > to require Python 2.0 (I presume). The panel stuff will be available > in 2.1 at the earliest. You probably shouldn't throw out your panel > emulation until your code has already been accepted... Yes, that's how I am currently expecting it to play out -- but if the 2.4.0 kernel is delayed another six months, I'd change my mind. I'll explain this, because python-dev people should grok what the surrounding politics and timing are. I actually debated staying with 1.5.2 as a base version. What changed my mind was two things. One: by going to 2.0 I could drop close to 600 lines and three entire support modules from CML2, slimming down its footprint in the kernel tree significantly (by more than 10% of the entire code volume, actually). Second: CML2 is not going to be seriously evaluated until 2.4.0 final is out. Linus made this clear when I demoed it for him at LWE. My best guess about when that will happen is late January into Februrary. By the time Red Hat issues its next distro after that (probably May or thenabouts) it's a safe bet 2.0 will be on it, and everywhere else. But if the 2.4.0 kernel slips another six months yet again, and our 2.1 commes out relatively quickly (like, just before the 9th Python Conference :-)) then we *might* have time to get 2.1 into the distros before CML2 gets the imprimatur. So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel will be delayed yet again :-). -- Eric S. Raymond Ideology, politics and journalism, which luxuriate in failure, are impotent in the face of hope and joy. -- P. J. O'Rourke From nas at arctrix.com Wed Dec 13 16:37:45 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 07:37:45 -0800 Subject: [Python-Dev] CVS: python/dist/src README,1.106,1.107 In-Reply-To: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Wed, Dec 13, 2000 at 10:56:27PM +0100 References: <200012132156.WAA01345@loewis.home.cs.tu-berlin.de> Message-ID: <20001213073745.C17148@glacier.fnational.com> These are issues to consider for Python 3000 as well. AFAKI, C++ ABIs are a nighmare. Neil From fdrake at acm.org Wed Dec 13 23:29:25 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 13 Dec 2000 17:29:25 -0500 (EST) Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213170359.A24915@thyrsus.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> Message-ID: <14903.63557.282592.796169@cj42289-a.reston1.va.home.com> Eric S. Raymond writes: > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel > will be delayed yet again :-). Politics aside, I think development of curses-related extensions like panels and forms doesn't need to be delayed. I've posted what I think are relavant technical comments already, and leave it up to the developers of any new modules to get them written -- I don't know enough curses to offer any help there. Regardless of how the curses package is distributed and deployed, I don't see any reason to delay development in its existing location in the Python CVS repository. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From nas at arctrix.com Wed Dec 13 16:41:54 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 07:41:54 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> Message-ID: <20001213074154.D17148@glacier.fnational.com> On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote: > CML2 is not going to be seriously evaluated until 2.4.0 final > is out. Linus made this clear when I demoed it for him at LWE. > My best guess about when that will happen is late January into > Februrary. By the time Red Hat issues its next distro after > that (probably May or thenabouts) it's a safe bet 2.0 will be > on it, and everywhere else. I don't think that is a very safe bet. Python 2.0 missed the Debian Potato boat. I have no idea when Woody is expected to be released but I expect it may take longer than that if history is any indication. Neil From guido at python.org Thu Dec 14 00:03:31 2000 From: guido at python.org (Guido van Rossum) Date: Wed, 13 Dec 2000 18:03:31 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Your message of "Wed, 13 Dec 2000 07:41:54 PST." <20001213074154.D17148@glacier.fnational.com> References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> Message-ID: <200012132303.SAA12434@cj20424-a.reston1.va.home.com> > I don't think that is a very safe bet. Python 2.0 missed the > Debian Potato boat. This may have had to do more with the unresolved GPL issues. I recently received a mail from Stallman indicating that an agreement with CNRI has been reached; they have agreed (in principle, at least) to specific changes to the CNRI license that will defuse the choice-of-law clause when it is combined with GPL-licensed code "in a non-separable way". A glitch here is that the BeOpen license probably has to be changed too, but I believe that that's all doable. > I have no idea when Woody is expected to be > released but I expect it may take longer than that if history is > any indication. And who or what is Woody? Feeling-left-out, --Guido van Rossum (home page: http://www.python.org/~guido/) From gstein at lyra.org Thu Dec 14 00:16:09 2000 From: gstein at lyra.org (Greg Stein) Date: Wed, 13 Dec 2000 15:16:09 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> Message-ID: <20001213151609.E8951@lyra.org> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: >... > > I have no idea when Woody is expected to be > > released but I expect it may take longer than that if history is > > any indication. > > And who or what is Woody? One of the Debian releases. Dunno if it is the "next" release, but there ya go. Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein at lyra.org Thu Dec 14 00:18:34 2000 From: gstein at lyra.org (Greg Stein) Date: Wed, 13 Dec 2000 15:18:34 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213170359.A24915@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 13, 2000 at 05:03:59PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> Message-ID: <20001213151834.F8951@lyra.org> On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote: >... > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel > will be delayed yet again :-). The kernel is not going to be delayed that much. Linus wants it to go out this month. Worst case, I could see January. But no way on six months. But as Fred said: that should not change panels going into the curses support at all. You can always have a "compat.py" module in CML2 that provides functionality for prior-to-2.1 releases of Python. I'd also be up for a separate _curses_panels module, loaded into the curses package. Cheers, -g -- Greg Stein, http://www.lyra.org/ From esr at thyrsus.com Thu Dec 14 00:33:02 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 18:33:02 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001213151834.F8951@lyra.org>; from gstein@lyra.org on Wed, Dec 13, 2000 at 03:18:34PM -0800 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213151834.F8951@lyra.org> Message-ID: <20001213183302.A25160@thyrsus.com> Greg Stein : > On Wed, Dec 13, 2000 at 05:03:59PM -0500, Eric S. Raymond wrote: > >... > > So, gentlemen, vote for panels to go in if you think the 2.4.0 kernel > > will be delayed yet again :-). > > The kernel is not going to be delayed that much. Linus wants it to go out > this month. Worst case, I could see January. But no way on six months. I know what Linus wants. That's why I'm estimating end of January or earlier Februrary -- the man's error curve on these estimates has a certain, er, *consistency* about it. -- Eric S. Raymond Alcohol still kills more people every year than all `illegal' drugs put together, and Prohibition only made it worse. Oppose the War On Some Drugs! From nas at arctrix.com Wed Dec 13 18:18:48 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 09:18:48 -0800 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> Message-ID: <20001213091848.A17326@glacier.fnational.com> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: > > I don't think that is a very safe bet. Python 2.0 missed the > > Debian Potato boat. > > This may have had to do more with the unresolved GPL issues. I can't remember the exact dates but I think Debian Potato was frozen before Python 2.0 was released. Once a Debian release is frozen packages are not upgraded except under unusual circumstances. > I recently received a mail from Stallman indicating that an > agreement with CNRI has been reached; they have agreed (in > principle, at least) to specific changes to the CNRI license > that will defuse the choice-of-law clause when it is combined > with GPL-licensed code "in a non-separable way". A glitch here > is that the BeOpen license probably has to be changed too, but > I believe that that's all doable. This is great news. > > I have no idea when Woody is expected to be > > released but I expect it may take longer than that if history is > > any indication. > > And who or what is Woody? Woody would be another character from the Pixar movie "Toy Story" (just like Rex, Bo, Potato, Slink, and Hamm). I believe Bruce Perens used to work a Pixar. Debian uses a code name for the development release until a release number is assigned. This avoids some problems but has the disadvantage of confusing people who are not familiar with Debian. I should have said "the next stable release of Debian". Neil (aka nas at debian.org) From akuchlin at mems-exchange.org Thu Dec 14 01:26:32 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Wed, 13 Dec 2000 19:26:32 -0500 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <14903.37733.339921.131872@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Wed, Dec 13, 2000 at 10:19:01AM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> Message-ID: <20001213192632.A30585@kronos.cnri.reston.va.us> On Wed, Dec 13, 2000 at 10:19:01AM -0500, Fred L. Drake, Jr. wrote: > Do these new functions have to be methods on the window objects, or >can they be functions in the new module that take a window as a >parameter? The underlying window object can certainly provide slots Panels and windows have a 1-1 association, but they're separate objects. The window.new_panel function could become just a method which takes a window as its first argument; it would only need the TypeObject for PyCursesWindow, in order to do typechecking. > > Also, the curses.panel_{above,below}() wrappers need access to the > > list_of_panels via find_po(). The list_of_panels is used only in the curses.panel module, so it could be private to that module, since only panel-related functions care about it. I'm ambivalent about the list_of_panels. It's a linked list storing (PyWindow, PyPanel) pairs. Probably it should use a dictionary instead of implementing a little list, just to reduce the amount of code. >does it make sense to spin out a distutils-based package? I've no >objection to them being in the core, but it seems that the release >cycle may want to diverge from Python's. Consensus seemed to be to leave it in; I'd have no objection to removing it, but either course is fine with me. So, I suggest we create _curses_panel.c, which would be available as curses.panel. (A panel.py module could then add any convenience functions that are required.) Thomas, do you want to work on this, or should I? --amk From nas at arctrix.com Wed Dec 13 18:43:06 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 09:43:06 -0800 Subject: [Python-Dev] OT: Debian and Python In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100 References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> Message-ID: <20001213094306.C17326@glacier.fnational.com> On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote: > Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0. > Someone created a separate set of 2.0-packages, but they didn't include > readline and gdbm support because of the licencing issues. (Posted on c.l.py > sometime this week.) I've had Python packages for Debian stable for a while. I guess I should have posted a link: http://arctrix.com/nas/python/debian/ Most useful modules are enabled. > I'm *almost* tempted enough to learn enough about > dpkg/.deb files to build my own licence-be-damned set Its quite easy. Debian source packages are basicly a diff. Applying the diff will create a "debian" directory and in that directory will be a makefile called "rules". Use the target "binary" to create new binary packages. Good things to know are that you must be in the source directory when you run the makefile (ie. ./debian/rules binary). You should be running a shell under fakeroot to get the install permissions right (running "fakeroot" will do). You need to have the Debian developer tools installed. There is a list somewhere on debian.org. "apt-get source " will get, extract and patch a package ready for tweaking and building (handy for getting stuff from unstable to run on stable). This is too off topic for python-dev. If anyone needs more info they can email me directly. Neil From thomas at xs4all.net Thu Dec 14 01:05:34 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 14 Dec 2000 01:05:34 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <200012132303.SAA12434@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Dec 13, 2000 at 06:03:31PM -0500 References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> Message-ID: <20001214010534.M4396@xs4all.nl> On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: > > I don't think that is a very safe bet. Python 2.0 missed the Debian > > Potato boat. > > This may have had to do more with the unresolved GPL issues. This is very likely. Debian is very licence -- or at least GPL -- aware. Which is a pity, really, because I already prefer it over RedHat in all other cases (and RedHat is also pretty licence aware, just less piously, devoutly, beyond-practicality-IMHO dedicated to the GPL.) > > I have no idea when Woody is expected to be released but I expect it may > > take longer than that if history is any indication. BTW, I believe Debian uses a fairly steady release schedule, something like an unstable->stable switch every year or 6 months or so ? I seem to recall seeing something like that on the debian website, but can't check right now. > And who or what is Woody? Woody is Debian's current development branch, the current bearer of the alias 'unstable'. It'll become Debian 2.3 (I believe, I don't pay attention to version numbers, I just run unstable :) once it's stabilized. 'potato' is the previous development branch, and currently the 'stable' branch. You can compare them with 'rawhide' and 'redhat-7.0', respectively :) (With the enormous difference that you can upgrade your debian install to a new version (even the devel version, or update your machine to the latest devel snapshot) while you are using it, without having to reboot ;) Note to the debian-pythoneers: woody still carries Python 1.5.2, not 2.0. Someone created a separate set of 2.0-packages, but they didn't include readline and gdbm support because of the licencing issues. (Posted on c.l.py sometime this week.) I'm *almost* tempted enough to learn enough about dpkg/.deb files to build my own licence-be-damned set, but it'd be a lot of work to mirror the current debian 1.5.2 set of packages (which include numeric, imaging, mxTools, GTK/GNOME, and a shitload of 3rd party modules) in 2.0. Ponder, maybe it could be done semi-automatically, from the src-deb's of those packages. By the way, in woody, there are 52 packages with 'python' in the name, and 32 with 'perl' in the name... Pity all of my perl-hugging hippy-friends are still blindly using RedHat, and refuse to listen to my calls from the Debian/Python-dark-side :-) Oh, and the names 'woody' and 'potato' came from the movie Toy Story, in case you wondered ;) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From esr at snark.thyrsus.com Thu Dec 14 01:46:37 2000 From: esr at snark.thyrsus.com (Eric S. Raymond) Date: Wed, 13 Dec 2000 19:46:37 -0500 Subject: [Python-Dev] Business related to the upcoming Python conference Message-ID: <200012140046.TAA25289@snark.thyrsus.com> I'm sending this to python-dev because I believe most or all of the reviewers for my PC9 paper are on this list. Paul, would you please forward to any who were not? First, my humble apologies for not having got my PC9 reviews in on time. I diligently read my assigned papers early, but I couldn't do the reviews early because of technical problems with my Foretec account -- and then I couldn't do them late because the pre-deadline crunch happened while I was on a ten-day speaking and business trip in Japan and California, with mostly poor or nonexistent Internet access. Matters were not helped by a nasty four-month-old problem in my personal life coming to a head right in the middle of the trip. Nor by the fact that the trip included the VA Linux Systems annual stockholders' meeting and the toughest Board of Directors' meeting in my tenure. We had to hammer out a strategic theory of what to do now that the dot-com companies who used to be our best companies aren't getting funded any more. Unfortunately, it's at times like this that Board members earn their stock options. Management oversight. Fiduciary responsibility. Mumble... Second, the feedback I received on the paper was *excellent*, and I will be making many of the recommended changes. I've already extended the discussion of "Why Python?" including addressing the weaknesses of Scheme and Prolog for this application. I have said more about uses of CML2 beyond the Linux kernel. I am working on a discussion of the politics of CML2 option, but may save that for the stand-up talk rather than the written paper. I will try to trim the CML2 language reference for the final version. (The reviewer who complained about the lack of references on the SAT problem should be pleased to hear that URLs to relevant papers are in fact included in the masters. I hope they show in the final version as rendered for publication.) -- Eric S. Raymond The Constitution is not neutral. It was designed to take the government off the backs of the people. -- Justice William O. Douglas From moshez at zadka.site.co.il Thu Dec 14 13:22:24 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Thu, 14 Dec 2000 14:22:24 +0200 (IST) Subject: [Python-Dev] Splitting up _cursesmodule Message-ID: <20001214122224.739EEA82E@darjeeling.zadka.site.co.il> On Wed, 13 Dec 2000 07:41:54 -0800, Neil Schemenauer wrote: > I don't think that is a very safe bet. Python 2.0 missed the > Debian Potato boat. By a long time -- potato was frozen for a few months when 2.0 came out. > I have no idea when Woody is expected to be > released but I expect it may take longer than that if history is > any indication. My bet is that woody starts freezing as soon as 2.4.0 is out. Note that once it starts freezing, 2.1 doesn't have a shot of getting in, regardless of how long it takes to freeze. OTOH, since in woody time there's a good chance for the "testing" distribution, a lot more people would be running something that *can* and *will* upgrade to 2.1 almost as soon as it is out. (For the record, most of the Debian users I know run woody on their server) -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From jeremy at alum.mit.edu Thu Dec 14 06:04:43 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 14 Dec 2000 00:04:43 -0500 (EST) Subject: [Python-Dev] new draft of PEP 227 Message-ID: <14904.21739.804346.650062@bitdiddle.concentric.net> I've got a new draft of PEP 227. The terminology and wording are more convoluted than they need to be. I'll do at least one revision just to say things more clearly, but I'd appreciate comments on the proposed spec if you can read the current draft. Jeremy From cgw at fnal.gov Thu Dec 14 07:03:01 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Thu, 14 Dec 2000 00:03:01 -0600 (CST) Subject: [Python-Dev] Memory leaks in tupleobject.c Message-ID: <14904.25237.654143.861733@buffalo.fnal.gov> I've been running a set of memory-leak tests against the latest Python and have found that running "test_extcall" leaks memory. This gave me a strange sense of deja vu, having fixed this once before... From nas at arctrix.com Thu Dec 14 00:43:43 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Wed, 13 Dec 2000 15:43:43 -0800 Subject: [Python-Dev] Memory leaks in tupleobject.c In-Reply-To: <14904.25237.654143.861733@buffalo.fnal.gov>; from cgw@fnal.gov on Thu, Dec 14, 2000 at 12:03:01AM -0600 References: <14904.25237.654143.861733@buffalo.fnal.gov> Message-ID: <20001213154343.A18303@glacier.fnational.com> On Thu, Dec 14, 2000 at 12:03:01AM -0600, Charles G Waldman wrote: > date: 2000/10/05 19:36:49; author: nascheme; state: Exp; lines: +24 -86 > Simplify _PyTuple_Resize by not using the tuple free list and dropping > support for the last_is_sticky flag. A few hard to find bugs may be > fixed by this patch since the old code was buggy. > > The 2.47 patch seems to have re-introduced the memory leak which was > fixed in 2.31. Maybe the old code was buggy, but the "right thing" > would have been to fix it, not to throw it away.... if _PyTuple_Resize > simply ignores the tuple free list, memory will be leaked. Guilty as charged. Can you explain how the current code is leaking memory? I can see one problem with deallocating size=0 tuples. Are there any more leaks? Neil From cgw at fnal.gov Thu Dec 14 07:57:05 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Thu, 14 Dec 2000 00:57:05 -0600 (CST) Subject: [Python-Dev] Memory leaks in tupleobject.c In-Reply-To: <20001213154343.A18303@glacier.fnational.com> References: <14904.25237.654143.861733@buffalo.fnal.gov> <20001213154343.A18303@glacier.fnational.com> Message-ID: <14904.28481.292539.354303@buffalo.fnal.gov> Neil Schemenauer writes: > Guilty as charged. Can you explain how the current code is > leaking memory? I can see one problem with deallocating size=0 > tuples. Are there any more leaks? Actually, I think I may have spoken too hastily - it's late and I'm tired and I should be sleeping rather than staring at the screen (like I've been doing since 8:30 this morning) - I jumped to conclusions - I'm not really sure that it was your patch that caused the leak; all I can say with 100% certainty is that if you run "test_extcall" in a loop, memory usage goes through the ceiling.... It's not just the cyclic garbage caused by the "saboteur" function because even with this commented out, the memory leak persists. I'm actually trying to track down a different memory leak, something which is currently causing trouble in one of our production servers (more about this some other time) and just as a sanity check I ran my little "leaktest.py" script over all the test_*.py modules in the distribution, and found that test_extcall triggers leaks... having analyzed and fixed this once before (see the CVS logs for tupleobject.c), I jumped to conclusions about the reason for its return. I'll take a more clear-headed and careful look tomorrow and post something (hopefully) a little more conclusive. It may have been some other change that caused this memory leak to re-appear. If you feel inclined to investigate, just do "reload(test.test_extcall)" in a loop and watch the memory usage with ps or top or what-have-you... -C From paulp at ActiveState.com Thu Dec 14 08:00:21 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Wed, 13 Dec 2000 23:00:21 -0800 Subject: [Python-Dev] new draft of PEP 227 References: <14904.21739.804346.650062@bitdiddle.concentric.net> Message-ID: <3A387005.6725DAAE@ActiveState.com> Jeremy Hylton wrote: > > I've got a new draft of PEP 227. The terminology and wording are more > convoluted than they need to be. I'll do at least one revision just > to say things more clearly, but I'd appreciate comments on the > proposed spec if you can read the current draft. It set me to thinking: Python should never require declarations. But would it necessarily be a problem for Python to have a variable declaration syntax? Might not the existence of declarations simplify some aspects of the proposal and of backwards compatibility? Along the same lines, might a new rule make Python code more robust? We could say that a local can only shadow a global if the local is formally declared. It's pretty rare that there is a good reason to shadow a global and Python makes it too easy to do accidentally. Paul Prescod From paulp at ActiveState.com Thu Dec 14 08:29:35 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Wed, 13 Dec 2000 23:29:35 -0800 Subject: [Python-Dev] Online help scope Message-ID: <3A3876DF.5554080C@ActiveState.com> I think Guido and I are pretty far apart on the scope and requirements of this online help thing so I'd like some clarification and opinions from the peanut gallery. Consider these scenarios a) Signature >>> help( dir ) dir([object]) -> list of stringsb) b) Usage hint >>> help( dir ) dir([object]) -> list of stringsb) Return an alphabetized list of names comprising (some of) the attributes of the given object. Without an argument, the names in the current scope are listed. With an instance argument, only the instance attributes are returned. With a class argument, attributes of the base class are not returned. For other types or arguments, this may list members or methods. c) Complete documentation, paged(man-style) >>> help( dir ) dir([object]) -> list of stringsb) Without arguments, return the list of names in the current local symbol table. With an argument, attempts to return a list of valid attribute for that object. This information is gleaned from the object's __dict__, __methods__ and __members__ attributes, if defined. The list is not necessarily complete; e.g., for classes, attributes defined in base classes are not included, and for class instances, methods are not included. The resulting list is sorted alphabetically. For example: >>> import sys >>> dir() ['sys'] >>> dir(sys) ['argv', 'exit', 'modules', 'path', 'stderr', 'stdin', 'stdout'] d) Complete documentation in a user-chosen hypertext window >>> help( dir ) (Netscape or lynx pops up) I'm thinking that maybe we need two functions: * help * pythondoc pythondoc("dir") would launch the Python documentation for the "dir" command. > That'S What Some People Think. I Disagree That It Would Be Either > Feasible Or A Good Idea To Put All Documentation For A Typical Module > In Its Doc Strings. Java and Perl people do it regularly. I think that in the greater world of software development, the inline model has won (or is winning) and I don't see a compelling reason to fight the tide. There will always be out-of-line tutorials, discussions, books etc. The canonical module documentation could be inline. That improves the liklihood of it being maintained. The LaTeX documentation is a major bottleneck and moving to XML or SGML will not help. Programmers do not want to learn documentation systems or syntaxes. They want to write code and comments. > I said above, and I'll say it again: I think the majority of people > would prefer to use their standard web browser to read the standard > docs. It's not worth the effort to try to make those accessible > through help(). No matter what we decide on the issue above, reusing the standard documentation is the only practical way of populating the help system in the short-term. Right now, today, there is a ton of documentation that exists only in LaTeX and HTML. Tons of modules have no docstrings. Keywords have no docstrings. Compare the docstring for urllib.urlretrieve to the HTML documentation. In fact, you've given me a good idea: if the HTML is not available locally, I can access it over the web. Paul Prescod From paulp at ActiveState.com Thu Dec 14 08:29:53 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Wed, 13 Dec 2000 23:29:53 -0800 Subject: [Python-Dev] Online help PEP References: <3A3480E5.C2577AE6@ActiveState.com> <200012111557.KAA24266@cj20424-a.reston1.va.home.com> <3A366A41.1A14EFD4@ActiveState.com> <200012131548.KAA21344@cj20424-a.reston1.va.home.com> Message-ID: <3A3876F1.D3E65E90@ActiveState.com> Guido van Rossum wrote: > > Having the repr() overloading invoke the pager is dangerous. The beta > version of the license command did this, and it caused some strange > side effects, e.g. vars(__builtins__) would start reading from input > and confuse the users. The new version's repr() returns the desired > string if it's less than a page, and 'Type license() to see the full > license text' if the pager would need to be invoked. I'll add this to the PEP. > The implied import is a major liability. If you can do this without > importing (e.g. by source code inspection), fine. Otherwise, you > might issue some kind of message like "you must first import XXX.YYY". Okay, I'll add to the PEP that an open issue is what strategy to use, but that we want to avoid implicit import. > The hurt is code bloat in the interpreter and creeping featurism. If > you need command line access to the docs (which may be a reasonable > thing to ask for, although to me it sounds backwards :-), it's better > to provide a separate command, e.g. pythondoc. (Analog to perldoc.) Okay, I'll add a pythondoc proposal to the PEP. > Yes. Please add that option to the PEP. Done. > > > What does "demand-loaded" mean in a Python context? > > > > When you "touch" the help object, it loads the onlinehelp module which > > has the real implementation. The thing in __builtins__ is just a > > lightweight proxy. > > Please suggest an implementation. In the PEP. > Glad You'Re So Agreeable. :) What happened to your capitalization? elisp gone awry? > ... > To Tell You The Truth, I'M Not Holding My Breath Either. :-) So your > code should just dump the doc string on stdout without interpreting it > in any way (except for paging). I'll do this for the first version. > It's buggier than just that. The output of the pager prints an extra > "| " at the start of each page except for the first, and the first > page is a line longer than subsequent pages. For some reason that I now I forget, that code is pretty hairy. > BTW, another bug: try help(cgi). It's nice that it gives the default > value for arguments, but the defaults for FieldStorage.__init__ happen > to include os.environ. Its entire value is dumped -- which causes the > pager to be off (it wraps over about 20 lines for me). I think you > may have to truncate long values a bit, e.g. by using the repr module. Okay. There are a lot of little things we need to figure out. Such as whether we should print out docstrings for private methods etc. >... > I don't know specific tools, but any serious docstring processing tool > ends up parsing the source code for this very reason, so there's > probably plenty of prior art. Okay, I'll look into it. Paul From tim.one at home.com Thu Dec 14 08:35:00 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 02:35:00 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A387005.6725DAAE@ActiveState.com> Message-ID: [Paul Prescod] > ... > Along the same lines, might a new rule make Python code more robust? > We could say that a local can only shadow a global if the local is > formally declared. It's pretty rare that there is a good reason to > shadow a global and Python makes it too easy to do accidentally. I've rarely seen problems due to shadowing a global, but have often seen problems due to shadowing a builtin. Alas, if this rule were extended to builtins too-- where it would do the most good --then the names of builtins would effectively become reserved words (any code shadowing them today would be broken until declarations were added, and any code working today may break tomorrow if a new builtin were introduced that happened to have the same name as a local). From pf at artcom-gmbh.de Thu Dec 14 08:42:59 2000 From: pf at artcom-gmbh.de (Peter Funk) Date: Thu, 14 Dec 2000 08:42:59 +0100 (MET) Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: <200012132039.MAA07496@slayer.i.sourceforge.net> from Moshe Zadka at "Dec 13, 2000 12:39:24 pm" Message-ID: Hi, I think the following change is incompatible and will break applications. At least I have some server type applications that rely on 'allow_reuse_address' defaulting to 0, because they use the 'address already in use' exception, to make sure, that exactly one server process is running on this port. One of these applications, which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work, if I change this default in SocketServer.py. Would you please explain the reasoning behind this change? Moshe Zadka: > *** SocketServer.py 2000/09/01 03:25:14 1.19 > --- SocketServer.py 2000/12/13 20:39:17 1.20 > *************** > *** 158,162 **** > request_queue_size = 5 > > ! allow_reuse_address = 0 > > def __init__(self, server_address, RequestHandlerClass): > --- 158,162 ---- > request_queue_size = 5 > > ! allow_reuse_address = 1 > > def __init__(self, server_address, RequestHandlerClass): Regards, Peter -- Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260 office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen) From paul at prescod.net Thu Dec 14 08:57:30 2000 From: paul at prescod.net (Paul Prescod) Date: Wed, 13 Dec 2000 23:57:30 -0800 Subject: [Python-Dev] new draft of PEP 227 References: Message-ID: <3A387D6A.782E6A3B@prescod.net> Tim Peters wrote: > > ... > > I've rarely seen problems due to shadowing a global, but have often seen > problems due to shadowing a builtin. Really? I think that there are two different issues here. One is consciously choosing to create a new variable but not understanding that there already exists a variable by that name. (i.e. str, list). Another is trying to assign to a global but actually shadowing it. There is no way that anyone coming from another language is going to consider this transcript reasonable: >>> a=5 >>> def show(): ... print a ... >>> def set(val): ... a=val ... >>> a 5 >>> show() 5 >>> set(10) >>> show() 5 It doesn't seem to make any sense. My solution is to make the assignment in "set" illegal unless you add a declaration that says: "No, really. I mean it. Override that sucker." As the PEP points out, overriding is seldom a good idea so the requirement to declare would be rarely invoked. Actually, one could argue that there is no good reason to even *allow* the shadowing of globals. You can always add an underscore to the end of the variable name to disambiguate. > Alas, if this rule were extended to > builtins too-- where it would do the most good --then the names of builtins > would effectively become reserved words (any code shadowing them today would > be broken until declarations were added, and any code working today may > break tomorrow if a new builtin were introduced that happened to have the > same name as a local). I have no good solutions to the shadowing-builtins accidently problem. But I will say that those sorts of problems are typically less subtle: str = "abcdef" ... str(5) # You'll get a pretty good error message here! The "right answer" in terms of namespace theory is to consistently refer to builtins with a prefix (whether "__builtins__" or "$") but that's pretty unpalatable from an aesthetic point of view. Paul Prescod From tim.one at home.com Thu Dec 14 09:41:19 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 03:41:19 -0500 Subject: [Python-Dev] Online help scope In-Reply-To: <3A3876DF.5554080C@ActiveState.com> Message-ID: [Paul Prescod] > I think Guido and I are pretty far apart on the scope and requirements > of this online help thing so I'd like some clarification and opinions > from the peanut gallery. > > Consider these scenarios > > a) Signature > ... > b) Usage hint > ... > c) Complete documentation, paged(man-style) > ... > d) Complete documentation in a user-chosen hypertext window > ... Guido's style guide has a lot to say about docstrings, suggesting that they were intended to support two scenarios: #a+#b together (the first line of a multi-line docstring), and #c+#d together (the entire docstring). In this respect I think Guido was (consciously or not) aping elisp's conventions, up to but not including the elisp convention for naming the arguments in the first line of a docstring. The elisp conventions were very successful (simple, and useful in practice), so aping them is a good thing. We've had stalemate ever since: there isn't a single style of writing docstrings in practice because no single docstring processor has been blessed, while no docstring processor can gain momentum before being blessed. Every attempt to date has erred by trying to do too much, thus attracting so much complaint that it can't ever become blessed. The current argument over PEP 233 appears more of the same. The way to break the stalemate is to err on the side of simplicity: just cater to the two obvious (first-line vs whole-string) cases, and for existing docstrings only. HTML vs plain text is fluff. Paging vs non-paging is fluff. Dumping to stdout vs displaying in a browser is fluff. Jumping through hoops for functions and modules whose authors didn't bother to write docstrings is fluff. Etc. People fight over fluff until it fills the air and everyone chokes to death on it <0.9 wink>. Something dirt simple can get blessed, and once *anything* is blessed, a million docstrings will bloom. [Guido] > That'S What Some People Think. I Disagree That It Would Be Either > Feasible Or A Good Idea To Put All Documentation For A Typical Module > In Its Doc Strings. I'm with Paul on this one: that's what module.__doc__ is for, IMO (Javadoc is great, Eiffel's embedded doc tools are great, Perl POD is great, even REBOL's interactive help is great). All Java, Eiffel, Perl and REBOL have in common that Python lacks is *a* blessed system, no matter how crude. [back to Paul] > ... > No matter what we decide on the issue above, reusing the standard > documentation is the only practical way of populating the help system > in the short-term. Right now, today, there is a ton of documentation > that exists only in LaTeX and HTML. Tons of modules have no docstrings. Then write tools to automatically create docstrings from the LaTeX and HTML, but *check in* the results (i.e., add the docstrings so created to the codebase), and keep the help system simple. > Keywords have no docstrings. Neither do integers, but they're obvious too . From thomas at xs4all.net Thu Dec 14 10:13:49 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 14 Dec 2000 10:13:49 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: <20001214010534.M4396@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 14, 2000 at 01:05:34AM +0100 References: <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> Message-ID: <20001214101348.N4396@xs4all.nl> On Thu, Dec 14, 2000 at 01:05:34AM +0100, Thomas Wouters wrote: > By the way, in woody, there are 52 packages with 'python' in the name, and > 32 with 'perl' in the name... Ah, not true, sorry. I shouldn't have posted off-topic stuff after being awoken by machine-down-alarms ;) That was just what my reasonably-default install had installed. Debian has what looks like most CPAN modules as packages, too, so it's closer to a 110/410 spread (python/perl.) Still, not a bad number :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mal at lemburg.com Thu Dec 14 11:32:58 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 11:32:58 +0100 Subject: [Python-Dev] new draft of PEP 227 References: <14904.21739.804346.650062@bitdiddle.concentric.net> Message-ID: <3A38A1DA.7EC49149@lemburg.com> Jeremy Hylton wrote: > > I've got a new draft of PEP 227. The terminology and wording are more > convoluted than they need to be. I'll do at least one revision just > to say things more clearly, but I'd appreciate comments on the > proposed spec if you can read the current draft. The PEP doesn't mention the problems I pointed out about breaking the lookup schemes w/r to symbols in methods, classes and globals. Please add a comment about this to the PEP + maybe the example I gave in one the posts to python-dev about it. I consider the problem serious enough to limit the nested scoping to lambda functions (or functions in general) only if that's possible. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Thu Dec 14 11:55:38 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 11:55:38 +0100 Subject: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule) References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> Message-ID: <3A38A72A.4011B5BD@lemburg.com> Thomas Wouters wrote: > > On Wed, Dec 13, 2000 at 06:03:31PM -0500, Guido van Rossum wrote: > > > I don't think that is a very safe bet. Python 2.0 missed the Debian > > > Potato boat. > > > > This may have had to do more with the unresolved GPL issues. > > This is very likely. Debian is very licence -- or at least GPL -- aware. > Which is a pity, really, because I already prefer it over RedHat in all > other cases (and RedHat is also pretty licence aware, just less piously, > devoutly, beyond-practicality-IMHO dedicated to the GPL.) About the GPL issue: as I understood Guido's post, RMS still regards the choice of law clause as being incompatible to the GPL (heck, doesn't this guy ever think about international trade terms, the United Nations Convention on International Sale of Goods or local law in one of the 200+ countries where you could deploy GPLed software... is the GPL only meant for US programmers ?). I am currently rewriting my open source licenses as well and among other things I chose to integrate a choice of law clause as well. Seeing RMS' view of things, I guess that my license will be regarded as incompatible to the GPL which is sad even though I'm in good company... e.g. the Apache license, the Zope license, etc. Dual licensing is not possible as it would reopen the loop-wholes in the GPL I tried to fix in my license. Any idea on how to proceed ? Another issue: since Python doesn't link Python scripts, is it still true that if one (pure) Python package is covered by the GPL, then all other packages needed by that application will also fall under GPL ? Thanks, -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From gstein at lyra.org Thu Dec 14 12:57:43 2000 From: gstein at lyra.org (Greg Stein) Date: Thu, 14 Dec 2000 03:57:43 -0800 Subject: (offtopic) Re: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A38A72A.4011B5BD@lemburg.com>; from mal@lemburg.com on Thu, Dec 14, 2000 at 11:55:38AM +0100 References: <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213160146.A24753@thyrsus.com> <200012132119.QAA11060@cj20424-a.reston1.va.home.com> <20001213170359.A24915@thyrsus.com> <20001213074154.D17148@glacier.fnational.com> <200012132303.SAA12434@cj20424-a.reston1.va.home.com> <20001214010534.M4396@xs4all.nl> <3A38A72A.4011B5BD@lemburg.com> Message-ID: <20001214035742.Z8951@lyra.org> On Thu, Dec 14, 2000 at 11:55:38AM +0100, M.-A. Lemburg wrote: >... > I am currently rewriting my open source licenses as well and among > other things I chose to integrate a choice of law clause as well. > Seeing RMS' view of things, I guess that my license will be regarded > as incompatible to the GPL which is sad even though I'm in good > company... e.g. the Apache license, the Zope license, etc. Dual > licensing is not possible as it would reopen the loop-wholes in the > GPL I tried to fix in my license. Any idea on how to proceed ? Only RMS is under the belief that the Apache license is incompatible. It is either clause 4 or 5 (I forget which) where we state that certain names (e.g. "Apache") cannot be used in derived products' names and promo materials. RMS views this as an "additional restriction on redistribution", which is apparently not allowed by the GPL. We (the ASF) generally feel he is being a royal pain in the ass with this. We've sent him a big, long email asking for clarification / resolution, but haven't heard back (we sent it a month or so ago). Basically, his FUD creates views such as yours ("the Apache license is incompatible with the GPL") because people just take his word for it. We plan to put together a web page to outline our own thoughts and licensing beliefs/philosophy. We're also planning to rev our license to rephrase/alter the particular clause, but for logistic purposes (putting the project name in there ties it to the particular project; we want a generic ASF license that can be applied to all of the projects without a search/replace). At this point, the ASF is taking the position of ignoring him and his controlling attitude(*) and beliefs. There is the outstanding letter to him, but that doesn't really change our point of view. Cheers, -g (*) for a person espousing freedom, it is rather ironic just how much of a control freak he is (stemming from a no-compromise position to guarantee peoples' freedoms, he always wants things done his way) -- Greg Stein, http://www.lyra.org/ From tg at melaten.rwth-aachen.de Thu Dec 14 14:07:12 2000 From: tg at melaten.rwth-aachen.de (Thomas Gellekum) Date: 14 Dec 2000 14:07:12 +0100 Subject: [Python-Dev] Splitting up _cursesmodule In-Reply-To: Andrew Kuchling's message of "Wed, 13 Dec 2000 19:26:32 -0500" References: <200012130355.WAA00656@207-172-112-211.s211.tnt4.ann.va.dialup.rcn.com> <14903.5633.941563.568690@cj42289-a.reston1.va.home.com> <20001213074423.A30348@kronos.cnri.reston.va.us> <14903.37733.339921.131872@cj42289-a.reston1.va.home.com> <20001213192632.A30585@kronos.cnri.reston.va.us> Message-ID: Andrew Kuchling writes: > I'm ambivalent about the list_of_panels. It's a linked list storing > (PyWindow, PyPanel) pairs. Probably it should use a dictionary > instead of implementing a little list, just to reduce the amount of > code. I don't like it either, so feel free to shred it. As I said, this is the first (piece of an) extension module I've written and I thought it would be easier to implement a little list than to manage a Python list or such in C. > So, I suggest we create _curses_panel.c, which would be available as > curses.panel. (A panel.py module could then add any convenience > functions that are required.) > > Thomas, do you want to work on this, or should I? Just do it. I'll try to add more examples in the meantime. tg From fredrik at pythonware.com Thu Dec 14 14:19:08 2000 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 14 Dec 2000 14:19:08 +0100 Subject: [Python-Dev] fuzzy logic? Message-ID: <015101c065d0$717d1680$0900a8c0@SPIFF> here's a simple (but somewhat strange) test program: def spam(): a = 1 if (0): global a print "global a" a = 2 def egg(): b = 1 if 0: global b print "global b" b = 2 egg() spam() print a print b if I run this under 1.5.2, I get: 2 Traceback (innermost last): File "", line 19, in ? NameError: b From gstein at lyra.org Thu Dec 14 14:42:11 2000 From: gstein at lyra.org (Greg Stein) Date: Thu, 14 Dec 2000 05:42:11 -0800 Subject: [Python-Dev] fuzzy logic? In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF>; from fredrik@pythonware.com on Thu, Dec 14, 2000 at 02:19:08PM +0100 References: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: <20001214054210.G8951@lyra.org> I would take a guess that the "if 0:" is optimized away *before* the inspection for a "global" statement. But the compiler doesn't know how to optimize away "if (0):", so the global statement remains. Ah. Just checked. Look at compile.c::com_if_stmt(). There is a call to "is_constant_false()" in there. Heh. Looks like is_constant_false() could be made a bit smarter. But the point is valid: you can make is_constant_false() as smart as you want, and you'll still end up with "funny" global behavior. Cheers, -g On Thu, Dec 14, 2000 at 02:19:08PM +0100, Fredrik Lundh wrote: > here's a simple (but somewhat strange) test program: > > def spam(): > a = 1 > if (0): > global a > print "global a" > a = 2 > > def egg(): > b = 1 > if 0: > global b > print "global b" > b = 2 > > egg() > spam() > > print a > print b > > if I run this under 1.5.2, I get: > > 2 > Traceback (innermost last): > File "", line 19, in ? > NameError: b > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://www.python.org/mailman/listinfo/python-dev -- Greg Stein, http://www.lyra.org/ From mwh21 at cam.ac.uk Thu Dec 14 14:58:24 2000 From: mwh21 at cam.ac.uk (Michael Hudson) Date: 14 Dec 2000 13:58:24 +0000 Subject: [Python-Dev] fuzzy logic? In-Reply-To: "Fredrik Lundh"'s message of "Thu, 14 Dec 2000 14:19:08 +0100" References: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: 1) Is there anything is the standard library that does the equivalent of import symbol,token def decode_ast(ast): if token.ISTERMINAL(ast[0]): return (token.tok_name[ast[0]], ast[1]) else: return (symbol.sym_name[ast[0]],)+tuple(map(decode_ast,ast[1:])) so that, eg: >>> pprint.pprint(decode.decode_ast(parser.expr("0").totuple())) ('eval_input', ('testlist', ('test', ('and_test', ('not_test', ('comparison', ('expr', ('xor_expr', ('and_expr', ('shift_expr', ('arith_expr', ('term', ('factor', ('power', ('atom', ('NUMBER', '0'))))))))))))))), ('NEWLINE', ''), ('ENDMARKER', '')) ? Should there be? (Especially if it was a bit better written). ... and Greg's just said everything else I wanted to! Cheers, M. -- please realize that the Common Lisp community is more than 40 years old. collectively, the community has already been where every clueless newbie will be going for the next three years. so relax, please. -- Erik Naggum, comp.lang.lisp From guido at python.org Thu Dec 14 15:51:26 2000 From: guido at python.org (Guido van Rossum) Date: Thu, 14 Dec 2000 09:51:26 -0500 Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: Your message of "Thu, 14 Dec 2000 08:42:59 +0100." References: Message-ID: <200012141451.JAA15637@cj20424-a.reston1.va.home.com> > I think the following change is incompatible and will break applications. > > At least I have some server type applications that rely on > 'allow_reuse_address' defaulting to 0, because they use > the 'address already in use' exception, to make sure, that exactly one > server process is running on this port. One of these applications, > which is BTW build on top of Fredrik Lundhs 'xmlrpclib' fails to work, > if I change this default in SocketServer.py. > > Would you please explain the reasoning behind this change? The reason for the patch is that without this, if you kill a TCP server and restart it right away, you'll get a 'port in use" error -- TCP has some kind of strange wait period after a connection is closed before it can be reused. The patch avoids this error. As far as I know, with TCP, code using SO_REUSEADDR still cannot bind to the port when another process is already using it, but for UDP, the semantics may be different. Is your server using UDP? Try this patch if your problem is indeed related to UDP: *** SocketServer.py 2000/12/13 20:39:17 1.20 --- SocketServer.py 2000/12/14 14:48:16 *************** *** 268,273 **** --- 268,275 ---- """UDP server class.""" + allow_reuse_address = 0 + socket_type = socket.SOCK_DGRAM max_packet_size = 8192 If this works for you, I'll check it in, of course. --Guido van Rossum (home page: http://www.python.org/~guido/) From jeremy at alum.mit.edu Thu Dec 14 15:52:37 2000 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Thu, 14 Dec 2000 09:52:37 -0500 (EST) Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A38A1DA.7EC49149@lemburg.com> References: <14904.21739.804346.650062@bitdiddle.concentric.net> <3A38A1DA.7EC49149@lemburg.com> Message-ID: <14904.57013.371474.691948@bitdiddle.concentric.net> >>>>> "MAL" == M -A Lemburg writes: MAL> Jeremy Hylton wrote: >> >> I've got a new draft of PEP 227. The terminology and wording are >> more convoluted than they need to be. I'll do at least one >> revision just to say things more clearly, but I'd appreciate >> comments on the proposed spec if you can read the current draft. MAL> The PEP doesn't mention the problems I pointed out about MAL> breaking the lookup schemes w/r to symbols in methods, classes MAL> and globals. I believe it does. There was some discussion on python-dev and with others in private email about how classes should be handled. The relevant section of the specification is: If a name is used within a code block, but it is not bound there and is not declared global, the use is treated as a reference to the nearest enclosing function region. (Note: If a region is contained within a class definition, the name bindings that occur in the class block are not visible to enclosed functions.) MAL> Please add a comment about this to the PEP + maybe the example MAL> I gave in one the posts to python-dev about it. I consider the MAL> problem serious enough to limit the nested scoping to lambda MAL> functions (or functions in general) only if that's possible. If there was some other concern you had, then I don't know what it was. I recall that you had a longish example that raised a NameError immediately :-). Jeremy From mal at lemburg.com Thu Dec 14 16:02:33 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 16:02:33 +0100 Subject: [Python-Dev] new draft of PEP 227 References: <14904.21739.804346.650062@bitdiddle.concentric.net> <3A38A1DA.7EC49149@lemburg.com> <14904.57013.371474.691948@bitdiddle.concentric.net> Message-ID: <3A38E109.54C07565@lemburg.com> Jeremy Hylton wrote: > > >>>>> "MAL" == M -A Lemburg writes: > > MAL> Jeremy Hylton wrote: > >> > >> I've got a new draft of PEP 227. The terminology and wording are > >> more convoluted than they need to be. I'll do at least one > >> revision just to say things more clearly, but I'd appreciate > >> comments on the proposed spec if you can read the current draft. > > MAL> The PEP doesn't mention the problems I pointed out about > MAL> breaking the lookup schemes w/r to symbols in methods, classes > MAL> and globals. > > I believe it does. There was some discussion on python-dev and > with others in private email about how classes should be handled. > > The relevant section of the specification is: > > If a name is used within a code block, but it is not bound there > and is not declared global, the use is treated as a reference to > the nearest enclosing function region. (Note: If a region is > contained within a class definition, the name bindings that occur > in the class block are not visible to enclosed functions.) Well hidden ;-) Honestly, I think that you should either make this specific case more visible to readers of the PEP since this single detail would produce most of the problems with nested scopes. BTW, what about nested classes ? AFAIR, the PEP only talks about nested functions. > MAL> Please add a comment about this to the PEP + maybe the example > MAL> I gave in one the posts to python-dev about it. I consider the > MAL> problem serious enough to limit the nested scoping to lambda > MAL> functions (or functions in general) only if that's possible. > > If there was some other concern you had, then I don't know what it > was. I recall that you had a longish example that raised a NameError > immediately :-). The idea behind the example should have been clear, though. x = 1 class C: x = 2 def test(self): print x -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake at acm.org Thu Dec 14 16:09:57 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu, 14 Dec 2000 10:09:57 -0500 (EST) Subject: [Python-Dev] fuzzy logic? In-Reply-To: References: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: <14904.58053.282537.260186@cj42289-a.reston1.va.home.com> Michael Hudson writes: > 1) Is there anything is the standard library that does the equivalent > of No, but I have a chunk of code that does in a different way. Where in the library do you think it belongs? The compiler package sounds like the best place, but that's not installed by default. (Jeremy, is that likely to change soon?) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From mwh21 at cam.ac.uk Thu Dec 14 16:47:33 2000 From: mwh21 at cam.ac.uk (Michael Hudson) Date: 14 Dec 2000 15:47:33 +0000 Subject: [Python-Dev] fuzzy logic? In-Reply-To: "Fred L. Drake, Jr."'s message of "Thu, 14 Dec 2000 10:09:57 -0500 (EST)" References: <015101c065d0$717d1680$0900a8c0@SPIFF> <14904.58053.282537.260186@cj42289-a.reston1.va.home.com> Message-ID: "Fred L. Drake, Jr." writes: > Michael Hudson writes: > > 1) Is there anything is the standard library that does the equivalent > > of > > No, but I have a chunk of code that does in a different way. I'm guessing everyone who's played with the parser much does, hence the suggestion. I agree my implementation is probably not optimal - I just threw it together as quickly as I could! > Where in the library do you think it belongs? The compiler package > sounds like the best place, but that's not installed by default. > (Jeremy, is that likely to change soon?) Actually, I'd have thought the parser module would be most natural, but that would probably mean doing the _module.c trick, and it's probably not worth the bother. OTOH, it seems that wrapping any given extension module in a python module is becoming if anything the norm, so maybe it is. Cheers, M. -- I don't remember any dirty green trousers. -- Ian Jackson, ucam.chat From nowonder at nowonder.de Thu Dec 14 16:50:10 2000 From: nowonder at nowonder.de (Peter Schneider-Kamp) Date: Thu, 14 Dec 2000 16:50:10 +0100 Subject: [Python-Dev] [PEP-212] new draft Message-ID: <3A38EC32.210BD1A2@nowonder.de> In an attempt to revive PEP 212 - Loop counter iteration I have updated the draft. The HTML version can be found at: http://python.sourceforge.net/peps/pep-0212.html I will appreciate any form of comments and/or criticisms. Peter P.S.: Now I have posted it - should I update the Post-History? Or is that for posts to c.l.py? From pf at artcom-gmbh.de Thu Dec 14 16:56:08 2000 From: pf at artcom-gmbh.de (Peter Funk) Date: Thu, 14 Dec 2000 16:56:08 +0100 (MET) Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: <200012141451.JAA15637@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 14, 2000 9:51:26 am" Message-ID: Hi, Moshes checkin indeed makes a lot of sense. Sorry for the irritation. Guido van Rossum: > The reason for the patch is that without this, if you kill a TCP server > and restart it right away, you'll get a 'port in use" error -- TCP has > some kind of strange wait period after a connection is closed before > it can be reused. The patch avoids this error. > > As far as I know, with TCP, code using SO_REUSEADDR still cannot bind > to the port when another process is already using it, but for UDP, the > semantics may be different. > > Is your server using UDP? No and I must admit, that I didn't tested carefully enough: From a quick look at my process listing I assumed there were indeed two server processes running concurrently which would have broken the needed mutual exclusion. But the second process went in a sleep-and-retry-to-connect-loop which I simply forgot about. This loop was initially built into my server to wait until the "strange wait period" you mentioned above was over or a certain number of retries has been exceeded. I guess I can take this ugly work-around out with Python 2.0 and newer, since the BaseHTTPServer.py shipped with Python 2.0 already contained allow_reuse_address = 1 default in the HTTPServer class. BTW: I've took my old W.Richard Stevens Unix Network Programming from the shelf. After rereading the rather terse paragraph about SO_REUSEADDR I guess the wait period is necessary to make sure, that their is no connect pending from an outside client on this TCP port. I can't find nothing about UDP and REUSE. Regards, Peter From guido at python.org Thu Dec 14 17:17:27 2000 From: guido at python.org (Guido van Rossum) Date: Thu, 14 Dec 2000 11:17:27 -0500 Subject: [Python-Dev] Online help scope In-Reply-To: Your message of "Wed, 13 Dec 2000 23:29:35 PST." <3A3876DF.5554080C@ActiveState.com> References: <3A3876DF.5554080C@ActiveState.com> Message-ID: <200012141617.LAA16179@cj20424-a.reston1.va.home.com> > I think Guido and I are pretty far apart on the scope and requirements > of this online help thing so I'd like some clarification and opinions > from the peanut gallery. I started replying but I think Tim's said it all. Let's do something dead simple. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at digicool.com Thu Dec 14 18:14:01 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 12:14:01 -0500 Subject: [Python-Dev] [PEP-212] new draft References: <3A38EC32.210BD1A2@nowonder.de> Message-ID: <14904.65497.940293.975775@anthem.concentric.net> >>>>> "PS" == Peter Schneider-Kamp writes: PS> P.S.: Now I have posted it - should I update the Post-History? PS> Or is that for posts to c.l.py? Originally, I'd thought of it as tracking the posting history to c.l.py. I'm not sure how useful that header is after all -- maybe in just giving a start into the python-list archives... -Barry From tim.one at home.com Thu Dec 14 18:33:41 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 12:33:41 -0500 Subject: [Python-Dev] fuzzy logic? In-Reply-To: <015101c065d0$717d1680$0900a8c0@SPIFF> Message-ID: Note that the behavior of both functions is undefined ("Names listed in a global statement must not be used in the same code block textually preceding that global statement", from the Lang Ref, and "if" does not introduce a new code block in Python's terminology). But you'll get the same outcome via these trivial variants, which sidestep that problem: def spam(): if (0): global a print "global a" a = 2 def egg(): if 0: global b print "global b" b = 2 *Now* you can complain . > -----Original Message----- > From: python-dev-admin at python.org [mailto:python-dev-admin at python.org]On > Behalf Of Fredrik Lundh > Sent: Thursday, December 14, 2000 8:19 AM > To: python-dev at python.org > Subject: [Python-Dev] fuzzy logic? > > > here's a simple (but somewhat strange) test program: > > def spam(): > a = 1 > if (0): > global a > print "global a" > a = 2 > > def egg(): > b = 1 > if 0: > global b > print "global b" > b = 2 > > egg() > spam() > > print a > print b > > if I run this under 1.5.2, I get: > > 2 > Traceback (innermost last): > File "", line 19, in ? > NameError: b > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://www.python.org/mailman/listinfo/python-dev From tim.one at home.com Thu Dec 14 19:46:09 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 13:46:09 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule) In-Reply-To: <3A38A72A.4011B5BD@lemburg.com> Message-ID: [MAL] > About the GPL issue: as I understood Guido's post, RMS still regards > the choice of law clause as being incompatible to the GPL Yes. Actually, I don't know what RMS really thinks -- his public opinions on legal issues appear to be echoes of what Eben Moglen tells him. Like his views or not, Moglen is a tenured law professor > (heck, doesn't this guy ever think about international trade terms, > the United Nations Convention on International Sale of Goods > or local law in one of the 200+ countries where you could deploy > GPLed software... Yes. > is the GPL only meant for US programmers ?). No. Indeed, that's why the GPL is grounded in copyright law, because copyright law is the most uniform (across countries) body of law we've got. Most commentary I've seen suggests that the GPL has its *weakest* legal legs in the US! > I am currently rewriting my open source licenses as well and among > other things I chose to integrate a choice of law clause as well. > Seeing RMS' view of things, I guess that my license will be regarded > as incompatible to the GPL Yes. > which is sad even though I'm in good company... e.g. the Apache > license, the Zope license, etc. Dual licensing is not possible as > it would reopen the loop-wholes in the GPL I tried to fix in my > license. Any idea on how to proceed ? You can wait to see how the CNRI license turns out, then copy it if it's successful; you can approach the FSF directly; you can stop trying to do it yourself and reuse some license that's already been blessed by the FSF; or you can give up on GPL compatibility (according to the FSF). I don't see any other choices. > Another issue: since Python doesn't link Python scripts, is it > still true that if one (pure) Python package is covered by the GPL, > then all other packages needed by that application will also fall > under GPL ? Sorry, couldn't make sense of the question. Just as well, since you should ask about it on a GNU forum anyway . From mal at lemburg.com Thu Dec 14 21:02:05 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 21:02:05 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: Message-ID: <3A39273D.4AE24920@lemburg.com> Tim Peters wrote: > > [MAL] > > About the GPL issue: as I understood Guido's post, RMS still regards > > the choice of law clause as being incompatible to the GPL > > Yes. Actually, I don't know what RMS really thinks -- his public opinions > on legal issues appear to be echoes of what Eben Moglen tells him. Like his > views or not, Moglen is a tenured law professor But it's his piece of work, isn't it ? He's the one who can change it. > > (heck, doesn't this guy ever think about international trade terms, > > the United Nations Convention on International Sale of Goods > > or local law in one of the 200+ countries where you could deploy > > GPLed software... > > Yes. Strange, then how come he sees the choice of law clause as a problem: without explicitely ruling out the applicability of the UN CISC, this clause is waived by it anyway... at least according to a specialist on software law here in Germany. > > is the GPL only meant for US programmers ?). > > No. Indeed, that's why the GPL is grounded in copyright law, because > copyright law is the most uniform (across countries) body of law we've got. > Most commentary I've seen suggests that the GPL has its *weakest* legal legs > in the US! Huh ? Just an example: in Germany customer rights assure a 6 month warranty on everything you buy or obtain in some other way. Liability is another issue: there are some very unpleasant laws which render most of the "no liability" paragraphs in licenses useless in Germany. Even better: since the license itself is written in English a German party could simply consider the license non-binding, since he or she hasn't agreed to accept contract in foreign languages. France has similar interpretations. > > I am currently rewriting my open source licenses as well and among > > other things I chose to integrate a choice of law clause as well. > > Seeing RMS' view of things, I guess that my license will be regarded > > as incompatible to the GPL > > Yes. > > > which is sad even though I'm in good company... e.g. the Apache > > license, the Zope license, etc. Dual licensing is not possible as > > it would reopen the loop-wholes in the GPL I tried to fix in my > > license. Any idea on how to proceed ? > > You can wait to see how the CNRI license turns out, then copy it if it's > successful; you can approach the FSF directly; you can stop trying to do it > yourself and reuse some license that's already been blessed by the FSF; or > you can give up on GPL compatibility (according to the FSF). I don't see > any other choices. I guess I'll go with the latter. > > Another issue: since Python doesn't link Python scripts, is it > > still true that if one (pure) Python package is covered by the GPL, > > then all other packages needed by that application will also fall > > under GPL ? > > Sorry, couldn't make sense of the question. Just as well, since you should > ask about it on a GNU forum anyway . Isn't this question (whether the GPL virus applies to byte-code as well) important to Python programmers as well ? Oh well, nevermind... it's still nice to hear that CNRI and RMS have finally made up their minds to render Python GPL-compatible -- whatever this means ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From cgw at fnal.gov Thu Dec 14 22:06:43 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Thu, 14 Dec 2000 15:06:43 -0600 (CST) Subject: [Python-Dev] memory leaks Message-ID: <14905.13923.659879.100243@buffalo.fnal.gov> The following code (extracted from test_extcall.py) leaks memory: class Foo: def method(self, arg1, arg2): return arg1 + arg2 def f(): err = None try: Foo.method(*(1, 2, 3)) except TypeError, err: pass del err One-line fix (also posted to Sourceforge): --- Python/ceval.c 2000/10/30 17:15:19 2.213 +++ Python/ceval.c 2000/12/14 20:54:02 @@ -1905,8 +1905,7 @@ class))) { PyErr_SetString(PyExc_TypeError, "unbound method must be called with instance as first argument"); - x = NULL; - break; + goto extcall_fail; } } } I think that there are a bunch more memory leaks lurking around... this only fixes one of them. I'll send more info as I find out what's going on. From tim.one at home.com Thu Dec 14 22:28:09 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 16:28:09 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A39273D.4AE24920@lemburg.com> Message-ID: I'm not going to argue about the GPL. Take it up with the FSF! I will say that if you do get the FSF's attention, Moglen will have an instant counter to any objection you're likely to raise -- he's been thinking about this for 10 years, and he's heard it all. And in our experience, RMS won't commit to anything before running it past Moglen. [MAL] > But it's his [RMS's] piece of work, isn't it ? He's the one who can > change it. Akin to saying Python is Guido's piece of work. Yes, no, kinda, more true at some times than others, ditto respects. RMS has consistently said that any changes for the next version of the GPL will take at least a year, due to extensive legal review required first. Would be more clearly true to say that the first version of the GPL was RMS's alone -- but version 2 came out in 1991. > ... > Strange, then how come he sees the choice of law clause as a problem: > without explicitely ruling out the applicability of the UN CISC, > this clause is waived by it anyway... at least according to a > specialist on software law here in Germany. > ... [and other "who knows?" objections] ... Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re: [License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who dismissed it almost offhandedly as "layman's commentary". You'll have to ask him why: MAL, we're not lawyers. We're incompetent to have this discussion -- or at least I am, and Moglen thinks you are too . >>> Another issue: since Python doesn't link Python scripts, is it >>> still true that if one (pure) Python package is covered by the GPL, >>> then all other packages needed by that application will also fall >>> under GPL ? [Tim] >> Sorry, couldn't make sense of the question. Just as well, >> since you should ask about it on a GNU forum anyway . [MAL] > Isn't this question (whether the GPL virus applies to byte-code > as well) important to Python programmers as well ? I don't know -- like I said, I couldn't make sense of the question, i.e. I couldn't figure out what it is you're asking. I *suspect* it's based on a misunderstanding of the GPL; for example, gcc is a GPL'ed application that requires stuff from the OS in order to do its job of compiling, but that doesn't mean that every OS it runs on falls under the GPL. The GPL contains no restrictions on *use*, it restricts only copying, modifying and distributing (the specific rights granted by copyright law). I don't see any way to read the GPL as restricting your ability to distribute a GPL'ed program P on its own, no matter what the status of the packages that P may rely upon for operation. The GPL is also not viral in the sense that it cannot infect an unwitting victim. Nothing whatsoever you do or don't do can make *any* other program Q "fall under" the GPL -- only Q's owner can set the license for Q. The GPL purportedly can prevent you from distributing (but not from using) a program that links with a GPL'ed program, but that doesn't appear to be what you're asking about. Or is it? If you were to put, say, mxDateTime, under the GPL, then yes, I believe the FSF would claim I could not distribute my program T that uses mxDateTime unless T were also under the GPL or a GPL-compatible license. But if mxDateTime is not under the GPL, then nothing I do with T can magically change the mxDateTime license to the GPL (although if your mxDateTime license allows me to redistribute mxDateTime under a different license, then it allows me to ship a copy of mxDateTime under the GPL). That said, the whole theory of GPL linking is muddy to me, especially since the word "link" (and its variants) doesn't appear in the GPL. > Oh well, nevermind... it's still nice to hear that CNRI and RMS > have finally made up their minds to render Python GPL-compatible -- > whatever this means ;-) I'm not sure it means anything yet. CNRI and the FSF believed they reached agreement before, but that didn't last after Moglen and Kahn each figured out what the other was really suggesting. From mal at lemburg.com Thu Dec 14 23:25:31 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 14 Dec 2000 23:25:31 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: Message-ID: <3A3948DB.9165E404@lemburg.com> Tim Peters wrote: > > I'm not going to argue about the GPL. Take it up with the FSF! Sorry, I got a bit carried away -- I don't want to take it up with the FSF, simply because I couldn't care less. What's bugging me is that this one guy is splitting the OSS world in two even though both halfs actually want the same thing: software which you can use for free with full source code. I find that a very poor situation. > I will say > that if you do get the FSF's attention, Moglen will have an instant counter > to any objection you're likely to raise -- he's been thinking about this for > 10 years, and he's heard it all. And in our experience, RMS won't commit to > anything before running it past Moglen. > > [MAL] > > But it's his [RMS's] piece of work, isn't it ? He's the one who can > > change it. > > Akin to saying Python is Guido's piece of work. Yes, no, kinda, more true > at some times than others, ditto respects. RMS has consistently said that > any changes for the next version of the GPL will take at least a year, due > to extensive legal review required first. Would be more clearly true to say > that the first version of the GPL was RMS's alone -- but version 2 came out > in 1991. Point taken. > > ... > > Strange, then how come he sees the choice of law clause as a problem: > > without explicitely ruling out the applicability of the UN CISC, > > this clause is waived by it anyway... at least according to a > > specialist on software law here in Germany. > > ... [and other "who knows?" objections] ... > > Guido quoted the text of your Wed, 06 Sep 2000 14:19:09 +0200 "Re: > [License-py20] Re: GPL incompability as seen from Europe" msg to Moglen, who > dismissed it almost offhandedly as "layman's commentary". You'll have to > ask him why: MAL, we're not lawyers. We're incompetent to have this > discussion -- or at least I am, and Moglen thinks you are too . I'm not a lawyer either, but I am able to apply common sense and know about German trade laws. Anyway, here a reference which covers all the controversial subjects. It's in German, but these guys qualify as lawyers ;-) ... http://www.ifross.de/ifross_html/index.html There's also a book on the subject in German which covers all aspects of software licensing. Here's the reference in case anyone cares: Jochen Marly, Software?berlassungsvertr?ge C.H. Beck, M?nchen, 2000 > >>> Another issue: since Python doesn't link Python scripts, is it > >>> still true that if one (pure) Python package is covered by the GPL, > >>> then all other packages needed by that application will also fall > >>> under GPL ? > > [Tim] > >> Sorry, couldn't make sense of the question. Just as well, > >> since you should ask about it on a GNU forum anyway . > > [MAL] > > Isn't this question (whether the GPL virus applies to byte-code > > as well) important to Python programmers as well ? > > I don't know -- like I said, I couldn't make sense of the question, i.e. I > couldn't figure out what it is you're asking. I *suspect* it's based on a > misunderstanding of the GPL; for example, gcc is a GPL'ed application that > requires stuff from the OS in order to do its job of compiling, but that > doesn't mean that every OS it runs on falls under the GPL. The GPL contains > no restrictions on *use*, it restricts only copying, modifying and > distributing (the specific rights granted by copyright law). I don't see > any way to read the GPL as restricting your ability to distribute a GPL'ed > program P on its own, no matter what the status of the packages that P may > rely upon for operation. This is very controversial: if an application Q needs a GPLed library P to work, then P and Q form a new whole in the sense of the GPL. And this even though P wasn't even distributed together with Q. Don't ask me why, but that's how RMS and folks look at it. It can be argued that the dynamic linker actually integrates P into Q, but is the same argument valid for a Python program Q which relies on a GPLed package P ? (The relationship between Q and P is one of providing interfaces -- there is no call address patching required for the setup to work.) > The GPL is also not viral in the sense that it cannot infect an unwitting > victim. Nothing whatsoever you do or don't do can make *any* other program > Q "fall under" the GPL -- only Q's owner can set the license for Q. The GPL > purportedly can prevent you from distributing (but not from using) a program > that links with a GPL'ed program, but that doesn't appear to be what you're > asking about. Or is it? No. What's viral about the GPL is that you can turn an application into a GPLed one by merely linking the two together -- that's why e.g. the libc is distributed under the LGPL which doesn't have this viral property. > If you were to put, say, mxDateTime, under the GPL, then yes, I believe the > FSF would claim I could not distribute my program T that uses mxDateTime > unless T were also under the GPL or a GPL-compatible license. But if > mxDateTime is not under the GPL, then nothing I do with T can magically > change the mxDateTime license to the GPL (although if your mxDateTime > license allows me to redistribute mxDateTime under a different license, then > it allows me to ship a copy of mxDateTime under the GPL). > > That said, the whole theory of GPL linking is muddy to me, especially since > the word "link" (and its variants) doesn't appear in the GPL. True. > > Oh well, nevermind... it's still nice to hear that CNRI and RMS > > have finally made up their minds to render Python GPL-compatible -- > > whatever this means ;-) > > I'm not sure it means anything yet. CNRI and the FSF believed they reached > agreement before, but that didn't last after Moglen and Kahn each figured > out what the other was really suggesting. Oh boy... -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From greg at cosc.canterbury.ac.nz Fri Dec 15 00:19:09 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Dec 2000 12:19:09 +1300 (NZDT) Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A3948DB.9165E404@lemburg.com> Message-ID: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> "M.-A. Lemburg" : > if an application Q needs a GPLed > library P to work, then P and Q form a new whole in the sense of > the GPL. I don't see how Q can *need* any particular library P to work. The most it can need is some library with an API which is compatible with P's. So I don't buy that argument. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Fri Dec 15 00:58:24 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Dec 2000 12:58:24 +1300 (NZDT) Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A387005.6725DAAE@ActiveState.com> Message-ID: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> Paul Prescod : > We could say that a local can only shadow a global > if the local is formally declared. How do you intend to enforce that? Seems like it would require a test on every assignment to a local, to make sure nobody has snuck in a new global since the function was compiled. > Actually, one could argue that there is no good reason to > even *allow* the shadowing of globals. If shadowing were completely disallowed, it would make it impossible to write a completely self-contained function whose source could be moved from one environment to another without danger of it breaking. I wouldn't like the language to have a characteristic like that. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Fri Dec 15 01:06:12 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Dec 2000 13:06:12 +1300 (NZDT) Subject: [Python-Dev] Online help scope In-Reply-To: Message-ID: <200012150006.NAA02154@s454.cosc.canterbury.ac.nz> Tim Peters : > [Paul Prescod] > > Keywords have no docstrings. > Neither do integers, but they're obvious too . Oh, I don't know, it could be useful. >>> help(2) The first prime number. >>> help(2147483647) sys.maxint, the largest Python small integer. >>> help(42) The answer to the ultimate question of life, the universe and everything. See also: ultimate_question. >>> help("ultimate_question") [Importing research.mice.earth] [Calling earth.find_ultimate_question] This may take about 10 million years, please be patient... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From barry at digicool.com Fri Dec 15 01:33:16 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Thu, 14 Dec 2000 19:33:16 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <3A3948DB.9165E404@lemburg.com> <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> Message-ID: <14905.26316.407495.981198@anthem.concentric.net> >>>>> "GE" == Greg Ewing writes: GE> I don't see how Q can *need* any particular library P to GE> work. The most it can need is some library with an API which GE> is compatible with P's. So I don't buy that argument. It's been my understanding that the FSF's position on this is as follows. If the only functional implementation of the API is GPL'd software then simply writing your code against that API is tantamount to linking with that software. Their reasoning is that the clear intent of the programmer (shut up, Chad) is to combine the program with GPL code. As soon as there is a second, non-GPL implementation of the API, you're fine because while you may not distribute your program with the GPL'd software linked in, those who receive your software wouldn't be forced to combine GPL and non-GPL code. -Barry From tim.one at home.com Fri Dec 15 04:01:36 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 14 Dec 2000 22:01:36 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <3A3948DB.9165E404@lemburg.com> Message-ID: [MAL] > Sorry, I got a bit carried away -- I don't want to take it up > with the FSF, simply because I couldn't care less. Well, nobody else is able to Pronounce on what the FSF believes or will do. Which tells me that you're not really interested in playing along with the FSF here after all -- which we both knew from the start anyway . > What's bugging me is that this one guy is splitting the OSS world There are many people on the FSF bandwagon. I'm not one of them, but I can count. > in two even though both halfs actually want the same thing: software > which you can use for free with full source code. I find that a very > poor situation. RMS would not agree that both halves want the same thing; to the contrary, he's openly contemptuous of the Open Source movement -- which you also knew from the start. > [stuff about German law I won't touch with 12-foot schnitzel] OTOH, a German FSF advocate assured me: I also tend to forget that the system of the law works different in the US as in Germany. In Germany something that most people will believe (called "common grounds") play a role in the court. So if you knew, because it is widely known what the GPL means, than it is harder to attack that in court. In the US, when something gets to court it doesn't matter at all what people believed about it. Heck, we'll let mass murderers go free if a comma was in the wrong place in a 1592 statute, or send a kid to jail for life for using crack cocaine instead of the flavor favored by stockbrokers . I hope the US is unique in that respect, but it does makes the GPL weaker here because even if *everyone* in our country believed the GPL means what RMS says it means, a US court would give that no weight in its logic-chopping. >>> Another issue: since Python doesn't link Python scripts, is it >>> still true that if one (pure) Python package is covered by the GPL, >>> then all other packages needed by that application will also fall >>> under GPL ? > This is very controversial: if an application Q needs a GPLed > library P to work, then P and Q form a new whole in the sense of > the GPL. And this even though P wasn't even distributed together > with Q. Don't ask me why, but that's how RMS and folks look at it. Understood, but have you reread your question above, which I've said twice I can't make sense of? That's not what you were asking about. Your question above asks, if anything, the opposite: the *application* Q is GPL'ed, and the question above asks whether that means the *Ps* it depends on must also be GPL'ed. To the best of my ability, I've answered "NO" to that one, and "YES" to the question it appears you meant to ask. > It can be argued that the dynamic linker actually integrates > P into Q, but is the same argument valid for a Python program Q > which relies on a GPLed package P ? (The relationship between > Q and P is one of providing interfaces -- there is no call address > patching required for the setup to work.) As before, I believe the FSF will say YES. Unless there's also a non-GPL'ed implementation of the same interface that people could use just as well. See my extended mxDateTime example too. > ... > No. What's viral about the GPL is that you can turn an application > into a GPLed one by merely linking the two together No, you cannot. You can link them together all day without any hassle. What you cannot do is *distribute* it unless the aggregate is first placed under the GPL (or a GPL-compatible license) too. If you distribute it without taking that step, that doesn't turn it into a GPL'ed application either -- in that case you've simply (& supposedly) violated the license on P, so your distribution was simply (& supposedly) illegal. And that is in fact the end result that people who knowingly use the GPL want (granting that it appears most people who use the GPL do so unknowing of its consequences). > -- that's why e.g. the libc is distributed under the LGPL which > doesn't have this viral property. You should read RMS on why glibc is under the LGPL: http://www.fsf.org/philosophy/why-not-lgpl.html It will at least disabuse you of the notion that RMS and you are after the same thing . From paulp at ActiveState.com Fri Dec 15 05:02:08 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Thu, 14 Dec 2000 20:02:08 -0800 Subject: [Python-Dev] new draft of PEP 227 References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> Message-ID: <3A3997C0.F977AF51@ActiveState.com> Greg Ewing wrote: > > Paul Prescod : > > > We could say that a local can only shadow a global > > if the local is formally declared. > > How do you intend to enforce that? Seems like it would > require a test on every assignment to a local, to make > sure nobody has snuck in a new global since the function > was compiled. I would expect that all of the checks would be at compile-time. Except for __dict__ hackery, I think it is doable. Python already keeps track of all assignments to locals and all assignments to globals in a function scope. The only addition is keeping track of assignments at a global scope. > > Actually, one could argue that there is no good reason to > > even *allow* the shadowing of globals. > > If shadowing were completely disallowed, it would make it > impossible to write a completely self-contained function > whose source could be moved from one environment to another > without danger of it breaking. I wouldn't like the language > to have a characteristic like that. That seems like a very esoteric requirement. How often do you have functions that do not rely *at all* on their environment (other functions, import statements, global variables). When you move code you have to do some rewriting or customizing of the environment in 94% of the cases. How much effort do you want to spend on the other 6%? Also, there are tools that are designed to help you move code without breaking programs (refactoring editors). They can just as easily handle renaming local variables as adding import statements and fixing up function calls. Paul Prescod From mal at lemburg.com Fri Dec 15 11:05:59 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 11:05:59 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> Message-ID: <3A39ED07.6B3EE68E@lemburg.com> Greg Ewing wrote: > > "M.-A. Lemburg" : > > if an application Q needs a GPLed > > library P to work, then P and Q form a new whole in the sense of > > the GPL. > > I don't see how Q can *need* any particular library P > to work. The most it can need is some library with > an API which is compatible with P's. So I don't > buy that argument. It's the view of the FSF, AFAIK. You can't distribute an application in binary which dynamically links against libreadline (which is GPLed) on the user's machine, since even though you don't distribute libreadline the application running on the user's machine is considered the "whole" in terms of the GPL. FWIW, I don't agree with that view either, but that's probably because I'm a programmer and not a lawyer :) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 15 11:25:12 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 11:25:12 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: Message-ID: <3A39F188.E366B481@lemburg.com> Tim Peters wrote: > > [Tim and MAL talking about the FSF and their views] > > [Tim and MAL showing off as hobby advocates ;-)] > > >>> Another issue: since Python doesn't link Python scripts, is it > >>> still true that if one (pure) Python package is covered by the GPL, > >>> then all other packages needed by that application will also fall > >>> under GPL ? > > > This is very controversial: if an application Q needs a GPLed > > library P to work, then P and Q form a new whole in the sense of > > the GPL. And this even though P wasn't even distributed together > > with Q. Don't ask me why, but that's how RMS and folks look at it. > > Understood, but have you reread your question above, which I've said twice I > can't make sense of? I know, it was backwards. Take an example: I have a program which wants to process MP3 files in some way. Now because of some stroke is luck, all Python MP3 modules out there are covered by the GPL. Now I could write an application which uses a certain interface and then tell the user to install the MP3 module separately. As Barry mentioned, this setup will cause distribution of my application to be illegal because I could have only done so by putting the application under the GPL. > You should read RMS on why glibc is under the LGPL: > > http://www.fsf.org/philosophy/why-not-lgpl.html > > It will at least disabuse you of the notion that RMS and you are after the > same thing . :-) Let's stop this discussion and get back to those cheerful things like Christmas Bells and Santa Clause... :-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From amk at mira.erols.com Fri Dec 15 14:27:24 2000 From: amk at mira.erols.com (A.M. Kuchling) Date: Fri, 15 Dec 2000 08:27:24 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE Message-ID: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> unicodeobject.c contains this code: PyErr_Format(PyExc_ValueError, "unsupported format character '%c' (0x%x) " "at index %i", c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat)); c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits, so '%\u3000' % 1 results in an error message containing "'\000' (0x3000)". Is this worth fixing? I'd say no, since the hex value is more useful for Unicode strings anyway. (I still wanted to mention this little buglet, since I just touched this bit of code.) --amk From jack at oratrix.nl Fri Dec 15 15:26:15 2000 From: jack at oratrix.nl (Jack Jansen) Date: Fri, 15 Dec 2000 15:26:15 +0100 Subject: [Python-Dev] reuse of address default value (was Re: [Python-checkins] CVS: python/dist/src/Lib SocketServer.py) In-Reply-To: Message by Guido van Rossum , Thu, 14 Dec 2000 09:51:26 -0500 , <200012141451.JAA15637@cj20424-a.reston1.va.home.com> Message-ID: <20001215142616.705993B9B44@snelboot.oratrix.nl> > The reason for the patch is that without this, if you kill a TCP server > and restart it right away, you'll get a 'port in use" error -- TCP has > some kind of strange wait period after a connection is closed before > it can be reused. The patch avoids this error. Well, actually there's a pretty good reason for the "port in use" behaviour: the TCP standard more-or-less requires it. A srchost/srcport/dsthost/dstport combination should not be reused until the maximum TTL has passed, because there may still be "old" retransmissions around. Especially the "open" packets are potentially dangerous. Setting the reuse bit while you're debugging is fine, but setting it in general is not a very good idea... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From guido at python.org Fri Dec 15 15:31:19 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 09:31:19 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: Your message of "Thu, 14 Dec 2000 20:02:08 PST." <3A3997C0.F977AF51@ActiveState.com> References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> Message-ID: <200012151431.JAA19799@cj20424-a.reston1.va.home.com> > Greg Ewing wrote: > > > > Paul Prescod : > > > > > We could say that a local can only shadow a global > > > if the local is formally declared. > > > > How do you intend to enforce that? Seems like it would > > require a test on every assignment to a local, to make > > sure nobody has snuck in a new global since the function > > was compiled. > > I would expect that all of the checks would be at compile-time. Except > for __dict__ hackery, I think it is doable. Python already keeps track > of all assignments to locals and all assignments to globals in a > function scope. The only addition is keeping track of assignments at a > global scope. > > > > Actually, one could argue that there is no good reason to > > > even *allow* the shadowing of globals. > > > > If shadowing were completely disallowed, it would make it > > impossible to write a completely self-contained function > > whose source could be moved from one environment to another > > without danger of it breaking. I wouldn't like the language > > to have a characteristic like that. > > That seems like a very esoteric requirement. How often do you have > functions that do not rely *at all* on their environment (other > functions, import statements, global variables). > > When you move code you have to do some rewriting or customizing of the > environment in 94% of the cases. How much effort do you want to spend on > the other 6%? Also, there are tools that are designed to help you move > code without breaking programs (refactoring editors). They can just as > easily handle renaming local variables as adding import statements and > fixing up function calls. Can we cut this out please? Paul is misguided. There's no reason to forbid a local shadowing a global. All languages with nested scopes allow this. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at digicool.com Fri Dec 15 17:17:08 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Fri, 15 Dec 2000 11:17:08 -0500 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> Message-ID: <14906.17412.221040.895357@anthem.concentric.net> >>>>> "M" == M writes: M> It's the view of the FSF, AFAIK. You can't distribute an M> application in binary which dynamically links against M> libreadline (which is GPLed) on the user's machine, since even M> though you don't distribute libreadline the application running M> on the user's machine is considered the "whole" in terms of the M> GPL. M> FWIW, I don't agree with that view either, but that's probably M> because I'm a programmer and not a lawyer :) I'm not sure I agree with that view either, but mostly because there is a non-GPL replacement for parts of the readline API: http://www.cstr.ed.ac.uk/downloads/editline.html Don't know anything about it, so it may not be featureful enough for Python's needs, but if licensing is really a problem, it might be worth looking into. -Barry From paulp at ActiveState.com Fri Dec 15 17:16:37 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Fri, 15 Dec 2000 08:16:37 -0800 Subject: [Python-Dev] new draft of PEP 227 References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com> Message-ID: <3A3A43E5.347AAF6C@ActiveState.com> Guido van Rossum wrote: > > ... > > Can we cut this out please? Paul is misguided. There's no reason to > forbid a local shadowing a global. All languages with nested scopes > allow this. Python is the only one I know of that implicitly shadows without requiring some form of declaration. JavaScript has it right: reading and writing of globals are symmetrical. In the rare case that you explicitly want to shadow, you need a declaration. Python's rule is confusing, implicit and error causing. In my opinion, of course. If you are dead-set against explicit declarations then I would say that disallowing the ambiguous construct is better than silently treating it as a declaration. Paul Prescod From guido at python.org Fri Dec 15 17:23:07 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 11:23:07 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: Your message of "Fri, 15 Dec 2000 08:16:37 PST." <3A3A43E5.347AAF6C@ActiveState.com> References: <200012142358.MAA02149@s454.cosc.canterbury.ac.nz> <3A3997C0.F977AF51@ActiveState.com> <200012151431.JAA19799@cj20424-a.reston1.va.home.com> <3A3A43E5.347AAF6C@ActiveState.com> Message-ID: <200012151623.LAA27630@cj20424-a.reston1.va.home.com> > Python is the only one I know of that implicitly shadows without > requiring some form of declaration. JavaScript has it right: reading and > writing of globals are symmetrical. In the rare case that you explicitly > want to shadow, you need a declaration. Python's rule is confusing, > implicit and error causing. In my opinion, of course. If you are > dead-set against explicit declarations then I would say that disallowing > the ambiguous construct is better than silently treating it as a > declaration. Let's agree to differ. This will never change. In Python, assignment is declaration. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Dec 15 18:01:33 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 12:01:33 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: Your message of "Fri, 15 Dec 2000 08:27:24 EST." <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> Message-ID: <200012151701.MAA28058@cj20424-a.reston1.va.home.com> > unicodeobject.c contains this code: > > PyErr_Format(PyExc_ValueError, > "unsupported format character '%c' (0x%x) " > "at index %i", > c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat)); > > c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits, > so '%\u3000' % 1 results in an error message containing "'\000' > (0x3000)". Is this worth fixing? I'd say no, since the hex value is > more useful for Unicode strings anyway. (I still wanted to mention > this little buglet, since I just touched this bit of code.) Sounds like the '%c' should just be deleted. --Guido van Rossum (home page: http://www.python.org/~guido/) From bckfnn at worldonline.dk Fri Dec 15 18:05:42 2000 From: bckfnn at worldonline.dk (Finn Bock) Date: Fri, 15 Dec 2000 17:05:42 GMT Subject: [Python-Dev] CWD in sys.path. Message-ID: <3a3a480b.28490597@smtp.worldonline.dk> Hi, I'm trying to understand the initialization of sys.path and especially if CWD is supposed to be included in sys.path by default. (I understand the purpose of sys.path[0], that is not the focus of my question). My setup is Python2.0 on Win2000, no PYTHONHOME or PYTHONPATH envvars. In this setup, an empty string exists as sys.path[1], but I'm unsure if this is by careful design or some freak accident. The empty entry is added because HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath does *not* have any subkey. There are a default value, but that value appears to be ignored. If I add a subkey "foo": HKEY_LOCAL_MACHINE\SOFTWARE\Python\PythonCore\2.0\PythonPath\foo with a default value of "d:\foo", the CWD is no longer in sys.path. i:\java\jython.cvs\org\python\util>d:\Python20\python.exe -S Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information. >>> import sys >>> sys.path ['', 'd:\\foo', 'D:\\PYTHON20\\DLLs', 'D:\\PYTHON20\\lib', 'D:\\PYTHON20\\lib\\plat-win', 'D:\\PYTHON20\\lib\\lib-tk', 'D:\\PYTHON20'] >>> I noticed that some of the PYTHONPATH macros in PC/config.h includes the '.', others does not. So, to put it as a question (for jython): Should CWD be included in sys.path? Are there some situation (like embedding) where CWD shouldn't be in sys.path? regards, finn From guido at python.org Fri Dec 15 18:12:03 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 12:12:03 -0500 Subject: [Python-Dev] CWD in sys.path. In-Reply-To: Your message of "Fri, 15 Dec 2000 17:05:42 GMT." <3a3a480b.28490597@smtp.worldonline.dk> References: <3a3a480b.28490597@smtp.worldonline.dk> Message-ID: <200012151712.MAA02544@cj20424-a.reston1.va.home.com> On Unix, CWD is not in sys.path unless as sys.path[0]. --Guido van Rossum (home page: http://www.python.org/~guido/) From moshez at zadka.site.co.il Sat Dec 16 02:43:41 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Sat, 16 Dec 2000 03:43:41 +0200 (IST) Subject: [Python-Dev] new draft of PEP 227 Message-ID: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il> On Fri, 15 Dec 2000 08:16:37 -0800, Paul Prescod wrote: > Python is the only one I know of that implicitly shadows without > requiring some form of declaration. Perl and Scheme permit implicit shadowing too. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tismer at tismer.com Fri Dec 15 17:42:18 2000 From: tismer at tismer.com (Christian Tismer) Date: Fri, 15 Dec 2000 18:42:18 +0200 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL (Splitting up _cursesmodule) References: Message-ID: <3A3A49EA.5D9418E@tismer.com> Tim Peters wrote: ... > > Another issue: since Python doesn't link Python scripts, is it > > still true that if one (pure) Python package is covered by the GPL, > > then all other packages needed by that application will also fall > > under GPL ? > > Sorry, couldn't make sense of the question. Just as well, since you should > ask about it on a GNU forum anyway . The GNU license is transitive. It automatically extends on other parts of a project, unless they are identifiable, independent developments. As soon as a couple of modules is published together, based upon one GPL-ed module, this propagates. I think this is what MAL meant? Anyway, I'd be interested to hear what the GNU forum says. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From amk at mira.erols.com Fri Dec 15 19:10:34 2000 From: amk at mira.erols.com (A.M. Kuchling) Date: Fri, 15 Dec 2000 13:10:34 -0500 Subject: [Python-Dev] What to do about PEP 229? Message-ID: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> I began writing the fabled fancy setup script described in PEP 229, and then realized there was duplication going on here. The code in setup.py would need to know what libraries, #defines, &c., are needed by each module in order to check if they're needed and set them. But if Modules/Setup can be used to override setup.py's behaviour, then much of this information would need to be in that file, too; the details of compiling a module are in two places. Possibilities: 1) Setup contains fully-loaded module descriptions, and the setup script drops unneeded bits. For example, the socket module requires -lnsl on some platforms. The Setup file would contain "socket socketmodule.c -lnsl" on all platforms, and setup.py would check for an nsl library and only use if it's there. This seems dodgy to me; what if -ldbm is needed on one platform and -lndbm on another? 2) Drop setup completely and just maintain setup.py, with some different overriding mechanism. This is more radical. Adding a new module is then not just a matter of editing a simple text file; you'd have to modify setup.py, making it more like maintaining an autoconf script. Remember, the underlying goal of PEP 229 is to have the out-of-the-box Python installation you get from "./configure;make" contain many more useful modules; right now you wouldn't get zlib, syslog, resource, any of the DBM modules, PyExpat, &c. I'm not wedded to using Distutils to get that, but think that's the only practical way; witness the hackery required to get the DB module automatically compiled. You can also wave your hands in the direction of packagers such as ActiveState or Red Hat, and say "let them make to compile everything". But this problem actually inconveniences *me*, since I always build Python myself and have to extensively edit Setup, so I'd like to fix the problem. Thoughts? --amk From nas at arctrix.com Fri Dec 15 13:03:04 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Fri, 15 Dec 2000 04:03:04 -0800 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <14906.17412.221040.895357@anthem.concentric.net>; from barry@digicool.com on Fri, Dec 15, 2000 at 11:17:08AM -0500 References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> Message-ID: <20001215040304.A22056@glacier.fnational.com> On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > I'm not sure I agree with that view either, but mostly because there > is a non-GPL replacement for parts of the readline API: > > http://www.cstr.ed.ac.uk/downloads/editline.html It doesn't work with the current readline module. It is much smaller than readline and works just as well in my experience. Would there be any interest in including a copy with the standard distribution? The license is quite nice (X11 type). Neil From nas at arctrix.com Fri Dec 15 13:14:50 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Fri, 15 Dec 2000 04:14:50 -0800 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012151509.HAA18093@slayer.i.sourceforge.net>; from gvanrossum@users.sourceforge.net on Fri, Dec 15, 2000 at 07:09:46AM -0800 References: <200012151509.HAA18093@slayer.i.sourceforge.net> Message-ID: <20001215041450.B22056@glacier.fnational.com> On Fri, Dec 15, 2000 at 07:09:46AM -0800, Guido van Rossum wrote: > Update of /cvsroot/python/python/dist/src/Lib > In directory slayer.i.sourceforge.net:/tmp/cvs-serv18082 > > Modified Files: > httplib.py > Log Message: > Get rid of string functions. Can you explain the logic behind this recent interest in removing string functions from the standard library? It it performance? Some unicode issue? I don't have a great attachment to string.py but I also don't see the justification for the amount of work it requires. Neil From guido at python.org Fri Dec 15 20:29:37 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 14:29:37 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Fri, 15 Dec 2000 04:14:50 PST." <20001215041450.B22056@glacier.fnational.com> References: <200012151509.HAA18093@slayer.i.sourceforge.net> <20001215041450.B22056@glacier.fnational.com> Message-ID: <200012151929.OAA03073@cj20424-a.reston1.va.home.com> > Can you explain the logic behind this recent interest in removing > string functions from the standard library? It it performance? > Some unicode issue? I don't have a great attachment to string.py > but I also don't see the justification for the amount of work it > requires. I figure that at *some* point we should start putting our money where our mouth is, deprecate most uses of the string module, and start warning about it. Not in 2.1 probably, given my experience below. As a realistic test of the warnings module I played with some warnings about the string module, and then found that say most of the std library modules use it, triggering an extraordinary amount of warnings. I then decided to experiment with the conversion. I quickly found out it's too much work to do manually, so I'll hold off until someone comes up with a tool that does 99% of the work. (The selection of std library modules to convert manually was triggered by something pretty random -- I decided to silence a particular cron job I was running. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From Barrett at stsci.edu Fri Dec 15 20:32:10 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Fri, 15 Dec 2000 14:32:10 -0500 (EST) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> Message-ID: <14906.17712.830224.481130@nem-srvr.stsci.edu> Guido, Here are my comments on PEP 207. (I've also gone back and read most of the 1998 discussion. What a tedious, in terms of time, but enlightening, in terms of content, discussion that was.) | - New function: | | PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op) | | This performs the requested rich comparison, returning a Python | object or raising an exception. The 3rd argument must be one of | LT, LE, EQ, NE, GT or GE. I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc. | Classes | | - Classes can define new special methods __lt__, __le__, __gt__, | __ge__, __eq__, __ne__ to override the corresponding operators. | (You gotta love the Fortran heritage.) If a class overrides | __cmp__ as well, it is only used by PyObject_Compare(). Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to __lt__, __le__, __eq__, etc. I'm not keen on the FORTRAN derived symbolism. I also find it contrary to Python's heritage of being clear and concise. I don't mind typing __lessequal__ (or __less_equal__) once per class for the additional clarity. | - Should we even bother upgrading the existing types? Isn't this question partly related to the coercion issue and which type of comparison takes precedence? And if so, then I would think the answer would be 'yes'. Or better still see below my suggestion of adding poor and rich comparison operators along with matrix-type operators. - If so, how should comparisons on container types be defined? Suppose we have a list whose items define rich comparisons. How should the itemwise comparisons be done? For example: def __lt__(a, b): # a bi: return 0 raise TypeError, "incomparable item types" return len(a) < len(b) This uses the same sequence of comparisons as cmp(), so it may as well use cmp() instead: def __lt__(a, b): # a 0: return 0 assert 0 # unreachable return len(a) < len(b) And now there's not really a reason to change lists to rich comparisons. I don't understand this example. If a[i] and b[i] define rich comparisons, then 'a[i] < b[i]' is likely to return a non-boolean value. Yet the 'if' statement expects a boolean value. I don't see how the above example will work. This example also makes me think that the proposals for new operators (ie. PEP 211 and 225) are a good idea. The discussion of rich comparisons in 1998 also lends some support to this. I can see many uses for two types of comparison operators (as well as the proposed matrix-type operators), one set for poor or boolean comparisons and one for rich or non-boolean comparisons. For example, numeric arrays can define both. Rich comparison operators would return an array of boolean values, while poor comparison operators return a boolean value by performing an implied 'and.reduce' operation. These operators provide clarity and conciseness, without much change to current Python behavior. -- Paul From guido at python.org Fri Dec 15 20:51:04 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 14:51:04 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Fri, 15 Dec 2000 14:32:10 EST." <14906.17712.830224.481130@nem-srvr.stsci.edu> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> Message-ID: <200012151951.OAA03219@cj20424-a.reston1.va.home.com> > Here are my comments on PEP 207. (I've also gone back and read most > of the 1998 discussion. What a tedious, in terms of time, but > enlightening, in terms of content, discussion that was.) > > | - New function: > | > | PyObject *PyObject_RichCompare(PyObject *, PyObject *, enum cmp_op) > | > | This performs the requested rich comparison, returning a Python > | object or raising an exception. The 3rd argument must be one of > | LT, LE, EQ, NE, GT or GE. > > I'd much prefer '<', '<=', '=', etc. to LT, LE, EQ, etc. This is only at the C level. Having to do a string compare is too slow. Since some of these are multi-character symbols, a character constant doesn't suffice (multi-character character constants are not portable). > | Classes > | > | - Classes can define new special methods __lt__, __le__, __gt__, > | __ge__, __eq__, __ne__ to override the corresponding operators. > | (You gotta love the Fortran heritage.) If a class overrides > | __cmp__ as well, it is only used by PyObject_Compare(). > > Likewise, I'd prefer __less__, __lessequal__, __equal__, etc. to > __lt__, __le__, __eq__, etc. I'm not keen on the FORTRAN derived > symbolism. I also find it contrary to Python's heritage of being > clear and concise. I don't mind typing __lessequal__ (or > __less_equal__) once per class for the additional clarity. I don't care about Fortran, but you just showed why I think the short operator names are better: there's less guessing or disagreement about how they are to be spelled. E.g. should it be __lessthan__ or __less_than__ or __less__? > | - Should we even bother upgrading the existing types? > > Isn't this question partly related to the coercion issue and which > type of comparison takes precedence? And if so, then I would think > the answer would be 'yes'. It wouldn't make much of a difference -- comparisons between different types numbers would get the same outcome either way. > Or better still see below my suggestion of > adding poor and rich comparison operators along with matrix-type > operators. > > > - If so, how should comparisons on container types be defined? > Suppose we have a list whose items define rich comparisons. How > should the itemwise comparisons be done? For example: > > def __lt__(a, b): # a for i in range(min(len(a), len(b))): > ai, bi = a[i], b[i] > if ai < bi: return 1 > if ai == bi: continue > if ai > bi: return 0 > raise TypeError, "incomparable item types" > return len(a) < len(b) > > This uses the same sequence of comparisons as cmp(), so it may > as well use cmp() instead: > > def __lt__(a, b): # a for i in range(min(len(a), len(b))): > c = cmp(a[i], b[i]) > if c < 0: return 1 > if c == 0: continue > if c > 0: return 0 > assert 0 # unreachable > return len(a) < len(b) > > And now there's not really a reason to change lists to rich > comparisons. > > I don't understand this example. If a[i] and b[i] define rich > comparisons, then 'a[i] < b[i]' is likely to return a non-boolean > value. Yet the 'if' statement expects a boolean value. I don't see > how the above example will work. Sorry. I was thinking of list items that contain objects that respond to the new overloading protocol, but still return Boolean outcomes. My conclusion is that __cmp__ is just as well. > This example also makes me think that the proposals for new operators > (ie. PEP 211 and 225) are a good idea. The discussion of rich > comparisons in 1998 also lends some support to this. I can see many > uses for two types of comparison operators (as well as the proposed > matrix-type operators), one set for poor or boolean comparisons and > one for rich or non-boolean comparisons. For example, numeric arrays > can define both. Rich comparison operators would return an array of > boolean values, while poor comparison operators return a boolean value > by performing an implied 'and.reduce' operation. These operators > provide clarity and conciseness, without much change to current Python > behavior. Maybe. That can still be decided later. Right now, adding operators is not on the table for 2.1 (if only because there are two conflicting PEPs); adding rich comparisons *is* on the table because it doesn't change the parser (and because the rich comparisons idea was already pretty much worked out two years ago). --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at home.com Fri Dec 15 22:08:02 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 15 Dec 2000 16:08:02 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012151929.OAA03073@cj20424-a.reston1.va.home.com> Message-ID: [Neil Schemenauer] > Can you explain the logic behind this recent interest in removing > string functions from the standard library? It it performance? > Some unicode issue? I don't have a great attachment to string.py > but I also don't see the justification for the amount of work it > requires. [Guido] > I figure that at *some* point we should start putting our money where > our mouth is, deprecate most uses of the string module, and start > warning about it. Not in 2.1 probably, given my experience below. I think this begs Neil's questions: *is* our mouth there , and if so, why? The only public notice of impending string module deprecation anyone came up with was a vague note on the 1.6 web page, and one not repeated in any of the 2.0 release material. "string" is right up there with "os" and "sys" as a FIM (Frequently Imported Module), so the required code changes will be massive. As a user, I don't see what's in it for me to endure that pain: the string module functions work fine! Neither are they warts in the language, any more than that we say sin(pi) instead of pi.sin(). Keeping the functions around doesn't hurt anybody that I can see. > As a realistic test of the warnings module I played with some warnings > about the string module, and then found that say most of the std > library modules use it, triggering an extraordinary amount of > warnings. I then decided to experiment with the conversion. I > quickly found out it's too much work to do manually, so I'll hold off > until someone comes up with a tool that does 99% of the work. Ah, so that's the *easy* way to kill this crusade -- forget I said anything . From Barrett at stsci.edu Fri Dec 15 22:20:20 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Fri, 15 Dec 2000 16:20:20 -0500 (EST) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <200012151951.OAA03219@cj20424-a.reston1.va.home.com> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> <200012151951.OAA03219@cj20424-a.reston1.va.home.com> Message-ID: <14906.33325.5784.118110@nem-srvr.stsci.edu> >> This example also makes me think that the proposals for new operators >> (ie. PEP 211 and 225) are a good idea. The discussion of rich >> comparisons in 1998 also lends some support to this. I can see many >> uses for two types of comparison operators (as well as the proposed >> matrix-type operators), one set for poor or boolean comparisons and >> one for rich or non-boolean comparisons. For example, numeric arrays >> can define both. Rich comparison operators would return an array of >> boolean values, while poor comparison operators return a boolean value >> by performing an implied 'and.reduce' operation. These operators >> provide clarity and conciseness, without much change to current Python >> behavior. > > Maybe. That can still be decided later. Right now, adding operators > is not on the table for 2.1 (if only because there are two conflicting > PEPs); adding rich comparisons *is* on the table because it doesn't > change the parser (and because the rich comparisons idea was already > pretty much worked out two years ago). Yes, it was worked out previously _assuming_ rich comparisons do not use any new operators. But let's stop for a moment and contemplate adding rich comparisons along with new comparison operators. What do we gain? 1. The current boolean operator behavior does not have to change, and hence will be backward compatible. 2. It eliminates the need to decide whether or not rich comparisons takes precedence over boolean comparisons. 3. The new operators add additional behavior without directly impacting current behavior and the use of them is unambigous, at least in relation to current Python behavior. You know by the operator what type of comparison will be returned. This should appease Jim Fulton, based on his arguments in 1998 about comparison operators always returning a boolean value. 4. Compound objects, such as lists, could implement both rich and boolean comparisons. The boolean comparison would remain as is, while the rich comparison would return a list of boolean values. Current behavior doesn't change; just a new feature, which you may or may not choose to use, is added. If we go one step further and add the matrix-style operators along with the comparison operators, we can provide a consistent user interface to array/complex operations without changing current Python behavior. If a user has no need for these new operators, he doesn't have to use them or even know about them. All we've done is made Python richer, but I believe with making it more complex. For example, all element-wise operations could have a ':' appended to them, e.g. '+:', '<:', etc.; and will define element-wise addition, element-wise less-than, etc. The traditional '*', '/', etc. operators can then be used for matrix operations, which will appease the Matlab people. Therefore, I don't think rich comparisons and matrix-type operators should be considered separable. I really think you should consider this suggestion. It appeases many groups while providing a consistent and clear user interface, while greatly impacting current Python behavior. Always-causing-havoc-at-the-last-moment-ly Yours, Paul -- Dr. Paul Barrett Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Group FAX: 410-338-4767 Baltimore, MD 21218 From guido at python.org Fri Dec 15 22:23:46 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 16:23:46 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Fri, 15 Dec 2000 16:08:02 EST." References: Message-ID: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> > "string" is right up there with "os" and "sys" as a FIM (Frequently > Imported Module), so the required code changes will be massive. As > a user, I don't see what's in it for me to endure that pain: the > string module functions work fine! Neither are they warts in the > language, any more than that we say sin(pi) instead of pi.sin(). > Keeping the functions around doesn't hurt anybody that I can see. Hm. I'm not saying that this one will be easy. But I don't like having "two ways to do it". It means more learning, etc. (you know the drill). We could have chosen to make the strop module support Unicode; instead, we chose to give string objects methods and promote the use of those methods instead of the string module. (And in a generous mood, we also supported Unicode in the string module -- by providing wrappers that invoke string methods.) If you're saying that we should give users ample time for the transition, I'm with you. If you're saying that you think the string module is too prominent to ever start deprecating its use, I'm afraid we have a problem. I'd also like to note that using the string module's wrappers incurs the overhead of a Python function call -- using string methods is faster. Finally, I like the look of fields[i].strip().lower() much better than that of string.lower(string.strip(fields[i])) -- an actual example from mimetools.py. Ideally, I would like to deprecate the entire string module, so that I can place a single warning at its top. This will cause a single warning to be issued for programs that still use it (no matter how many times it is imported). Unfortunately, there are a couple of things that still need it: string.letters etc., and string.maketrans(). --Guido van Rossum (home page: http://www.python.org/~guido/) From gvwilson at nevex.com Fri Dec 15 22:43:47 2000 From: gvwilson at nevex.com (Greg Wilson) Date: Fri, 15 Dec 2000 16:43:47 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <14906.33325.5784.118110@nem-srvr.stsci.edu> Message-ID: <002901c066e0$1b3f13c0$770a0a0a@nevex.com> Hi, Paul; thanks for your mail. W.r.t. adding matrix operators to Python, you may want to take a look at the counter-arguments in PEP 0211 (attached). Basically, I spoke with the authors of GNU Octave (a GPL'd clone of MATLAB) about what users really used. They felt that the only matrix operator that really mattered was matrix-matrix multiply; other operators (including the left and right division operators that even experienced MATLAB users often mix up) were second order at best, and were better handled with methods or functions. Thanks, Greg p.s. PEP 0225 (also attached) is an alternative to PEP 0211 which would add most of the MATLAB-ish operators to Python. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pep-0211.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: pep-0225.txt URL: From guido at python.org Fri Dec 15 22:55:46 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 16:55:46 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Fri, 15 Dec 2000 16:20:20 EST." <14906.33325.5784.118110@nem-srvr.stsci.edu> References: <200012071748.MAA26523@cj20424-a.reston1.va.home.com> <14906.17712.830224.481130@nem-srvr.stsci.edu> <200012151951.OAA03219@cj20424-a.reston1.va.home.com> <14906.33325.5784.118110@nem-srvr.stsci.edu> Message-ID: <200012152155.QAA03879@cj20424-a.reston1.va.home.com> > > Maybe. That can still be decided later. Right now, adding operators > > is not on the table for 2.1 (if only because there are two conflicting > > PEPs); adding rich comparisons *is* on the table because it doesn't > > change the parser (and because the rich comparisons idea was already > > pretty much worked out two years ago). > > Yes, it was worked out previously _assuming_ rich comparisons do not > use any new operators. > > But let's stop for a moment and contemplate adding rich comparisons > along with new comparison operators. What do we gain? > > 1. The current boolean operator behavior does not have to change, and > hence will be backward compatible. What incompatibility do you see in the current proposal? > 2. It eliminates the need to decide whether or not rich comparisons > takes precedence over boolean comparisons. Only if you want different semantics -- that's only an issue for NumPy. > 3. The new operators add additional behavior without directly impacting > current behavior and the use of them is unambigous, at least in > relation to current Python behavior. You know by the operator what > type of comparison will be returned. This should appease Jim > Fulton, based on his arguments in 1998 about comparison operators > always returning a boolean value. As you know, I'm now pretty close to Jim. :-) He seemed pretty mellow about this now. > 4. Compound objects, such as lists, could implement both rich > and boolean comparisons. The boolean comparison would remain as > is, while the rich comparison would return a list of boolean > values. Current behavior doesn't change; just a new feature, which > you may or may not choose to use, is added. > > If we go one step further and add the matrix-style operators along > with the comparison operators, we can provide a consistent user > interface to array/complex operations without changing current Python > behavior. If a user has no need for these new operators, he doesn't > have to use them or even know about them. All we've done is made > Python richer, but I believe with making it more complex. For > example, all element-wise operations could have a ':' appended to > them, e.g. '+:', '<:', etc.; and will define element-wise addition, > element-wise less-than, etc. The traditional '*', '/', etc. operators > can then be used for matrix operations, which will appease the Matlab > people. > > Therefore, I don't think rich comparisons and matrix-type operators > should be considered separable. I really think you should consider > this suggestion. It appeases many groups while providing a consistent > and clear user interface, while greatly impacting current Python > behavior. > > Always-causing-havoc-at-the-last-moment-ly Yours, I think you misunderstand. Rich comparisons are mostly about allowing the separate overloading of <, <=, ==, !=, >, and >=. This is useful in its own light. If you don't want to use this overloading facility for elementwise comparisons in NumPy, that's fine with me. Nobody says you have to -- it's just that you *could*. Red my lips: there won't be *any* new operators in 2.1. There will a better way to overload the existing Boolean operators, and they will be able to return non-Boolean results. That's useful in other situations besides NumPy. Feel free to lobby for elementwise operators -- but based on the discussion about this subject so far, I don't give it much of a chance even past Python 2.1. They would add a lot of baggage to the language (e.g. the table of operators in all Python books would be about twice as long) and by far the most users don't care about them. (Read the intro to 211 for some of the concerns -- this PEP tries to make the addition palatable by adding exactly *one* new operator.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Dec 15 23:16:34 2000 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Dec 2000 17:16:34 -0500 Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: Your message of "Fri, 08 Dec 2000 17:58:03 EST." <200012082258.RAA02389@cj20424-a.reston1.va.home.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> Message-ID: <200012152216.RAA11098@cj20424-a.reston1.va.home.com> I've checked in the essential parts of the warnings PEP, and closed the SF patch. I haven't checked in the examples in the patch -- it's too early for that. But I figured that it's easier to revise the code once it's checked in. I'm pretty confident that it works as advertised. Still missing is documentation: the warnings module, the new API functions, and the new command line option should all be documented. I'll work on that over the holidays. I consider the PEP done. --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Fri Dec 15 23:21:24 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:21:24 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> Message-ID: <3A3A9964.A6B3DD11@lemburg.com> Neil Schemenauer wrote: > > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > > I'm not sure I agree with that view either, but mostly because there > > is a non-GPL replacement for parts of the readline API: > > > > http://www.cstr.ed.ac.uk/downloads/editline.html > > It doesn't work with the current readline module. It is much > smaller than readline and works just as well in my experience. > Would there be any interest in including a copy with the standard > distribution? The license is quite nice (X11 type). +1 from here -- line editing is simply a very important part of an interactive prompt and readline is not only big, slow and full of strange surprises, but also GPLed ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 15 23:24:34 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:24:34 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> Message-ID: <3A3A9A22.E9BA9551@lemburg.com> "A.M. Kuchling" wrote: > > unicodeobject.c contains this code: > > PyErr_Format(PyExc_ValueError, > "unsupported format character '%c' (0x%x) " > "at index %i", > c, c, fmt -1 - PyUnicode_AS_UNICODE(uformat)); > > c is a Py_UNICODE; applying C's %c to it only takes the lowest 8 bits, > so '%\u3000' % 1 results in an error message containing "'\000' > (0x3000)". Is this worth fixing? I'd say no, since the hex value is > more useful for Unicode strings anyway. (I still wanted to mention > this little buglet, since I just touched this bit of code.) Why would you want to fix it ? Format characters will always be ASCII and thus 7-bit -- theres really no need to expand the set of possibilities beyond 8 bits ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fdrake at acm.org Fri Dec 15 23:22:34 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 15 Dec 2000 17:22:34 -0500 (EST) Subject: [Python-Dev] Warning Framework (PEP 230) In-Reply-To: <200012152216.RAA11098@cj20424-a.reston1.va.home.com> References: <200012071754.MAA26557@cj20424-a.reston1.va.home.com> <200012082258.RAA02389@cj20424-a.reston1.va.home.com> <200012152216.RAA11098@cj20424-a.reston1.va.home.com> Message-ID: <14906.39338.795843.947683@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > Still missing is documentation: the warnings module, the new API > functions, and the new command line option should all be documented. > I'll work on that over the holidays. I've assigned a bug to you in case you forget. I've given it a "show-stopper" priority level, so I'll feel good ripping the code out if you don't get docs written in time. ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From mal at lemburg.com Fri Dec 15 23:39:18 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:39:18 +0100 Subject: [Python-Dev] What to do about PEP 229? References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> Message-ID: <3A3A9D96.80781D61@lemburg.com> "A.M. Kuchling" wrote: > > I began writing the fabled fancy setup script described in PEP 229, > and then realized there was duplication going on here. The code in > setup.py would need to know what libraries, #defines, &c., are needed > by each module in order to check if they're needed and set them. But > if Modules/Setup can be used to override setup.py's behaviour, then > much of this information would need to be in that file, too; the > details of compiling a module are in two places. > > Possibilities: > > 1) Setup contains fully-loaded module descriptions, and the setup > script drops unneeded bits. For example, the socket module > requires -lnsl on some platforms. The Setup file would contain > "socket socketmodule.c -lnsl" on all platforms, and setup.py would > check for an nsl library and only use if it's there. > > This seems dodgy to me; what if -ldbm is needed on one platform and > -lndbm on another? Can't distutils try both and then settle for the working combination ? [distutils isn't really ready for auto-configure yet, but Greg has already provided most of the needed functionality -- it's just not well integrated into the rest of the build process in version 1.0.1 ... BTW, where is Gerg ? I haven't heard from him in quite a while.] > 2) Drop setup completely and just maintain setup.py, with some > different overriding mechanism. This is more radical. Adding a > new module is then not just a matter of editing a simple text file; > you'd have to modify setup.py, making it more like maintaining an > autoconf script. Why not parse Setup and use it as input to distutils setup.py ? > Remember, the underlying goal of PEP 229 is to have the out-of-the-box > Python installation you get from "./configure;make" contain many more > useful modules; right now you wouldn't get zlib, syslog, resource, any > of the DBM modules, PyExpat, &c. I'm not wedded to using Distutils to > get that, but think that's the only practical way; witness the hackery > required to get the DB module automatically compiled. > > You can also wave your hands in the direction of packagers such as > ActiveState or Red Hat, and say "let them make to compile everything". > But this problem actually inconveniences *me*, since I always build > Python myself and have to extensively edit Setup, so I'd like to fix > the problem. > > Thoughts? Nice idea :-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 15 23:44:15 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:44:15 +0100 Subject: [Python-Dev] Death to string functions! References: <200012151509.HAA18093@slayer.i.sourceforge.net> <20001215041450.B22056@glacier.fnational.com> <200012151929.OAA03073@cj20424-a.reston1.va.home.com> Message-ID: <3A3A9EBF.3F9306B6@lemburg.com> Guido van Rossum wrote: > > > Can you explain the logic behind this recent interest in removing > > string functions from the standard library? It it performance? > > Some unicode issue? I don't have a great attachment to string.py > > but I also don't see the justification for the amount of work it > > requires. > > I figure that at *some* point we should start putting our money where > our mouth is, deprecate most uses of the string module, and start > warning about it. Not in 2.1 probably, given my experience below. > > As a realistic test of the warnings module I played with some warnings > about the string module, and then found that say most of the std > library modules use it, triggering an extraordinary amount of > warnings. I then decided to experiment with the conversion. I > quickly found out it's too much work to do manually, so I'll hold off > until someone comes up with a tool that does 99% of the work. This would also help a lot of programmers out there who are stuch with 100k LOCs of Python code using string.py ;) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Fri Dec 15 23:49:01 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Fri, 15 Dec 2000 23:49:01 +0100 Subject: [Python-Dev] Death to string functions! References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: <3A3A9FDD.E6F021AF@lemburg.com> Guido van Rossum wrote: > > Ideally, I would like to deprecate the entire string module, so that I > can place a single warning at its top. This will cause a single > warning to be issued for programs that still use it (no matter how > many times it is imported). Unfortunately, there are a couple of > things that still need it: string.letters etc., and > string.maketrans(). Can't we come up with a module similar to unicodedata[.py] ? string.py could then still provide the interfaces, but the implementation would live in stringdata.py [Perhaps we won't need stringdata by then... Unicode will have taken over and the discussion be mood ;-)] -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From thomas at xs4all.net Fri Dec 15 23:54:25 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 15 Dec 2000 23:54:25 +0100 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: <20001215040304.A22056@glacier.fnational.com>; from nas@arctrix.com on Fri, Dec 15, 2000 at 04:03:04AM -0800 References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> Message-ID: <20001215235425.A29681@xs4all.nl> On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote: > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > > I'm not sure I agree with that view either, but mostly because there > > is a non-GPL replacement for parts of the readline API: > > > > http://www.cstr.ed.ac.uk/downloads/editline.html > > It doesn't work with the current readline module. It is much > smaller than readline and works just as well in my experience. > Would there be any interest in including a copy with the standard > distribution? The license is quite nice (X11 type). Definately +1 from here. Readline reminds me of the cold war, for some reason. (Actually, multiple reasons ;) I don't have time to do it myself, unfortunately, or I would. (Looking at editline has been on my TODO list for a while... :P) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From martin at loewis.home.cs.tu-berlin.de Sat Dec 16 13:32:30 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 16 Dec 2000 13:32:30 +0100 Subject: [Python-Dev] PEP 226 Message-ID: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> I remember earlier discussion on the Python 2.1 release schedule, and never managed to comment on those. I believe that Python contributors and maintainers did an enourmous job in releasing Python 2, which took quite some time from everybody's life. I think it is unrealistic to expect the same amount of commitment for the next release, especially if that release appears just a few months after the previous release (that is, one month from now). So I'd like to ask the release manager to take that into account. I'm not quite sure what kind of action I expect; possible alternatives are: - declare 2.1 a pure bug fix release only; with a minimal set of new features. In particular, don't push for completion of PEPs; everybody should then accept that most features that are currently discussed will appear in Python 2.2. - move the schedule for Python 2.1 back (or is it forward?) by, say, a few month. This will people give some time to do the things that did not get the right amount of attention during 2.0 release, and will still allow to work on new and interesting features. Just my 0.02EUR, Martin From guido at python.org Sat Dec 16 17:38:28 2000 From: guido at python.org (Guido van Rossum) Date: Sat, 16 Dec 2000 11:38:28 -0500 Subject: [Python-Dev] PEP 226 In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100." <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> Message-ID: <200012161638.LAA13888@cj20424-a.reston1.va.home.com> > I remember earlier discussion on the Python 2.1 release schedule, and > never managed to comment on those. > > I believe that Python contributors and maintainers did an enourmous > job in releasing Python 2, which took quite some time from everybody's > life. I think it is unrealistic to expect the same amount of > commitment for the next release, especially if that release appears > just a few months after the previous release (that is, one month from > now). > > So I'd like to ask the release manager to take that into > account. I'm not quite sure what kind of action I expect; possible > alternatives are: > - declare 2.1 a pure bug fix release only; with a minimal set of new > features. In particular, don't push for completion of PEPs; everybody > should then accept that most features that are currently discussed > will appear in Python 2.2. > > - move the schedule for Python 2.1 back (or is it forward?) by, say, a > few month. This will people give some time to do the things that did > not get the right amount of attention during 2.0 release, and will > still allow to work on new and interesting features. > > Just my 0.02EUR, You're right -- 2.0 (including 1.6) was a monumental effort, and I'm grateful to all who contributed. I don't expect that 2.1 will be anywhere near the same amount of work! Let's look at what's on the table. 0042 Small Feature Requests Hylton SD 205 pep-0205.txt Weak References Drake S 207 pep-0207.txt Rich Comparisons Lemburg, van Rossum S 208 pep-0208.txt Reworking the Coercion Model Schemenauer S 217 pep-0217.txt Display Hook for Interactive Use Zadka S 222 pep-0222.txt Web Library Enhancements Kuchling I 226 pep-0226.txt Python 2.1 Release Schedule Hylton S 227 pep-0227.txt Statically Nested Scopes Hylton S 230 pep-0230.txt Warning Framework van Rossum S 232 pep-0232.txt Function Attributes Warsaw S 233 pep-0233.txt Python Online Help Prescod From guido at python.org Sat Dec 16 17:46:32 2000 From: guido at python.org (Guido van Rossum) Date: Sat, 16 Dec 2000 11:46:32 -0500 Subject: [Python-Dev] PEP 226 In-Reply-To: Your message of "Sat, 16 Dec 2000 13:32:30 +0100." <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> References: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> Message-ID: <200012161646.LAA13947@cj20424-a.reston1.va.home.com> [Oops, I posted a partial edit of this message by mistake before.] > I remember earlier discussion on the Python 2.1 release schedule, and > never managed to comment on those. > > I believe that Python contributors and maintainers did an enourmous > job in releasing Python 2, which took quite some time from everybody's > life. I think it is unrealistic to expect the same amount of > commitment for the next release, especially if that release appears > just a few months after the previous release (that is, one month from > now). > > So I'd like to ask the release manager to take that into > account. I'm not quite sure what kind of action I expect; possible > alternatives are: > - declare 2.1 a pure bug fix release only; with a minimal set of new > features. In particular, don't push for completion of PEPs; everybody > should then accept that most features that are currently discussed > will appear in Python 2.2. > > - move the schedule for Python 2.1 back (or is it forward?) by, say, a > few month. This will people give some time to do the things that did > not get the right amount of attention during 2.0 release, and will > still allow to work on new and interesting features. > > Just my 0.02EUR, You're right -- 2.0 (including 1.6) was a monumental effort, and I'm grateful to all who contributed. I don't expect that 2.1 will be anywhere near the same amount of work! Let's look at what's on the table. These are listed as Active PEPs -- under serious consideration for Python 2.1: > 0042 Small Feature Requests Hylton We can do some of these or leave them. > 0205 Weak References Drake This one's open. > 0207 Rich Comparisons Lemburg, van Rossum This is really not that much work -- I would've done it already if I weren't distracted by the next one. > 0208 Reworking the Coercion Model Schemenauer Neil has most of this under control. I don't doubt for a second that it will be finished. > 0217 Display Hook for Interactive Use Zadka Probably a 20-line fix. > 0222 Web Library Enhancements Kuchling Up to Andrew. If he doesn't get to it, no big deal. > 0226 Python 2.1 Release Schedule Hylton I still think this is realistic -- a release before the conference seems doable! > 0227 Statically Nested Scopes Hylton This one's got a 50% chance at least. Jeremy seems motivated to do it. > 0230 Warning Framework van Rossum Done except for documentation. > 0232 Function Attributes Warsaw We need to discuss this more, but it's not much work to implement. > 0233 Python Online Help Prescod If Paul can control his urge to want to solve everything at once, I see no reason whi this one couldn't find its way into 2.1. Now, officially the PEP deadline is closed today: the schedule says "16-Dec-2000: 2.1 PEPs ready for review". That means that no new PEPs will be considered for inclusion in 2.1, and PEPs not in the active list won't be considered either. But the PEPs in the list above are all ready for review, even if we don't agree with all of them. I'm actually more worried about the ever-growing number of bug reports and submitted patches. But that's for another time. --Guido van Rossum (home page: http://www.python.org/~guido/) From akuchlin at mems-exchange.org Sun Dec 17 01:09:28 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Sat, 16 Dec 2000 19:09:28 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:24:34PM +0100 References: <200012151327.IAA00696@207-172-57-238.s238.tnt2.ann.va.dialup.rcn.com> <3A3A9A22.E9BA9551@lemburg.com> Message-ID: <20001216190928.A6703@kronos.cnri.reston.va.us> On Fri, Dec 15, 2000 at 11:24:34PM +0100, M.-A. Lemburg wrote: >Why would you want to fix it ? Format characters will always >be ASCII and thus 7-bit -- theres really no need to expand the >set of possibilities beyond 8 bits ;-) This message is for characters that aren't format characters, which therefore includes all characters >127. --amk From akuchlin at mems-exchange.org Sun Dec 17 01:17:39 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Sat, 16 Dec 2000 19:17:39 -0500 Subject: [Python-Dev] What to do about PEP 229? In-Reply-To: <3A3A9D96.80781D61@lemburg.com>; from mal@lemburg.com on Fri, Dec 15, 2000 at 11:39:18PM +0100 References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> Message-ID: <20001216191739.B6703@kronos.cnri.reston.va.us> On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote: >Can't distutils try both and then settle for the working combination ? I'm worried about subtle problems; what if an unneeded -lfoo drags in a customized malloc, or has symbols which conflict with some other library. >... BTW, where is Greg ? I haven't heard from him in quite a while.] Still around; he just hasn't been posting much these days. >Why not parse Setup and use it as input to distutils setup.py ? That was option 1. The existing Setup format doesn't really contain enough intelligence, though; the intelligence is usually in comments such as "Uncomment the following line for Solaris". So either the Setup format is modified (bad, since we'd break existing 3rd-party packages that still use a Makefile.pre.in), or I give up and just do everything in a setup.py. --amk From guido at python.org Sun Dec 17 03:38:01 2000 From: guido at python.org (Guido van Rossum) Date: Sat, 16 Dec 2000 21:38:01 -0500 Subject: [Python-Dev] What to do about PEP 229? In-Reply-To: Your message of "Sat, 16 Dec 2000 19:17:39 EST." <20001216191739.B6703@kronos.cnri.reston.va.us> References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> <20001216191739.B6703@kronos.cnri.reston.va.us> Message-ID: <200012170238.VAA14466@cj20424-a.reston1.va.home.com> > >Why not parse Setup and use it as input to distutils setup.py ? > > That was option 1. The existing Setup format doesn't really contain > enough intelligence, though; the intelligence is usually in comments > such as "Uncomment the following line for Solaris". So either the > Setup format is modified (bad, since we'd break existing 3rd-party > packages that still use a Makefile.pre.in), or I give up and just do > everything in a setup.py. Forget Setup. Convert it and be done with it. There really isn't enough there to hang on to. We'll support Setup format (through the makesetup script and the Misc/Makefile.pre.in file) for 3rd party b/w compatibility, but we won't need to use it ourselves. (Too bad for 3rd party documentation that describes the Setup format. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at home.com Sun Dec 17 08:34:27 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 17 Dec 2000 02:34:27 -0500 Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: <20001216190928.A6703@kronos.cnri.reston.va.us> Message-ID: [MAL] > Why would you want to fix it ? Format characters will always > be ASCII and thus 7-bit -- theres really no need to expand the > set of possibilities beyond 8 bits ;-) [AMK] > This message is for characters that aren't format characters, which > therefore includes all characters >127. I'm with the wise man who suggested to drop the %c in this case and just display the hex value. Although it would be more readable to drop the %c if and only if the bogus format character isn't printable 7-bit ASCII. Which is obvious, yes? A new if/else isn't going to hurt anything. From tim.one at home.com Sun Dec 17 08:57:01 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 17 Dec 2000 02:57:01 -0500 Subject: [Python-Dev] PEP 226 In-Reply-To: <200012161232.NAA01779@loewis.home.cs.tu-berlin.de> Message-ID: [Martin v. Loewis] > ... > - move the schedule for Python 2.1 back (or is it forward?) by, say, a > few month. This will people give some time to do the things that did > not get the right amount of attention during 2.0 release, and will > still allow to work on new and interesting features. Just a stab in the dark, but is one of your real concerns the spotty state of Unicode support in the std libraries? If so, nobody working on the PEPs Guido identified would be likely to work on improving Unicode support even if the PEPs vanished. I don't know how Unicode support is going to improve, but in the absence of visible work in that direction-- or even A Plan to get some --I doubt we're going to hold up 2.1 waiting for magic. no-feature-is-ever-done-ly y'rs - tim From tim.one at home.com Sun Dec 17 09:30:24 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 17 Dec 2000 03:30:24 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <3A387D6A.782E6A3B@prescod.net> Message-ID: [Tim] >> I've rarely seen problems due to shadowing a global, but have often >> seen problems due to shadowing a builtin. [Paul Prescod] > Really? Yes. > I think that there are two different issues here. One is consciously > choosing to create a new variable but not understanding that there > already exists a variable by that name. (i.e. str, list). Yes, and that's what I've often seen, typically long after the original code is written: someone sticks in some debugging output, or makes a small change to the implementation, and introduces e.g. str = some_preexisting_var + ":" yadda(str) "Suddenly" the program misbehaves in baffling ways. They're "baffling" because the errors do not occur on the lines where the changes were made, and are almost never related to the programmer's intent when making the changes. > Another is trying to assign to a global but actually shadowing it. I've rarely seen that. > There is no way that anyone coming from another language is going > to consider this transcript reasonable: True, but I don't really care: everyone gets burned once, the better ones eventually learn to use classes instead of mutating globals, and even the dull get over it. It is not, in my experience, an on-going problem for anyone. But I still get burned regularly by shadowing builtins. The burns are not fatal, however, and I can't think of an ointment less painful than the blisters. > >>> a=5 > >>> def show(): > ... print a > ... > >>> def set(val): > ... a=val > ... > >>> a > 5 > >>> show() > 5 > >>> set(10) > >>> show() > 5 > > It doesn't seem to make any sense. My solution is to make the assignment > in "set" illegal unless you add a declaration that says: "No, really. I > mean it. Override that sucker." As the PEP points out, overriding is > seldom a good idea so the requirement to declare would be rarely > invoked. I expect it would do less harm to introduce a compile-time warning for locals that are never referenced (such as the "a" in "set"). > ... > The "right answer" in terms of namespace theory is to consistently refer > to builtins with a prefix (whether "__builtins__" or "$") but that's > pretty unpalatable from an aesthetic point of view. Right, that's one of the ointments I won't apply to my own code, so wouldn't think of asking others to either. WRT mutable globals, people who feel they have to use them would be well served to adopt a naming convention. For example, begin each name with "g" and capitalize the second letter. This can make global-rich code much easier to follow (I've done-- and very happily --similar things in Javascript and C++). From pf at artcom-gmbh.de Sun Dec 17 10:59:11 2000 From: pf at artcom-gmbh.de (Peter Funk) Date: Sun, 17 Dec 2000 10:59:11 +0100 (MET) Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 15, 2000 4:23:46 pm" Message-ID: Hi, Guido van Rossum: > If you're saying that you think the string module is too prominent to > ever start deprecating its use, I'm afraid we have a problem. I strongly believe the string module is too prominent. > I'd also like to note that using the string module's wrappers incurs > the overhead of a Python function call -- using string methods is > faster. I think most care more about readbility than about run time performance. For people without much OOP experience, the method syntax hurts readability. > Finally, I like the look of fields[i].strip().lower() much better than > that of string.lower(string.strip(fields[i])) -- an actual example > from mimetools.py. Hmmmm.... May be this is just a matter of taste? Like my preference for '<>' instead of '!='? Personally I still like the old fashinoned form more. Especially, if string.join() or string.split() are involved. Since Python 1.5.2 will stay around for several years, keeping backward compatibility in our Python coding is still major issue for us. So we won't change our Python coding style soon if ever. > Ideally, I would like to deprecate the entire string module, so that I [...] I share Mark Lutz and Tim Peters oppinion, that this crusade will do more harm than good to Python community. IMO this is a really bad idea. Just my $0.02, Peter From martin at loewis.home.cs.tu-berlin.de Sun Dec 17 12:13:09 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 17 Dec 2000 12:13:09 +0100 Subject: [Python-Dev] PEP 226 In-Reply-To: References: Message-ID: <200012171113.MAA00733@loewis.home.cs.tu-berlin.de> > Just a stab in the dark, but is one of your real concerns the spotty state > of Unicode support in the std libraries? Not at all. I really responded to amk's message # All the PEPs for 2.1 are supposed to be complete for Dec. 16, and # some of those PEPs are pretty complicated. I'm a bit worried that # it's been so quiet on python-dev lately, especially after the # previous two weeks of lively discussion. I just thought that something was wrong here - contributing to a free software project ought to be fun for contributors, not a cause for worries. There-are-other-things-but-i18n-although-they-are-not-that-interesting y'rs, Martin From guido at python.org Sun Dec 17 15:38:07 2000 From: guido at python.org (Guido van Rossum) Date: Sun, 17 Dec 2000 09:38:07 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: Your message of "Sun, 17 Dec 2000 03:30:24 EST." References: Message-ID: <200012171438.JAA21603@cj20424-a.reston1.va.home.com> > I expect it would do less harm to introduce a compile-time warning for > locals that are never referenced (such as the "a" in "set"). Another warning that would be quite useful (and trap similar cases) would be "local variable used before set". --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Dec 17 15:40:40 2000 From: guido at python.org (Guido van Rossum) Date: Sun, 17 Dec 2000 09:40:40 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Sun, 17 Dec 2000 10:59:11 +0100." References: Message-ID: <200012171440.JAA21620@cj20424-a.reston1.va.home.com> > I think most care more about readbility than about run time performance. > For people without much OOP experience, the method syntax hurts > readability. I don't believe one bit of this. By that standard, we would do better to define a new module "list" and start writing list.append(L, x) for L.append(x). > I share Mark Lutz and Tim Peters oppinion, that this crusade will do > more harm than good to Python community. IMO this is a really bad > idea. You are entitled to your opinion, but given that your arguments seem very weak I will continue to ignore it (except to argue with you :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at digicool.com Sun Dec 17 17:17:12 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Sun, 17 Dec 2000 11:17:12 -0500 Subject: [Python-Dev] Death to string functions! References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: <14908.59144.321167.419762@anthem.concentric.net> >>>>> "PF" == Peter Funk writes: PF> Hmmmm.... May be this is just a matter of taste? Like my PF> preference for '<>' instead of '!='? Personally I still like PF> the old fashinoned form more. Especially, if string.join() or PF> string.split() are involved. Hey cool! I prefer <> over != too, but I also (not surprisingly) strongly prefer string methods over string module functions. TOOWTDI-MA-ly y'rs, -Barry From gvwilson at nevex.com Sun Dec 17 17:25:17 2000 From: gvwilson at nevex.com (Greg Wilson) Date: Sun, 17 Dec 2000 11:25:17 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <14908.59144.321167.419762@anthem.concentric.net> Message-ID: <000201c06845$f1afdb40$770a0a0a@nevex.com> +1 on deprecating string functions. Every Python book and tutorial (including mine) emphasizes Python's simplicity and lack of Perl-ish redundancy; the more we practice what we preach, the more persuasive this argument is. Greg (who admittedly only has a few thousand lines of Python to maintain) From pf at artcom-gmbh.de Sun Dec 17 18:40:06 2000 From: pf at artcom-gmbh.de (Peter Funk) Date: Sun, 17 Dec 2000 18:40:06 +0100 (MET) Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012171440.JAA21620@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Dec 17, 2000 9:40:40 am" Message-ID: [string.function(S, ...) vs. S.method(...)] Guido van Rossum: > I don't believe one bit of this. By that standard, we would do better > to define a new module "list" and start writing list.append(L, x) for > L.append(x). list objects have only very few methods. Strings have so many methods. Some of them have names, that clash easily with the method names of other kind of objects. Since there are no type declarations in Python, looking at the code in isolation and seeing a line i = string.index(some_parameter) tells at the first glance, that some_parameter should be a string object even if the doc string of this function is too terse. However in i = some_parameter.index() it could be a list, a database or whatever. > You are entitled to your opinion, but given that your arguments seem > very weak I will continue to ignore it (except to argue with you :-). I see. But given the time frame that the string module wouldn't go away any time soon, I guess I have a lot of time to either think about some stronger arguments or to get finally accustomed to that new style of coding. But since we have to keep compatibility with Python 1.5.2 for at least the next two years chances for the latter are bad. Regards and have a nice vacation, Peter From mwh21 at cam.ac.uk Sun Dec 17 19:18:24 2000 From: mwh21 at cam.ac.uk (Michael Hudson) Date: 17 Dec 2000 18:18:24 +0000 Subject: (offtopic) RE: [Python-Dev] Python 2.0 license and GPL In-Reply-To: Thomas Wouters's message of "Fri, 15 Dec 2000 23:54:25 +0100" References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> <20001215235425.A29681@xs4all.nl> Message-ID: Thomas Wouters writes: > On Fri, Dec 15, 2000 at 04:03:04AM -0800, Neil Schemenauer wrote: > > On Fri, Dec 15, 2000 at 11:17:08AM -0500, Barry A. Warsaw wrote: > > > I'm not sure I agree with that view either, but mostly because there > > > is a non-GPL replacement for parts of the readline API: > > > > > > http://www.cstr.ed.ac.uk/downloads/editline.html > > > > It doesn't work with the current readline module. It is much > > smaller than readline and works just as well in my experience. > > Would there be any interest in including a copy with the standard > > distribution? The license is quite nice (X11 type). > > Definately +1 from here. Readline reminds me of the cold war, for > some reason. (Actually, multiple reasons ;) I don't have time to do > it myself, unfortunately, or I would. (Looking at editline has been > on my TODO list for a while... :P) It wouldn't be particularly hard to rewrite editline in Python (we have termios & the terminal handling functions in curses - and even ioctl if we get really keen). I've been hacking on my own Python line reader on and off for a while; it's still pretty buggy, but if you're feeling brave you could look at: http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.0.0.tar.gz To try it out, unpack it, cd into the ./pyrl directory and try: >>> import foo # sorry >>> foo.test_loop() It sort of imitates the Python command prompt, except that it doesn't actually execute the code you type. You need a recent _cursesmodule.c for it to work. Cheers, M. -- 41. Some programming languages manage to absorb change, but withstand progress. -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html From thomas at xs4all.net Sun Dec 17 19:30:38 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Sun, 17 Dec 2000 19:30:38 +0100 Subject: [Python-Dev] Death to string functions! In-Reply-To: <000201c06845$f1afdb40$770a0a0a@nevex.com>; from gvwilson@nevex.com on Sun, Dec 17, 2000 at 11:25:17AM -0500 References: <14908.59144.321167.419762@anthem.concentric.net> <000201c06845$f1afdb40$770a0a0a@nevex.com> Message-ID: <20001217193038.C29681@xs4all.nl> On Sun, Dec 17, 2000 at 11:25:17AM -0500, Greg Wilson wrote: > +1 on deprecating string functions. How wonderfully ambiguous ! Do you mean string methods, or the string module? :) FWIW, I agree that in time, the string module should be deprecated. But I also think that 'in time' should be a considerable timespan. Don't deprecate it before everything it provides is available though some other means. Wait a bit longer than that, even, before calling it deprecated -- that scares people off. And then keep it for practically forever (until Py3K) just to support old code. And don't forget to document it 'deprecated' everywhere, not just one minor release note. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tismer at tismer.com Sun Dec 17 18:38:31 2000 From: tismer at tismer.com (Christian Tismer) Date: Sun, 17 Dec 2000 19:38:31 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> Message-ID: <3A3CFA17.ED26F51A@tismer.com> Old topic: {}.popitem() (was Re: {}.first[key,value,item] ...) Christian Tismer wrote: > > Fredrik Lundh wrote: > > > > christian wrote: > > > That algorithm is really a gem which you should know, > > > so let me try to explain it. > > > > I think someone just won the "brain exploder 2000" award ;-) > As you might have guessed, I didn't do this just for fun. > It is the old game of explaining what is there, convincing > everybody that you at least know what you are talking about, > and then three days later coming up with an improved > application of the theory. > > Today is Monday, 2 days left. :-) Ok, today is Sunday, I had no time to finish this. But now it is here. =========================== ===== Claim: ===== =========================== - Dictionary access time can be improved with a minimal change - On the hash() function: All Objects are supposed to provide a hash function which is as good as possible. Good means to provide a wide range of different keys for different values. Problem: There are hash functions which are "good" in this sense, but they do not spread their randomness uniformly over the 32 bits. Example: Integers use their own value as hash. This is ok, as far as the integers are uniformly distributed. But if they all contain a high power of two, for instance, the low bits give a very bad hash function. Take a dictionary with integers range(1000) as keys and access all entries. Then use a dictionay with the integers shifted left by 16. Access time is slowed down by a factor of 100, since every access is a linear search now. This is not an urgent problem, although applications exist where this can play a role (memory addresses for instance can have high factors of two when people do statistics on page accesses...) While this is not a big problem, it is ugly enough to think of a solution. Solution 1: ------------- Try to involve more bits of the hash value by doing extra shuffling, either a) in the dictlook function, or b) in the hash generation itself. I believe, both can't be really justified for a rare problem. But how about changing the existing solution in a way that an improvement is gained without extra cost? Solution 2: (*the* solution) ---------------------------- Some people may remember what I wrote about re-hashing functions through the multiplicative group GF(2^n)*, and I don't want to repeat this here. The simple idea can be summarized quickly: The original algorithm uses multiplication by polynomials, and it is guaranteed that these re-hash values are jittering through all possible nonzero patterns of the n bits. Observation: Whe are using an operation of a finite field. This means that the inverse of multiplication also exists! Old algortithm (multiplication): shift the index left by 1 if index > mask: xor the index with the generator polynomial New algorithm (division): if low bit of index set: xor the index with the generator polynomial shift the index right by 1 What does this mean? Not so much, we are just cycling through our bit patterns in reverse order. But now for the big difference. First change: We change from multiplication to division. Second change: We do not mask the hash value before! The second change is what I was after: By not masking the hash value when computing the initial index, all the existing bits in the hash come into play. This can be seen like a polynomial division, but the initial remainder (the hash value) was not normalized. After a number of steps, all the extra bits are wheeled into our index, but not wasted by masking them off. That gives our re-hash some more randomness. When all the extra bits are sucked in, the guaranteed single-visit cycle begins. There cannot be more than 27 extra cycles in the worst case (dict size = 32, so there are 27 bits to consume). I do not expect any bad effect from this modification. Here some results, dictionaries have 1000 entries: timing for strings old= 5.097 new= 5.088 timing for bad integers (<<10) old=101.540 new=12.610 timing for bad integers (<<16) old=571.210 new=19.220 On strings, both algorithms behave the same. On numbers, they differ dramatically. While the current algorithm is 110 times slower on a worst case dict (quadratic behavior), the new algorithm accounts a little for the extra cycle, but is only 4 times slower. Alternative implementation: The above approach is conservative in the sense that it tries not to slow down the current implementation in any way. An alternative would be to comsume all of the extra bits at once. But this would add an extra "warmup" loop like this to the algorithm: while index > mask: if low bit of index set: xor the index with the generator polynomial shift the index right by 1 This is of course a very good digest of the higher bits, since it is a polynomial division and not just some bit xor-ing which might give quite predictable cancellations, therefore it is "the right way" in my sense. It might be cheap, but would add over 20 cycles to every small dict. I therefore don't think it is worth to do this. Personally, I prefer the solution to merge the bits during the actual lookup, since it suffices to get access time from quadratic down to logarithmic. Attached is a direct translation of the relevant parts of dictobject.c into Python, with both algorithms implemented. cheers - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com -------------- next part -------------- ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # even the shifting my not be worth it. incr = _hash ^ (_hash >> 3) ###### TO HERE if (not incr): incr = mask while 1: ep = ep0[(i+incr)&mask] if (ep[me_key] is NULL) : if (freeslot != NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def timing(func, args=None, n=1, **keywords) : import time time=time.time appl=apply if args is None: args = () if type(args) != type(()) : args=(args,) rep=range(n) dummyarg = ("",) dummykw = {} dummyfunc = len if keywords: before=time() for i in rep: res=appl(dummyfunc, dummyarg, dummykw) empty = time()-before before=time() for i in rep: res=appl(func, args, keywords) else: before=time() for i in rep: res=appl(dummyfunc, dummyarg) empty = time()-before before=time() for i in rep: res=appl(func, args) after = time() return round(after-before-empty,4), res def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash for i in range(1000): s = str(i) * 5 d1[s] = d2[s] = i return d1, d2 def badnum_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash for i in range(1000): bad = i << 16 d1[bad] = d2[bad] = i return d1, d2 def do_test(dict, keys, n): t0 = timing(nulltest, (keys, dict), n)[0] t1 = timing(test, (keys, dict), n)[0] return t1-t0 if __name__ == "__main__": sdold, sdnew = string_dicts() bdold, bdnew = badnum_dicts() print "timing for strings old=%.3f new=%.3f" % ( do_test(sdold, sdold.keys(), 100), do_test(sdnew, sdnew.keys(), 100) ) print "timing for bad integers old=%.3f new=%.3f" % ( do_test(bdold, bdold.keys(), 10) *10, do_test(bdnew, bdnew.keys(), 10) *10) """ D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 """ From fdrake at acm.org Sun Dec 17 19:49:58 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sun, 17 Dec 2000 13:49:58 -0500 (EST) Subject: [Python-Dev] Death to string functions! In-Reply-To: <20001217193038.C29681@xs4all.nl> References: <14908.59144.321167.419762@anthem.concentric.net> <000201c06845$f1afdb40$770a0a0a@nevex.com> <20001217193038.C29681@xs4all.nl> Message-ID: <14909.2774.158973.760077@cj42289-a.reston1.va.home.com> Thomas Wouters writes: > FWIW, I agree that in time, the string module should be deprecated. But I > also think that 'in time' should be a considerable timespan. Don't deprecate *If* most functions in the string module are going to be deprecated, that should be done *now*, so that the documentation will include the appropriate warning to users. When they should actually be removed is another matter, and I think Guido is sufficiently aware of their widespread use and won't remove them too quickly -- his creation of Python isn't the reason he's *accepted* as BDFL, it just made it a possibility. He's had to actually *earn* the BDFL position, I think. With regard to converting the standard library to string methods: that needs to be done as part of the deprecation. The code in the library is commonly used as example code, and should be good example code wherever possible. > support old code. And don't forget to document it 'deprecated' everywhere, > not just one minor release note. When Guido tells me exactly what is deprecated, the documentation will be updated with proper deprecation notices in the appropriate places. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tismer at tismer.com Sun Dec 17 19:10:07 2000 From: tismer at tismer.com (Christian Tismer) Date: Sun, 17 Dec 2000 20:10:07 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> Message-ID: <3A3D017F.62AD599F@tismer.com> Christian Tismer wrote: ... (my timings) Attached is the updated script with the timings mentioned in the last posting. Sorry, I posted an older version before. -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com -------------- next part -------------- ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # even the shifting my not be worth it. incr = _hash ^ (_hash >> 3) ###### TO HERE if (not incr): incr = mask while 1: ep = ep0[(i+incr)&mask] if (ep[me_key] is NULL) : if (freeslot != NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def timing(func, args=None, n=1, **keywords) : import time time=time.time appl=apply if args is None: args = () if type(args) != type(()) : args=(args,) rep=range(n) dummyarg = ("",) dummykw = {} dummyfunc = len if keywords: before=time() for i in rep: res=appl(dummyfunc, dummyarg, dummykw) empty = time()-before before=time() for i in rep: res=appl(func, args, keywords) else: before=time() for i in rep: res=appl(dummyfunc, dummyarg) empty = time()-before before=time() for i in rep: res=appl(func, args) after = time() return round(after-before-empty,4), res def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash for i in range(1000): s = str(i) * 5 d1[s] = d2[s] = i return d1, d2 def badnum_dicts(): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash shift = 10 if EXTREME: shift = 16 for i in range(1000): bad = i << 16 d1[bad] = d2[bad] = i return d1, d2 def do_test(dict, keys, n): t0 = timing(nulltest, (keys, dict), n)[0] t1 = timing(test, (keys, dict), n)[0] return t1-t0 EXTREME=1 if __name__ == "__main__": sdold, sdnew = string_dicts() bdold, bdnew = badnum_dicts() print "timing for strings old=%.3f new=%.3f" % ( do_test(sdold, sdold.keys(), 100), do_test(sdnew, sdnew.keys(), 100) ) print "timing for bad integers old=%.3f new=%.3f" % ( do_test(bdold, bdold.keys(), 10) *10, do_test(bdnew, bdnew.keys(), 10) *10) """ Results with a shift of 10 (EXTREME=0): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 Results with a shift of 16 (EXTREME=1): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.218 new=5.147 timing for bad integers old=571.210 new=19.220 """ From lutz at rmi.net Sun Dec 17 20:09:47 2000 From: lutz at rmi.net (Mark Lutz) Date: Sun, 17 Dec 2000 12:09:47 -0700 Subject: [Python-Dev] Death to string functions! References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: <001f01c0685c$ef555200$7bdb5da6@vaio> As a longstanding Python advocate and user, I find this thread disturbing, and feel compelled to add a few words: > > [Tim wrote:] > > "string" is right up there with "os" and "sys" as a FIM (Frequently > > Imported Module), so the required code changes will be massive. As > > a user, I don't see what's in it for me to endure that pain: the > > string module functions work fine! Neither are they warts in the > > language, any more than that we say sin(pi) instead of pi.sin(). > > Keeping the functions around doesn't hurt anybody that I can see. > > [Guido wrote:] > Hm. I'm not saying that this one will be easy. But I don't like > having "two ways to do it". It means more learning, etc. (you know > the drill). But with all due respect, there are already _lots_ of places in Python that provide at least two ways to do something already. Why be so strict on this one alone? Consider lambda and def; tuples and lists; map and for loops; the loop else and boolean exit flags; and so on. The notion of Python forcing a single solution is largely a myth. And as someone who makes a living teaching this stuff, I can tell you that none of the existing redundancies prevent anyone from learning Python. More to the point, many of those shiny new features added to 2.0 fall squarely into this category too, and are completely redundant with other tools. Consider list comprehensions and simple loops; extended print statements and sys.std* assignments; augmented assignment statements and simpler ones. Eliminating redundancy at a time when we're also busy introducing it seems a tough goal to sell. I understand the virtues of aesthetics too, but removing the string module seems an incredibly arbitrary application of it. > If you're saying that you think the string module is too prominent to > ever start deprecating its use, I'm afraid we have a problem. > > [...] > Ideally, I'd like to deprecate the entire string module, so that I > can place a single warning at its top. This will cause a single > warning to be issued for programs that still use it (no matter how > many times it is imported). And to me, this seems the real crux of the matter. For a decade now, the string module _has_ been the right way to do it. And today, half a million Python developers absolutely rely on it as an essential staple in their toolbox. What could possibly be wrong with keeping it around for backward compatibility, albeit as a less recommended option? If almost every Python program ever written suddenly starts issuing warning messages, then I think we do have a problem indeed. Frankly, a Python that changes without regard to its user base seems an ominous thing to me. And keep in mind that I like Python; others will look much less generously upon a tool that seems inclined to rip the rug out from under its users. Trust me on this; I've already heard the rumblings out there. So please: can we keep string around? Like it or not, we're way past the point of removing such core modules at this point. Such a radical change might pass in a future non-backward- compatible Python mutation; I'm not sure such a different system will still be "Python", but that's a topic for another day. All IMHO, of course, --Mark Lutz (http://www.rmi.net~lutz) From tim.one at home.com Sun Dec 17 20:50:55 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 17 Dec 2000 14:50:55 -0500 Subject: [Python-Dev] SourceForge SSH silliness Message-ID: Starting last night, I get this msg whenever I update Python code w/ CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python: """ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that the host key has just been changed. Please contact your system administrator. Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message. Password authentication is disabled to avoid trojan horses. """ This is SourceForge's doing, and is permanent (they've changed keys on their end). Here's a link to a thread that may or may not make sense to you: http://sourceforge.net/forum/forum.php?forum_id=52867 Deleting the sourceforge entries from my .ssh/known_hosts file worked for me. But everyone in the thread above who tried it says that they haven't been able to get scp working again (I haven't tried it yet ...). From paulp at ActiveState.com Sun Dec 17 21:04:27 2000 From: paulp at ActiveState.com (Paul Prescod) Date: Sun, 17 Dec 2000 12:04:27 -0800 Subject: [Python-Dev] Pragmas and warnings Message-ID: <3A3D1C4B.8F08A744@ActiveState.com> A couple of other threads started me to thinking that there are a couple of things missing from our warnings framework. Many languages have pragmas that allow you turn warnings on and off in code. For instance, I should be able to put a pragma at the top of a module that uses string functions to say: "I know that this module doesn't adhere to the latest Python conventions. Please don't warn me about it." I should also be able to put a declaration that says: "I'm really paranoid about shadowing globals and builtins. Please warn me when I do that." Batch and visual linters could also use the declarations to customize their behaviors. And of course we have a stack of other features that could use pragmas: * type signatures * Unicode syntax declarations * external object model language binding hints * ... A case could be made that warning pragmas could use a totally different syntax from "user-defined" pragmas. I don't care much. Paul From thomas at xs4all.net Sun Dec 17 22:00:08 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Sun, 17 Dec 2000 22:00:08 +0100 Subject: [Python-Dev] SourceForge SSH silliness In-Reply-To: ; from tim.one@home.com on Sun, Dec 17, 2000 at 02:50:55PM -0500 References: Message-ID: <20001217220008.D29681@xs4all.nl> On Sun, Dec 17, 2000 at 02:50:55PM -0500, Tim Peters wrote: > Starting last night, I get this msg whenever I update Python code w/ > CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python: > """ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > @ WARNING: HOST IDENTIFICATION HAS CHANGED! @ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! > Someone could be eavesdropping on you right now (man-in-the-middle attack)! > It is also possible that the host key has just been changed. > Please contact your system administrator. > Add correct host key in C:\Code/.ssh/known_hosts to get rid of this message. > Password authentication is disabled to avoid trojan horses. > """ > This is SourceForge's doing, and is permanent (they've changed keys on their > end). Here's a link to a thread that may or may not make sense to you: > http://sourceforge.net/forum/forum.php?forum_id=52867 > Deleting the sourceforge entries from my .ssh/known_hosts file worked for > me. But everyone in the thread above who tried it says that they haven't > been able to get scp working again (I haven't tried it yet ...). What sourceforge did was switch Linux distributions, and upgrade. The switch doesn't really matter for the SSH problem, because recent Debian and recent RedHat releases both use a new ssh, the OpenBSD ssh imlementation. Apparently, it isn't entirely backwards compatible to old versions of F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This might or might not be your problem; if it is, you should get a decent message that gives a relatively clear message such as 'cypher type 'idea' not supported'. You should be able to pass the '-c' option to scp/ssh to use a different cypher, like 3des (aka triple-des.) Or maybe the windows versions have a menu to configure that kind of thing :) Another possible problem is that it might not have good support for older protocol versions. The 'current' protocol version, at least for 'ssh1', is 1.5. The one message on the sourceforge thread above that actually mentions a version in the *cough* bugreport is using an older ssh that only supports protocol version 1.4. Since that particular version of F-secure ssh has known problems (why else would they release 16 more versions ?) I'd suggest anyone with problems first try a newer version. I hope that doesn't break WinCVS, but it would suck if it did :P If that doesn't work, which is entirely possible, it might be an honest bug in the OpenBSD ssh that Sourceforge is using. If anyone cared, we could do a bit of experimenting with the openssh-2.0 betas installed by Debian woody (unstable) to see if the problem occurs there as well. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From greg at cosc.canterbury.ac.nz Mon Dec 18 00:05:41 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 18 Dec 2000 12:05:41 +1300 (NZDT) Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <20001216014341.5BA97A82E@darjeeling.zadka.site.co.il> Message-ID: <200012172305.MAA02512@s454.cosc.canterbury.ac.nz> Moshe Zadka : > Perl and Scheme permit implicit shadowing too. But Scheme always requires declarations! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From martin at loewis.home.cs.tu-berlin.de Mon Dec 18 00:45:56 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 18 Dec 2000 00:45:56 +0100 Subject: [Python-Dev] Death to string functions! Message-ID: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> > But with all due respect, there are already _lots_ of places in > Python that provide at least two ways to do something already. Exactly. My favourite one here is string exceptions, which have quite some analogy to the string module. At some time, there were only string exceptions. Then, instance exceptions were added, some releases later they were considered the better choice, so the standard library was converted to use them. Still, there is no sign whatsoever that anybody plans to deprecate string exceptions. I believe the string module will get less importance over time. Comparing it with string exception, that may be well 5 years. It seems there are two ways of "deprecation": a loud "we will remove that, change your code", and a silent "strings have methods" (i.e. don't mention the module when educating people). The latter approach requires educators to agree that the module is "uninteresting", and people to really not use once they find out it exists. I think deprecation should be only attempted once there is a clear sign that people don't use it massively for new code anymore. Removal should only occur if keeping the module less pain than maintaining it. Regards, Martin From skip at mojam.com Mon Dec 18 00:55:10 2000 From: skip at mojam.com (Skip Montanaro) Date: Sun, 17 Dec 2000 17:55:10 -0600 (CST) Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in? Message-ID: <14909.21086.92774.940814@beluga.mojam.com> I executed cvs update today (removing the sourceforge machines from .ssh/known_hosts worked fine for me, btw) followed by a configure and a make clean. The last step failed with this output: ... make[1]: Entering directory `/home/beluga/skip/src/python/dist/src/Modules' Makefile.pre.in:20: *** missing separator. Stop. make[1]: Leaving directory `/home/beluga/skip/src/python/dist/src/Modules' make: [clean] Error 2 (ignored) I found the following at line 20 of Modules/Makefile.pre.in: @SET_CXX@ I then tried a cvs annotate on that file but saw that line 20 had been there since rev 1.60 (16-Dec-99). I then checked the top-level Makefile.in thinking something must have changed in the clean target recently, but cvs annotate shows no recent changes there either: 1.1 (guido 24-Dec-93): clean: localclean 1.1 (guido 24-Dec-93): -for i in $(SUBDIRS); do \ 1.74 (guido 19-May-98): if test -d $$i; then \ 1.24 (guido 20-Jun-96): (echo making clean in subdirectory $$i; cd $$i; \ 1.4 (guido 01-Aug-94): if test -f Makefile; \ 1.4 (guido 01-Aug-94): then $(MAKE) clean; \ 1.4 (guido 01-Aug-94): else $(MAKE) -f Makefile.*in clean; \ 1.4 (guido 01-Aug-94): fi); \ 1.74 (guido 19-May-98): else true; fi; \ 1.1 (guido 24-Dec-93): done Make distclean succeeded so I tried the following: make distclean ./configure make clean but the last step still failed. Any idea why make clean is now failing (for me)? Can anyone else reproduce this problem? Skip From greg at cosc.canterbury.ac.nz Mon Dec 18 01:02:32 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon, 18 Dec 2000 13:02:32 +1300 (NZDT) Subject: [Python-Dev] Use of %c and Py_UNICODE In-Reply-To: <3A3A9A22.E9BA9551@lemburg.com> Message-ID: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> "M.-A. Lemburg" : > Format characters will always > be ASCII and thus 7-bit -- theres really no need to expand the > set of possibilities beyond 8 bits ;-) But the error message is being produced because the character is NOT a valid format character. One of the reasons for that might be because it's not in the 7-bit range! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From MarkH at ActiveState.com Mon Dec 18 07:02:27 2000 From: MarkH at ActiveState.com (Mark Hammond) Date: Mon, 18 Dec 2000 17:02:27 +1100 Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in? In-Reply-To: <14909.21086.92774.940814@beluga.mojam.com> Message-ID: > I found the following at line 20 of Modules/Makefile.pre.in: > > @SET_CXX@ I dont have time to investigate this specific problem, but I definitely had problems with SET_CXX around 6 months back. This was trying to build an external C++ application, so may be different. My message and other followups at the time implied noone really knew and everyone agreed it was likely SET_CXX was broken :-( I even referenced the CVS chekin that I thought broke it. Mark. From mal at lemburg.com Mon Dec 18 10:58:37 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 10:58:37 +0100 Subject: [Python-Dev] Pragmas and warnings References: <3A3D1C4B.8F08A744@ActiveState.com> Message-ID: <3A3DDFCD.34AB05B2@lemburg.com> Paul Prescod wrote: > > A couple of other threads started me to thinking that there are a couple > of things missing from our warnings framework. > > Many languages have pragmas that allow you turn warnings on and off in > code. For instance, I should be able to put a pragma at the top of a > module that uses string functions to say: "I know that this module > doesn't adhere to the latest Python conventions. Please don't warn me > about it." I should also be able to put a declaration that says: "I'm > really paranoid about shadowing globals and builtins. Please warn me > when I do that." > > Batch and visual linters could also use the declarations to customize > their behaviors. > > And of course we have a stack of other features that could use pragmas: > > * type signatures > * Unicode syntax declarations > * external object model language binding hints > * ... > > A case could be made that warning pragmas could use a totally different > syntax from "user-defined" pragmas. I don't care much. There was a long thread about this some months ago. We agreed to add a new keyword to the language (I think it was "define") which then uses a very simple syntax which can be interpreted at compile time to modify the behaviour of the compiler, e.g. define = There was also a discussion about allowing limited forms of expressions instead of the constant literal. define source_encoding = "utf-8" was the original motivation for this, but (as always ;) the usefulness for other application areas was quickly recognized, e.g. to enable compilation in optimization mode on a per module basis. PS: "define" is perhaps not obscure enough as keyword... -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Mon Dec 18 11:04:08 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:04:08 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> Message-ID: <3A3DE118.3355896D@lemburg.com> Greg Ewing wrote: > > "M.-A. Lemburg" : > > > Format characters will always > > be ASCII and thus 7-bit -- theres really no need to expand the > > set of possibilities beyond 8 bits ;-) > > But the error message is being produced because the > character is NOT a valid format character. One of the > reasons for that might be because it's not in the > 7-bit range! True. I think removing %c completely in that case is the right solution (in case you don't want to convert the Unicode char using the default encoding to a string first). -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Mon Dec 18 11:09:16 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:09:16 +0100 Subject: [Python-Dev] What to do about PEP 229? References: <200012151810.NAA01121@207-172-146-21.s21.tnt3.ann.va.dialup.rcn.com> <3A3A9D96.80781D61@lemburg.com> <20001216191739.B6703@kronos.cnri.reston.va.us> Message-ID: <3A3DE24C.DA0B2F6C@lemburg.com> Andrew Kuchling wrote: > > On Fri, Dec 15, 2000 at 11:39:18PM +0100, M.-A. Lemburg wrote: > >Can't distutils try both and then settle for the working combination ? > > I'm worried about subtle problems; what if an unneeded -lfoo drags in > a customized malloc, or has symbols which conflict with some other > library. In that case, I think the user will have to decide. setup.py should then default to not integrating the module in question and issue a warning telling the use what to look for and how to call setup.py in order to add the right combination of libs. > >... BTW, where is Greg ? I haven't heard from him in quite a while.] > > Still around; he just hasn't been posting much these days. Good to know :) > >Why not parse Setup and use it as input to distutils setup.py ? > > That was option 1. The existing Setup format doesn't really contain > enough intelligence, though; the intelligence is usually in comments > such as "Uncomment the following line for Solaris". So either the > Setup format is modified (bad, since we'd break existing 3rd-party > packages that still use a Makefile.pre.in), or I give up and just do > everything in a setup.py. I would still like a simple input to setup.py -- one that doesn't require hacking setup.py just to enable a few more modules. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From fredrik at effbot.org Mon Dec 18 11:15:26 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Mon, 18 Dec 2000 11:15:26 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com> Message-ID: <004a01c068db$72403170$3c6340d5@hagrid> mal wrote: > > But the error message is being produced because the > > character is NOT a valid format character. One of the > > reasons for that might be because it's not in the > > 7-bit range! > > True. > > I think removing %c completely in that case is the right > solution (in case you don't want to convert the Unicode char > using the default encoding to a string first). how likely is it that a human programmer will use a bad formatting character that's not in the ASCII range? -1 on removing it -- people shouldn't have to learn the octal ASCII table just to be able to fix trivial typos. +1 on mapping the character back to a string in the same was as "repr" -- that is, print ASCII characters as is, map anything else to an octal escape. +0 on leaving it as it is, or mapping non-printables to "?". From mal at lemburg.com Mon Dec 18 11:34:02 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:34:02 +0100 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> Message-ID: <3A3DE81A.4B825D89@lemburg.com> > Here some results, dictionaries have 1000 entries: > > timing for strings old= 5.097 new= 5.088 > timing for bad integers (<<10) old=101.540 new=12.610 > timing for bad integers (<<16) old=571.210 new=19.220 Even though I think concentrating on string keys would provide more performance boost for Python in general, I think you have a point there. +1 from here. BTW, would changing the hash function on strings from the simple XOR scheme to something a little smarter help improve the performance too (e.g. most strings used in programming never use the 8-th bit) ? I also think that we could inline the string compare function in dictobject:lookdict_string to achieve even better performance. Currently it uses a function which doesn't trigger compiler inlining. And finally: I think a generic PyString_Compare() API would be useful in a lot of places where strings are being compared (e.g. dictionaries and keyword parameters). Unicode already has such an API (along with dozens of other useful APIs which are not available for strings). -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Mon Dec 18 11:41:38 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 11:41:38 +0100 Subject: [Python-Dev] Use of %c and Py_UNICODE References: <200012180002.NAA02518@s454.cosc.canterbury.ac.nz> <3A3DE118.3355896D@lemburg.com> <004a01c068db$72403170$3c6340d5@hagrid> Message-ID: <3A3DE9E2.77FF0FA9@lemburg.com> Fredrik Lundh wrote: > > mal wrote: > > > > But the error message is being produced because the > > > character is NOT a valid format character. One of the > > > reasons for that might be because it's not in the > > > 7-bit range! > > > > True. > > > > I think removing %c completely in that case is the right > > solution (in case you don't want to convert the Unicode char > > using the default encoding to a string first). > > how likely is it that a human programmer will use a bad formatting > character that's not in the ASCII range? Not very likely... the most common case of this error is probably the use of % as percent sign in a formatting string. The next character in those cases is usually whitespace. > -1 on removing it -- people shouldn't have to learn the octal ASCII > table just to be able to fix trivial typos. > > +1 on mapping the character back to a string in the same was as > "repr" -- that is, print ASCII characters as is, map anything else to > an octal escape. > > +0 on leaving it as it is, or mapping non-printables to "?". Agreed. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tismer at tismer.com Mon Dec 18 12:08:34 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 18 Dec 2000 13:08:34 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> <3A3DE81A.4B825D89@lemburg.com> Message-ID: <3A3DF032.5F86AD15@tismer.com> "M.-A. Lemburg" wrote: > > > Here some results, dictionaries have 1000 entries: > > > > timing for strings old= 5.097 new= 5.088 > > timing for bad integers (<<10) old=101.540 new=12.610 > > timing for bad integers (<<16) old=571.210 new=19.220 > > Even though I think concentrating on string keys would provide more > performance boost for Python in general, I think you have a point > there. +1 from here. > > BTW, would changing the hash function on strings from the simple > XOR scheme to something a little smarter help improve the performance > too (e.g. most strings used in programming never use the 8-th > bit) ? Yes, it would. I spent the rest of last night to do more accurate tests, also refined the implementation (using longs for the shifts etc), and turned from timing over to trip counting, i.e. a dict counts every round through the re-hash. That showed two things: - The bits used from the string hash are not well distributed - using a "warmup wheel" on the hash to suck all bits in gives the same quality of hashes like random numbers. I will publish some results later today. > I also think that we could inline the string compare function > in dictobject:lookdict_string to achieve even better performance. > Currently it uses a function which doesn't trigger compiler > inlining. Sure! ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From guido at python.org Mon Dec 18 15:20:22 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 09:20:22 -0500 Subject: [Python-Dev] The Dictionary Gem is polished! In-Reply-To: Your message of "Sun, 17 Dec 2000 19:38:31 +0200." <3A3CFA17.ED26F51A@tismer.com> References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> Message-ID: <200012181420.JAA25063@cj20424-a.reston1.va.home.com> > Problem: There are hash functions which are "good" in this sense, > but they do not spread their randomness uniformly over the > 32 bits. > > Example: Integers use their own value as hash. > This is ok, as far as the integers are uniformly distributed. > But if they all contain a high power of two, for instance, > the low bits give a very bad hash function. > > Take a dictionary with integers range(1000) as keys and access > all entries. Then use a dictionay with the integers shifted > left by 16. > Access time is slowed down by a factor of 100, since every > access is a linear search now. Ai. I think what happened is this: long ago, the hash table sizes were primes, or at least not powers of two! I'll leave it to the more mathematically-inclined to judge your solution... --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Dec 18 15:52:35 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 09:52:35 -0500 Subject: [Python-Dev] Did something change in Makefile.in or Modules/Makefile.pre.in? In-Reply-To: Your message of "Sun, 17 Dec 2000 17:55:10 CST." <14909.21086.92774.940814@beluga.mojam.com> References: <14909.21086.92774.940814@beluga.mojam.com> Message-ID: <200012181452.JAA04372@cj20424-a.reston1.va.home.com> > Make distclean succeeded so I tried the following: > > make distclean > ./configure > make clean > > but the last step still failed. Any idea why make clean is now failing (for > me)? Can anyone else reproduce this problem? Yes. I don't understand it, but this takes care of it: make distclean ./configure make Makefiles # <--------- !!! make clean --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Dec 18 15:54:20 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 09:54:20 -0500 Subject: [Python-Dev] Pragmas and warnings In-Reply-To: Your message of "Mon, 18 Dec 2000 10:58:37 +0100." <3A3DDFCD.34AB05B2@lemburg.com> References: <3A3D1C4B.8F08A744@ActiveState.com> <3A3DDFCD.34AB05B2@lemburg.com> Message-ID: <200012181454.JAA04394@cj20424-a.reston1.va.home.com> > There was a long thread about this some months ago. We agreed > to add a new keyword to the language (I think it was "define") I don't recall agreeing. :-) This is PEP material. For 2.2, please! --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Mon Dec 18 15:56:33 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Mon, 18 Dec 2000 15:56:33 +0100 Subject: [Python-Dev] Pragmas and warnings References: <3A3D1C4B.8F08A744@ActiveState.com> <3A3DDFCD.34AB05B2@lemburg.com> <200012181454.JAA04394@cj20424-a.reston1.va.home.com> Message-ID: <3A3E25A1.DFD2BDBF@lemburg.com> Guido van Rossum wrote: > > > There was a long thread about this some months ago. We agreed > > to add a new keyword to the language (I think it was "define") > > I don't recall agreeing. :-) Well, maybe it was a misinterpretation on my part... you said something like "add a new keyword and live with the consequences". AFAIR, of course :-) > This is PEP material. For 2.2, please! Ok. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at python.org Mon Dec 18 16:15:26 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 10:15:26 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Sun, 17 Dec 2000 12:09:47 MST." <001f01c0685c$ef555200$7bdb5da6@vaio> References: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> <001f01c0685c$ef555200$7bdb5da6@vaio> Message-ID: <200012181515.KAA04571@cj20424-a.reston1.va.home.com> [Mark Lutz] > So please: can we keep string around? Like it or not, we're > way past the point of removing such core modules at this point. Of course we're keeping string around. I already said that for backwards compatibility reasons it would not disappear before Py3K. I think there's a misunderstanding about the meaning of deprecation, too. That word doesn't mean to remove a feature. It doesn't even necessarily mean to warn every time a feature is used. It just means (to me) that at some point in the future the feature will change or disappear, there's a new and better way to do it, and that we encourage users to start using the new way, to save them from work later. In my mind, there's no reason to start emitting warnings about every deprecated feature. The warnings are only needed late in the deprecation cycle. PEP 5 says "There must be at least a one-year transition period between the release of the transitional version of Python and the release of the backwards incompatible version." Can we now stop getting all bent out of shape over this? String methods *are* recommended over equivalent string functions. Those string functions *are* already deprecated, in the informal sense (i.e. just that it is recommended to use string methods instead). This *should* (take notice, Fred!) be documented per 2.1. We won't however be issuing run-time warnings about the use of string functions until much later. (Lint-style tools may start warning sooner -- that's up to the author of the lint tool to decide.) Note that I believe Java makes a useful distinction that PEP 5 misses: it defines both deprecated features and obsolete features. *Deprecated* features are simply features for which a better alternative exists. *Obsolete* features are features that are only being kept around for backwards compatibility. Deprecated features may also be (and usually are) *obsolescent*, meaning they will become obsolete in the future. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Dec 18 16:22:09 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 10:22:09 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Mon, 18 Dec 2000 00:45:56 +0100." <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> References: <200012172345.AAA00877@loewis.home.cs.tu-berlin.de> Message-ID: <200012181522.KAA04597@cj20424-a.reston1.va.home.com> > At some time, there were only string exceptions. Then, instance > exceptions were added, some releases later they were considered the > better choice, so the standard library was converted to use them. > Still, there is no sign whatsoever that anybody plans to deprecate > string exceptions. Now there is: I hereby state that I officially deprecate string exceptions. Py3K won't support them, and it *may* even require that all exception classes are derived from Exception. > I believe the string module will get less importance over > time. Comparing it with string exception, that may be well 5 years. > It seems there are two ways of "deprecation": a loud "we will remove > that, change your code", and a silent "strings have methods" > (i.e. don't mention the module when educating people). The latter > approach requires educators to agree that the module is > "uninteresting", and people to really not use once they find out it > exists. Exactly. This is what I hope will happen. I certainly hope that Mark Lutz has already started teaching string methods! > I think deprecation should be only attempted once there is a clear > sign that people don't use it massively for new code anymore. Right. So now we're on the first step: get the word out! > Removal should only occur if keeping the module [is] less pain than > maintaining it. Exactly. Guess where the string module falls today. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From Barrett at stsci.edu Mon Dec 18 17:50:49 2000 From: Barrett at stsci.edu (Paul Barrett) Date: Mon, 18 Dec 2000 11:50:49 -0500 (EST) Subject: [Python-Dev] PEP 207 -- Rich Comparisons Message-ID: <14910.16431.554136.374725@nem-srvr.stsci.edu> Guido van Rossum writes: > > > > 1. The current boolean operator behavior does not have to change, and > > hence will be backward compatible. > > What incompatibility do you see in the current proposal? You have to choose between using rich comparisons or boolean comparisons. You can't use both for the same (rich/complex) object. > > 2. It eliminates the need to decide whether or not rich comparisons > > takes precedence over boolean comparisons. > > Only if you want different semantics -- that's only an issue for NumPy. No. I think NumPy is the tip of the iceberg, when discussing new semantics. Most users don't consider these broader semantic issues, because Python doesn't give them the opportunity to do so. I can see possible scenarios of using both boolean and non-boolean comparisons for Python lists and dictionaries in addition to NumPy. I chose to use Python because it provides a richer framework than other languages. When Python fails to provide such benefits, I'll move to another language. I moved from PERL to Python because the multi-dimensional array syntax is vastly better in Python than PERL, though as a novice I don't have to know that it exists. What I'm proposing here is in a similar vein. > > 3. The new operators add additional behavior without directly impacting > > current behavior and the use of them is unambigous, at least in > > relation to current Python behavior. You know by the operator what > > type of comparison will be returned. This should appease Jim > > Fulton, based on his arguments in 1998 about comparison operators > > always returning a boolean value. > > As you know, I'm now pretty close to Jim. :-) He seemed pretty mellow > about this now. Yes, I would hope so! It appears though that you misunderstand me. My point was that I tend to agree with Jim Fulton's arguments for a limited interpretation of the current comparison operators. I too expect them to return a boolean result. I have never felt comfortable using such comparison operators in an array context, e.g. as in the array language, IDL. It just looks wrong. So my suggestion is to create new ones whose implicit meaning is to provide element-wise or rich comparison behavior. And to add similar behavior for the other operators for consistency. Can someone provide an example in mathematics where comparison operators are used in a non-boolean, ie. rich comparison, context. If so, this might shut me up! > > 4. Compound objects, such as lists, could implement both rich > > and boolean comparisons. The boolean comparison would remain as > > is, while the rich comparison would return a list of boolean > > values. Current behavior doesn't change; just a new feature, which > > you may or may not choose to use, is added. > > > > If we go one step further and add the matrix-style operators along > > with the comparison operators, we can provide a consistent user > > interface to array/complex operations without changing current Python > > behavior. If a user has no need for these new operators, he doesn't > > have to use them or even know about them. All we've done is made > > Python richer, but I believe with making it more complex. For Phrase should be: "but I believe without making it more complex.". ------- > > example, all element-wise operations could have a ':' appended to > > them, e.g. '+:', '<:', etc.; and will define element-wise addition, > > element-wise less-than, etc. The traditional '*', '/', etc. operators > > can then be used for matrix operations, which will appease the Matlab > > people. > > > > Therefore, I don't think rich comparisons and matrix-type operators > > should be considered separable. I really think you should consider > > this suggestion. It appeases many groups while providing a consistent > > and clear user interface, while greatly impacting current Python > > behavior. The last phrase should read: "while not greatly impacting current --- Python behavior." > > > > Always-causing-havoc-at-the-last-moment-ly Yours, > > I think you misunderstand. Rich comparisons are mostly about allowing > the separate overloading of <, <=, ==, !=, >, and >=. This is useful > in its own light. No, I do understand. I've read most of the early discussions on this issue and one of those issues was about having to choose between boolean and rich comparisons and what should take precedence, when both may be appropriate. I'm suggesting an alternative here. > If you don't want to use this overloading facility for elementwise > comparisons in NumPy, that's fine with me. Nobody says you have to -- > it's just that you *could*. Yes, I understand. > Red my lips: there won't be *any* new operators in 2.1. OK, I didn't expect this to make it into 2.1. > There will a better way to overload the existing Boolean operators, > and they will be able to return non-Boolean results. That's useful in > other situations besides NumPy. Yes, I agree, this should be done anyway. I'm just not sure that the implicit meaning that these comparison operators are being given is the best one. I'm just looking for ways to incorporate rich comparisons into a broader framework, numpy just currently happens to be the primary example of this proposal. Assuming the current comparison operator overloading is already implemented and has been used to implement rich comparisons for some objects, then my rich comparison proposal would cause confusion. This is what I'm trying to avoid. > Feel free to lobby for elementwise operators -- but based on the > discussion about this subject so far, I don't give it much of a chance > even past Python 2.1. They would add a lot of baggage to the language > (e.g. the table of operators in all Python books would be about twice > as long) and by far the most users don't care about them. (Read the > intro to 211 for some of the concerns -- this PEP tries to make the > addition palatable by adding exactly *one* new operator.) So! Introductory books don't have to discuss these additional operators. I don't have to know about XML and socket modules to start using Python effectively, nor do I have to know about 'zip' or list comprehensions. These additions decrease the code size and increase efficiency, but don't really add any new expressive power that can't already be done by a 'for' loop. I'll try to convince myself that this suggestion is crazy and not bother you with this issue for awhile. Cheers, Paul From guido at python.org Mon Dec 18 18:18:11 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 12:18:11 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Mon, 18 Dec 2000 11:50:49 EST." <14910.16431.554136.374725@nem-srvr.stsci.edu> References: <14910.16431.554136.374725@nem-srvr.stsci.edu> Message-ID: <200012181718.MAA14030@cj20424-a.reston1.va.home.com> Paul Barret: > > > 1. The current boolean operator behavior does not have to change, and > > > hence will be backward compatible. Guido van Rossum: > > What incompatibility do you see in the current proposal? Paul Barret: > You have to choose between using rich comparisons or boolean > comparisons. You can't use both for the same (rich/complex) object. Sure. I thought that the NumPy folks were happy with this. Certainly two years ago they seemed to be. > > > 2. It eliminates the need to decide whether or not rich comparisons > > > takes precedence over boolean comparisons. > > > > Only if you want different semantics -- that's only an issue for NumPy. > > No. I think NumPy is the tip of the iceberg, when discussing new > semantics. Most users don't consider these broader semantic issues, > because Python doesn't give them the opportunity to do so. I can see > possible scenarios of using both boolean and non-boolean comparisons > for Python lists and dictionaries in addition to NumPy. That's the same argument that has been made for new operators all along. I've explained already why they are not on the table for 2.1. > I chose to use Python because it provides a richer framework than > other languages. When Python fails to provide such benefits, I'll > move to another language. I moved from PERL to Python because the > multi-dimensional array syntax is vastly better in Python than PERL, > though as a novice I don't have to know that it exists. What I'm > proposing here is in a similar vein. > > > > 3. The new operators add additional behavior without directly impacting > > > current behavior and the use of them is unambigous, at least in > > > relation to current Python behavior. You know by the operator what > > > type of comparison will be returned. This should appease Jim > > > Fulton, based on his arguments in 1998 about comparison operators > > > always returning a boolean value. > > > > As you know, I'm now pretty close to Jim. :-) He seemed pretty mellow > > about this now. > > Yes, I would hope so! > > It appears though that you misunderstand me. My point was that I tend > to agree with Jim Fulton's arguments for a limited interpretation of > the current comparison operators. I too expect them to return a > boolean result. I have never felt comfortable using such comparison > operators in an array context, e.g. as in the array language, IDL. It > just looks wrong. So my suggestion is to create new ones whose > implicit meaning is to provide element-wise or rich comparison > behavior. And to add similar behavior for the other operators for > consistency. > > Can someone provide an example in mathematics where comparison > operators are used in a non-boolean, ie. rich comparison, context. > If so, this might shut me up! Not me (I no longer consider myself a mathematician :-). Why are you requiring an example from math though? Again, you will be able to make this argument to the NumPy folks when they are ready to change the meaning of A > > 4. Compound objects, such as lists, could implement both rich > > > and boolean comparisons. The boolean comparison would remain as > > > is, while the rich comparison would return a list of boolean > > > values. Current behavior doesn't change; just a new feature, which > > > you may or may not choose to use, is added. > > > > > > If we go one step further and add the matrix-style operators along > > > with the comparison operators, we can provide a consistent user > > > interface to array/complex operations without changing current Python > > > behavior. If a user has no need for these new operators, he doesn't > > > have to use them or even know about them. All we've done is made > > > Python richer, but I believe with making it more complex. For > > Phrase should be: "but I believe without making it more complex.". > ------- > > > > example, all element-wise operations could have a ':' appended to > > > them, e.g. '+:', '<:', etc.; and will define element-wise addition, > > > element-wise less-than, etc. The traditional '*', '/', etc. operators > > > can then be used for matrix operations, which will appease the Matlab > > > people. > > > > > > Therefore, I don't think rich comparisons and matrix-type operators > > > should be considered separable. I really think you should consider > > > this suggestion. It appeases many groups while providing a consistent > > > and clear user interface, while greatly impacting current Python > > > behavior. > > The last phrase should read: "while not greatly impacting current > --- > Python behavior." I don't see any argument for elementwise operators here that I haven't heard before, and AFAIK it's all in the two PEPs. > > > Always-causing-havoc-at-the-last-moment-ly Yours, > > > > I think you misunderstand. Rich comparisons are mostly about allowing > > the separate overloading of <, <=, ==, !=, >, and >=. This is useful > > in its own light. > > No, I do understand. I've read most of the early discussions on this > issue and one of those issues was about having to choose between > boolean and rich comparisons and what should take precedence, when > both may be appropriate. I'm suggesting an alternative here. Note that Python doesn't decide which should take precedent. The implementer of an individual extension type decides what his comparison operators will return. > > If you don't want to use this overloading facility for elementwise > > comparisons in NumPy, that's fine with me. Nobody says you have to -- > > it's just that you *could*. > > Yes, I understand. > > > Red my lips: there won't be *any* new operators in 2.1. > > OK, I didn't expect this to make it into 2.1. > > > There will a better way to overload the existing Boolean operators, > > and they will be able to return non-Boolean results. That's useful in > > other situations besides NumPy. > > Yes, I agree, this should be done anyway. I'm just not sure that the > implicit meaning that these comparison operators are being given is > the best one. I'm just looking for ways to incorporate rich > comparisons into a broader framework, numpy just currently happens to > be the primary example of this proposal. > > Assuming the current comparison operator overloading is already > implemented and has been used to implement rich comparisons for some > objects, then my rich comparison proposal would cause confusion. This > is what I'm trying to avoid. AFAIK, rich comparisons haven't been used anywhere to return non-Boolean results. > > Feel free to lobby for elementwise operators -- but based on the > > discussion about this subject so far, I don't give it much of a chance > > even past Python 2.1. They would add a lot of baggage to the language > > (e.g. the table of operators in all Python books would be about twice > > as long) and by far the most users don't care about them. (Read the > > intro to 211 for some of the concerns -- this PEP tries to make the > > addition palatable by adding exactly *one* new operator.) > > So! Introductory books don't have to discuss these additional > operators. I don't have to know about XML and socket modules to start > using Python effectively, nor do I have to know about 'zip' or list > comprehensions. These additions decrease the code size and increase > efficiency, but don't really add any new expressive power that can't > already be done by a 'for' loop. > > I'll try to convince myself that this suggestion is crazy and not > bother you with this issue for awhile. Happy holidays nevertheless. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at home.com Mon Dec 18 19:38:13 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 18 Dec 2000 13:38:13 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <14910.16431.554136.374725@nem-srvr.stsci.edu> Message-ID: [Paul Barrett] > ... > Can someone provide an example in mathematics where comparison > operators are used in a non-boolean, ie. rich comparison, context. > If so, this might shut me up! By my informal accounting, over the years there have been more requests for three-outcome comparison operators than for elementwise ones, although the three-outcome lobby isn't organized so is less visible. It's a natural request for anyone working with partial orderings (a < b -> one of {yes, no, unordered}). Another large group of requests comes from people working with variants of fuzzy logic, where it's desired that the comparison operators be definable to return floats (intuitively corresponding to the probability that the stated relation "is true"). Another desire comes from the symbolic math camp, which would like to be able to-- as is possible for "+", "*", etc --define "<" so that e.g. "x < y" return an object capturing that somebody *asked* for "x < y"; they're not interested in numeric or Boolean results so much as symbolic expressions. "<" is used for all these things in the literature too. Whatever. "<" and friends are just collections of pixels. Add 300 new operator symbols, and people will want to redefine all of them at will too. draw-a-line-in-the-sand-and-the-wind-blows-it-away-ly y'rs - tim From tim.one at home.com Mon Dec 18 21:37:13 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 18 Dec 2000 15:37:13 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012152123.QAA03566@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > ... > If you're saying that we should give users ample time for the > transition, I'm with you. Then we're with each other, for suitably large values of "ample" . > If you're saying that you think the string module is too prominent to > ever start deprecating its use, I'm afraid we have a problem. We may. Time will tell. It needs a conversion tool, else I think it's unsellable. > ... > I'd also like to note that using the string module's wrappers incurs > the overhead of a Python function call -- using string methods is > faster. > > Finally, I like the look of fields[i].strip().lower() much better than > that of string.lower(string.strip(fields[i])) -- an actual example > from mimetools.py. I happen to like string methods better myself; I don't think that's at issue (except that loads of people apparently don't like "join" as a string method -- idiots ). The issue to me is purely breaking old code someday -- "string" is in very heavy use, and unlike as when deprecating regex in favor of re (either pre or especially sre), string methods aren't orders of magnitude better than the old way; and also unlike regex-vs-re it's not the case that the string module has become unmaintainable (to the contrary, string.py has become trivial). IOW, this one would be unprecedented fiddling. > ... > Note that I believe Java makes a useful distinction that PEP 5 misses: > it defines both deprecated features and obsolete features. > *Deprecated* features are simply features for which a better > alternative exists. *Obsolete* features are features that are only > being kept around for backwards compatibility. Deprecated features > may also be (and usually are) *obsolescent*, meaning they will become > obsolete in the future. I agree it would be useful to define these terms, although those particular definitions appear to be missing the most important point from the user's POV (not a one says "going away someday"). A Google search on "java obsolete obsolescent deprecated" doesn't turn up anything useful, so I doubt the usages you have in mind come from Java (it has "deprecated", but doesn't appear to have any well-defined meaning for the others). In keeping with the religious nature of the battle-- and religion offers precise terms for degrees of damnation! --I suggest: struggling -- a supported feature; the initial state of all features; may transition to Anathematized anathematized -- this feature is now cursed, but is supported; may transition to Condemned or Struggling; intimacy with Anathematized features is perilous condemned -- a feature scheduled for crucifixion; may transition to Crucified, Anathematized (this transition is called "a pardon"), or Struggling (this transition is called "a miracle"); intimacy with Condemned features is suicidal crucified -- a feature that is no longer supported; may transition to Resurrected resurrected -- a once-Crucified feature that is again supported; may transition to Condemned, Anathematized or Struggling; although since Resurrection is a state of grace, there may be no point in human time at which a feature is identifiably Resurrected (i.e., it may *appear*, to the unenlightened, that a feature moved directly from Crucified to Anathematized or Struggling or Condemned -- although saying so out loud is heresy). From tismer at tismer.com Mon Dec 18 23:58:03 2000 From: tismer at tismer.com (Christian Tismer) Date: Mon, 18 Dec 2000 23:58:03 +0100 Subject: [Python-Dev] The Dictionary Gem is polished! References: <200012081843.NAA32225@cj20424-a.reston1.va.home.com> <3A32615E.D39B68D2@tismer.com> <033701c06366$ab746580$0900a8c0@SPIFF> <3A34DB7C.FF7E82CE@tismer.com> <3A3CFA17.ED26F51A@tismer.com> <200012181420.JAA25063@cj20424-a.reston1.va.home.com> Message-ID: <3A3E967B.BE404114@tismer.com> Guido van Rossum wrote: [me, expanding on hashes, integers,and how to tame them cheaply] > Ai. I think what happened is this: long ago, the hash table sizes > were primes, or at least not powers of two! At some time I will wake up and they tell me that I'm reducible :-) > I'll leave it to the more mathematically-inclined to judge your > solution... I love small lists! - ciao - chris +1 (being a member, hopefully) -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From greg at cosc.canterbury.ac.nz Tue Dec 19 00:04:42 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Dec 2000 12:04:42 +1300 (NZDT) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Message-ID: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> [Paul Barrett] > ... > Can someone provide an example in mathematics where comparison > operators are used in a non-boolean, ie. rich comparison, context. > If so, this might shut me up! Not exactly mathematical, but some day I'd like to create a database access module which lets you say things like mydb = OpenDB("inventory") parts = mydb.parts tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42) Of course, to really make this work I need to be able to overload "and" and "or" as well, but that's a whole 'nother PEP... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Tue Dec 19 00:32:51 2000 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Dec 2000 18:32:51 -0500 Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: Your message of "Tue, 19 Dec 2000 12:04:42 +1300." <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> References: <200012182304.MAA02642@s454.cosc.canterbury.ac.nz> Message-ID: <200012182332.SAA18456@cj20424-a.reston1.va.home.com> > Not exactly mathematical, but some day I'd like to create > a database access module which lets you say things like > > mydb = OpenDB("inventory") > parts = mydb.parts > tuples = mydb.retrieve(parts.name, parts.number).where(parts.quantity >= 42) > > Of course, to really make this work I need to be able > to overload "and" and "or" as well, but that's a whole > 'nother PEP... Believe it or not, in 1998 we already had a suggestion for overloading these too. This is hinted at in David Ascher's proposal (the Appendix of PEP 208) where objects could define __boolean_and__ to overload x Message-ID: Sounds good to me! It's a very cheap way to get the high bits into play. > i = (~_hash) & mask The ~ here seems like pure superstition to me (and the comments in the C code don't justify it at all -- I added a nag of my own about that the last time I checked in dictobject.c -- and see below for a bad consequence of doing ~). > # note that we do not mask! > # even the shifting my not be worth it. > incr = _hash ^ (_hash >> 3) The shifting was just another cheap trick to get *some* influence from the high bits. It's very limited, though. Toss it (it appears to be from the "random operations yield random results" matchbook school of design). [MAL] > BTW, would changing the hash function on strings from the simple > XOR scheme to something a little smarter help improve the performance > too (e.g. most strings used in programming never use the 8-th > bit) ? Don't understand -- the string hash uses multiplication: x = (1000003*x) ^ *p++; in a loop. Replacing "^" there by "+" should yield slightly better results. As is, string hashes are a lot like integer hashes, in that "consecutive" strings J001 J002 J003 J004 ... yield hashes very close together in value. But, because the current dict algorithm uses ~ on the full hash but does not use ~ on the initial increment, (~hash)+incr too often yields the same result for distinct hashes (i.e., there's a systematic (but weak) form of clustering). Note that Python is doing something very unusual: hashes are *usually* designed to yield an approximation to randomness across all bits. But Python's hashes never achieve that. This drives theoreticians mad (like the fellow who originally came up with the GF idea), but tends to work "better than random" in practice (e.g., a truly random hash function would almost certainly produce many collisions when fed a fat range of consecutive integers but still less than half the table size; but Python's trivial "identity" integer hash produces no collisions in that common case). [Christian] > - The bits used from the string hash are not well distributed > - using a "warmup wheel" on the hash to suck all bits in > gives the same quality of hashes like random numbers. See above and be very cautious: none of Python's hash functions produce well-distributed bits, and-- in effect --that's why Python dicts often perform "better than random" on common data. Even what you've done so far appears to provide marginally worse statistics for Guido's favorite kind of test case ("worse" in two senses: total number of collisions (a measure of amortized lookup cost), and maximum collision chain length (a measure of worst-case lookup cost)): d = {} for i in range(N): d[repr(i)] = i check-in-one-thing-then-let-it-simmer-ly y'rs - tim From tismer at tismer.com Tue Dec 19 02:16:27 2000 From: tismer at tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 02:16:27 +0100 Subject: [Python-Dev] The Dictionary Gem is polished! References: Message-ID: <3A3EB6EB.C79A3896@tismer.com> Greg Wilson wrote: > > > > > Here some results, dictionaries have 1000 entries: > > I will publish some results later today. > > In Doctor Dobb's Journal, right? :-) We'd *really* like this article... Well, the results are not so bad: I stopped testing computation time for the Python dictionary implementation, in favor of "trips". How many trips does the re-hash take in a dictionary? Tests were run for dictionaries of size 1000, 2000, 3000, 4000. Dictionary 1 consists of i, formatted as string. Dictionary 2 consists of strings containig the binary of i. Dictionary 3 consists of random numbers. Dictionary 4 consists of i << 16. Algorithms: old is the original dictionary algorithm implemented in Python (probably quite correct now, using longs :-) new is the proposed incremental bits-suck-in-division algorithm. new2 is a version of new, where all extra bits of the hash function are wheeled in in advance. The computation time of this is not neglectible, so please use this result for reference, only. Here the results: (bad integers(old) not computed for n>1000 ) """ D:\crml_doc\platf\py>python dictest.py N=1000 trips for strings old=293 new=302 new2=221 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=499500 new=13187 new2=999 trips for random integers old=377 new=371 new2=393 trips for windows names old=230 new=207 new2=200 N=2000 trips for strings old=1093 new=1109 new2=786 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=26455 new2=1999 trips for random integers old=691 new=710 new2=741 trips for windows names old=503 new=542 new2=564 N=3000 trips for strings old=810 new=839 new2=609 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=38681 new2=2999 trips for random integers old=762 new=740 new2=735 trips for windows names old=712 new=711 new2=691 N=4000 trips for strings old=1850 new=1843 new2=1375 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=52994 new2=3999 trips for random integers old=1440 new=1450 new2=1414 trips for windows names old=1449 new=1434 new2=1457 D:\crml_doc\platf\py> """ Interpretation: --------------- Short numeric strings show a slightly too high trip number. This means that the hash() function could be enhanced. But the effect would be below 10 percent compared to random hashes, therefore not worth it. Binary representations of numbers as strings still create perfect hash numbers. Bad integers (complete hash clash due to high power of 2) are handled fairly well by the new algorithm. "new2" shows that they can be brought down to nearly perfect hashes just by applying the "hash melting wheel": Windows names are almost upper case, and almost verbose. They appear to perform nearly as well as random numbers. This means: The Python string has function is very good for a wide area of applications. In Summary: I would try to modify the string hash function slightly for short strings, but only if this does not negatively affect the results of above. Summary of summary: There is no really low hanging fruit in string hashing. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com -------------- next part -------------- ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] polys = map(long, polys) class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg mp.warmup = newalg>1 mp.trips = 0 def getTrips(self): trips = self.trips self.trips = 0 return trips def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # the shifting is worth it in the incremental case. ## added after posting to python-dev: uhash = _hash & 0xffffffffl if mp.warmup: incr = uhash mask2 = 0xffffffffl ^ mask while mask2 > mask: if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 mask2 = mask2>>1 # this loop *can* be sped up by tables # with precomputed multiple shifts. # But I'm not sure if it is worth it at all. else: incr = uhash ^ (uhash >> 3) ###### TO HERE if (not incr): incr = mask while 1: mp.trips = mp.trips+1 ep = ep0[int((i+incr)&mask)] if (ep[me_key] is NULL) : if (freeslot is not NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup for i in range(n): s = str(i) #* 5 #s = chr(i%256) + chr(i>>8)## d1[s] = d2[s] = d3[s] = i return d1, d2, d3 def istring_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup for i in range(n): s = chr(i%256) + chr(i>>8) d1[s] = d2[s] = d3[s] = i return d1, d2, d3 def random_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup from whrandom import randint import sys keys = [] for i in range(n): keys.append(randint(0, sys.maxint-1)) for i in keys: d1[i] = d2[i] = d3[i] = i return d1, d2, d3 def badnum_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup shift = 10 if EXTREME: shift = 16 for i in range(n): bad = i << 16 d2[bad] = d3[bad] = i if n <= 1000: d1[bad] = i return d1, d2, d3 def names_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup import win32con keys = win32con.__dict__.keys() if len(keys) < n: keys = [] for s in keys[:n]: d1[s] = d2[s] = d3[s] = s return d1, d2, d3 def do_test(dict): keys = dict.keys() dict.getTrips() # reset test(keys, dict) return dict.getTrips() EXTREME=1 if __name__ == "__main__": for N in (1000,2000,3000,4000): sdold, sdnew, sdnew2 = string_dicts(N) idold, idnew, idnew2 = istring_dicts(N) bdold, bdnew, bdnew2 = badnum_dicts(N) rdold, rdnew, rdnew2 = random_dicts(N) ndold, ndnew, ndnew2 = names_dicts(N) print "N=%d" %N print "trips for strings old=%d new=%d new2=%d" % tuple( map(do_test, (sdold, sdnew, sdnew2)) ) print "trips for bin strings old=%d new=%d new2=%d" % tuple( map(do_test, (idold, idnew, idnew2)) ) print "trips for bad integers old=%d new=%d new2=%d" % tuple( map(do_test, (bdold, bdnew, bdnew2))) print "trips for random integers old=%d new=%d new2=%d" % tuple( map(do_test, (rdold, rdnew, rdnew2))) print "trips for windows names old=%d new=%d new2=%d" % tuple( map(do_test, (ndold, ndnew, ndnew2))) """ Results with a shift of 10 (EXTREME=0): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 Results with a shift of 16 (EXTREME=1): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.218 new=5.147 timing for bad integers old=571.210 new=19.220 """ From tismer at tismer.com Tue Dec 19 02:51:32 2000 From: tismer at tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 02:51:32 +0100 Subject: [Python-Dev] Re: The Dictionary Gem is polished! References: Message-ID: <3A3EBF23.750CF761@tismer.com> Tim Peters wrote: > > Sounds good to me! It's a very cheap way to get the high bits into play. That's what I wanted to hear. It's also the reason why I try to stay conservative: Just do an obviously useful bit, but do not break any of the inherent benefits, like those "better than random" amenities. Python's dictionary algorithm appears to be "near perfect" and of "never touch but veery carefully or redo it completely". I tried the tightrope walk of just adding a tiny topping. > > i = (~_hash) & mask Yes that stuff was 2 hours last nite :-) I just decided to not touch it. Arbitrary crap! Although an XOR with hash >> number of mask bits would perform much better (in many cases but not all). Anyway, simple shifting cannot solve general bit distribution problems. Nor can I :-) > The ~ here seems like pure superstition to me (and the comments in the C > code don't justify it at all -- I added a nag of my own about that the last > time I checked in dictobject.c -- and see below for a bad consequence of > doing ~). > > > # note that we do not mask! > > # even the shifting my not be worth it. > > incr = _hash ^ (_hash >> 3) > > The shifting was just another cheap trick to get *some* influence from the > high bits. It's very limited, though. Toss it (it appears to be from the > "random operations yield random results" matchbook school of > design). Now, comment it out, and you see my new algorithm perform much worse. I just kept it since it had an advantage on "my case". (bad guy I know). And I wanted to have an argument for my change to get accepted. "No cost, just profit, nearly the same" was what I tried to sell. > [MAL] > > BTW, would changing the hash function on strings from the simple > > XOR scheme to something a little smarter help improve the performance > > too (e.g. most strings used in programming never use the 8-th > > bit) ? > > Don't understand -- the string hash uses multiplication: > > x = (1000003*x) ^ *p++; > > in a loop. Replacing "^" there by "+" should yield slightly better results. For short strings, this prime has bad influence on the low bits, making it perform supoptimally for small dicts. See the new2 algo which funnily corrects for that. The reason is obvious: Just look at the bit pattern of 1000003: '0xf4243' Without giving proof, this smells like bad bit distribution on small strings to me. You smell it too, right? > As is, string hashes are a lot like integer hashes, in that "consecutive" > strings > > J001 > J002 > J003 > J004 > ... > > yield hashes very close together in value. A bad generator in that case. I'll look for a better one. > But, because the current dict > algorithm uses ~ on the full hash but does not use ~ on the initial > increment, (~hash)+incr too often yields the same result for distinct hashes > (i.e., there's a systematic (but weak) form of clustering). You name it. > Note that Python is doing something very unusual: hashes are *usually* > designed to yield an approximation to randomness across all bits. But > Python's hashes never achieve that. This drives theoreticians mad (like the > fellow who originally came up with the GF idea), but tends to work "better > than random" in practice (e.g., a truly random hash function would almost > certainly produce many collisions when fed a fat range of consecutive > integers but still less than half the table size; but Python's trivial > "identity" integer hash produces no collisions in that common case). A good reason to be careful with changes(ahem). > [Christian] > > - The bits used from the string hash are not well distributed > > - using a "warmup wheel" on the hash to suck all bits in > > gives the same quality of hashes like random numbers. > > See above and be very cautious: none of Python's hash functions produce > well-distributed bits, and-- in effect --that's why Python dicts often > perform "better than random" on common data. Even what you've done so far > appears to provide marginally worse statistics for Guido's favorite kind of > test case ("worse" in two senses: total number of collisions (a measure of > amortized lookup cost), and maximum collision chain length (a measure of > worst-case lookup cost)): > > d = {} > for i in range(N): > d[repr(i)] = i Nah, I did quite a lot of tests, and the trip number shows a variation of about 10%, without judging old or new for better. This is just the randomness inside. > check-in-one-thing-then-let-it-simmer-ly y'rs - tim This is why I think to be even more conservative: Try to use a division wheel, but with the inverses of the original primitive roots, just in order to get at Guido's results :-) making-perfect-hashes-of-interneds-still-looks-promising - ly y'rs - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From greg at cosc.canterbury.ac.nz Tue Dec 19 04:07:56 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue, 19 Dec 2000 16:07:56 +1300 (NZDT) Subject: [Python-Dev] PEP 207 -- Rich Comparisons In-Reply-To: <200012182332.SAA18456@cj20424-a.reston1.va.home.com> Message-ID: <200012190307.QAA02663@s454.cosc.canterbury.ac.nz> > The problem I have with this is that the code to evaluate g() has to > be generated twice! I have an idea how to fix that. There need to be two methods, __boolean_and_1__ and __boolean_and_2__. The first operand is evaluated and passed to __boolean_and_1__. If it returns a result, that becomes the result of the expression, and the second operand is short-circuited. If __boolean_and_1__ raises a NeedOtherOperand exception (or there is no __boolean_and_1__ method), the second operand is evaluated, and both operands are passed to __boolean_and_2__. The bytecode would look something like BOOLEAN_AND_1 label BOOLEAN_AND_2 label: ... I'll make a PEP out of this one day if I get enthusiastic enough. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From tim.one at home.com Tue Dec 19 05:55:33 2000 From: tim.one at home.com (Tim Peters) Date: Mon, 18 Dec 2000 23:55:33 -0500 Subject: [Python-Dev] The Dictionary Gem is polished! In-Reply-To: <3A3EB6EB.C79A3896@tismer.com> Message-ID: Something else to ponder: my tests show that the current ("old") algorithm performs much better (somewhat worse than "new2" == new algorithm + warmup) if incr is simply initialized like so instead: if mp.oldalg: incr = (_hash & 0xffffffffL) % (mp.ma_size - 1) That's another way to get all the bits to contribute to the result. Note that a mod by size-1 is analogous to "casting out nines" in decimal: it's the same as breaking hash into fixed-sized pieces from the right (10 bits each if size=2**10, etc), adding the pieces together, and repeating that process until only one piece remains. IOW, it's a degenerate form of division, but works well all the same. It didn't improve over that when I tried a mod by the largest prime less than the table size (which suggests we're sucking all we can out of the *probe* sequence given a sometimes-poor starting index). However, it's subject to the same weak clustering phenomenon as the old method due to the ill-advised "~hash" operation in computing the initial index. If ~ is also thrown away, it's as good as new2 (here I've tossed out the "windows names", and "old" == existing algorithm except (a) get rid of ~ when computing index and (b) do mod by size-1 when computing incr): N=1000 trips for strings old=230 new=261 new2=239 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=999 new=13212 new2=999 trips for random integers old=399 new=421 new2=410 N=2000 trips for strings old=787 new=1066 new2=827 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=26481 new2=1999 trips for random integers old=652 new=733 new2=650 N=3000 trips for strings old=547 new=760 new2=617 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=38701 new2=2999 trips for random integers old=724 new=743 new2=768 N=4000 trips for strings old=1311 new=1657 new2=1315 trips for bin strings old=0 new=0 new2=0 trips for bad integers old=0 new=53014 new2=3999 trips for random integers old=1476 new=1587 new2=1493 The new and new2 values differ in minor ways from the ones you posted because I got rid of the ~ (the ~ has a bad interaction with "additive" means of computing incr, because the ~ tends to move the index in the opposite direction, and these moves in opposite directions tend to cancel out when computing incr+index the first time). too-bad-mod-is-expensive!-ly y'rs - tim From tim.one at home.com Tue Dec 19 06:50:01 2000 From: tim.one at home.com (Tim Peters) Date: Tue, 19 Dec 2000 00:50:01 -0500 Subject: [Python-Dev] SourceForge SSH silliness In-Reply-To: <20001217220008.D29681@xs4all.nl> Message-ID: [Tim] > Starting last night, I get this msg whenever I update Python code w/ > CVSROOT=:ext:tim_one at cvs.python.sourceforge.net:/cvsroot/python: > > """ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > @ WARNING: HOST IDENTIFICATION HAS CHANGED! @ > @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! > Someone could be eavesdropping on you right now > (man-in-the-middle attack)! > It is also possible that the host key has just been changed. > Please contact your system administrator. > Add correct host key in C:\Code/.ssh/known_hosts to get rid of > this message. > Password authentication is disabled to avoid trojan horses. > """ > > This is SourceForge's doing, and is permanent (they've changed > keys on their end). ... [Thomas Wouters] > What sourceforge did was switch Linux distributions, and upgrade. > The switch doesn't really matter for the SSH problem, because recent > Debian and recent RedHat releases both use a new ssh, the OpenBSD > ssh imlementation. > Apparently, it isn't entirely backwards compatible to old versions of > F-secure ssh. For one thing, it doesn't support the 'idea' cypher. This > might or might not be your problem; if it is, you should get a decent > message that gives a relatively clear message such as 'cypher type 'idea' > not supported'. > ... [and quite a bit more] ... I hope you're feeling better today . "The problem" was one the wng msg spelled out: "It is also possible that the host key has just been changed.". SF changed keys. That's the whole banana right there. Deleting the sourceforge keys from known_hosts fixed it (== convinced ssh to install new SF keys the next time I connected). From tim.one at home.com Tue Dec 19 06:58:45 2000 From: tim.one at home.com (Tim Peters) Date: Tue, 19 Dec 2000 00:58:45 -0500 Subject: [Python-Dev] new draft of PEP 227 In-Reply-To: <200012171438.JAA21603@cj20424-a.reston1.va.home.com> Message-ID: [Tim] > I expect it would do less harm to introduce a compile-time warning for > locals that are never referenced (such as the "a" in "set"). [Guido] > Another warning that would be quite useful (and trap similar cases) > would be "local variable used before set". Java elevated that last one to a compile-time error, via its "definite assignment" rules: you not only have to make sure a local is bound before reference, you have to make it *obvious* to the compiler that it's bound before reference. I think this is a Good Thing, because with intense training, people can learn to think like a compiler too . Seriously, in several of the cases where gcc warned about "maybe used before set" in the Python implementation, the warnings were bogus but it was non-trivial to deduce that. Such code is very brittle under modification, and the definite assignment rules make that path to error a non-starter. Example: def f(N): if N > 0: for i in range(N): if i == 0: j = 42 else: f2(i) elif N <= 0: j = 24 return j It's a Crime Against Humanity to make the code reader *deduce* that j is always bound by the time "return" is executed. From guido at python.org Tue Dec 19 07:08:14 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 01:08:14 -0500 Subject: [Python-Dev] Error: syncmail script missing Message-ID: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> I just checked in the documentation for the warnings module. (Check it out!) When I ran "cvs commit" in the Doc directory, it said, amongst other things: sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory I suppose this may be a side effect of the transition to new hardware of the SourceForge CVS archive. (Which, by the way, has dramatically improved the performance of typical CVS operations -- I am no longer afraid to do a cvs diff or cvs log in Emacs, or to do a cvs update just to be sure.) Could some of the Powers That Be (Fred or Barry :-) check into what happened to the syncmail script? --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Tue Dec 19 07:10:04 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue, 19 Dec 2000 01:10:04 -0500 (EST) Subject: [Python-Dev] Error: syncmail script missing In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> Message-ID: <14910.64444.662460.48236@cj42289-a.reston1.va.home.com> Guido van Rossum writes: > Could some of the Powers That Be (Fred or Barry :-) check into what > happened to the syncmail script? We've seen this before, but I'm not sure what it was. Barry, do you recall? Had the Python interpreter landed in a different directory? Or perhaps the location of the CVS repository is different, so syncmail isn't where loginfo says. Tomorrow... scp to SF appears broken as well. ;( -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tim.one at home.com Tue Dec 19 07:16:15 2000 From: tim.one at home.com (Tim Peters) Date: Tue, 19 Dec 2000 01:16:15 -0500 Subject: [Python-Dev] Error: syncmail script missing In-Reply-To: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > I just checked in the documentation for the warnings module. (Check > it out!) Everyone should note that this means Guido will be taking his traditional post-release vacation almost immediately . > When I ran "cvs commit" in the Doc directory, it said, amongst other > things: > > sh: /cvsroot/python/CVSROOT/syncmail: No such file or directory > > I suppose this may be a side effect of the transition to new hardware > of the SourceForge CVS archive. The lack of checkin mail was first noted on a Jython list. Finn wisely replied that he'd just sit back and wait for the CPython people to figure out how to fix it. > ... > Could some of the Powers That Be (Fred or Barry :-) check into what > happened to the syncmail script? Don't worry, I'll do my part by nagging them in your absence . Bon holiday voyage! From cgw at fnal.gov Tue Dec 19 07:32:15 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Tue, 19 Dec 2000 00:32:15 -0600 (CST) Subject: [Python-Dev] cycle-GC question Message-ID: <14911.239.12288.546710@buffalo.fnal.gov> The following program: import rexec while 1: x = rexec.RExec() del x leaks memory at a fantastic rate. It seems clear (?) that this is due to the call to "set_rexec" at rexec.py:140, which creates a circular reference between the `rexec' and `hooks' objects. (There's even a nice comment to that effect). I'm curious however as to why the spiffy new cyclic-garbage collector doesn't pick this up? Just-wondering-ly y'rs, cgw From tim_one at email.msn.com Tue Dec 19 10:24:18 2000 From: tim_one at email.msn.com (Tim Peters) Date: Tue, 19 Dec 2000 04:24:18 -0500 Subject: [Python-Dev] RE: The Dictionary Gem is polished! In-Reply-To: <3A3EBF23.750CF761@tismer.com> Message-ID: [Christian Tismer] > ... > For short strings, this prime has bad influence on the low bits, > making it perform supoptimally for small dicts. > See the new2 algo which funnily corrects for that. > The reason is obvious: Just look at the bit pattern > of 1000003: '0xf4243' > > Without giving proof, this smells like bad bit distribution on small > strings to me. You smell it too, right? > ... [Tim] > As is, string hashes are a lot like integer hashes, in that > "consecutive" strings > > J001 > J002 > J003 > J004 > ... > > yield hashes very close together in value. [back to Christian] > A bad generator in that case. I'll look for a better one. Not necessarily! It's for that same reason "consecutive strings" can have "better than random" behavior today. And consecutive strings-- like consecutive ints --are a common case. Here are the numbers for the synthesized string cases: N=1000 trips for strings old=293 new=302 new2=221 trips for bin strings old=0 new=0 new2=0 N=2000 trips for strings old=1093 new=1109 new2=786 trips for bin strings old=0 new=0 new2=0 N=3000 trips for strings old=810 new=839 new2=609 trips for bin strings old=0 new=0 new2=0 N=4000 trips for strings old=1850 new=1843 new2=1375 trips for bin strings old=0 new=0 new2=0 Here they are again, after doing nothing except changing the "^" to "+" in the string hash, i.e. replacing x = (1000003*x) ^ *p++; by x = (1000003*x) + *p++; N=1000 trips for strings old=140 new=127 new2=108 trips for bin strings old=0 new=0 new2=0 N=2000 trips for strings old=480 new=434 new2=411 trips for bin strings old=0 new=0 new2=0 N=3000 trips for strings old=821 new=857 new2=631 trips for bin strings old=0 new=0 new2=0 N=4000 trips for strings old=1892 new=1852 new2=1476 trips for bin strings old=0 new=0 new2=0 The first two sizes are dramatically better, the last two a wash. If you want to see a real disaster, replace the "+" with "*" : N=1000 trips for strings old=71429 new=6449 new2=2048 trips for bin strings old=81187 new=41117 new2=41584 N=2000 trips for strings old=26882 new=9300 new2=6103 trips for bin strings old=96018 new=46932 new2=42408 I got tired of waiting at that point ... suspecting-a-better-string-hash-is-hard-to-find-ly y'rs - tim From martin at loewis.home.cs.tu-berlin.de Tue Dec 19 12:58:17 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 19 Dec 2000 12:58:17 +0100 Subject: [Python-Dev] Death to string functions! Message-ID: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> > I agree it would be useful to define these terms, although those > particular definitions appear to be missing the most important point > from the user's POV (not a one says "going away someday"). PEP 4 says # Usage of a module may be `deprecated', which means that it may be # removed from a future Python release. Proposals for better wording are welcome (and yes, I still have to get the comments that I got into the document). Regards, Martin From guido at python.org Tue Dec 19 15:48:47 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 09:48:47 -0500 Subject: [Python-Dev] cycle-GC question In-Reply-To: Your message of "Tue, 19 Dec 2000 00:32:15 CST." <14911.239.12288.546710@buffalo.fnal.gov> References: <14911.239.12288.546710@buffalo.fnal.gov> Message-ID: <200012191448.JAA28737@cj20424-a.reston1.va.home.com> > The following program: > > import rexec > while 1: > x = rexec.RExec() > del x > > leaks memory at a fantastic rate. > > It seems clear (?) that this is due to the call to "set_rexec" at > rexec.py:140, which creates a circular reference between the `rexec' > and `hooks' objects. (There's even a nice comment to that effect). > > I'm curious however as to why the spiffy new cyclic-garbage collector > doesn't pick this up? Me too. I turned on gc debugging (gc.set_debug(077) :-) and got messages suggesting that it is not collecting everything. The output looks like this: . . . gc: collecting generation 0... gc: objects in each generation: 764 6726 89174 gc: done. gc: collecting generation 1... gc: objects in each generation: 0 8179 89174 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 0 97235 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 747 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 1386 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 2082 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 2721 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 757 3417 97184 gc: done. gc: collecting generation 0... gc: objects in each generation: 764 4056 97184 gc: done. . . . With the third number growing each time a "generation 1" collection is done. Maybe Neil can shed some light? The gc.garbage list is empty. This is about as much as I know about the GC stuff... --Guido van Rossum (home page: http://www.python.org/~guido/) From petrilli at amber.org Tue Dec 19 16:25:18 2000 From: petrilli at amber.org (Christopher Petrilli) Date: Tue, 19 Dec 2000 10:25:18 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 12:58:17PM +0100 References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> Message-ID: <20001219102518.A14288@trump.amber.org> So I was thinking about this whole thing, and wondering why it was that seeing things like: " ".join(aList) bugged me to no end, while: aString.lower() didn't seem to look wrong. I finally put my finger on it, and I haven't seen anyone mention it, so I guess I'll do so. To me, the concept of "join" on a string is just not quite kosher, instead it should be something like this: aList.join(" ") or if you want it without the indirection: ['item', 'item', 'item'].join(" ") Now *THAT* looks right to me. The example of a join method on a string just doesn't quite gel in my head, and I did some thinking and digging, and well, when I pulled up my Smalltalk browser, things like join are done on Collections, not on Strings. You're joining the collection, not the string. Perhaps in a rush to move some things that were "string related" in the string module into methods on the strings themselves (something I whole-heartedly support), we moved a few too many things there---things that symantically don't really belong as methods on a string object. How this gets resolved, I don't know... but I know a lot of people have looked at the string methods---and they each keep coming back to 1 or 2 that bug them... and I think it's those that really aren't methods of a string, but instead something that operates with strings, but expects other things. Chris -- | Christopher Petrilli | petrilli at amber.org From guido at python.org Tue Dec 19 16:37:15 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 10:37:15 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Tue, 19 Dec 2000 10:25:18 EST." <20001219102518.A14288@trump.amber.org> References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org> Message-ID: <200012191537.KAA28909@cj20424-a.reston1.va.home.com> > So I was thinking about this whole thing, and wondering why it was > that seeing things like: > > " ".join(aList) > > bugged me to no end, while: > > aString.lower() > > didn't seem to look wrong. I finally put my finger on it, and I > haven't seen anyone mention it, so I guess I'll do so. To me, the > concept of "join" on a string is just not quite kosher, instead it > should be something like this: > > aList.join(" ") > > or if you want it without the indirection: > > ['item', 'item', 'item'].join(" ") > > Now *THAT* looks right to me. The example of a join method on a > string just doesn't quite gel in my head, and I did some thinking and > digging, and well, when I pulled up my Smalltalk browser, things like > join are done on Collections, not on Strings. You're joining the > collection, not the string. > > Perhaps in a rush to move some things that were "string related" in > the string module into methods on the strings themselves (something I > whole-heartedly support), we moved a few too many things > there---things that symantically don't really belong as methods on a > string object. > > How this gets resolved, I don't know... but I know a lot of people > have looked at the string methods---and they each keep coming back to > 1 or 2 that bug them... and I think it's those that really aren't > methods of a string, but instead something that operates with strings, > but expects other things. Boy, are you stirring up a can of worms that we've been through many times before! Nothing you say hasn't been said at least a hundred times before, on this list as well as on c.l.py. The problem is that if you want to make this a method on lists, you'll also have to make it a method on tuples, and on arrays, and on NumPy arrays, and on any user-defined type that implements the sequence protocol... That's just not reasonable to expect. There really seem to be only two possibilities that don't have this problem: (1) make it a built-in, or (2) make it a method on strings. We chose for (2) for uniformity, and to avoid the potention with os.path.join(), which is sometimes imported as a local. If " ".join(L) bugs you, try this: space = " " # This could be a global . . . s = space.join(L) --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at digicool.com Tue Dec 19 16:46:55 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 10:46:55 -0500 Subject: [Python-Dev] Death to string functions! References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org> Message-ID: <14911.33519.764029.306876@anthem.concentric.net> >>>>> "CP" == Christopher Petrilli writes: CP> So I was thinking about this whole thing, and wondering why it CP> was that seeing things like: CP> " ".join(aList) CP> bugged me to no end, while: CP> aString.lower() CP> didn't seem to look wrong. I finally put my finger on it, and CP> I haven't seen anyone mention it, so I guess I'll do so. Actually, it has been debated to death. ;) This looks better: SPACE = ' ' SPACE.join(aList) That reads good to me ("space-join this list") and that's how I always write it. That said, there are certainly lots of people who agree with you. You can't put join() on sequences though, until you have builtin base-classes, or interfaces, or protocols or some such construct, because otherwise you'd have to add it to EVERY sequence, including classes that act like sequences. One idea that I believe has merit is to consider adding join() to the builtins, probably with a signature like: join(aList, aString) -> aString This horse has been whacked pretty good too, but I don't remember seeing a patch or a pronouncement. -Barry From nas at arctrix.com Tue Dec 19 09:53:36 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 19 Dec 2000 00:53:36 -0800 Subject: [Python-Dev] cycle-GC question In-Reply-To: <200012191448.JAA28737@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Dec 19, 2000 at 09:48:47AM -0500 References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com> Message-ID: <20001219005336.A303@glacier.fnational.com> On Tue, Dec 19, 2000 at 09:48:47AM -0500, Guido van Rossum wrote: > > import rexec > > while 1: > > x = rexec.RExec() > > del x > > > > leaks memory at a fantastic rate. > > > > It seems clear (?) that this is due to the call to "set_rexec" at > > rexec.py:140, which creates a circular reference between the `rexec' > > and `hooks' objects. (There's even a nice comment to that effect). Line 140 is not the only place a circular reference is created. There is another one which is trickier to find: def add_module(self, mname): if self.modules.has_key(mname): return self.modules[mname] self.modules[mname] = m = self.hooks.new_module(mname) m.__builtins__ = self.modules['__builtin__'] return m If the module being added is __builtin__ then m.__builtins__ = m. The GC currently doesn't track modules. I guess it should. It might be possible to avoid this circular reference but I don't know enough about how RExec works. Would something like: def add_module(self, mname): if self.modules.has_key(mname): return self.modules[mname] self.modules[mname] = m = self.hooks.new_module(mname) if mname != '__builtin__': m.__builtins__ = self.modules['__builtin__'] return m do the trick? Neil From fredrik at effbot.org Tue Dec 19 16:39:49 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Tue, 19 Dec 2000 16:39:49 +0100 Subject: [Python-Dev] Death to string functions! References: <200012191158.MAA10475@loewis.home.cs.tu-berlin.de> <20001219102518.A14288@trump.amber.org> Message-ID: <008301c069d3$76560a20$3c6340d5@hagrid> "Christopher Petrilli" wrote: > didn't seem to look wrong. I finally put my finger on it, and I > haven't seen anyone mention it, so I guess I'll do so. To me, the > concept of "join" on a string is just not quite kosher, instead it > should be something like this: > > aList.join(" ") > > or if you want it without the indirection: > > ['item', 'item', 'item'].join(" ") > > Now *THAT* looks right to me. why do we keep coming back to this? aString.join can do anything string.join can do, but aList.join cannot. if you don't understand why, check the archives. From martin at loewis.home.cs.tu-berlin.de Tue Dec 19 16:44:48 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 19 Dec 2000 16:44:48 +0100 Subject: [Python-Dev] cycle-GC question Message-ID: <200012191544.QAA11408@loewis.home.cs.tu-berlin.de> > It seems clear (?) that this is due to the call to "set_rexec" at > rexec.py:140, which creates a circular reference between the `rexec' > and `hooks' objects. (There's even a nice comment to that effect). It's not all that clear that *this* is the cycle. In fact, it is not. > I'm curious however as to why the spiffy new cyclic-garbage > collector doesn't pick this up? It's an interesting problem, so I spent this afternoon investigating it. I soon found that I need a tool, so I introduced a new function gc.getreferents which, when given an object, returns a list of objects referring to that object. The patch for that feature is in http://sourceforge.net/patch/?func=detailpatch&patch_id=102925&group_id=5470 Applying that function recursively, I can get an output that looks like that: dictionary 0x81f4f24 dictionary 0x81f4f24 (seen) dictionary 0x81f4f24 (seen) dictionary 0x8213bc4 dictionary 0x820869c dictionary 0x820866c (seen) dictionary 0x8213bf4 dictionary 0x820866c (seen) dictionary 0x8214144 dictionary 0x820866c (seen) Each indentation level shows the objects which refer to the outer-next object, e.g. the dictionary 0x820869c refers to the RExec instance, and the RHooks instance refers to that dictionary. Clearly, the dictionary 0x820869c is the RHooks' __dict__, and the reference belongs to the 'rexec' key in that dictionary. The recursion stops only when an object has been seen before (so its a cycle, or other non-tree graph), or if there are no referents (the lists created to do the iteration are ignored). So it appears that the r_import method is referenced from some dictionary, but that dictionary is not referenced anywhere??? Checking the actual structures shows that rexec creates a __builtin__ module, which has a dictionary that has an __import__ key. So the reference to the method comes from the __builtin__ module, which in turn is referenced as the RExec's .modules attribute, giving another cycle. However, module objects don't participate in garbage collection. Therefore, gc.getreferents cannot traverse a module, and the garbage collector won't find a cycle involving a garbage module. I just submitted a bug report, http://sourceforge.net/bugs/?func=detailbug&bug_id=126345&group_id=5470 which suggests that modules should also participate in garbage collection. Regards, Martin From guido at python.org Tue Dec 19 17:01:46 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 11:01:46 -0500 Subject: [Python-Dev] cycle-GC question In-Reply-To: Your message of "Tue, 19 Dec 2000 00:53:36 PST." <20001219005336.A303@glacier.fnational.com> References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com> <20001219005336.A303@glacier.fnational.com> Message-ID: <200012191601.LAA29015@cj20424-a.reston1.va.home.com> > might be possible to avoid this circular reference but I don't > know enough about how RExec works. Would something like: > > def add_module(self, mname): > if self.modules.has_key(mname): > return self.modules[mname] > self.modules[mname] = m = self.hooks.new_module(mname) > if mname != '__builtin__': > m.__builtins__ = self.modules['__builtin__'] > return m > > do the trick? That's certainly a good thing to do (__builtin__ has no business having a __builtins__!), but (in my feeble experiment) it doesn't make the leaks go away. Note that almost every module participates heavily in cycles: whenever you define a function f(), f.func_globals is the module's __dict__, which also contains a reference to f. Similar for classes, with an extra hop via the class object and its __dict__. --Guido van Rossum (home page: http://www.python.org/~guido/) From cgw at fnal.gov Tue Dec 19 17:06:06 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Tue, 19 Dec 2000 10:06:06 -0600 (CST) Subject: [Python-Dev] cycle-GC question In-Reply-To: <20001219005336.A303@glacier.fnational.com> References: <14911.239.12288.546710@buffalo.fnal.gov> <200012191448.JAA28737@cj20424-a.reston1.va.home.com> <20001219005336.A303@glacier.fnational.com> Message-ID: <14911.34670.664178.418523@buffalo.fnal.gov> Neil Schemenauer writes: > > Line 140 is not the only place a circular reference is created. > There is another one which is trickier to find: > > def add_module(self, mname): > if self.modules.has_key(mname): > return self.modules[mname] > self.modules[mname] = m = self.hooks.new_module(mname) > m.__builtins__ = self.modules['__builtin__'] > return m > > If the module being added is __builtin__ then m.__builtins__ = m. > The GC currently doesn't track modules. I guess it should. It > might be possible to avoid this circular reference but I don't > know enough about how RExec works. Would something like: > > def add_module(self, mname): > if self.modules.has_key(mname): > return self.modules[mname] > self.modules[mname] = m = self.hooks.new_module(mname) > if mname != '__builtin__': > m.__builtins__ = self.modules['__builtin__'] > return m > > do the trick? No... if you change "add_module" in exactly the way you suggest (without worrying about whether it breaks the functionality of rexec!) and run the test while 1: rexec.REXec() you will find that it still leaks memory at a prodigious rate. So, (unless there is yet another module-level cyclic reference) I don't think this theory explains the problem. From martin at loewis.home.cs.tu-berlin.de Tue Dec 19 17:07:04 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 19 Dec 2000 17:07:04 +0100 Subject: [Python-Dev] cycle-GC question Message-ID: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de> > There is another one which is trickier to find: [__builtin__.__builtins__ == __builtin__] > Would something like: [do not add builtins to builtin > work? No, because there is another one that is even trickier to find :-) >>> print r >>> print r.modules['__builtin__'].open.im_self Please see my other message; I think modules should be gc'ed. Regards, Martin From nas at arctrix.com Tue Dec 19 10:24:29 2000 From: nas at arctrix.com (Neil Schemenauer) Date: Tue, 19 Dec 2000 01:24:29 -0800 Subject: [Python-Dev] cycle-GC question In-Reply-To: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Dec 19, 2000 at 05:07:04PM +0100 References: <200012191607.RAA11680@loewis.home.cs.tu-berlin.de> Message-ID: <20001219012429.A520@glacier.fnational.com> On Tue, Dec 19, 2000 at 05:07:04PM +0100, Martin v. Loewis wrote: > I think modules should be gc'ed. I agree. Its easy to do. If no one does over Christmas I will do it before 2.1 is released. Neil From tismer at tismer.com Tue Dec 19 16:48:58 2000 From: tismer at tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 17:48:58 +0200 Subject: [Python-Dev] The Dictionary Gem is polished! References: Message-ID: <3A3F836A.DEDF1011@tismer.com> Tim Peters wrote: > > Something else to ponder: my tests show that the current ("old") algorithm > performs much better (somewhat worse than "new2" == new algorithm + warmup) > if incr is simply initialized like so instead: > > if mp.oldalg: > incr = (_hash & 0xffffffffL) % (mp.ma_size - 1) Sure. I did this as well, but didn't consider a division since it said to be too slow. But this is very platform dependant. On Pentiums this might be not noticeable. > That's another way to get all the bits to contribute to the result. Note > that a mod by size-1 is analogous to "casting out nines" in decimal: it's > the same as breaking hash into fixed-sized pieces from the right (10 bits > each if size=2**10, etc), adding the pieces together, and repeating that > process until only one piece remains. IOW, it's a degenerate form of > division, but works well all the same. It didn't improve over that when I > tried a mod by the largest prime less than the table size (which suggests > we're sucking all we can out of the *probe* sequence given a sometimes-poor > starting index). Again I tried this too. Instead of the largest near prime I used the nearest prime. Remarkably the nearest prime is identical to the primitive element in a lot of cases. But no improvement over the modulus. > > However, it's subject to the same weak clustering phenomenon as the old > method due to the ill-advised "~hash" operation in computing the initial > index. If ~ is also thrown away, it's as good as new2 (here I've tossed out > the "windows names", and "old" == existing algorithm except (a) get rid of ~ > when computing index and (b) do mod by size-1 when computing incr): ... > The new and new2 values differ in minor ways from the ones you posted > because I got rid of the ~ (the ~ has a bad interaction with "additive" > means of computing incr, because the ~ tends to move the index in the > opposite direction, and these moves in opposite directions tend to cancel > out when computing incr+index the first time). Remarkable. > too-bad-mod-is-expensive!-ly y'rs - tim Yes. The wheel is cheapest yet. ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From just at letterror.com Tue Dec 19 18:11:55 2000 From: just at letterror.com (Just van Rossum) Date: Tue, 19 Dec 2000 18:11:55 +0100 Subject: [Python-Dev] Death to string functions! Message-ID: Barry wrote: >Actually, it has been debated to death. ;) This looks better: > > SPACE = ' ' > SPACE.join(aList) > >That reads good to me ("space-join this list") and that's how I always >write it. I just did a quick scan through the 1.5.2 library, and _most_ occurrances of string.join() are used with a string constant for the second argument. There is a whole bunch of one-arg string.join()'s, too. Recommending replacing all of these (not to mention all the code "out there") with named constants seems plain silly. Sure, " ".join() is the most "logical" choice for Python as it stands, but it's definitely not the most intuitive, as evidenced by the number of times this comes up on c.l.py: to many people it simply "looks wrong". Maybe this is the deal: joiner.join() makes a whole lot of sense from an _implementation_ standpoint, but a whole lot less as a public interface. It's easy to explain why join() can't be a method of sequences (in Python), but that alone doesn't justify a string method. string.join() is not quite unlike map() and friends: map() wouldn't be so bad as a sequence method, but that isn't practical for exactly the same reasons: so it's a builtin. (And not a function method...) So, making join() a builtin makes a whole lot of sense. Not doing this because people sometimes use a local reference to os.path.join seems awfully backward. Hm, maybe joiner.join() could become a special method: joiner.__join__(), that way other objects could define their own implementation for join(). (Hm, wouldn't be the worst thing for, say, a file path object...) Just From barry at digicool.com Tue Dec 19 18:20:07 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 12:20:07 -0500 Subject: [Python-Dev] Death to string functions! References: Message-ID: <14911.39111.710940.342986@anthem.concentric.net> >>>>> "JvR" == Just van Rossum writes: JvR> Recommending replacing all of these (not to mention all the JvR> code "out there") with named constants seems plain silly. Until there's a tool to do the migration, I don't (personally) recommend wholesale migration. For new code I write though, I usually do it the way I described (which is intuitive to me, but then so is moving your fingers at a blinding speed up and down 5 long strips of metal to cause low bowel-tickling rumbly noises). JvR> So, making join() a builtin makes a whole lot of sense. Not JvR> doing this because people sometimes use a local reference to JvR> os.path.join seems awfully backward. I agree. Have we agreed on the semantics and signature of builtin join() though? Is it just string.join() stuck in builtins? -Barry From fredrik at effbot.org Tue Dec 19 18:25:49 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Tue, 19 Dec 2000 18:25:49 +0100 Subject: [Python-Dev] Death to string functions! References: <14911.39111.710940.342986@anthem.concentric.net> Message-ID: <012901c069e0$bd724fb0$3c6340d5@hagrid> Barry wrote: > JvR> So, making join() a builtin makes a whole lot of sense. Not > JvR> doing this because people sometimes use a local reference to > JvR> os.path.join seems awfully backward. > > I agree. Have we agreed on the semantics and signature of builtin > join() though? Is it just string.join() stuck in builtins? +1 (let's leave the __join__ slot and other super-generalized variants for 2.2) From thomas at xs4all.net Tue Dec 19 18:54:34 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Tue, 19 Dec 2000 18:54:34 +0100 Subject: [Python-Dev] SourceForge SSH silliness In-Reply-To: ; from tim.one@home.com on Tue, Dec 19, 2000 at 12:50:01AM -0500 References: <20001217220008.D29681@xs4all.nl> Message-ID: <20001219185434.E29681@xs4all.nl> On Tue, Dec 19, 2000 at 12:50:01AM -0500, Tim Peters wrote: > [Thomas Wouters] > > What sourceforge did was switch Linux distributions, and upgrade. > > ... [and quite a bit more] ... > I hope you're feeling better today . "The problem" was one the wng > msg spelled out: "It is also possible that the host key has just been > changed.". SF changed keys. That's the whole banana right there. Deleting > the sourceforge keys from known_hosts fixed it (== convinced ssh to install > new SF keys the next time I connected). Well, if you'd read the thread , you'll notice that other people had problems even after that. I'm glad you're not one of them, though :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From barry at digicool.com Tue Dec 19 19:22:19 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 13:22:19 -0500 Subject: [Python-Dev] Error: syncmail script missing References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> Message-ID: <14911.42843.284822.935268@anthem.concentric.net> Folks, Python wasn't installed on the new SF CVS machine, which was why syncmail was broken. My thanks to the SF guys for quickly remedying this situation! Please give it a test. -Barry From barry at digicool.com Tue Dec 19 19:23:32 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 13:23:32 -0500 Subject: [Python-Dev] Error: syncmail script missing References: <200012190608.BAA25007@cj20424-a.reston1.va.home.com> <14911.42843.284822.935268@anthem.concentric.net> Message-ID: <14911.42916.573600.922606@anthem.concentric.net> >>>>> "BAW" == Barry A Warsaw writes: BAW> Python wasn't installed on the new SF CVS machine, which was BAW> why syncmail was broken. My thanks to the SF guys for BAW> quickly remedying this situation! BTW, it's currently Python 1.5.2. From tismer at tismer.com Tue Dec 19 18:34:14 2000 From: tismer at tismer.com (Christian Tismer) Date: Tue, 19 Dec 2000 19:34:14 +0200 Subject: [Python-Dev] Re: The Dictionary Gem is polished! References: Message-ID: <3A3F9C16.562F9D9F@tismer.com> Again... Tim Peters wrote: > > Sounds good to me! It's a very cheap way to get the high bits into play. ... > [Christian] > > - The bits used from the string hash are not well distributed > > - using a "warmup wheel" on the hash to suck all bits in > > gives the same quality of hashes like random numbers. > > See above and be very cautious: none of Python's hash functions produce > well-distributed bits, and-- in effect --that's why Python dicts often > perform "better than random" on common data. Even what you've done so far > appears to provide marginally worse statistics for Guido's favorite kind of > test case ("worse" in two senses: total number of collisions (a measure of > amortized lookup cost), and maximum collision chain length (a measure of > worst-case lookup cost)): > > d = {} > for i in range(N): > d[repr(i)] = i I will look into this. > check-in-one-thing-then-let-it-simmer-ly y'rs - tim Are you saying I should check the thing in? Really? In another reply to this message I was saying """ This is why I think to be even more conservative: Try to use a division wheel, but with the inverses of the original primitive roots, just in order to get at Guido's results :-) """ This was a religious desire, but such an inverse cannot exist. Well, all inverses exists, but it is an error to think that they can produce similar bit patterns. Changing the root means changing the whole system, since we have just a *representation* of a goup, via polynomial coefficients. A simple example which renders my thought useless is this: There is no general pattern that can turn a physical right shift into a left shift, for all bit combinations. Anyway, how can I produce a nearly complete scheme like today with the same "cheaper than random" properties? Ok, we have to stick with the given polymomials to stay compatible, and we also have to shift left. How do we then rotate the random bits in? Well, we can in fact do a rotation of the whole index, moving the highest bit into the lowest. Too bad that this isn't supported in C. It is a native machine instruction on X86 machines. We would then have: incr = ROTATE_LEFT(incr, 1) if (incr > mask): incr = incr ^ mp.ma_poly The effect is similar to the "old" algorithm, bits are shiftet left. Only if the hash happens to have hight bits, they appear in the modulus. On the current "faster than random" cases, I assume that high bits in the hash are less likely than low bits, so it is more likely that an entry finds its good place in the dict, before bits are rotated in. hence the "good" cases would be kept. I did all tests again, now including maximum trip length, and added a "rotate-left" version as well: D:\crml_doc\platf\py>python dictest.py N=1000 trips for strings old=293/9 new=302/7 new2=221/7 rot=278/5 trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16754/31 trips for random integers old=360/8 new=369/8 new2=358/6 rot=356/7 trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/5 N=2000 trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1082/8 trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33524/34 trips for random integers old=704/7 new=686/8 new2=685/7 rot=693/7 trips for windows names old=503/8 new=542/9 new2=564/6 rot=529/7 N=3000 trips for strings old=810/5 new=839/6 new2=609/5 rot=796/5 trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=49828/38 trips for random integers old=708/5 new=723/7 new2=724/5 rot=722/6 trips for windows names old=712/6 new=711/5 new2=691/5 rot=738/9 N=4000 trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1848/10 trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=66356/38 trips for random integers old=1395/9 new=1397/8 new2=1435/9 rot=1394/13 trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1513/9 D:\crml_doc\platf\py> Concerning trip length, rotate is better than old in most cases. Random integers seem to withstand any of these procedures. For bad integers, rot takes naturally more trips than new, since the path to the bits is longer. All in all I don't see more than marginal differences between the approaches, and I tent to stick with "new", since it is theapest to implement. (it does not cost anything and might instead be a little cheaper for some compilers, since it does not reference the mask variable). I'd say let's do the patch -- ciao - chris -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com -------------- next part -------------- ## dictest.py ## Test of a new rehash algorithm ## Chris Tismer ## 2000-12-17 ## Mission Impossible 5oftware Team # The following is a partial re-implementation of # Python dictionaries in Python. # The original algorithm was literally turned # into Python code. ##/* ##Table of irreducible polynomials to efficiently cycle through ##GF(2^n)-{0}, 2<=n<=30. ##*/ polys = [ 4 + 3, 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3, 256 + 29, 512 + 17, 1024 + 9, 2048 + 5, 4096 + 83, 8192 + 27, 16384 + 43, 32768 + 3, 65536 + 45, 131072 + 9, 262144 + 39, 524288 + 39, 1048576 + 9, 2097152 + 5, 4194304 + 3, 8388608 + 33, 16777216 + 27, 33554432 + 9, 67108864 + 71, 134217728 + 39, 268435456 + 9, 536870912 + 5, 1073741824 + 83, 0 ] polys = map(long, polys) class NULL: pass class Dictionary: dummy = "" def __init__(mp, newalg=0): mp.ma_size = 0 mp.ma_poly = 0 mp.ma_table = [] mp.ma_fill = 0 mp.ma_used = 0 mp.oldalg = not newalg mp.warmup = newalg==2 mp.rotleft = newalg==3 mp.trips = 0 mp.tripmax = 0 def getTrips(self): trips, tripmax = self.trips, self.tripmax self.trips = self.tripmax = 0 return trips, tripmax def lookdict(mp, key, _hash): me_hash, me_key, me_value = range(3) # rec slots dummy = mp.dummy mask = mp.ma_size-1 ep0 = mp.ma_table i = (~_hash) & mask ep = ep0[i] if ep[me_key] is NULL or ep[me_key] == key: return ep if ep[me_key] == dummy: freeslot = ep else: if (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0) : return ep freeslot = NULL ###### FROM HERE if mp.oldalg: incr = (_hash ^ (_hash >> 3)) & mask else: # note that we do not mask! # the shifting is worth it in the incremental case. ## added after posting to python-dev: uhash = _hash & 0xffffffffl if mp.warmup: incr = uhash mask2 = 0xffffffffl ^ mask while mask2 > mask: if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 mask2 = mask2>>1 # this loop *can* be sped up by tables # with precomputed multiple shifts. # But I'm not sure if it is worth it at all. else: incr = uhash ^ (uhash >> 3) ###### TO HERE if (not incr): incr = mask triplen = 0 while 1: mp.trips = mp.trips+1 triplen = triplen+1 if triplen > mp.tripmax: mp.tripmax = triplen ep = ep0[int((i+incr)&mask)] if (ep[me_key] is NULL) : if (freeslot is not NULL) : return freeslot else: return ep if (ep[me_key] == dummy) : if (freeslot == NULL): freeslot = ep elif (ep[me_key] == key or (ep[me_hash] == _hash and cmp(ep[me_key], key) == 0)) : return ep # Cycle through GF(2^n)-{0} ###### FROM HERE if mp.oldalg: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly elif mp.rotleft: if incr &0x80000000L: incr = (incr << 1) | 1 else: incr = incr << 1 if (incr > mask): incr = incr ^ mp.ma_poly else: # new algorithm: do a division if (incr & 1): incr = incr ^ mp.ma_poly incr = incr >> 1 ###### TO HERE def insertdict(mp, key, _hash, value): me_hash, me_key, me_value = range(3) # rec slots ep = mp.lookdict(key, _hash) if (ep[me_value] is not NULL) : old_value = ep[me_value] ep[me_value] = value else : if (ep[me_key] is NULL): mp.ma_fill=mp.ma_fill+1 ep[me_key] = key ep[me_hash] = _hash ep[me_value] = value mp.ma_used = mp.ma_used+1 def dictresize(mp, minused): me_hash, me_key, me_value = range(3) # rec slots oldsize = mp.ma_size oldtable = mp.ma_table MINSIZE = 4 newsize = MINSIZE for i in range(len(polys)): if (newsize > minused) : newpoly = polys[i] break newsize = newsize << 1 else: return -1 _nullentry = range(3) _nullentry[me_hash] = 0 _nullentry[me_key] = NULL _nullentry[me_value] = NULL newtable = map(lambda x,y=_nullentry:y[:], range(newsize)) mp.ma_size = newsize mp.ma_poly = newpoly mp.ma_table = newtable mp.ma_fill = 0 mp.ma_used = 0 for ep in oldtable: if (ep[me_value] is not NULL): mp.insertdict(ep[me_key],ep[me_hash],ep[me_value]) return 0 # PyDict_GetItem def __getitem__(op, key): me_hash, me_key, me_value = range(3) # rec slots if not op.ma_table: raise KeyError, key _hash = hash(key) return op.lookdict(key, _hash)[me_value] # PyDict_SetItem def __setitem__(op, key, value): mp = op _hash = hash(key) ## /* if fill >= 2/3 size, double in size */ if (mp.ma_fill*3 >= mp.ma_size*2) : if (mp.dictresize(mp.ma_used*2) != 0): if (mp.ma_fill+1 > mp.ma_size): raise MemoryError mp.insertdict(key, _hash, value) # more interface functions def keys(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _key) return res def values(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( _value) return res def items(self): me_hash, me_key, me_value = range(3) # rec slots res = [] for _hash, _key, _value in self.ma_table: if _value is not NULL: res.append( (_key, _value) ) return res def __cmp__(self, other): mine = self.items() others = other.items() mine.sort() others.sort() return cmp(mine, others) ###################################################### ## tests def test(lis, dic): for key in lis: dic[key] def nulltest(lis, dic): for key in lis: dic def string_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft for i in range(n): s = str(i) #* 5 #s = chr(i%256) + chr(i>>8)## d1[s] = d2[s] = d3[s] = d4[s] = i return d1, d2, d3, d4 def istring_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft for i in range(n): s = chr(i%256) + chr(i>>8) d1[s] = d2[s] = d3[s] = d4[s] = i return d1, d2, d3, d4 def random_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft from whrandom import randint import sys keys = [] for i in range(n): keys.append(randint(0, sys.maxint-1)) for i in keys: d1[i] = d2[i] = d3[i] = d4[i] = i return d1, d2, d3, d4 def badnum_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft shift = 10 if EXTREME: shift = 16 for i in range(n): bad = i << 16 d2[bad] = d3[bad] = d4[bad] = i if n <= 1000: d1[bad] = i return d1, d2, d3, d4 def names_dicts(n): d1 = Dictionary() # original d2 = Dictionary(1) # other rehash d3 = Dictionary(2) # with warmup d4 = Dictionary(3) # rotleft import win32con keys = win32con.__dict__.keys() if len(keys) < n: keys = [] for s in keys[:n]: d1[s] = d2[s] = d3[s] = d4[s] = s return d1, d2, d3, d4 def do_test(dict): keys = dict.keys() dict.getTrips() # reset test(keys, dict) return "%d/%d" % dict.getTrips() EXTREME=1 if __name__ == "__main__": for N in (1000,2000,3000,4000): sdold, sdnew, sdnew2, sdrot = string_dicts(N) #idold, idnew, idnew2, idrot = istring_dicts(N) bdold, bdnew, bdnew2, bdrot = badnum_dicts(N) rdold, rdnew, rdnew2, rdrot = random_dicts(N) ndold, ndnew, ndnew2, ndrot = names_dicts(N) fmt = "old=%s new=%s new2=%s rot=%s" print "N=%d" %N print ("trips for strings "+fmt) % tuple( map(do_test, (sdold, sdnew, sdnew2, sdrot)) ) #print ("trips for bin strings "+fmt) % tuple( # map(do_test, (idold, idnew, idnew2, idrot)) ) print ("trips for bad integers "+fmt) % tuple( map(do_test, (bdold, bdnew, bdnew2, bdrot))) print ("trips for random integers "+fmt) % tuple( map(do_test, (rdold, rdnew, rdnew2, rdrot))) print ("trips for windows names "+fmt) % tuple( map(do_test, (ndold, ndnew, ndnew2, ndrot))) """ Results with a shift of 10 (EXTREME=0): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.097 new=5.088 timing for bad integers old=101.540 new=12.610 Results with a shift of 16 (EXTREME=1): D:\crml_doc\platf\py>python dictest.py timing for strings old=5.218 new=5.147 timing for bad integers old=571.210 new=19.220 """ From just at letterror.com Tue Dec 19 19:46:18 2000 From: just at letterror.com (Just van Rossum) Date: Tue, 19 Dec 2000 19:46:18 +0100 Subject: [Python-Dev] Death to string functions! In-Reply-To: <14911.39111.710940.342986@anthem.concentric.net> References: Message-ID: At 12:20 PM -0500 19-12-2000, Barry A. Warsaw wrote: >I agree. Have we agreed on the semantics and signature of builtin >join() though? Is it just string.join() stuck in builtins? Yep. I'm with /F that further generalization can be done later. Oh, does this mean that "".join() becomes deprecated? (Nice test case for the warning framework...) Just From barry at digicool.com Tue Dec 19 19:56:45 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Tue, 19 Dec 2000 13:56:45 -0500 Subject: [Python-Dev] Death to string functions! References: Message-ID: <14911.44909.414520.788073@anthem.concentric.net> >>>>> "JvR" == Just van Rossum writes: JvR> Oh, does this mean that "".join() becomes deprecated? Please, no. From guido at python.org Tue Dec 19 19:56:39 2000 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Dec 2000 13:56:39 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: Your message of "Tue, 19 Dec 2000 13:56:45 EST." <14911.44909.414520.788073@anthem.concentric.net> References: <14911.44909.414520.788073@anthem.concentric.net> Message-ID: <200012191856.NAA30524@cj20424-a.reston1.va.home.com> > >>>>> "JvR" == Just van Rossum writes: > > JvR> Oh, does this mean that "".join() becomes deprecated? > > Please, no. No. --Guido van Rossum (home page: http://www.python.org/~guido/) From just at letterror.com Tue Dec 19 20:15:19 2000 From: just at letterror.com (Just van Rossum) Date: Tue, 19 Dec 2000 20:15:19 +0100 Subject: [Python-Dev] Death to string functions! In-Reply-To: <14911.44909.414520.788073@anthem.concentric.net> References: Message-ID: At 1:56 PM -0500 19-12-2000, Barry A. Warsaw wrote: >>>>>> "JvR" == Just van Rossum writes: > > JvR> Oh, does this mean that "".join() becomes deprecated? > >Please, no. And keep two non-deprecated ways to do the same thing? I'm not saying it should be removed, just that the powers that be declare that _one_ of them is the preferred way. And-if-that-one-isn't-builtin-join()-I-don't-know-why-to-even-bother y'rs -- Just From greg at cosc.canterbury.ac.nz Tue Dec 19 23:35:05 2000 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed, 20 Dec 2000 11:35:05 +1300 (NZDT) Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012191537.KAA28909@cj20424-a.reston1.va.home.com> Message-ID: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz> Guido: > Boy, are you stirring up a can of worms that we've been through many > times before! Nothing you say hasn't been said at least a hundred > times before, on this list as well as on c.l.py. And I'll wager you'll continue to hear them said at regular intervals for a long time to come, because you've done something which a lot of people feel very strongly was a mistake, and they have some very rational arguments as to why it was a mistake, whereas you don't seem to have any arguments to the contrary which those people are likely to find convincing. > There really seem to be only two possibilities that don't have this > problem: (1) make it a built-in, or (2) make it a method on strings. False dichotomy. Some other possibilities: (3) Use an operator. (4) Leave it in the string module! Really, I don't see what would be so bad about that. You still need somewhere to put all the string-related constants, so why not keep the string module for those, plus the few functions that don't have any other obvious place? > If " ".join(L) bugs you, try this: > > space = " " # This could be a global > . > . > . > s = space.join(L) Surely you must realise that this completely fails to address Mr. Petrilli's concern? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg at cosc.canterbury.ac.nz +--------------------------------------+ From akuchlin at mems-exchange.org Wed Dec 20 15:40:58 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Wed, 20 Dec 2000 09:40:58 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: ; from noreply@sourceforge.net on Tue, Dec 19, 2000 at 07:02:05PM -0800 References: Message-ID: <20001220094058.A17623@kronos.cnri.reston.va.us> On Tue, Dec 19, 2000 at 07:02:05PM -0800, noreply at sourceforge.net wrote: >Date: 2000-Dec-19 19:02 >By: tim_one >Unrelated to your patch but in the same area: the other msg, "ord() >expected string or Unicode character", doesn't read right. The type >names in question are "string" and "unicode": > >>>> type("") > >>>> type(u"") > >>>> > >"character" is out of place, or not in enough places. Just thought I'd mention that, since *you're* so cute! Is it OK to refer to 8-bit strings under that name? How about "expected an 8-bit string or Unicode string", when the object passed to ord() isn't of the right type. Similarly, when the value is of the right type but has length>1, the message is "ord() expected a character, length-%d string found". Should that be "length-%d (string / unicode) found)" And should the type names be changed to '8-bit string'/'Unicode string', maybe? --amk From barry at digicool.com Wed Dec 20 16:39:30 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Wed, 20 Dec 2000 10:39:30 -0500 Subject: [Python-Dev] IGNORE - this is only a test Message-ID: <14912.53938.280864.596141@anthem.concentric.net> Testing the new MX for python.org... From fdrake at acm.org Wed Dec 20 17:57:09 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 20 Dec 2000 11:57:09 -0500 (EST) Subject: [Python-Dev] scp with SourceForge Message-ID: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> I've not been able to get scp to work with SourceForge since they upgraded their machines. ssh works fine. Is this related to the protocol mismatch problem that was discussed earlier? My ssh tells me "SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote sshd is sending it's version as "Remote protocol version 1.99, remote software version OpenSSH_2.2.0p1". Was there a reasonable way to deal with this? I'm running Linux-Mandrake 7.1 with very little customization or extra stuff. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tismer at tismer.com Wed Dec 20 17:31:00 2000 From: tismer at tismer.com (Christian Tismer) Date: Wed, 20 Dec 2000 18:31:00 +0200 Subject: [Python-Dev] Re: The Dictionary Gem is polished! References: <3A3F9C16.562F9D9F@tismer.com> Message-ID: <3A40DEC4.5F659E8E@tismer.com> Christian Tismer wrote: ... When talking about left rotation, an error crept in. Sorry! > We would then have: > > incr = ROTATE_LEFT(incr, 1) > if (incr > mask): > incr = incr ^ mp.ma_poly If incr contains the high bits of the hash, then the above must be replaced by incr = ROTATE_LEFT(incr, 1) if (incr & (mask+1)): incr = incr ^ mp.ma_poly or the multiplicative group is not guaranteed to be generated, obviously. This doesn't change my results, rotating right is still my choice. ciao - chris D:\crml_doc\platf\py>python dictest.py N=1000 trips for strings old=293/9 new=302/7 new2=221/7 rot=272/8 trips for bad integers old=499500/999 new=13187/31 new2=999/1 rot=16982/27 trips for random integers old=339/9 new=337/7 new2=343/10 rot=342/8 trips for windows names old=230/5 new=207/7 new2=200/5 rot=225/6 N=2000 trips for strings old=1093/11 new=1109/10 new2=786/6 rot=1090/9 trips for bad integers old=0/0 new=26455/32 new2=1999/1 rot=33985/31 trips for random integers old=747/10 new=733/7 new2=734/7 rot=728/8 trips for windows names old=503/8 new=542/9 new2=564/6 rot=521/11 N=3000 trips for strings old=810/5 new=839/6 new2=609/5 rot=820/6 trips for bad integers old=0/0 new=38681/36 new2=2999/1 rot=50985/26 trips for random integers old=709/4 new=728/5 new2=767/5 rot=711/6 trips for windows names old=712/6 new=711/5 new2=691/5 rot=727/7 N=4000 trips for strings old=1850/9 new=1843/8 new2=1375/11 rot=1861/9 trips for bad integers old=0/0 new=52994/39 new2=3999/1 rot=67986/26 trips for random integers old=1584/9 new=1606/8 new2=1505/9 rot=1579/8 trips for windows names old=1449/8 new=1434/8 new2=1457/11 rot=1476/7 -- Christian Tismer :^) Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com From tim.one at home.com Wed Dec 20 20:52:40 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 20 Dec 2000 14:52:40 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: [Fred L. Drake, Jr.] > I've not been able to get scp to work with SourceForge since they > upgraded their machines. ssh works fine. Same here. In particular, I can use ssh to log in to shell.sourceforge.net, but attempts to scp there act like this (breaking long lines by hand with \n\t): > scp -v pep-0042.html tim_one at shell.sourceforge.net:/home/groups/python/htdocs/peps Executing: host shell.sourceforge.net, user tim_one, command scp -v -t /home/groups/python/htdocs/peps SSH Version 1.2.14 [winnt-4.0-x86], protocol version 1.4. Standard version. Does not use RSAREF. ssh_connect: getuid 0 geteuid 0 anon 0 Connecting to shell.sourceforge.net [216.136.171.201] port 22. Connection established. Remote protocol version 1.99, remote software version OpenSSH_2.2.0p1 Waiting for server public key. Received server public key (768 bits) and host key (1024 bits). Host 'shell.sourceforge.net' is known and matches the host key. Initializing random; seed file C:\Code/.ssh/random_seed IDEA not supported, using 3des instead. Encryption type: 3des Sent encrypted session key. Received encrypted confirmation. Trying RSA authentication with key 'sourceforge' Server refused our key. Doing password authentication. Password: **** here tim enteredth his password **** Sending command: scp -v -t /home/groups/python/htdocs/peps Entering interactive session. And there it sits forever. Several others report the same symptom on SF forums, and assorted unresolved SF Support and Bug reports. We don't know what your symptom is! > Is this related to the protocol mismatch problem that was discussed > earlier? Doubt it. Most commentators pin the blame elsewhere. > ... > Was there a reasonable way to deal with this? A new note was added to http://sourceforge.net/support/?func=detailsupport&support_id=110235&group_i d=1 today, including: """ Re: Shell server We're also aware of the number of problems on the shell server with respect to restricitive permissions on some programs - and sourcing of shell environments. We're also aware of the troubles with scp and transferring files. As a work around, we recommend either editing files on the shell server, or scping files to the shell server from external hosts to the shell server, whilst logged in to the shell server. """ So there you go: scp files to the shell server from external hosts to the shell server whilst logged in to the shell server . Is scp working for *anyone*??? From fdrake at acm.org Wed Dec 20 21:17:58 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 20 Dec 2000 15:17:58 -0500 (EST) Subject: [Python-Dev] scp with SourceForge In-Reply-To: References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com> Tim Peters writes: > And there it sits forever. Several others report the same symptom on SF > forums, and assorted unresolved SF Support and Bug reports. We don't know > what your symptom is! Exactly the same. > So there you go: scp files to the shell server from external hosts to the > shell server whilst logged in to the shell server . Yeah, that really helps.... NOT! All I want to be able to do is post a new development version of the documentation. ;-( -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From bckfnn at worldonline.dk Wed Dec 20 21:23:33 2000 From: bckfnn at worldonline.dk (Finn Bock) Date: Wed, 20 Dec 2000 20:23:33 GMT Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <3a411449.5247545@smtp.worldonline.dk> [Fred L. Drake] > I've not been able to get scp to work with SourceForge since they >upgraded their machines. ssh works fine. Is this related to the >protocol mismatch problem that was discussed earlier? My ssh tells me >"SSH Version OpenSSH-1.2.2, protocol version 1.5.", and the remote >sshd is sending it's version as "Remote protocol version 1.99, remote >software version OpenSSH_2.2.0p1". > Was there a reasonable way to deal with this? I'm running >Linux-Mandrake 7.1 with very little customization or extra stuff. I managed to update the jython website by logging into the shell machine by ssh and doing a ftp back to my machine (using the IP number). That isn't exactly reasonable, but I was desperate. regards, finn From tim.one at home.com Wed Dec 20 21:42:11 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 20 Dec 2000 15:42:11 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14913.5110.271684.107030@cj42289-a.reston1.va.home.com> Message-ID: [Tim] > So there you go: scp files to the shell server from external > hosts to the shell server whilst logged in to the shell server . [Fred] > Yeah, that really helps.... NOT! All I want to be able to do is > post a new development version of the documentation. ;-( All I want to do is make a measly change to a PEP -- I'm afraid it doesn't ask how trivial your intents are. If some suck^H^H^H^Hdeveloper admits that scp works for them, maybe we can mail them stuff and have *them* copy it over. no-takers-so-far-though-ly y'rs - tim From barry at digicool.com Wed Dec 20 21:49:00 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Wed, 20 Dec 2000 15:49:00 -0500 Subject: [Python-Dev] scp with SourceForge References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <14913.6972.934625.840781@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> So there you go: scp files to the shell server from external TP> hosts to the shell server whilst logged in to the shell server TP> . Psheesh, /that/ was obvious. Did you even have to ask? TP> Is scp working for *anyone*??? Nope, same thing happens to me; it just hangs. -Barry From tim.one at home.com Wed Dec 20 21:53:38 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 20 Dec 2000 15:53:38 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14913.6972.934625.840781@anthem.concentric.net> Message-ID: [Tim, quoting a bit of immortal SF support prose] > TP> So there you go: scp files to the shell server from external > TP> hosts to the shell server whilst logged in to the shell server > TP> . [Barry] > Psheesh, /that/ was obvious. Did you even have to ask? Actually, isn't this easy to do on Linux? That is, run an ssh server (whatever) on your home machine, log in to the SF shell (which everyone seems able to do), then scp whatever your_home_IP_address:your_home_path from the SF shell? Heck, I can even get that to work on Windows, except I don't know how to set up anything on my end to accept the connection . > TP> Is scp working for *anyone*??? > Nope, same thing happens to me; it just hangs. That's good to know -- since nobody else mentioned this, Fred probably figured he was unique. not-that-he-isn't-it's-just-that-he's-not-ly y'rs - tim From fdrake at acm.org Wed Dec 20 21:52:10 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed, 20 Dec 2000 15:52:10 -0500 (EST) Subject: [Python-Dev] scp with SourceForge In-Reply-To: References: <14913.6972.934625.840781@anthem.concentric.net> Message-ID: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com> Tim Peters writes: > Actually, isn't this easy to do on Linux? That is, run an ssh server > (whatever) on your home machine, log in to the SF shell (which everyone > seems able to do), then > > scp whatever your_home_IP_address:your_home_path > > from the SF shell? Heck, I can even get that to work on Windows, except I > don't know how to set up anything on my end to accept the connection . Err, yes, that's easy to do, but... that means putting your private key on SourceForge. They're a great bunch of guys, but they can't have my private key! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tim.one at home.com Wed Dec 20 22:06:07 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 20 Dec 2000 16:06:07 -0500 Subject: [Python-Dev] scp with SourceForge In-Reply-To: <14913.7162.824838.63143@cj42289-a.reston1.va.home.com> Message-ID: [Fred] > Err, yes, that's easy to do, but... that means putting your private > key on SourceForge. They're a great bunch of guys, but they can't > have my private key! So generate a unique one-shot key pair for the life of the copy. I can do that for you on Windows if you lack a real OS . From thomas at xs4all.net Wed Dec 20 23:59:49 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Wed, 20 Dec 2000 23:59:49 +0100 Subject: [Python-Dev] scp with SourceForge In-Reply-To: ; from tim.one@home.com on Wed, Dec 20, 2000 at 02:52:40PM -0500 References: <14912.58597.779009.178647@cj42289-a.reston1.va.home.com> Message-ID: <20001220235949.F29681@xs4all.nl> On Wed, Dec 20, 2000 at 02:52:40PM -0500, Tim Peters wrote: > So there you go: scp files to the shell server from external hosts to the > shell server whilst logged in to the shell server . > Is scp working for *anyone*??? Not for me, anyway. And I'm not just saying that to avoid scp-duty :) And I'm using the same ssh version, which works fine on all other machines. It probably has to do with the funky setup Sourceforge uses. (Try looking at 'df' and 'cat /proc/mounts', and comparing the two -- you'll see what I mean :) That also means I'm not tempted to try and reproduce it, obviously :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From tim.one at home.com Thu Dec 21 04:24:12 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 20 Dec 2000 22:24:12 -0500 Subject: [Python-Dev] Death to string functions! In-Reply-To: <200012192235.LAA02763@s454.cosc.canterbury.ac.nz> Message-ID: [Guido] >> Boy, are you stirring up a can of worms that we've been through many >> times before! Nothing you say hasn't been said at least a hundred >> times before, on this list as well as on c.l.py. [Greg Ewing] > And I'll wager you'll continue to hear them said at regular intervals > for a long time to come, because you've done something which a lot of > people feel very strongly was a mistake, and they have some very > rational arguments as to why it was a mistake, whereas you don't seem > to have any arguments to the contrary which those people are likely to > find convincing. Then it's a wash: Guido doesn't find their arguments convincing either, and ties favor the status quo even in the absence of BDFLness. >> There really seem to be only two possibilities that don't have this >> problem: (1) make it a built-in, or (2) make it a method on strings. > False dichotomy. Some other possibilities: > > (3) Use an operator. Oh, that's likely . > (4) Leave it in the string module! Really, I don't see what > would be so bad about that. You still need somewhere to put > all the string-related constants, so why not keep the string > module for those, plus the few functions that don't have > any other obvious place? Guido said he wants to deprecate the entire string module, so that Python can eventually warn on the mere presence of "import string". That's what he said when I earlier ranted in favor of keeping the string module around. My guess is that making it a builtin is the only alternative that stands any chance at this point. >> If " ".join(L) bugs you, try this: >> >> space = " " # This could be a global >> . >> . >> . >> s = space.join(L) > Surely you must realise that this completely fails to > address Mr. Petrilli's concern? Don't know about Guido, but I don't realize that, and we haven't heard back from Charles. His objections were raised the first day " ".join was suggested, space.join was suggested almost immediately after, and that latter suggestion did seem to pacify at least several objectors. Don't know whether it makes Charles happier, but since it *has* made others happier in the past, it's not unreasonable to imagine that Charles might like it too. if-we're-to-be-swayed-by-his-continued-outrage-afraid-it-will- have-to-come-from-him-ly y'rs - tim From tim.one at home.com Thu Dec 21 08:44:19 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 21 Dec 2000 02:44:19 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: <20001220094058.A17623@kronos.cnri.reston.va.us> Message-ID: [Andrew Kuchling] > Is it OK to refer to 8-bit strings under that name? > How about "expected an 8-bit string or Unicode string", when the > object passed to ord() isn't of the right type. > > Similarly, when the value is of the right type but has length>1, > the message is "ord() expected a character, length-%d string found". > Should that be "length-%d (string / unicode) found)" > > And should the type names be changed to '8-bit string'/'Unicode > string', maybe? Actually, upon reflection I think it was a mistake to add all these "or Unicode" clauses to the error msgs to begin with. Python used to have only one string type, we're saying that's also a hope for the future, and in the meantime I know I'd have no trouble understanding "string" as including both 8-bit strings and Unicode strings. So we should say "8-bit string" or "Unicode string" when *only* one of those is allowable. So "ord() expected string ..." instead of (even a repaired version of) "ord() expected string or Unicode character ..." but-i'm-not-even-motivated-enough-to-finish-this-sig- From tim.one at home.com Thu Dec 21 09:52:54 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 21 Dec 2000 03:52:54 -0500 Subject: [Python-Dev] RE: The Dictionary Gem is polished! In-Reply-To: <3A3F9C16.562F9D9F@tismer.com> Message-ID: [Christian Tismer] > Are you saying I should check the thing in? Really? Of course. The first thing you talked about showed a major improvement in some bad cases, did no harm in the others, and both results were more than just plausible -- they made compelling sense and were backed by simulation. So why not check it in? It's a clear net win! Stuff since then has been a spattering of maybe-good maybe-bad maybe-neutral ideas that hasn't gotten anywhere conclusive. What I want to avoid is another "Unicode compression" scenario, where we avoid grabbing a clear win for months just because it may not be the best possible of all conceivable compression schemes -- and then mistakes get made in a last-second rush to get *any* improvement. Checking in a clear improvement today does not preclude checking in a better one next week . > ... > Ok, we have to stick with the given polymomials to stay > compatible, Na, feel free to explore that too, if you like. It really should get some study! The polys there now are utterly arbitrary: of all polys that happen to be irreducible and that have x as a primitive root in the induced multiplicative group, these are simply the smallest when viewed as binary integers. That's because they were *found* by trying all odd binary ints with odd parity (even ints and ints with even parity necessarily correspond to reducible polys), starting with 2**N+3 and going up until finding the first one that was both irreducible and had x as a primitive root. There's no theory at all that I know of to say that any such poly is any better for this purpose than any other. And they weren't tested for that either -- they're simply the first ones "that worked at all" in a brute search. Besides, Python's "better than random" dict behavior-- when it obtains! --is almost entirely due to that its hash functions produce distinct starting indices more often than a random hash function would. The contribution of the GF-based probe sequence in case of collision is to avoid the terrible behavior most other forms of probe sequence would cause given that Python's hash functions also tend to fill solid contiguous slices of the table more often than would a random hash function. [stuff about rotation] > ... > Too bad that this isn't supported in C. It is a native > machine instruction on X86 machines. Guido long ago rejected hash functions based on rotation for this reason; he's not likely to approve of rotations more in the probe sequence . A similar frustration is that almost modern CPUs have a fast instruction to get at the high 32 bits of a 32x32->64 bit multiply: another way to get the high bits of the hash code into play is to multiply the 32-bit hash code by a 32-bit constant (see Knuth for "Fibonacci hashing" details), and take the least-significant N bits of the *upper* 32 bits of the 64-bit product as the initial table index. If the constant is chosen correctly, this defines a permutation on the space of 32-bit unsigned ints, and can be very effective at "scrambling" arithmetic progressions (which Python's hash functions often produce). But C doesn't give a decent way to get at that either. > ... > On the current "faster than random" cases, I assume that > high bits in the hash are less likely than low bits, I'm not sure what this means. As the comment in dictobject.c says, it's common for Python's hash functions to return a result with lots of leading zeroes. But the lookup currently applies ~ to those first (which is a bad idea -- see earlier msgs), so the actual hash that gets *used* often has lots of leading ones. > so it is more likely that an entry finds its good place in the dict, > before bits are rotated in. hence the "good" cases would be kept. I can agree with this easily if I read the above as asserting that in the very good cases today, the low bits of hashes (whether or not ~ is applied) vary more than the high bits. > ... > Random integers seem to withstand any of these procedures. If you wanted to, you could *define* random this way . > ... > I'd say let's do the patch -- ciao - chris full-circle-ly y'rs - tim From mal at lemburg.com Thu Dec 21 12:16:27 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 21 Dec 2000 12:16:27 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix References: Message-ID: <3A41E68B.6B12CD71@lemburg.com> Tim Peters wrote: > > [Andrew Kuchling] > > Is it OK to refer to 8-bit strings under that name? > > How about "expected an 8-bit string or Unicode string", when the > > object passed to ord() isn't of the right type. > > > > Similarly, when the value is of the right type but has length>1, > > the message is "ord() expected a character, length-%d string found". > > Should that be "length-%d (string / unicode) found)" > > > > And should the type names be changed to '8-bit string'/'Unicode > > string', maybe? > > Actually, upon reflection I think it was a mistake to add all these "or > Unicode" clauses to the error msgs to begin with. Python used to have only > one string type, we're saying that's also a hope for the future, and in the > meantime I know I'd have no trouble understanding "string" as including both > 8-bit strings and Unicode strings. > > So we should say "8-bit string" or "Unicode string" when *only* one of those > is allowable. So > > "ord() expected string ..." > > instead of (even a repaired version of) > > "ord() expected string or Unicode character ..." I think this has to do with understanding that there are two string types in Python 2.0 -- a novice won't notice this until she sees the error message. My understanding is similar to yours, "string" should mean "any string object" and in cases where the difference between 8-bit string and Unicode matters, these should be referred to as "8-bit string" and "Unicode string". Still, I think it is a good idea to make people aware of the possibility of passing Unicode objects to these functions, so perhaps the idea of adding both possibilies to error messages is not such a bad idea for 2.1. The next phases would be converting all messages back to "string" and then convert all strings to Unicode ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From akuchlin at mems-exchange.org Thu Dec 21 19:37:19 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Thu, 21 Dec 2000 13:37:19 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: ; from tim.one@home.com on Thu, Dec 21, 2000 at 02:44:19AM -0500 References: <20001220094058.A17623@kronos.cnri.reston.va.us> Message-ID: <20001221133719.B11880@kronos.cnri.reston.va.us> On Thu, Dec 21, 2000 at 02:44:19AM -0500, Tim Peters wrote: >So we should say "8-bit string" or "Unicode string" when *only* one of those >is allowable. So OK... how about this patch? Index: bltinmodule.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v retrieving revision 2.185 diff -u -r2.185 bltinmodule.c --- bltinmodule.c 2000/12/20 15:07:34 2.185 +++ bltinmodule.c 2000/12/21 18:36:54 @@ -1524,13 +1524,14 @@ } } else { PyErr_Format(PyExc_TypeError, - "ord() expected string or Unicode character, " \ + "ord() expected string of length 1, but " \ "%.200s found", obj->ob_type->tp_name); return NULL; } PyErr_Format(PyExc_TypeError, - "ord() expected a character, length-%d string found", + "ord() expected a character, " + "but string of length %d found", size); return NULL; } From thomas at xs4all.net Fri Dec 22 16:21:43 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 22 Dec 2000 16:21:43 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: ; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800 References: Message-ID: <20001222162143.A5515@xs4all.nl> On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply at sourceforge.net wrote: > * Guido-style: 8-column hard-tab indents. > * New style: 4-column space-only indents. Hm, I must have missed this... Is 'new style' the preferred style, as its name suggests, or is Guido mounting a rebellion to adhere to the One True Style (or rather his own version of it, which just has the * in pointer type declarations wrong ? :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From fdrake at acm.org Fri Dec 22 16:31:21 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 10:31:21 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <20001222162143.A5515@xs4all.nl> References: <20001222162143.A5515@xs4all.nl> Message-ID: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> Thomas Wouters writes: > Hm, I must have missed this... Is 'new style' the preferred style, as its > name suggests, or is Guido mounting a rebellion to adhere to the One True > Style (or rather his own version of it, which just has the * in pointer > type declarations wrong ? :) Guido has grudgingly granted that new code in the "New style" is acceptable, mostly because many people complain that "Guido style" causes too much code to get scrunched up on the right margin. The "New style" is more like the recommendations for Python code as well, so it's easier for Python programmers to read (Tabs are hard to read clearly! ;). -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From cgw at fnal.gov Fri Dec 22 16:43:45 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Fri, 22 Dec 2000 09:43:45 -0600 (CST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> Message-ID: <14915.30385.201343.360880@buffalo.fnal.gov> Fred L. Drake, Jr. writes: > > Guido has grudgingly granted that new code in the "New style" is > acceptable, mostly because many people complain that "Guido style" > causes too much code to get scrunched up on the right margin. I am reminded of Linus Torvalds comments on this subject (see /usr/src/linux/Documentation/CodingStyle): Now, some people will claim that having 8-character indentations makes the code move too far to the right, and makes it hard to read on a 80-character terminal screen. The answer to that is that if you need more than 3 levels of indentation, you're screwed anyway, and should fix your program. From fdrake at acm.org Fri Dec 22 16:58:56 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 10:58:56 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.30385.201343.360880@buffalo.fnal.gov> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> Message-ID: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> Charles G Waldman writes: > I am reminded of Linus Torvalds comments on this subject (see > /usr/src/linux/Documentation/CodingStyle): The catch, of course, is Python/cevel.c, where breaking it up can hurt performance. People scream when you do things like that.... -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From cgw at fnal.gov Fri Dec 22 17:07:47 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Fri, 22 Dec 2000 10:07:47 -0600 (CST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> Message-ID: <14915.31827.250987.283364@buffalo.fnal.gov> Fred L. Drake, Jr. writes: > > The catch, of course, is Python/cevel.c, where breaking it up can > hurt performance. People scream when you do things like that.... Quoting again from the same source: Use helper functions with descriptive names (you can ask the compiler to in-line them if you think it's performance-critical, and it will probably do a better job of it that you would have done). But I should have pointed out that I was quoting the great Linus mostly for entertainment/cultural value, and was not really trying to add fuel to the fire. In other words, a message that I thought was amusing, but probably shouldn't have sent ;-) From fdrake at acm.org Fri Dec 22 17:20:52 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 11:20:52 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14915.31827.250987.283364@buffalo.fnal.gov> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> <14915.31827.250987.283364@buffalo.fnal.gov> Message-ID: <14915.32612.252115.562296@cj42289-a.reston1.va.home.com> Charles G Waldman writes: > But I should have pointed out that I was quoting the great Linus > mostly for entertainment/cultural value, and was not really trying to > add fuel to the fire. In other words, a message that I thought was > amusing, but probably shouldn't have sent ;-) I understood the intent; I think he's really got a point. There are a few places in Python where it would really help to break things up! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From fredrik at effbot.org Fri Dec 22 17:33:37 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Fri, 22 Dec 2000 17:33:37 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support References: <20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov> <14915.32612.252115.562296@cj42289-a.reston1.va.home.com> Message-ID: <004b01c06c34$f08151c0$e46940d5@hagrid> Fred wrote: > I understood the intent; I think he's really got a point. There are > a few places in Python where it would really help to break things up! if that's what you want, maybe you could start by putting the INLINE stuff back again? (if C/C++ compatibility is a problem, put it inside a cplusplus ifdef, and mark it as "for internal use only. don't use inline on public interfaces") From fdrake at acm.org Fri Dec 22 17:36:15 2000 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri, 22 Dec 2000 11:36:15 -0500 (EST) Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <004b01c06c34$f08151c0$e46940d5@hagrid> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> <14915.31827.250987.283364@buffalo.fnal.gov> <14915.32612.252115.562296@cj42289-a.reston1.va.home.com> <004b01c06c34$f08151c0$e46940d5@hagrid> Message-ID: <14915.33535.520957.215310@cj42289-a.reston1.va.home.com> Fredrik Lundh writes: > if that's what you want, maybe you could start by > putting the INLINE stuff back again? I could not see the value in the inline stuff that configure was setting up, and still don't. > (if C/C++ compatibility is a problem, put it inside a > cplusplus ifdef, and mark it as "for internal use only. > don't use inline on public interfaces") We should be able to come up with something reasonable, but I don't have time right now, and my head isn't currently wrapped around C compilers. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From akuchlin at cnri.reston.va.us Fri Dec 22 19:01:43 2000 From: akuchlin at cnri.reston.va.us (Andrew Kuchling) Date: Fri, 22 Dec 2000 13:01:43 -0500 Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: ; from noreply@sourceforge.net on Fri, Dec 22, 2000 at 07:07:03AM -0800 References: Message-ID: <20001222130143.B7127@newcnri.cnri.reston.va.us> On Fri, Dec 22, 2000 at 07:07:03AM -0800, noreply at sourceforge.net wrote: > * Guido-style: 8-column hard-tab indents. > * New style: 4-column space-only indents. > * _curses style: 2 column indents. > >I'd prefer "New style", myself. New style it is. (Barry, is the "python" style in cc-mode.el going to be changed to new style, or a "python2" style added?) I've been wanting to reformat _cursesmodule.c to match the Python style for some time. Probably I'll do that a little while after the panel module has settled down a bit. Fred, did you look at the use of the CObject for exposing the API? Did that look reasonable? Also, should py_curses.h go in the Include/ subdirectory instead of Modules/? --amk From fredrik at effbot.org Fri Dec 22 19:03:43 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Fri, 22 Dec 2000 19:03:43 +0100 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support References: <20001222162143.A5515@xs4all.nl><14915.29641.806901.661707@cj42289-a.reston1.va.home.com><14915.30385.201343.360880@buffalo.fnal.gov><14915.31296.56181.260479@cj42289-a.reston1.va.home.com><14915.31827.250987.283364@buffalo.fnal.gov><14915.32612.252115.562296@cj42289-a.reston1.va.home.com><004b01c06c34$f08151c0$e46940d5@hagrid> <14915.33535.520957.215310@cj42289-a.reston1.va.home.com> Message-ID: <006701c06c41$896a1a00$e46940d5@hagrid> Fred wrote: > > if that's what you want, maybe you could start by > > putting the INLINE stuff back again? > > I could not see the value in the inline stuff that configure was > setting up, and still don't. the INLINE stuff guarantees that "inline" is defined to be whatever directive the compiler uses for explicit inlining. quoting the autoconf docs: If the C compiler supports the keyword inline, do nothing. Otherwise define inline to __inline__ or __inline if it accepts one of those, otherwise define inline to be empty as a result, you can always use "inline" in your code, and have it do the right thing on all compilers that support ex- plicit inlining (all modern C compilers, in practice). ::: to deal with people compiling Python with a C compiler, but linking it with a C++ compiler, the config.h.in file could be written as: /* Define "inline" to be whatever the C compiler calls it. To avoid problems when mixing C and C++, make sure to only use "inline" for internal interfaces. */ #ifndef __cplusplus #undef inline #endif From akuchlin at mems-exchange.org Fri Dec 22 20:40:15 2000 From: akuchlin at mems-exchange.org (A.M. Kuchling) Date: Fri, 22 Dec 2000 14:40:15 -0500 Subject: [Python-Dev] PEP 222 draft Message-ID: <200012221940.OAA01936@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com> I've completed a draft of PEP 222 (sort of -- note the XXX comments in the text for things that still need to be resolved). This is being posted to python-dev, python-web-modules, and python-list/comp.lang.python, to get comments on the proposed interface. I'm on all three lists, but would prefer to see followups on python-list/comp.lang.python, so if you can reply there, please do so. --amk Abstract This PEP proposes a set of enhancements to the CGI development facilities in the Python standard library. Enhancements might be new features, new modules for tasks such as cookie support, or removal of obsolete code. The intent is to incorporate the proposals emerging from this document into Python 2.1, due to be released in the first half of 2001. Open Issues This section lists changes that have been suggested, but about which no firm decision has yet been made. In the final version of this PEP, this section should be empty, as all the changes should be classified as accepted or rejected. cgi.py: We should not be told to create our own subclass just so we can handle file uploads. As a practical matter, I have yet to find the time to do this right, so I end up reading cgi.py's temp file into, at best, another file. Some of our legacy code actually reads it into a second temp file, then into a final destination! And even if we did, that would mean creating yet another object with its __init__ call and associated overhead. cgi.py: Currently, query data with no `=' are ignored. Even if keep_blank_values is set, queries like `...?value=&...' are returned with blank values but queries like `...?value&...' are completely lost. It would be great if such data were made available through the FieldStorage interface, either as entries with None as values, or in a separate list. Utility function: build a query string from a list of 2-tuples Dictionary-related utility classes: NoKeyErrors (returns an empty string, never a KeyError), PartialStringSubstitution (returns the original key string, never a KeyError) New Modules This section lists details about entire new packages or modules that should be added to the Python standard library. * fcgi.py : A new module adding support for the FastCGI protocol. Robin Dunn's code needs to be ported to Windows, though. Major Changes to Existing Modules This section lists details of major changes to existing modules, whether in implementation or in interface. The changes in this section therefore carry greater degrees of risk, either in introducing bugs or a backward incompatibility. The cgi.py module would be deprecated. (XXX A new module or package name hasn't been chosen yet: 'web'? 'cgilib'?) Minor Changes to Existing Modules This section lists details of minor changes to existing modules. These changes should have relatively small implementations, and have little risk of introducing incompatibilities with previous versions. Rejected Changes The changes listed in this section were proposed for Python 2.1, but were rejected as unsuitable. For each rejected change, a rationale is given describing why the change was deemed inappropriate. * An HTML generation module is not part of this PEP. Several such modules exist, ranging from HTMLgen's purely programming interface to ASP-inspired simple templating to DTML's complex templating. There's no indication of which templating module to enshrine in the standard library, and that probably means that no module should be so chosen. * cgi.py: Allowing a combination of query data and POST data. This doesn't seem to be standard at all, and therefore is dubious practice. Proposed Interface XXX open issues: naming convention (studlycaps or underline-separated?); need to look at the cgi.parse*() functions and see if they can be simplified, too. Parsing functions: carry over most of the parse* functions from cgi.py # The Response class borrows most of its methods from Zope's # HTTPResponse class. class Response: """ Attributes: status: HTTP status code to return headers: dictionary of response headers body: string containing the body of the HTTP response """ def __init__(self, status=200, headers={}, body=""): pass def setStatus(self, status, reason=None): "Set the numeric HTTP response code" pass def setHeader(self, name, value): "Set an HTTP header" pass def setBody(self, body): "Set the body of the response" pass def setCookie(self, name, value, path = '/', comment = None, domain = None, max-age = None, expires = None, secure = 0 ): "Set a cookie" pass def expireCookie(self, name): "Remove a cookie from the user" pass def redirect(self, url): "Redirect the browser to another URL" pass def __str__(self): "Convert entire response to a string" pass def dump(self): "Return a string representation useful for debugging" pass # XXX methods for specific classes of error:serverError, badRequest, etc.? class Request: """ Attributes: XXX should these be dictionaries, or dictionary-like objects? .headers : dictionary containing HTTP headers .cookies : dictionary of cookies .fields : data from the form .env : environment dictionary """ def __init__(self, environ=os.environ, stdin=sys.stdin, keep_blank_values=1, strict_parsing=0): """Initialize the request object, using the provided environment and standard input.""" pass # Should people just use the dictionaries directly? def getHeader(self, name, default=None): pass def getCookie(self, name, default=None): pass def getField(self, name, default=None): "Return field's value as a string (even if it's an uploaded file)" pass def getUploadedFile(self, name): """Returns a file object that can be read to obtain the contents of an uploaded file. XXX should this report an error if the field isn't actually an uploaded file? Or should it wrap a StringIO around simple fields for consistency? """ def getURL(self, n=0, query_string=0): """Return the URL of the current request, chopping off 'n' path components from the right. Eg. if the URL is "http://foo.com/bar/baz/quux", n=2 would return "http://foo.com/bar". Does not include the query string (if any) """ def getBaseURL(self, n=0): """Return the base URL of the current request, adding 'n' path components to the end to recreate more of the whole URL. Eg. if the request URL is "http://foo.com/q/bar/baz/qux", n=0 would return "http://foo.com/", and n=2 "http://foo.com/q/bar". Returned URL does not include the query string, if any. """ def dump(self): "String representation suitable for debugging output" pass # Possibilities? I don't know if these are worth doing in the # basic objects. def getBrowser(self): "Returns Mozilla/IE/Lynx/Opera/whatever" def isSecure(self): "Return true if this is an SSLified request" # Module-level function def wrapper(func, logfile=sys.stderr): """ Calls the function 'func', passing it the arguments (request, response, logfile). Exceptions are trapped and sent to the file 'logfile'. """ # This wrapper will detect if it's being called from the command-line, # and if so, it will run in a debugging mode; name=value pairs # can be entered on standard input to set field values. # (XXX how to do file uploads in this syntax?) Copyright This document has been placed in the public domain. From tim.one at home.com Fri Dec 22 20:31:07 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 22 Dec 2000 14:31:07 -0500 Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support) In-Reply-To: <20001222162143.A5515@xs4all.nl> Message-ID: [Thomas Wouters] >> * Guido-style: 8-column hard-tab indents. >> * New style: 4-column space-only indents. > > Hm, I must have missed this... Is 'new style' the preferred style, as > its name suggests, or is Guido mounting a rebellion to adhere to the > One True Style (or rather his own version of it, which just has > the * in pointer type declarations wrong ? :) Every time this comes up wrt C code, 1. Fred repeats that he thinks Guido caved in (but doesn't supply a reference to anything saying so). 2. Guido repeats that he prefers old-style (but in a wishy-washy way that leaves it uncertain (*)). 3. Fredrik and/or I repeat a request for a BDFL Pronouncement. 4. And there the thread ends. It's *very* hard to find this history in the Python-Dev archives because these threads always have subject lines like this one originally had ("RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support"). Fred already did the #1 bit in this thread. You can consider this msg the repeat of #3. Since Guido is out of town, we can skip #2 and go straight to #4 early . (*) Two examples of #2 from this year: Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/ Modules mmapmodule.c,2.1,2.2 From: Guido van Rossum Date: Fri, 31 Mar 2000 07:10:45 -0500 > Can we change the 8-space-tab rule for all new C code that goes in? I > know that we can't practically change existing code right now, but for > new C code, I propose we use no tab characters, and we use a 4-space > block indentation. Actually, this one was formatted for 8-space indents but using 4-space tabs, so in my editor it looked like 16-space indents! Given that we don't want to change existing code, I'd prefer to stick with 1-tab 8-space indents. Subject: Re: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules linuxaudiodev.c,2.2,2.3 From: Guido van Rossum Date: Sat, 08 Jul 2000 09:39:51 -0500 > Aren't tabs preferred as C-source indents, instead of 4-spaces ? At > least, that's what I see in Python/*.c and Object/*.c, but I only > vaguely recall it from the style document... Yes, you're right. From fredrik at effbot.org Fri Dec 22 21:37:35 2000 From: fredrik at effbot.org (Fredrik Lundh) Date: Fri, 22 Dec 2000 21:37:35 +0100 Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support) References: Message-ID: <00e201c06c57$052fff00$e46940d5@hagrid> > 3. Fredrik and/or I repeat a request for a BDFL Pronouncement. and. From akuchlin at mems-exchange.org Fri Dec 22 22:09:47 2000 From: akuchlin at mems-exchange.org (A.M. Kuchling) Date: Fri, 22 Dec 2000 16:09:47 -0500 Subject: [Python-Dev] Reviving the bookstore Message-ID: <200012222109.QAA02737@207-172-57-45.s45.tnt2.ann.va.dialup.rcn.com> Since the PSA isn't doing anything for us any longer, I've been working on reviving the bookstore at a new location with a new affiliate code. A draft version is up at its new home, http://www.kuchling.com/bookstore/ . Please take a look and offer comments. Book authors, please take a look at the entry for your book and let me know about any corrections. Links to reviews of books would also be really welcomed. I'd like to abolish having book listings with no description or review, so if you notice a book that you've read has no description, please feel free to submit a description and/or review. --amk From tim.one at home.com Sat Dec 23 08:15:59 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 23 Dec 2000 02:15:59 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: <3A41E68B.6B12CD71@lemburg.com> Message-ID: [Tim] > ... > So we should say "8-bit string" or "Unicode string" when *only* > one of those is allowable. So > > "ord() expected string ..." > > instead of (even a repaired version of) > > "ord() expected string or Unicode character ..." [MAL] > I think this has to do with understanding that there are two > string types in Python 2.0 -- a novice won't notice this until > she sees the error message. Except that this error msg has nothing to do with how many string types there are: they didn't pass *any* flavor of string when they get this msg. At the time they pass (say) a float to ord(), that there are currently two flavors of string is more information than they need to know. > My understanding is similar to yours, "string" should mean > "any string object" and in cases where the difference between > 8-bit string and Unicode matters, these should be referred to > as "8-bit string" and "Unicode string". In that happy case of universal harmony, the msg above should say just "string" and leave it at that. > Still, I think it is a good idea to make people aware of the > possibility of passing Unicode objects to these functions, Me too. > so perhaps the idea of adding both possibilies to error messages > is not such a bad idea for 2.1. But not that. The user is trying to track down their problem. Advertising an irrelevant (to their problem) distinction at that time of crisis is simply spam. TypeError: ord() requires an 8-bit string or a Unicode string. On the other hand, you'd be surprised to discover all the things you can pass to chr(): it's not just ints. Long ints are also accepted, by design, and due to an obscure bug in the Python internals, you can also pass floats, which get truncated to ints. > The next phases would be converting all messages back to "string" > and then convert all strings to Unicode ;-) Then we'll save a lot of work by skipping the need for the first half of that -- unless you're volunteering to do all of it . From tim.one at home.com Sat Dec 23 08:16:29 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 23 Dec 2000 02:16:29 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102955] bltinmodule.c warning fix In-Reply-To: <20001221133719.B11880@kronos.cnri.reston.va.us> Message-ID: [Tim] > So we should say "8-bit string" or "Unicode string" when *only* > one of those is allowable. [Andrew] > OK... how about this patch? +1 from me. And maybe if you offer to send a royalty to Marc-Andre each time it's printed, he'll back down from wanting to use the error msgs as a billboard . > Index: bltinmodule.c > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Python/bltinmodule.c,v > retrieving revision 2.185 > diff -u -r2.185 bltinmodule.c > --- bltinmodule.c 2000/12/20 15:07:34 2.185 > +++ bltinmodule.c 2000/12/21 18:36:54 > @@ -1524,13 +1524,14 @@ > } > } else { > PyErr_Format(PyExc_TypeError, > - "ord() expected string or Unicode character, " \ > + "ord() expected string of length 1, but " \ > "%.200s found", obj->ob_type->tp_name); > return NULL; > } > > PyErr_Format(PyExc_TypeError, > - "ord() expected a character, length-%d string found", > + "ord() expected a character, " > + "but string of length %d found", > size); > return NULL; > } From barry at digicool.com Sat Dec 23 17:43:37 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Sat, 23 Dec 2000 11:43:37 -0500 Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support References: <20001222130143.B7127@newcnri.cnri.reston.va.us> Message-ID: <14916.54841.418495.194558@anthem.concentric.net> >>>>> "AK" == Andrew Kuchling writes: AK> New style it is. (Barry, is the "python" style in cc-mode.el AK> going to be changed to new style, or a "python2" style added?) There should probably be a second style added to cc-mode.el. I haven't maintained that package in a long time, but I'll work out a patch and send it to the current maintainer. Let's call it "python2". -Barry From cgw at fnal.gov Sat Dec 23 18:09:57 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Sat, 23 Dec 2000 11:09:57 -0600 (CST) Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: <14916.54841.418495.194558@anthem.concentric.net> References: <20001222130143.B7127@newcnri.cnri.reston.va.us> <14916.54841.418495.194558@anthem.concentric.net> Message-ID: <14916.56421.370499.762023@buffalo.fnal.gov> Barry A. Warsaw writes: > There should probably be a second style added to cc-mode.el. I > haven't maintained that package in a long time, but I'll work out a > patch and send it to the current maintainer. Let's call it > "python2". Maybe we should wait for the BDFL's pronouncement? From barry at digicool.com Sat Dec 23 20:24:42 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Sat, 23 Dec 2000 14:24:42 -0500 Subject: [Python-Dev] [Patch #102813] _cursesmodule: Add panel support References: <20001222130143.B7127@newcnri.cnri.reston.va.us> <14916.54841.418495.194558@anthem.concentric.net> <14916.56421.370499.762023@buffalo.fnal.gov> Message-ID: <14916.64506.56351.443287@anthem.concentric.net> >>>>> "CGW" == Charles G Waldman writes: CGW> Barry A. Warsaw writes: >> There should probably be a second style added to cc-mode.el. I >> haven't maintained that package in a long time, but I'll work >> out a patch and send it to the current maintainer. Let's call >> it "python2". CGW> Maybe we should wait for the BDFL's pronouncement? Sure, at least before submitting a patch. Here's the simple one liner you can add to your .emacs file to play with the new style in the meantime. -Barry (c-add-style "python2" '("python" (c-basic-offset . 4))) From tim.one at home.com Sun Dec 24 05:04:47 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 23 Dec 2000 23:04:47 -0500 Subject: [Python-Dev] PEP 208 and __coerce__ In-Reply-To: <20001209033006.A3737@glacier.fnational.com> Message-ID: [Neil Schemenauer Saturday, December 09, 2000 6:30 AM] > While working on the implementation of PEP 208, I discovered that > __coerce__ has some surprising properties. Initially I > implemented __coerce__ so that the numberic operation currently > being performed was called on the values returned by __coerce__. > This caused test_class to blow up due to code like this: > > class Test: > def __coerce__(self, other): > return (self, other) > > The 2.0 "solves" this by not calling __coerce__ again if the > objects returned by __coerce__ are instances. If C.__coerce__ doesn't *know* it can do the full job, it should return None. This is what's documented, too: a coerce method should return a pair consisting of objects of the same type, or return None. It's always going to be somewhat clumsy since what you really want is double (or, in the case of pow, sometimes triple) dispatch. Now there's a deliberate cheat that may not have gotten documented comprehensibly: when __coerce__ returns a pair, Python does not check to verify both elements are of the same class. That's because "a pair consisting of objects of the same type" is often not what you *want* from coerce. For example, if I've got a matrix class M, then in M() + 42 I really don't want M.__coerce__ "promoting" 42 to a multi-gigabyte matrix matching the shape and size of M(). M.__add__ can deal with that much more efficiently if it gets 42 directly. OTOH, M.__coerce__ may want to coerce types other than scalar numbers to conform to the shape and size of self, or fiddle self to conform to some other type. What Python accepts back from __coerce__ has to be flexible enough to allow all of those without further interference from the interpreter (just ask MAL : the *real* problem in practice is making coerce more of a help than a burden to the end user; outside of int->long->float->complex (which is itself partly broken, because long->float can lose precision or even fail outright), "coercion to a common type" is almost never quite right; note that C99 introduces distinct imaginary and complex types, because even auto-conversion of imaginary->complex can be a royal PITA!). > This has the effect of making code like: > > class A: > def __coerce__(self, other): > return B(), other > > class B: > def __coerce__(self, other): > return 1, other > > A() + 1 > > fail to work in the expected way. I have no idea how you expected that to work. Neither coerce() method looks reasonable: they don't follow the rules for coerce methods. If A thinks it needs to create a B() and have coercion "start over from scratch" with that, then it should do so explicitly: class A: def __coerce__(self, other): return coerce(B(), other) > The question is: how should __coerce__ work? This can't be answered by a one-liner: the intended behavior is documented by a complex set of rules at the bottom of Lang Ref 3.3.6 ("Emulating numeric types"). Alternatives should be written up as a diff against those rules, which Guido worked hard on in years past -- more than once, too . From esr at thyrsus.com Mon Dec 25 10:17:23 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 25 Dec 2000 04:17:23 -0500 Subject: [Python-Dev] Tkinter support under RH 7.0? Message-ID: <20001225041723.A9567@thyrsus.com> I just upgraded to Red Hat 7.0 and installed Python 2.0. Anybody have a recipe for making Tkinter support work in this environment? -- Eric S. Raymond "Government is not reason, it is not eloquence, it is force; like fire, a troublesome servant and a fearful master. Never for a moment should it be left to irresponsible action." -- George Washington, in a speech of January 7, 1790 From thomas at xs4all.net Mon Dec 25 11:59:45 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Mon, 25 Dec 2000 11:59:45 +0100 Subject: [Python-Dev] Tkinter support under RH 7.0? In-Reply-To: <20001225041723.A9567@thyrsus.com>; from esr@thyrsus.com on Mon, Dec 25, 2000 at 04:17:23AM -0500 References: <20001225041723.A9567@thyrsus.com> Message-ID: <20001225115945.A25820@xs4all.nl> On Mon, Dec 25, 2000 at 04:17:23AM -0500, Eric S. Raymond wrote: > I just upgraded to Red Hat 7.0 and installed Python 2.0. Anybody have > a recipe for making Tkinter support work in this environment? I installed Python 2.0 + Tkinter both from the BeOpen rpms and later from source (for various reasons) and both were a breeze. I didn't really use the 2.0+tkinter rpm version until I needed Numpy and various other things and had to revert to the self-compiled version, but it seemed to work fine. As far as I can recall, there's only two things you have to keep in mind: the tcl/tk version that comes with RedHat 7.0 is 8.3, so you have to adjust the Tkinter section of Modules/Setup accordingly, and some of the RedHat-supplied scripts stop working because they use deprecated modules (at least 'rand') and use the socket.socket call wrong. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From esr at thyrsus.com Wed Dec 27 20:37:50 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Dec 2000 14:37:50 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues Message-ID: <20001227143750.A26894@thyrsus.com> I have 2.0 up and running on RH7.0, compiled from sources. In the process, I discovered a couple of issues: 1. The curses module is commented out in the default Modules/Setup file. This is not good, as it may lead careless distribution builders to ship Python 2.0s that will not be able to support the curses front end in CML2. Supporting CML2 (and thus getting Python the "design win" of being involved in the Linux kernel build) was the major point of integrating the curses module into the Python core. It is possible that one little "#" may have blown that. 2.The default Modules/Setup file assumes that various Tkinter-related libraries are in /usr/local. But /usr would be a more appropriate choice under most circumstances. Most Linux users now install their Tcl/Tk stuff from RPMs or .deb packages that place the binaries and libraries under /usr. Under most other Unixes (e.g. Solaris) they were there to begin with. 3. The configure machinery could be made to deduce more about the contents of Modules/Setup than it does now. In particular, it's silly that the person building Python has to fill in the locations of X librasries when configure is in principle perfectly capable of finding them. -- Eric S. Raymond Our society won't be truly free until "None of the Above" is always an option. From guido at digicool.com Wed Dec 27 22:04:27 2000 From: guido at digicool.com (Guido van Rossum) Date: Wed, 27 Dec 2000 16:04:27 -0500 Subject: C indentation style (was RE: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support) In-Reply-To: Your message of "Fri, 22 Dec 2000 14:31:07 EST." References: Message-ID: <200012272104.QAA22278@cj20424-a.reston1.va.home.com> > 2. Guido repeats that he prefers old-style (but in a wishy-washy way that > leaves it uncertain (*)). OK, since a pronouncement is obviously needed, here goes: Python C source code should be indented using tabs only. Exceptions: (1) If 3rd party code is already written using a different style, it can stay that way, especially if it's a large volume that would be hard to reformat. But only if it is consistent within a file or set of files (e.g. a 3rd party patch will have to conform to the prevailing style in the patched file). (2) Occasionally (e.g. in ceval.c) there is code that's very deeply nested. I will allow 4-space indents for the innermost nesting levels here. Other C whitespace nits: - Always place spaces around assignment operators, comparisons, &&, ||. - No space between function name and left parenthesis. - Always a space between a keyword ('if', 'for' etc.) and left paren. - No space inside parentheses, brackets etc. - No space before a comma or semicolon. - Always a space after a comma (and semicolon, if not at end of line). - Use ``return x;'' instead of ``return(x)''. --Guido van Rossum (home page: http://www.python.org/~guido/) From cgw at fnal.gov Wed Dec 27 23:17:31 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Wed, 27 Dec 2000 16:17:31 -0600 (CST) Subject: [Python-Dev] sourceforge: problems with bug list? Message-ID: <14922.27259.456364.750295@buffalo.fnal.gov> Is it just me, or is anybody else getting this error when trying to access the bug list? > An error occured in the logger. ERROR: pg_atoi: error in "5470/": > can't parse "/" From akuchlin at mems-exchange.org Wed Dec 27 23:39:35 2000 From: akuchlin at mems-exchange.org (Andrew Kuchling) Date: Wed, 27 Dec 2000 17:39:35 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500 References: <20001227143750.A26894@thyrsus.com> Message-ID: <20001227173935.A25605@kronos.cnri.reston.va.us> On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote: >1. The curses module is commented out in the default Modules/Setup >file. This is not good, as it may lead careless distribution builders It always has been commented out. Good distributions ship with most of the available modules enabled; I can't say if RH7.0 counts as a good distribution or not (still on 6.2). >3. The configure machinery could be made to deduce more about the contents >of Modules/Setup than it does now. In particular, it's silly that the person This is the point of PEP 229 and patch #102588, which uses a setup.py script to build extension modules. (I need to upload an updated version of the patch which actually includes setup.py -- thought I did that, but apparently not...) The patch is still extremely green, though, but I think it's the best course; witness the tissue of hackery required to get the bsddb module automatically detected and built. --amk From guido at digicool.com Wed Dec 27 23:54:26 2000 From: guido at digicool.com (Guido van Rossum) Date: Wed, 27 Dec 2000 17:54:26 -0500 Subject: [Python-Dev] Re: [Patches] [Patch #102813] _cursesmodule: Add panel support In-Reply-To: Your message of "Fri, 22 Dec 2000 10:58:56 EST." <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> References: <20001222162143.A5515@xs4all.nl> <14915.29641.806901.661707@cj42289-a.reston1.va.home.com> <14915.30385.201343.360880@buffalo.fnal.gov> <14915.31296.56181.260479@cj42289-a.reston1.va.home.com> Message-ID: <200012272254.RAA22931@cj20424-a.reston1.va.home.com> > Charles G Waldman writes: > > I am reminded of Linus Torvalds comments on this subject (see > > /usr/src/linux/Documentation/CodingStyle): Fred replied: > The catch, of course, is Python/cevel.c, where breaking it up can > hurt performance. People scream when you do things like that.... Funny, Jeremy is doing just that, and it doesn't seem to be hurting performance at all. See http://sourceforge.net/patch/?func=detailpatch&patch_id=102337&group_id=5470 (though this is not quite finished). --Guido van Rossum (home page: http://www.python.org/~guido/) From esr at thyrsus.com Thu Dec 28 00:05:46 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Dec 2000 18:05:46 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001227173935.A25605@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Dec 27, 2000 at 05:39:35PM -0500 References: <20001227143750.A26894@thyrsus.com> <20001227173935.A25605@kronos.cnri.reston.va.us> Message-ID: <20001227180546.A4365@thyrsus.com> Andrew Kuchling : > >1. The curses module is commented out in the default Modules/Setup > >file. This is not good, as it may lead careless distribution builders > > It always has been commented out. Good distributions ship with most > of the available modules enabled; I can't say if RH7.0 counts as a > good distribution or not (still on 6.2). I think this needs to change. If curses is a core facility now, the default build should tread it as one. -- Eric S. Raymond If a thousand men were not to pay their tax-bills this year, that would ... [be] the definition of a peaceable revolution, if any such is possible. -- Henry David Thoreau From tim.one at home.com Thu Dec 28 01:44:29 2000 From: tim.one at home.com (Tim Peters) Date: Wed, 27 Dec 2000 19:44:29 -0500 Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109 In-Reply-To: Message-ID: [Barry Warsaw] > Modified Files: > python-mode.el > Log Message: > (python-font-lock-keywords): Add highlighting of `as' as a keyword, > but only in "import foo as bar" statements (including optional > preceding `from' clause). Oh, that's right, try to make IDLE look bad, will you? I've got half a mind to take up the challenge. Unfortunately, I only have half a mind in total, so you may get away with this backstabbing for a while . From thomas at xs4all.net Thu Dec 28 10:53:31 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 10:53:31 +0100 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001227143750.A26894@thyrsus.com>; from esr@thyrsus.com on Wed, Dec 27, 2000 at 02:37:50PM -0500 References: <20001227143750.A26894@thyrsus.com> Message-ID: <20001228105331.A6042@xs4all.nl> On Wed, Dec 27, 2000 at 02:37:50PM -0500, Eric S. Raymond wrote: > I have 2.0 up and running on RH7.0, compiled from sources. In the process, > I discovered a couple of issues: > 1. The curses module is commented out in the default Modules/Setup > file. This is not good, as it may lead careless distribution builders > to ship Python 2.0s that will not be able to support the curses front > end in CML2. Supporting CML2 (and thus getting Python the "design > win" of being involved in the Linux kernel build) was the major point > of integrating the curses module into the Python core. It is possible > that one little "#" may have blown that. Note that Tkinter is off by default too. And readline. And ssl. And the use of shared libraries. We *can't* enable the cursesmodule by default, because we don't know what the system's curses library is called. We'd have to auto-detect that before we can enable it (and lots of other modules) automatically, and that's a lot of work. I personally favour autoconf for the job, but since amk is already busy on using distutils, I'm not going to work on that. > 2.The default Modules/Setup file assumes that various Tkinter-related libraries > are in /usr/local. But /usr would be a more appropriate choice under most > circumstances. Most Linux users now install their Tcl/Tk stuff from RPMs > or .deb packages that place the binaries and libraries under /usr. Under > most other Unixes (e.g. Solaris) they were there to begin with. This is nonsense. The line above it specifically states 'edit to reflect where your Tcl/Tk headers are'. And besides from the issue whether they are usually found in /usr (I don't believe so, not even on Solaris, but 'my' Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane choice, since /usr is already included in the include-path, but /usr/local usually is not. > 3. The configure machinery could be made to deduce more about the contents > of Modules/Setup than it does now. In particular, it's silly that the person > building Python has to fill in the locations of X librasries when > configure is in principle perfectly capable of finding them. In principle, I agree. It's a lot of work, though. For instance, Debian stores the Tcl/Tk headers in /usr/include/tcl, which means you can compile for more than one tcl version, by just changing your include path and the library you link with. And there are undoubtedly several other variants out there. Should we really make the Setup file default to Linux, and leave other operating systems in the dark about what it might be on their system ? I think people with Linux and without clue are the least likely people to compile their own Python, since Linux distributions already come with a decent enough Python. And, please, lets assume the people assembling those know how to read ? Maybe we just need a HOWTO document covering Setup ? (Besides, won't this all be fixed when CML2 comes with a distribution, Eric ? They'll *have* to have working curses/tkinter then :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From MarkH at ActiveState.com Thu Dec 28 13:34:09 2000 From: MarkH at ActiveState.com (Mark Hammond) Date: Thu, 28 Dec 2000 23:34:09 +1100 Subject: [Python-Dev] Fwd: try...else Message-ID: <3A4B3341.5010707@ActiveState.com> Spotted on c.l.python. Although Pythonwin is mentioned, python.exe gives the same results - as does 1.5.2. Seems a reasonable question... [Also, if Robin hasn't been invited to join us here, I think it could make some sense...] Mark. -------- Original Message -------- Subject: try...else Date: Fri, 22 Dec 2000 18:02:27 +0000 From: Robin Becker Newsgroups: comp.lang.python I had expected that in try: except: else the else clause always got executed, but it seems not for return PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32.Portions Copyright 1994-2000 Mark Hammond (MarkH at ActiveState.com) - see 'Help/About PythonWin' for further copyright information. >>> def bang(): .... try: .... return 'return value' .... except: .... print 'bang failed' .... else: .... print 'bang succeeded' .... >>> bang() 'return value' >>> is this a 'feature' or bug. The 2.0 docs seem not to mention return/continue except for try finally. -- Robin Becker From mal at lemburg.com Thu Dec 28 15:45:49 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 15:45:49 +0100 Subject: [Python-Dev] Fwd: try...else References: <3A4B3341.5010707@ActiveState.com> Message-ID: <3A4B521D.4372224A@lemburg.com> Mark Hammond wrote: > > Spotted on c.l.python. Although Pythonwin is mentioned, python.exe > gives the same results - as does 1.5.2. > > Seems a reasonable question... > > [Also, if Robin hasn't been invited to join us here, I think it could > make some sense...] > > Mark. > -------- Original Message -------- > Subject: try...else > Date: Fri, 22 Dec 2000 18:02:27 +0000 > From: Robin Becker > Newsgroups: comp.lang.python > > I had expected that in try: except: else > the else clause always got executed, but it seems not for return I think Robin mixed up try...finally with try...except...else. The finally clause is executed even in case an exception occurred. He does have a point however that 'return' will bypass try...else and try...finally clauses. I don't think we can change that behaviour, though, as it would break code. > PythonWin 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on > win32.Portions Copyright 1994-2000 Mark Hammond (MarkH at ActiveState.com) > - see 'Help/About PythonWin' for further copyright information. > >>> def bang(): > .... try: > .... return 'return value' > .... except: > .... print 'bang failed' > .... else: > .... print 'bang succeeded' > .... > >>> bang() > 'return value' > >>> > > is this a 'feature' or bug. The 2.0 docs seem not to mention > return/continue except for try finally. > -- > Robin Becker > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://www.python.org/mailman/listinfo/python-dev -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From guido at digicool.com Thu Dec 28 16:04:23 2000 From: guido at digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 10:04:23 -0500 Subject: [Python-Dev] chomp()? Message-ID: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Someone just posted a patch to implement s.chomp() as a string method: http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470 Pseudo code (for those not aware of the Perl function by that name): def chomp(s): if s[-2:] == '\r\n': return s[:-2] if s[-1:] == '\r' or s[-1:] == '\n': return s[:-1] return s I.e. it removes a trailing \r\n, \r, or \n. Any comments? Is this needed given that we have s.rstrip() already? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at digicool.com Thu Dec 28 16:30:57 2000 From: guido at digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 10:30:57 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: Your message of "Wed, 27 Dec 2000 14:37:50 EST." <20001227143750.A26894@thyrsus.com> References: <20001227143750.A26894@thyrsus.com> Message-ID: <200012281530.KAA26049@cj20424-a.reston1.va.home.com> Eric, I think your recent posts have shown a worldview that's a bit too Eric-centered. :-) Not all the world is Linux. CML2 isn't the only Python application that matters. Python world domination is not a goal. There is no Eric conspiracy! :-) That said, I think that the future is bright: Anderw is already working on a much more intelligent configuration manager. I believe it would be a mistake to enable curses by default using the current approach to module configuration: it doesn't compile out of the box on every platform, and you wouldn't believe how much email I get from clueless Unix users trying to build Python when there's a problem like that in the distribution. So I'd rather wait for Andrew's work. You could do worse than help him with that, to further your goal! --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Thu Dec 28 16:41:23 2000 From: fdrake at acm.org (Fred L. Drake) Date: Thu, 28 Dec 2000 10:41:23 -0500 Subject: [Python-Dev] chomp()? In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: On Thu, 28 Dec 2000 10:04:23 -0500, Guido wrote: > Someone just posted a patch to implement s.chomp() as a > string method: ... > Any comments? Is this needed given that we have > s.rstrip() already? I've always considered this a different operation from rstrip(). When you intend to be as surgical in your changes as possible, it is important *not* to use rstrip(). I don't feel strongly that it needs to be implemented in C, though I imagine people who do a lot of string processing feel otherwise. It's just hard to beat the performance difference if you are doing this a lot. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From barry at digicool.com Thu Dec 28 17:00:36 2000 From: barry at digicool.com (Barry A. Warsaw) Date: Thu, 28 Dec 2000 11:00:36 -0500 Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Misc python-mode.el,3.108,3.109 References: Message-ID: <14923.25508.668453.186209@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> [Barry Warsaw] >> Modified Files: python-mode.el Log Message: >> (python-font-lock-keywords): Add highlighting of `as' as a >> keyword, but only in "import foo as bar" statements (including >> optional preceding `from' clause). TP> Oh, that's right, try to make IDLE look bad, will you? I've TP> got half a mind to take up the challenge. Unfortunately, I TP> only have half a mind in total, so you may get away with this TP> backstabbing for a while . With my current network (un)connectivity, I feel like a nuclear sub which can only surface once a month to receive low frequency orders from some remote antenna farm out in New Brunswick. Just think of me as a rogue commander who tries to do as much damage as possible when he's not joyriding in the draft-wake of giant squids. rehoming-all-remaining-missiles-at-the-Kingdom-of-Timbotia-ly y'rs, -Barry From esr at thyrsus.com Thu Dec 28 17:01:54 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 11:01:54 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <200012281530.KAA26049@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 10:30:57AM -0500 References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> Message-ID: <20001228110154.D32394@thyrsus.com> Guido van Rossum : > Not all the world is Linux. CML2 isn't the only Python application > that matters. Python world domination is not a goal. There is no > Eric conspiracy! :-) Perhaps I misunderstood you, then. I thought you considered CML2 an potentially important design win, and that was why curses didn't get dropped from the core. Have you changed your mind about this? If Python world domination is not a goal then I can only conclude that you haven't had your morning coffee yet :-). There's a more general question here about what it means for something to be in the core language. Developers need to have a clear, bright-line picture of what they can count on to be present. To me this implies that it's the job of the Python maintainers to make sure that a facility declared "core" by its presence in the standard library documentation is always present, for maximum "batteries are included" effect. Yes, dealing with cross-platform variations in linking curses is a pain -- but dealing with that kind of pain so the Python user doesn't have to is precisely our job. Or so I understand it, anyway. -- Eric S. Raymond Conservatism is the blind and fear-filled worship of dead radicals. From moshez at zadka.site.co.il Thu Dec 28 17:51:32 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: 28 Dec 2000 16:51:32 -0000 Subject: [Python-Dev] chomp()? In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <20001228165132.8025.qmail@stimpy.scso.com> On Thu, 28 Dec 2000, Guido van Rossum wrote: > Someone just posted a patch to implement s.chomp() as a string method: ... > Any comments? Is this needed given that we have s.rstrip() already? Yes. i=0 for line in fileinput.input(): print '%d: %s' % (i, line.chomp()) i++ I want that operation to be invertable by sed 's/^[0-9]*: //' From guido at digicool.com Thu Dec 28 18:08:18 2000 From: guido at digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 12:08:18 -0500 Subject: [Python-Dev] scp to sourceforge Message-ID: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> I've seen a thread on this but there was no conclusive answer, so I'm reopening this. I can't SCP updated PEPs to the SourceForge machine. The "pep2html.py -i" command just hangs. I can ssh into shell.sourceforge.net just fine, but scp just hangs. "scp -v" prints a bunch of things suggesting that it can authenticate itself just fine, ending with these three lines: cj20424-a.reston1.va.home.com: RSA authentication accepted by server. cj20424-a.reston1.va.home.com: Sending command: scp -v -t . cj20424-a.reston1.va.home.com: Entering interactive session. and then nothing. It just sits there. Would somebody please figure out a way to update the PEPs? It's kind of pathetic to see the website not have the latest versions... --Guido van Rossum (home page: http://www.python.org/~guido/) From moshez at zadka.site.co.il Thu Dec 28 17:28:07 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: 28 Dec 2000 16:28:07 -0000 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B521D.4372224A@lemburg.com> References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> Message-ID: <20001228162807.7229.qmail@stimpy.scso.com> On Thu, 28 Dec 2000, "M.-A. Lemburg" wrote: > He does have a point however that 'return' will bypass > try...else and try...finally clauses. I don't think we can change > that behaviour, though, as it would break code. It doesn't bypass try..finally: >>> def foo(): ... try: ... print "hello" ... return ... finally: ... print "goodbye" ... >>> foo() hello goodbye From guido at digicool.com Thu Dec 28 17:43:26 2000 From: guido at digicool.com (Guido van Rossum) Date: Thu, 28 Dec 2000 11:43:26 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: Your message of "Thu, 28 Dec 2000 11:01:54 EST." <20001228110154.D32394@thyrsus.com> References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> <20001228110154.D32394@thyrsus.com> Message-ID: <200012281643.LAA26687@cj20424-a.reston1.va.home.com> > Guido van Rossum : > > Not all the world is Linux. CML2 isn't the only Python application > > that matters. Python world domination is not a goal. There is no > > Eric conspiracy! :-) > > Perhaps I misunderstood you, then. I thought you considered CML2 an > potentially important design win, and that was why curses didn't get > dropped from the core. Have you changed your mind about this? Supporting CML2 was one of the reasons to keep curses in the core, but not the only one. Linux kernel configuration is so far removed from my daily use of computers that I don't have a good way to judge its importance in the grand scheme of things. Since you obviously consider it very important, and since I generally trust your judgement (except on the issue of firearms :-), your plea for keeping, and improving, curses support in the Python core made a difference in my decision. And don't worry, I don't expect to change that decision -- though I personally still find it curious that curses is so important. I find curses-style user interfaces pretty pathetic, and wished that Linux migrated to a real GUI for configuration. (And the linuxconf approach does *not* qualify as a a real GUI. :-) > If Python world domination is not a goal then I can only conclude that > you haven't had your morning coffee yet :-). Sorry to disappoint you, Eric. I gave up coffee years ago. :-) I was totally serious though: my personal satisfaction doesn't come from Python world domination. Others seem have that goal, and if it doesn't inconvenience me too much I'll play along, but in the end I've got some goals in my personal life that are much more important. > There's a more general question here about what it means for something > to be in the core language. Developers need to have a clear, > bright-line picture of what they can count on to be present. To me > this implies that it's the job of the Python maintainers to make sure > that a facility declared "core" by its presence in the standard > library documentation is always present, for maximum "batteries are > included" effect. We do the best we can. Using the current module configuration system, it's a one-character edit to enable curses if you need it. With Andrew's new scheme, it will be automatic. > Yes, dealing with cross-platform variations in linking curses is a > pain -- but dealing with that kind of pain so the Python user doesn't > have to is precisely our job. Or so I understand it, anyway. So help Andrew: http://python.sourceforge.net/peps/pep-0229.html --Guido van Rossum (home page: http://www.python.org/~guido/) From mal at lemburg.com Thu Dec 28 17:52:36 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 17:52:36 +0100 Subject: [Python-Dev] Fwd: try...else References: <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> Message-ID: <3A4B6FD3.9B576E9A@lemburg.com> Moshe Zadka wrote: > > On Thu, 28 Dec 2000, "M.-A. Lemburg" wrote: > > > He does have a point however that 'return' will bypass > > try...else and try...finally clauses. I don't think we can change > > that behaviour, though, as it would break code. > > It doesn't bypass try..finally: > > >>> def foo(): > ... try: > ... print "hello" > ... return > ... finally: > ... print "goodbye" > ... > >>> foo() > hello > goodbye Hmm, that must have changed between Python 1.5 and more recent versions: Python 1.5: >>> def f(): ... try: ... return 1 ... finally: ... print 'finally' ... >>> f() 1 >>> Python 2.0: >>> def f(): ... try: ... return 1 ... finally: ... print 'finally' ... >>> f() finally 1 >>> -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From moshez at stimpy.scso.com Thu Dec 28 17:59:32 2000 From: moshez at stimpy.scso.com (Moshe Zadka) Date: 28 Dec 2000 16:59:32 -0000 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B6FD3.9B576E9A@lemburg.com> References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> Message-ID: <20001228165932.8143.qmail@stimpy.scso.com> On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" wrote: [about try..finally not playing well with return] > Hmm, that must have changed between Python 1.5 and more recent > versions: I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2? From esr at thyrsus.com Thu Dec 28 18:20:48 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 12:20:48 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001228105331.A6042@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 10:53:31AM +0100 References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> Message-ID: <20001228122048.A1381@thyrsus.com> Thomas Wouters : > > 1. The curses module is commented out in the default Modules/Setup > > file. This is not good, as it may lead careless distribution builders > > to ship Python 2.0s that will not be able to support the curses front > > end in CML2. Supporting CML2 (and thus getting Python the "design > > win" of being involved in the Linux kernel build) was the major point > > of integrating the curses module into the Python core. It is possible > > that one little "#" may have blown that. > > Note that Tkinter is off by default too. And readline. And ssl. And the use > of shared libraries. IMO ssl isn't an issue because it's not documented as being in the standard module set. Readline is a minor issue because raw_input()'s functionality changes somewhat if it's not linked, but I think we can live with this -- the change isn't visible to calling programs. Hm. It appears tkinter isn't documented in the standard set of modules either. Interesting. Technically this means I don't have a problem with it not being built in by default, but I think there is a problem here... My more general point is that right now Pyjthon has three classes of modules: 1. Documented as being in the core and built in by default. 2. Not documented as being in the core and not built in by default. 3. Documented as being in the core but not built in by default. My more general claim is that the existence of class 3 is a problem, because it compromises the "batteries are included" effect -- it means Python users don't have a bright-line test for what will be present in every Python (or at least every Python on an operating system theoretically feature-compatible with theirs). My struggle to get CML2 adopted brings this problem into particularly sharp focus because the kernel group is allergic to big footprints or having to download extension modules to do a build. But the issue is really broader than that. I think we ought to be migrating stuff out of class 3 into class 1 where possible and to class 2 only where unavoidable. > We *can't* enable the cursesmodule by default, because > we don't know what the system's curses library is called. We'd have to > auto-detect that before we can enable it (and lots of other modules) > automatically, and that's a lot of work. I personally favour autoconf for > the job, but since amk is already busy on using distutils, I'm not going to > work on that. Yes, we need to do a lot more autodetection -- this is a large part of my point. I have nothing against distutils, but I don't see how it solves this problem unless we assume that we'll always have Python already available on any platform where we're building Python. I'm willing to put my effort where my mouth is on this. I have a lot of experience with autoconf; I'm willing to write some of these nasty config tests. > > 2.The default Modules/Setup file assumes that various Tkinter-related libraries > > are in /usr/local. But /usr would be a more appropriate choice under most > > circumstances. Most Linux users now install their Tcl/Tk stuff from RPMs > > or .deb packages that place the binaries and libraries under /usr. Under > > most other Unixes (e.g. Solaris) they were there to begin with. > > This is nonsense. The line above it specifically states 'edit to reflect > where your Tcl/Tk headers are'. And besides from the issue whether they are > usually found in /usr (I don't believe so, not even on Solaris, but 'my' > Solaris box doesn't even have tcl/tk,) /usr/local is a perfectly sane > choice, since /usr is already included in the include-path, but /usr/local > usually is not. Is it? That is not clear from the comment. Perhaps this is just a documentation problem. I'll look again. > > 3. The configure machinery could be made to deduce more about the contents > > of Modules/Setup than it does now. In particular, it's silly that the > > person building Python has to fill in the locations of X librasries when > > configure is in principle perfectly capable of finding them. > > In principle, I agree. It's a lot of work, though. For instance, Debian > stores the Tcl/Tk headers in /usr/include/tcl, which means you can > compile for more than one tcl version, by just changing your include path > and the library you link with. And there are undoubtedly several other > variants out there. As I said to Guido, I think it is exactly our job to deal with this sort of grottiness. One of Python's major selling points is supposed to be cross-platform consistency of the API. If we fail to do what you're describing, we're failing to meet Python users' reasonable expectations for the language. > Should we really make the Setup file default to Linux, and leave other > operating systems in the dark about what it might be on their system ? I > think people with Linux and without clue are the least likely people to > compile their own Python, since Linux distributions already come with a > decent enough Python. And, please, lets assume the people assembling those > know how to read ? Please note that I am specifically *not* advocating making the build defaults Linux-centric. That's not my point at all. > Maybe we just need a HOWTO document covering Setup ? That would be a good idea. > (Besides, won't this all be fixed when CML2 comes with a distribution, Eric ? > They'll *have* to have working curses/tkinter then :-) I'm concerned that it will work the other way around, that CML2 won't happen if the core does not reliably include these facilities. In itself CML2 not happening wouldn't be the end of the world of course, but I'm pushing on this because I think the larger issue of class 3 modules is actually important to the health of Python and needs to be attacked seriously. -- Eric S. Raymond The Bible is not my book, and Christianity is not my religion. I could never give assent to the long, complicated statements of Christian dogma. -- Abraham Lincoln From cgw at fnal.gov Thu Dec 28 18:36:06 2000 From: cgw at fnal.gov (Charles G Waldman) Date: Thu, 28 Dec 2000 11:36:06 -0600 (CST) Subject: [Python-Dev] chomp()? In-Reply-To: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <14923.31238.65155.496546@buffalo.fnal.gov> Guido van Rossum writes: > Someone just posted a patch to implement s.chomp() as a string method: > I.e. it removes a trailing \r\n, \r, or \n. > > Any comments? Is this needed given that we have s.rstrip() already? -1 from me. P=NP (Python is not Perl). "Chomp" is an excessively cute name. And like you said, this is too much like "rstrip" to merit a separate method. From esr at thyrsus.com Thu Dec 28 18:41:17 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 12:41:17 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <200012281643.LAA26687@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 11:43:26AM -0500 References: <20001227143750.A26894@thyrsus.com> <200012281530.KAA26049@cj20424-a.reston1.va.home.com> <20001228110154.D32394@thyrsus.com> <200012281643.LAA26687@cj20424-a.reston1.va.home.com> Message-ID: <20001228124117.B1381@thyrsus.com> Guido van Rossum : > Supporting CML2 was one of the reasons to keep curses in the core, but > not the only one. Linux kernel configuration is so far removed from > my daily use of computers that I don't have a good way to judge its > importance in the grand scheme of things. Since you obviously > consider it very important, and since I generally trust your judgement > (except on the issue of firearms :-), your plea for keeping, and > improving, curses support in the Python core made a difference in my > decision. And don't worry, I don't expect to change that decision > -- though I personally still find it curious that curses is so important. > I find curses-style user interfaces pretty pathetic, and wished that > Linux migrated to a real GUI for configuration. (And the linuxconf > approach does *not* qualify as a a real GUI. :-) Thank you, that makes your priorities much clearer. Actually I agree with you that curses interfaces are mostly pretty pathetic. A lot of people still like them, though, because they tend to be fast and lightweight. Then, too, a really well-designed curses interface can in fact be good enough that the usability gain from GUIizing is marginal. My favorite examples of this are mutt and slrn. The fact that GUI programs have failed to make much headway against this is not simply due to user conservatism, it's genuinely hard to see how a GUI interface could be made significantly better. And unfortunately, there is a niche where it is still important to support curses interfacing independently of anyone's preferences in interface style -- early in the system-configuration process before one has bootstrapped to the point where X is reliably available. I hasten to add that this is not just *my* problem -- one of your more important Python constituencies in a practical sense is the guys who maintain Red Hat's installer. > I was totally serious though: my personal satisfaction doesn't come > from Python world domination. Others seem have that goal, and if it > doesn't inconvenience me too much I'll play along, but in the end I've > got some goals in my personal life that are much more important. There speaks the new husband :-). OK. So what *do* you want from Python? Personally, BTW, my goal is not exactly Python world domination either -- it's that the world should be dominated by the language that has the least tendency to produce grotty fragile code (remember that I tend to obsess about the software-quality problem :-)). Right now that's Python. -- Eric S. Raymond The people of the various provinces are strictly forbidden to have in their possession any swords, short swords, bows, spears, firearms, or other types of arms. The possession of unnecessary implements makes difficult the collection of taxes and dues and tends to foment uprisings. -- Toyotomi Hideyoshi, dictator of Japan, August 1588 From mal at lemburg.com Thu Dec 28 18:43:13 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 18:43:13 +0100 Subject: [Python-Dev] chomp()? References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <3A4B7BB1.F09660ED@lemburg.com> Guido van Rossum wrote: > > Someone just posted a patch to implement s.chomp() as a string method: > > http://sourceforge.net/patch/?func=detailpatch&patch_id=103029&group_id=5470 > > Pseudo code (for those not aware of the Perl function by that name): > > def chomp(s): > if s[-2:] == '\r\n': > return s[:-2] > if s[-1:] == '\r' or s[-1:] == '\n': > return s[:-1] > return s > > I.e. it removes a trailing \r\n, \r, or \n. > > Any comments? Is this needed given that we have s.rstrip() already? We already have .splitlines() which does the above (remove line breaks) not only for a single line, but for many lines at once. Even better: .splitlines() also does the right thing for Unicode. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal at lemburg.com Thu Dec 28 20:06:33 2000 From: mal at lemburg.com (M.-A. Lemburg) Date: Thu, 28 Dec 2000 20:06:33 +0100 Subject: [Python-Dev] Fwd: try...else References: <3A4B6FD3.9B576E9A@lemburg.com>, <3A4B521D.4372224A@lemburg.com>, <3A4B3341.5010707@ActiveState.com> <20001228162807.7229.qmail@stimpy.scso.com> <20001228165932.8143.qmail@stimpy.scso.com> Message-ID: <3A4B8F39.58C64EFB@lemburg.com> Moshe Zadka wrote: > > On Thu, 28 Dec 2000 17:52:36 +0100, "M.-A. Lemburg" wrote: > > [about try..finally not playing well with return] > > Hmm, that must have changed between Python 1.5 and more recent > > versions: > > I posted a 1.5.2 test. So it changed between 1.5 and 1.5.2? Sorry, false alarm: there was a bug in my patched 1.5 version. The original 1.5 version does not show the described behaviour. -- Marc-Andre Lemburg ______________________________________________________________________ Company: http://www.egenix.com/ Consulting: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From thomas at xs4all.net Thu Dec 28 21:21:15 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 21:21:15 +0100 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B521D.4372224A@lemburg.com>; from mal@lemburg.com on Thu, Dec 28, 2000 at 03:45:49PM +0100 References: <3A4B3341.5010707@ActiveState.com> <3A4B521D.4372224A@lemburg.com> Message-ID: <20001228212115.C1811@xs4all.nl> On Thu, Dec 28, 2000 at 03:45:49PM +0100, M.-A. Lemburg wrote: > > I had expected that in try: except: else > > the else clause always got executed, but it seems not for return > I think Robin mixed up try...finally with try...except...else. > The finally clause is executed even in case an exception occurred. (MAL and I already discussed this in private mail: Robin did mean try/except/else, and 'finally' already executes when returning directly from the 'try' block, even in Python 1.5) > He does have a point however that 'return' will bypass > try...else and try...finally clauses. I don't think we can change > that behaviour, though, as it would break code. This code: try: return except: pass else: print "returning" will indeed not print 'returning', but I believe it's by design. I'm against changing it, in any case, and not just because it'd break code :) If you want something that always executes, use a 'finally'. Or don't return from the 'try', but return in the 'else' clause. The 'except' clause is documented to execute if a matching exception occurs, and 'else' if no exception occurs. Maybe the intent of the 'else' clause would be clearer if it was documented to 'execute if the try: clause finishes without an exception being raised' ? The 'else' clause isn't executed when you 'break' or (after applying my continue-in-try patch ;) 'continue' out of the 'try', either. Robin... Did I already reply this, on python-list or to you directly ? I distinctly remember writing that post, but I'm not sure if it arrived. Maybe I didn't send it after all, or maybe something on mail.python.org is detaining it ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas at xs4all.net Thu Dec 28 19:19:06 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 19:19:06 +0100 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001228122048.A1381@thyrsus.com>; from esr@thyrsus.com on Thu, Dec 28, 2000 at 12:20:48PM -0500 References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com> Message-ID: <20001228191906.F1281@xs4all.nl> On Thu, Dec 28, 2000 at 12:20:48PM -0500, Eric S. Raymond wrote: > My more general point is that right now Pyjthon has three classes of > modules: > 1. Documented as being in the core and built in by default. > 2. Not documented as being in the core and not built in by default. > 3. Documented as being in the core but not built in by default. > My more general claim is that the existence of class 3 is a problem, > because it compromises the "batteries are included" effect -- it means > Python users don't have a bright-line test for what will be present in > every Python (or at least every Python on an operating system > theoretically feature-compatible with theirs). It depends on your definition of 'being in the core'. Some of the things that are 'in the core' are simply not possible on all platforms. So if you want really portable code, you don't want to use them. Other features are available on all systems that matter [to you], so you don't really care about it, just use them, and at best document that they need feature X. There is also the subtle difference between a Python user and a Python compiler/assembler (excuse my overloading of the terms, but you know what I mean). People who choose to compile their own Python should realize that they might disable or misconfigure some parts of it. I personally trust most people that assemble OS distributions to compile a proper Python binary + modules, but I think a HOWTO isn't a bad idea -- unless we autoconf everything. > I think we ought to be migrating stuff out > of class 3 into class 1 where possible and to class 2 only where > unavoidable. [ and ] > I'm willing to put my effort where my mouth is on this. I have a lot > of experience with autoconf; I'm willing to write some of these nasty > config tests. [ and ] > As I said to Guido, I think it is exactly our job to deal with this sort > of grottiness. One of Python's major selling points is supposed to be > cross-platform consistency of the API. If we fail to do what you're > describing, we're failing to meet Python users' reasonable expectations > for the language. [ and ] > Please note that I am specifically *not* advocating making the build defaults > Linux-centric. That's not my point at all. I apologize for the tone of my previous post, and the above snippet. I'm not trying to block progress here ;) I'm actually all for autodetecting as much as possible, and more than willing to put effort into it as well (as long as it's deemed useful, and isn't supplanted by a distutils variant weeks later.) And I personally have my doubts about the distutils variant, too, but that's partly because I have little experience with distutils. If we can work out a deal where both autoconf and distutils are an option, I'm happy to write a few, if not all, autoconf tests for the currently disabled modules. So, Eric, let's split the work. I'll do Tkinter if you do curses. :) However, I'm also keeping those oddball platforms that just don't support some features in mind. If you want truly portable code, you have to work at it. I think it's perfectly okay to say "your Python needs to have the curses module or the tkinter module compiled in -- contact your administrator if it has neither". There will still be platforms that don't have curses, or syslog, or crypt(), though hopefully none of them will be Linux. Oh, and I also apologize for possibly duplicating what has already been said by others. I haven't seen anything but this post (which was CC'd to me directly) since I posted my reply to Eric, due to the ululating bouts of delay on mail.python.org. Maybe DC should hire some *real* sysadmins, instead of those silly programmer-kniggits ? >:-> -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From mwh21 at cam.ac.uk Thu Dec 28 19:27:48 2000 From: mwh21 at cam.ac.uk (Michael Hudson) Date: Thu, 28 Dec 2000 18:27:48 +0000 (GMT) Subject: [Python-Dev] Fwd: try...else In-Reply-To: <3A4B521D.4372224A@lemburg.com> Message-ID: On Thu, 28 Dec 2000, M.-A. Lemburg wrote: > I think Robin mixed up try...finally with try...except...else. I think so too. > The finally clause is executed even in case an exception occurred. > > He does have a point however that 'return' will bypass > try...else and try...finally clauses. I don't think we can change > that behaviour, though, as it would break code. return does not skip finally clauses[1]. In my not especially humble opinion, the current behaviour is the Right Thing. I'd have to think for a moment before saying what Robin's example would print, but I think the alternative would disturb me far more. Cheers, M. [1] In fact the flow of control on return is very similar to that of an exception - ooh, look at that implementation... From esr at thyrsus.com Thu Dec 28 20:17:51 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 14:17:51 -0500 Subject: [Python-Dev] Miscellaneous 2.0 installation issues In-Reply-To: <20001228191906.F1281@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 07:19:06PM +0100 References: <20001227143750.A26894@thyrsus.com> <20001228105331.A6042@xs4all.nl> <20001228122048.A1381@thyrsus.com> <20001228191906.F1281@xs4all.nl> Message-ID: <20001228141751.B2528@thyrsus.com> Thomas Wouters : > > My more general claim is that the existence of class 3 is a problem, > > because it compromises the "batteries are included" effect -- it means > > Python users don't have a bright-line test for what will be present in > > every Python (or at least every Python on an operating system > > theoretically feature-compatible with theirs). > > It depends on your definition of 'being in the core'. Some of the things > that are 'in the core' are simply not possible on all platforms. So if you > want really portable code, you don't want to use them. Other features are > available on all systems that matter [to you], so you don't really care > about it, just use them, and at best document that they need feature X. I understand. We can't, for example, guarantee to duplicate the Windows- specific stuff in the Unix port (nor would we want to in most cases :-)). However, I think "we build in curses/Tkinter everywhere the corresponding libraries exist" is a guarantee we can and should make. Similarly for other modules presently in class 3. > There is also the subtle difference between a Python user and a Python > compiler/assembler (excuse my overloading of the terms, but you know what I > mean). Yes. We have three categories here: 1. People who use python for applications (what I've been calling users) 2. People who configure Python binary packages for distribution (what you call a "compiler/assembler" and I think of as a "builder"). 3. People who hack Python itself. Problem is that "developer" is very ambiguous in this context... > People who choose to compile their own Python should realize that > they might disable or misconfigure some parts of it. I personally trust most > people that assemble OS distributions to compile a proper Python binary + > modules, but I think a HOWTO isn't a bad idea -- unless we autoconf > everything. I'd like to see both things happen (HOWTO and autoconfing) and am willing to work on both. > I apologize for the tone of my previous post, and the above snippet. No offense taken at all, I assure you. > I'm not > trying to block progress here ;) I'm actually all for autodetecting as much > as possible, and more than willing to put effort into it as well (as long as > it's deemed useful, and isn't supplanted by a distutils variant weeks > later.) And I personally have my doubts about the distutils variant, too, > but that's partly because I have little experience with distutils. If we can > work out a deal where both autoconf and distutils are an option, I'm happy > to write a few, if not all, autoconf tests for the currently disabled > modules. I admit I'm not very clear on the scope of what distutils is supposed to handle, and how. Perhaps amk can enlighten us? > So, Eric, let's split the work. I'll do Tkinter if you do curses. :) You've got a deal. I'll start looking at the autoconf code. I've already got a fair idea how to do this. -- Eric S. Raymond No one who's seen it in action can say the phrase "government help" without either laughing or crying. From tim.one at home.com Fri Dec 29 03:59:53 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 28 Dec 2000 21:59:53 -0500 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> Message-ID: [Guido] > I've seen a thread on this but there was no conclusive answer, so I'm > reopening this. It hasn't budged an inch since then: your "Entering interactive session" problem is the same one everyone has; it gets reported on SF's bug and/or support managers at least daily; SF has not fixed it yet; these days they don't even respond to scp bug reports anymore; the cause appears to be SF's custom sfshell, and only SF can change that; the only known workarounds are to (a) modify files on SF directly (they suggest vi ), or (b) initiate scp *from* SF, using your local machine as a server (if you can do that -- I cannot, or at least haven't succeeded). From martin at loewis.home.cs.tu-berlin.de Thu Dec 28 23:52:02 2000 From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 28 Dec 2000 23:52:02 +0100 Subject: [Python-Dev] curses in the core? Message-ID: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de> > If curses is a core facility now, the default build should tread it > as one. ... > IMO ssl isn't an issue because it's not documented as being in the > standard module set. ... > 3. Documented as being in the core but not built in by default. > My more general claim is that the existence of class 3 is a problem In the case of curses, I believe there is a documentation error in the 2.0 documentation. The curses packages is listed under "Generic Operating System Services". I believe this is wrong, it should be listed as "Unix Specific Services". Unless I'm mistaken, the curses module is not available on the Mac and on Windows. With that change, the curses module would then fall into Eric's category 2 (Not documented as being in the core and not built in by default). That documentation change should be carried out even if curses is autoconfigured; autoconf is used on Unix only, either. Regards, Martin P.S. The "Python Library Reference" content page does not mention the word "core" at all, except as part of asyncore... From thomas at xs4all.net Thu Dec 28 23:58:25 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Thu, 28 Dec 2000 23:58:25 +0100 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <200012281708.MAA26899@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Dec 28, 2000 at 12:08:18PM -0500 References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> Message-ID: <20001228235824.E1811@xs4all.nl> On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote: > I've seen a thread on this but there was no conclusive answer, so I'm > reopening this. Actually there was: it's all SourceForge's fault. (At least that's my professional opinion ;) They honestly have a strange setup, though how strange and to what end I cannot tell. > Would somebody please figure out a way to update the PEPs? It's kind > of pathetic to see the website not have the latest versions... The way to update the peps is by ssh'ing into shell.sourceforge.net, and then scp'ing the files from your work repository to the htdocs/peps directory. That is, until SF fixes the scp problem. This method works (I just updated all PEPs to up-to-date CVS versions) but it's a bit cumbersome. And it only works if you have ssh access to your work environment. And it's damned hard to script; I tried playing with a single ssh command that did all the work, but between shell weirdness, scp weirdness and a genuine bash bug I couldn't figure it out. I assume that SF is aware of the severity of this problem, and is working on something akin to a fix or workaround. Until then, I can do an occasional update of the PEPs, for those that can't themselves. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From thomas at xs4all.net Fri Dec 29 00:05:28 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 29 Dec 2000 00:05:28 +0100 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <20001228235824.E1811@xs4all.nl>; from thomas@xs4all.net on Thu, Dec 28, 2000 at 11:58:25PM +0100 References: <200012281708.MAA26899@cj20424-a.reston1.va.home.com> <20001228235824.E1811@xs4all.nl> Message-ID: <20001229000528.F1811@xs4all.nl> On Thu, Dec 28, 2000 at 11:58:25PM +0100, Thomas Wouters wrote: > On Thu, Dec 28, 2000 at 12:08:18PM -0500, Guido van Rossum wrote: > > Would somebody please figure out a way to update the PEPs? It's kind > > of pathetic to see the website not have the latest versions... > > The way to update the peps is by ssh'ing into shell.sourceforge.net, and > then scp'ing the files from your work repository to the htdocs/peps [ blah blah ] And then they fixed it ! At least, for me, direct scp now works fine. (I should've tested that before posting my blah blah, sorry.) Anybody else, like people using F-secure ssh (unix or windows) experience the same ? -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From MarkH at ActiveState.com Fri Dec 29 00:15:01 2000 From: MarkH at ActiveState.com (Mark Hammond) Date: Fri, 29 Dec 2000 10:15:01 +1100 Subject: [Python-Dev] chomp()? In-Reply-To: <14923.31238.65155.496546@buffalo.fnal.gov> Message-ID: > -1 from me. P=NP (Python is not Perl). "Chomp" is an > excessively cute name. > And like you said, this is too much like "rstrip" to merit a separate > method. My thoughts exactly. I can't remember _ever_ wanting to chomp() when rstrip() wasnt perfectly suitable. I'm sure it happens, but not often enough to introduce an ambiguous new function purely for "feature parity" with Perl. Mark. From esr at thyrsus.com Fri Dec 29 00:25:28 2000 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 28 Dec 2000 18:25:28 -0500 Subject: [Python-Dev] Re: curses in the core? In-Reply-To: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Thu, Dec 28, 2000 at 11:52:02PM +0100 References: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de> Message-ID: <20001228182528.A10743@thyrsus.com> Martin v. Loewis : > In the case of curses, I believe there is a documentation error in the > 2.0 documentation. The curses packages is listed under "Generic > Operating System Services". I believe this is wrong, it should be listed > as "Unix Specific Services". I agree that this is an error and should be fixed. > Unless I'm mistaken, the curses module is not available on the Mac and > on Windows. With that change, the curses module would then fall into > Eric's category 2 (Not documented as being in the core and not built > in by default). Well...that's a definitional question that is part of the larger issue here. What does being in the Python core mean? There are two potential definitions: 1. Documentation says it's available on all platforms. 2. Documentation restricts it to one of the three platform groups (Unix/Windows/Mac) but implies that it will be available on any OS in that group. I think the second one is closer to what application programmers thinking about which batteries are included expect. But I could be persuaded otherwise by a good argument. -- Eric S. Raymond The difference between death and taxes is death doesn't get worse every time Congress meets -- Will Rogers From akuchlin at mems-exchange.org Fri Dec 29 01:33:36 2000 From: akuchlin at mems-exchange.org (A.M. Kuchling) Date: Thu, 28 Dec 2000 19:33:36 -0500 Subject: [Python-Dev] Bookstore completed Message-ID: <200012290033.TAA01295@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> OK, I think I'm ready to declare the Python bookstore complete enough to go public. Before I set up redirects from www.python.org, please take another look. (More book descriptions would be helpful...) http://www.amk.ca/bookstore/ --amk From akuchlin at mems-exchange.org Fri Dec 29 01:46:16 2000 From: akuchlin at mems-exchange.org (A.M. Kuchling) Date: Thu, 28 Dec 2000 19:46:16 -0500 Subject: [Python-Dev] Help wanted with setup.py script Message-ID: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> Want to help with the laudable goal of automating the Python build process? It'll need lots of testing on many different platforms, and I'd like to start the process now. First, download the setup.py script from http://www.amk.ca/files/python/setup.py Next, drop it in the root directory of your Python source tree and run "python setup.py build". If it dies with an exception, let me know. (Replies to this list are OK.) If it runs to completion, look in the Modules/build/lib. directory to see which modules got built. (On my system, is "linux-i686-2.0", but of course this will depend on your platform.) Is anything missing that should have been built? (_tkinter.so is the prime candidate; the autodetection code is far too simple at the moment and assumes one particular version of Tcl and Tk.) Did an attempt at building a module fail? These indicate problems autodetecting something, so if you can figure out how to find the required library or include file, let me know what to do. --amk From fdrake at acm.org Fri Dec 29 05:12:18 2000 From: fdrake at acm.org (Fred L. Drake) Date: Thu, 28 Dec 2000 23:12:18 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <20001228212115.C1811@xs4all.nl> Message-ID: On Thu, 28 Dec 2000 21:21:15 +0100, Thomas Wouters wrote: > The 'except' clause is documented to execute if a > matching exception occurs, > and 'else' if no exception occurs. Maybe the intent of > the 'else' clause This can certainly be clarified in the documentation -- please file a bug report at http://sourceforge.net/projects/python/. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From tim.one at home.com Fri Dec 29 05:25:44 2000 From: tim.one at home.com (Tim Peters) Date: Thu, 28 Dec 2000 23:25:44 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: <20001228212115.C1811@xs4all.nl> Message-ID: [Fred, suggested doc change near the end] [Thomas Wouters] > (MAL and I already discussed this in private mail: Robin did mean > try/except/else, and 'finally' already executes when returning > directly from the 'try' block, even in Python 1.5) > > This code: > > try: > return > except: > pass > else: > print "returning" > > will indeed not print 'returning', but I believe it's by design. > I'm against changing it, in any case, and not just because it'd > break code :) If you want something that always executes, use a > 'finally'. Or don't return from the 'try', but return in the > 'else' clause. Guido's out of town again, so I'll channel him: Thomas is correct on all counts. In try/else, the "else" clause should execute if and only if control "falls off the end" of the "try" block. IOW, consider: try: arbitrary stuff x = 1 An "else" clause added to that "try" should execute when and only when the code as written executes the "x = 1" after the block. When "arbitrary stuff" == "return", control does not fall off the end, so "else" shouldn't trigger. Same thing if "arbitrary stuff" == "break" and we're inside a loop, or "continue" after Thomas's patch gets accepted. > The 'except' clause is documented to execute if a matching > exception occurs, and 'else' if no exception occurs. Yup, and that's imprecise: the same words are used to describe (part of) when 'finally' executes, but they weren't intended to be the same. > Maybe the intent of the 'else' clause would be clearer if it > was documented to 'execute if the try: clause finishes without > an exception being raised' ? Sorry, I don't find that any clearer. Let's be explicit: The optional 'else' clause is executed when the 'try' clause terminates by any means other than an exception or executing a 'return', 'continue' or 'break' statement. Exceptions in the 'else' clause are not handled by the preceding 'except' clauses. > The 'else' clause isn't executed when you 'break' or (after > applying my continue-in-try patch ;) 'continue' out of the > 'try', either. Hey, now you're channeling me ! Be afraid -- be very afraid. From moshez at zadka.site.co.il Fri Dec 29 15:42:44 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Fri, 29 Dec 2000 16:42:44 +0200 (IST) Subject: [Python-Dev] chomp()? In-Reply-To: <3A4B7BB1.F09660ED@lemburg.com> References: <3A4B7BB1.F09660ED@lemburg.com>, <200012281504.KAA25892@cj20424-a.reston1.va.home.com> Message-ID: <20001229144244.D5AD0A84F@darjeeling.zadka.site.co.il> On Thu, 28 Dec 2000, "M.-A. Lemburg" wrote: [about chomp] > We already have .splitlines() which does the above (remove > line breaks) not only for a single line, but for many lines at once. > > Even better: .splitlines() also does the right thing for Unicode. OK, I retract my earlier +1, and instead I move that this be added to the FAQ. Where is the FAQ maintained nowadays? The grail link doesn't work anymore. -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From loewis at informatik.hu-berlin.de Fri Dec 29 17:52:13 2000 From: loewis at informatik.hu-berlin.de (Martin von Loewis) Date: Fri, 29 Dec 2000 17:52:13 +0100 (MET) Subject: [Python-Dev] Re: [Patch #103002] Fix for #116285: Properly raise UnicodeErrors Message-ID: <200012291652.RAA20251@pandora.informatik.hu-berlin.de> [resent since python.org ran out of disk space] > My only problem with it is your copyright notice. AFAIK, patches to > the Python core cannot contain copyright notices without proper > license information. OTOH, I don't think that these minor changes > really warrant adding a complete license paragraph. I'd like to get an "official" clarification on this question. Is it the case that patches containing copyright notices are only accepted if they are accompanied with license information? I agree that the changes are minor, I also believe that I hold the copyright to the changes whether I attach a notice or not (at least according to our local copyright law). What concerns me that without such a notice, gencodec.py looks as if CNRI holds the copyright to it. I'm not willing to assign the copyright of my changes to CNRI, and I'd like to avoid the impression of doing so. What is even more concerning is that CNRI also holds the copyright to the generated files, even though they are derived from information made available by the Unicode consortium! Regards, Martin From tim.one at home.com Fri Dec 29 20:56:36 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 29 Dec 2000 14:56:36 -0500 Subject: [Python-Dev] scp to sourceforge In-Reply-To: <20001229000528.F1811@xs4all.nl> Message-ID: [Thomas Wouters] > And then they fixed it ! At least, for me, direct scp now works > fine. (I should've tested that before posting my blah blah, sorry.) I tried it immediately before posting my blah-blah yesterday, and it was still hanging. > Anybody else, like people using F-secure ssh (unix or windows) > experience the same ? Same here: I tried it again just now (under Win98 cmdline ssh/scp) and it worked fine! We're in business again. Thanks for fixing it, Thomas . now-if-only-we-could-get-python-dev-email-on-an-approximation-to-the- same-day-it's-sent-ly y'rs - tim From tim.one at home.com Fri Dec 29 21:27:40 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 29 Dec 2000 15:27:40 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: Message-ID: [Robin Becker] > The 2.0 docs clearly state 'The optional else clause is executed when no > exception occurs in the try clause.' This makes it sound as though it > gets executed on the 'way out'. Of course. That's not what the docs meant, though, and Guido is not going to change the implementation now because that would break code that relies on how Python has always *worked* in these cases. The way Python works is also the way Guido intended it to work (I'm allowed to channel him when he's on vacation <0.9 wink)>. Indeed, that's why I suggested a specific doc change. If your friend would also be confused by that, then we still have a problem; else we don't. From tim.one at home.com Fri Dec 29 21:37:08 2000 From: tim.one at home.com (Tim Peters) Date: Fri, 29 Dec 2000 15:37:08 -0500 Subject: [Python-Dev] Fwd: try...else In-Reply-To: Message-ID: [Fred] > This can certainly be clarified in the documentation -- > please file a bug report at http://sourceforge.net/projects/python/. Here you go: https://sourceforge.net/bugs/?func=detailbug&bug_id=127098&group_id=5470 From thomas at xs4all.net Fri Dec 29 21:59:16 2000 From: thomas at xs4all.net (Thomas Wouters) Date: Fri, 29 Dec 2000 21:59:16 +0100 Subject: [Python-Dev] Fwd: try...else In-Reply-To: ; from tim.one@home.com on Fri, Dec 29, 2000 at 03:27:40PM -0500 References: Message-ID: <20001229215915.L1281@xs4all.nl> On Fri, Dec 29, 2000 at 03:27:40PM -0500, Tim Peters wrote: > Indeed, that's why I suggested a specific doc change. If your friend would > also be confused by that, then we still have a problem; else we don't. Note that I already uploaded a patch to fix the docs, assigned to fdrake, using Tim's wording exactly. (patch #103045) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! From moshez at zadka.site.co.il Sun Dec 31 01:33:30 2000 From: moshez at zadka.site.co.il (Moshe Zadka) Date: Sun, 31 Dec 2000 02:33:30 +0200 (IST) Subject: [Python-Dev] FAQ Horribly Out Of Date Message-ID: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Hi! The current FAQ is horribly out of date. I think the FAQ-Wizard method has proven itself not very efficient (for example, apparently no one noticed until now that it's not working <0.2 wink>). Is there any hope putting the FAQ in Misc/, having a script which scp's it to the SF page, and making that the official FAQ? On a related note, what is the current status of the PSA? Is it officially dead? -- Moshe Zadka This is a signature anti-virus. Please stop the spread of signature viruses! From tim.one at home.com Sat Dec 30 21:48:08 2000 From: tim.one at home.com (Tim Peters) Date: Sat, 30 Dec 2000 15:48:08 -0500 Subject: [Python-Dev] Most everything is busted Message-ID: Add this error to the pot: """ http://www.python.org/cgi-bin/moinmoin Proxy Error The proxy server received an invalid response from an upstream server. The proxy server could not handle the request GET /cgi-bin/moinmoin. Reason: Document contains no data ------------------------------------------------------------------- Apache/1.3.9 Server at www.python.org Port 80 """ Also, as far as I can tell: + news->mail for c.l.py hasn't delivered anything for well over 24 hours. + No mail to Python-Dev has showed up in the archives (let alone been delivered) since Fri, 29 Dec 2000 16:42:44 +0200 (IST). + The other Python mailing lists appear equally dead. time-for-a-new-year!-ly y'rs - tim From barry at wooz.org Sun Dec 31 02:06:23 2000 From: barry at wooz.org (Barry A. Warsaw) Date: Sat, 30 Dec 2000 20:06:23 -0500 Subject: [Python-Dev] Re: Most everything is busted References: Message-ID: <14926.34447.60988.553140@anthem.concentric.net> >>>>> "TP" == Tim Peters writes: TP> + news->mail for c.l.py hasn't delivered anything for well TP> over 24 hours. TP> + No mail to Python-Dev has showed up in the archives (let TP> alone been delivered) since Fri, 29 Dec 2000 16:42:44 +0200 TP> (IST). TP> + The other Python mailing lists appear equally dead. There's a stupid, stupid bug in Mailman 2.0, which I've just fixed and (hopefully) unjammed things on the Mailman end[1]. We're still probably subject to the Postfix delays unfortunately; I think those are DNS related, and I've gotten a few other reports of DNS oddities, which I've forwarded off to the DC sysadmins. I don't think that particular problem will be fixed until after the New Year. relax-and-enjoy-the-quiet-ly y'rs, -Barry [1] For those who care: there's a resource throttle in qrunner which limits the number of files any single qrunner process will handle. qrunner does a listdir() on the qfiles directory and ignores any .msg file it finds (it only does the bulk of the processing on the corresponding .db files). But it performs the throttle check on every file in listdir() so depending on the order that listdir() returns and the number of files in the qfiles directory, the throttle check might get triggered before any .db file is seen. Wedge city. This is serious enough to warrant a Mailman 2.0.1 release, probably mid-next week. From gstein at lyra.org Sun Dec 31 11:19:50 2000 From: gstein at lyra.org (Greg Stein) Date: Sun, 31 Dec 2000 02:19:50 -0800 Subject: [Python-Dev] FAQ Horribly Out Of Date In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200 References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Message-ID: <20001231021950.M28628@lyra.org> On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote: >... > On a related note, what is the current status of the PSA? Is it officially > dead? The PSA was always kind of a (legal) fiction with the basic intent to help provide some funding for Python development. Since that isn't occurring at CNRI any more, the PSA is a bit moot. There was always some idea that maybe the PSA would be the "sponsor" (and possibly the beneficiary) of the conferences. That wasn't ever really formalized either. From akuchlin at cnri.reston.va.us Sun Dec 31 16:58:12 2000 From: akuchlin at cnri.reston.va.us (Andrew Kuchling) Date: Sun, 31 Dec 2000 10:58:12 -0500 Subject: [Python-Dev] FAQ Horribly Out Of Date In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Sun, Dec 31, 2000 at 02:33:30AM +0200 References: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Message-ID: <20001231105812.A12168@newcnri.cnri.reston.va.us> On Sun, Dec 31, 2000 at 02:33:30AM +0200, Moshe Zadka wrote: >The current FAQ is horribly out of date. I think the FAQ-Wizard method >has proven itself not very efficient (for example, apparently no one >noticed until now that it's not working <0.2 wink>). Is there any It also leads to one section of the FAQ (#3, I think) having something like 60 questions jumbled together. IMHO the FAQ should be a text file, perhaps in the PEP format so it can be converted to HTML, and it should have an editor who'll arrange it into smaller sections. Any volunteers? (Must ... resist ... urge to volunteer myself... help me, Spock...) --amk From skip at mojam.com Sun Dec 31 20:25:18 2000 From: skip at mojam.com (Skip Montanaro) Date: Sun, 31 Dec 2000 13:25:18 -0600 (CST) Subject: [Python-Dev] plz test bsddb using shared linkage Message-ID: <14927.34846.153117.764547@beluga.mojam.com> A bug was filed on SF contending that the default linkage for bsddb should be shared instead of static because some Linux systems ship multiple versions of libdb. Would those of you who can and do build bsddb (probably only unixoids of some variety) please give this simple test a try? Uncomment the *shared* line in Modules/Setup.config.in, re-run configure, build Python and then try: import bsddb db = bsddb.btopen("/tmp/dbtest.db", "c") db["1"] = "1" print db["1"] db.close() del db If this doesn't fail for anyone I'll check the change in and close the bug report, otherwise I'll add a(nother) comment to the bug report that *shared* breaks bsddb for others and close the bug report. Thx, Skip From skip at mojam.com Sun Dec 31 20:26:16 2000 From: skip at mojam.com (Skip Montanaro) Date: Sun, 31 Dec 2000 13:26:16 -0600 (CST) Subject: [Python-Dev] plz test bsddb using shared linkage Message-ID: <14927.34904.20832.319647@beluga.mojam.com> oops, forgot the bug report is at http://sourceforge.net/bugs/?func=detailbug&bug_id=126564&group_id=5470 for those of you who do not monitor python-bugs-list. S From tim.one at home.com Sun Dec 31 21:28:47 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 31 Dec 2000 15:28:47 -0500 Subject: [Python-Dev] FAQ Horribly Out Of Date In-Reply-To: <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> Message-ID: [Moshe Zadka] > The current FAQ is horribly out of date. The password is Spam. Fix it . > I think the FAQ-Wizard method has proven itself not very > efficient (for example, apparently no one noticed until now > that it's not working <0.2 wink>). I'm afraid almost nothing on python.org with an active component works today (not searches, not the FAQ Wizard, not the 2.0 Wiki, ...). If history is any clue, these will remain broken until Guido gets back from vacation. > Is there any hope putting the FAQ in Misc/, having a script > which scp's it to the SF page, and making that the official FAQ? Would be OK by me. I'm more concerned that the whole of python.org has barely been updated since March; huge chunks of the FAQ are still relevant, but, e.g., the Job Board hasn't been touched in over 3 months; the News got so out of date Guido deleted the whole section; etc. > On a related note, what is the current status of the PSA? Is it > officially dead? It appears that CNRI can only think about one thing at a time <0.5 wink>. For the last 6 months, that thing has been the license. If they ever resolve the GPL compatibility issue, maybe they can be persuaded to think about the PSA. In the meantime, I'd suggest you not renew . From tim.one at home.com Sun Dec 31 23:12:43 2000 From: tim.one at home.com (Tim Peters) Date: Sun, 31 Dec 2000 17:12:43 -0500 Subject: [Python-Dev] plz test bsddb using shared linkage In-Reply-To: <14927.34846.153117.764547@beluga.mojam.com> Message-ID: [Skip Montanaro] > ... > Would those of you who can and do build bsddb (probably only > unixoids of some variety) please give this simple test a try? Just noting that bsddb already ships with the Windows installer as a (shared) DLL. But it's an old (1.85?) Windows port from Sam Rushing.